DataMigrator User's Guide

Feedback

Copyright © 2016 iWay Software

This documentation describes how to install and use DataMigrator, formerly ETL Manager. It is intended for the user responsible for transforming data by designing and executing data and process flows.

In some instances, DataMigrator is abbreviated as DM.

For up-to-the-minute information, please refer to the release notes.

How This Manual Is Organized

This manual includes the following chapters:

Chapter/Appendix

Contents

1

Introduction to DataMigrator

Provides background information about DataMigrator architecture, components, concepts, and uses.

2

Setting Up DataMigrator

Provides setup information for DataMigrator developers and administrators, including overviews for local and remote DataMigrator Server configurations.

3

Planning a DataMigrator Application

Offers conceptual and practical information to be considered before designing a DataMigrator application.

4

Working in the Data Management Console

Describes the Data Management Console (DMC) interface.

5

Designing a Data Flow

Provides detailed information about all aspects of Data Flow design.

6

Designing a Process Flow

Provides detailed information about all aspects of Process Flow design.

7

Tutorial: Creating Data and Process Flows With DataMigrator

Provides step-by-step instructions for building a Data Flow and a Process Flow using sample data.

8

Generating Reports, Viewing the Log File, and Printing Flows

Enables you to view reports on how flows run, what the DataMigrator Server environment looks like, and what error messages the DataMigrator Server generates.

9

Optimizing Performance

Provides tips for ensuring successful deployment of your data mart and data warehouse implementations.

10

Working With Synonyms

Provides information about creating synonyms in the Data Management Console and using the Synonym Editor.

11

Advanced DataMigrator Tips and Techniques

Provides a series of examples to demonstrate the capabilities of DataMigrator.

12

Change Data Capture

Describes how to use Change Data Capture to enable reading from database logs to determine what rows in a table have changed.

A

DataMigrator Log Statistics and Tables

Provides information about DataMigrator internal information.

B

Event-Based Scheduling

Describes event-based scheduling options that enable execution a flow outside of DataMigrator.

C

Problems, Errors, and Troubleshooting

Assists you in resolving problems that may occur during the development of your DataMigrator application.

D

Customizing Your Environment

Provides information about customizing the DataMigrator environment.

E

Source Management

Source control programs enable you to manage flows, stored procedures, and synonyms, typically providing file backup, change tracking, and version control

Summary of New Features

This table describes the new features and enhancements for Version 7.7.06.

New Feature

For more information, see...

Data Management Console

Check File

A report that shows the number of segments, fields, and the total length of all fields is available from the Properties pane for a synonym.

File or Synonym Context Menu Options

Transformations

When you enter a transformation directly in a transformation grid, the format is now determined and displayed.

 

Dependencies Analysis

A new context menu item when selecting a flow, procedure, or synonym runs a new report that lists all the synonyms used by the object. For flows, it lists any flows and procedures that the flow calls.

Navigation Pane: Procedures and Navigation Pane: Synonyms

Update Synonym

Create Synonym for relational database tables dialog box has a new update option that allows you to select which attributes should be updated from the database.

Creating Metadata

Quick ETL Copy

Table and synonym names are only suffixed with _t unless necessary, in order to avoid a conflict. An existing synonym can now be used as a target. When selected rows with matching key values are updated, new rows are inserted. A new option allows selecting columns to copy. When copying a single table, key columns are now propagated. For a synonym that contains a cluster join that represents a Star Schema, a new option allows retaining the multi-segment and multi-dimension structure when creating the target table. A new option Copy auxiliary metadata copies Defines, Filters, and Variables from the source synonym to the target.

How to Create a Quick ETL Copy for a Single Table

Add Server Node

The dialog box when adding or editing a server node has been improved.

How to Add a Node to Identify the Server to the Data Management Console

Copy Server Node

A server node can now be copied to make it simpler to create a similar node.

Server Shortcut Menu Options

Noncontiguous Keys

Key columns for a relational database table no longer have to be grouped together as the first fields in a synonym. The synonym can now show fields in the same order as the column appears in the table. Key column selection in target transformation now also allows selecting discontiguous keys.

Segment Attributes Summary and Create a Data Target and Target Properties Pane for Relational Targets

Tabbed Traces View

Server traces now appear in a tabbed view, making it easy to switch between the list of traces and the traces.

How to View Traces

Accessibility

New menu options have been added so that all operations can be performed without the use of a mouse.

Global Variables Editor

A new editor facilitates editing global variables.

How to Set a Value for Global Variables

Tutorial Files

Samples files have been renamed tutorials.

Data and Process Flows

Updates for Type I Fields

A new option, Updates for Type I fields, is available for a Slowly Changing Dimension Load. This setting specifies how updates will affect Type I fields. The available options are Change all rows or Change only active rows. The default value is Change all rows.

Target Properties Pane for Relational Targets

New Target Synonym Location

Instead of always defaulting to baseapp, new targets are now created in the same directory as the flow, or if not specified in the first directory, in the user app path.

 

Discontiguous Keys

New tables in an RDBMS can be created with discontiguous key columns.

Annotations in Flows

Additional notes can be added to properties for objects in data and process flows to document them.

Find Objects in Flows

On the Flow Ribbon, a new Find Flow Object group allows searching for objects in data and process flows.

Flow Tab

Find Text in SQL Object

Search dialog box for text in the SQL object can be invoked from a button or Ctrl-F.

How to Edit the SQL Statement

Upgrade to Extended Bulk Load

When a data flow that uses load type Bulk Load via Disk File is opened and resaved, the load type is updated to use Extended Bulk Load for supported databases: Teradata 14, Sybase ASE, Sybase IQ, and Hyperstage.

Load Images from Files

Images and other binary files can be loaded into BLOB (Binary Large Object) fields in supported relational databases.

Loading Images Into a Database Table

JSON Document Target

A flow can now use an existing JSON synonym as a target, allowing mapping to a pre-defined structure.

Creating a Data Flow Using a Target Based on a Predefined XML Schema

Options for Data Flow Designer

A new option when all columns are selected allows also copying Defines, Filters, and Variables.

Calculators

Simplified Functions

Some frequently used functions have new alternates with a simplified syntax requiring fewer arguments.

Balloon Tips

If you do not specify a format, or if the format is incorrect, a balloon tip now appears notifying you of the change. A balloon tip also appears if you do not specify a name for a new field.

Creating a Source Transformation in a Data Source Object

Missing

You can specify whether missing (Null) values are allowed directly from the calculator.

Transformation Calculator

Customize Column Display

The function list can also include the Category and SQL conversion notes for each function.

How to Change the Display of Columns and Variables in a Calculator

ANY and ALL in WHERE Calculator

The calculator for WHERE filters now has ANY and ALL buttons and supports their use as predicates.

WHERE Filter Calculator

Reporting

Impact Analysis for Connections

An Impact Analysis report is now available for a connection showing every flow and synonym where the connection is used.

Navigation Pane: Adapters Folder

Impact Analysis

The impact analysis report now includes synonyms used as an Iterator file.

Impact Analysis Reports on Synonyms, Procedures and Columns

Running

Slowly Changing Dimensions

This load option can now be used with Direct Load Flows.

Iterator

The iterator can now be used for procedures, as well as for flows.

Properties Attribute Pane for Procedures

Extended Bulk Load

Now available for Greenplum, Hive/Impala, Sybase ASE, and Sybase IQ, in addition to Hyperstage and Teradata.

Load Options

New Scheduling Options

New options for flows scheduled yearly, monthly, or weekly allow running a flow on specific or recurring days of the week. For example, on the third Thursday of November.

Source Code Control

Configure Source Control

This can now be done from the ribbon Home tab.

Home Tab

Multi Select

You can now multi select objects with shift or ctrl and left click before a right-click to check-in, check-out, or perform other source code control operations.

Using a Source Control System to Manage Procedures and Synonyms

Adapters

Flat File or Delimited Flat File

The source or target location for a file can now be on an SFTP (SSH FTP) server, in addition to an FTP server.

Sending a Target File to an FTP or SFTP Server

View Delimited Flat File

When creating a synonym for a Delimited Flat File, a new option allows viewing the file to ascertain its format.

Change Data Capture for MS SQL Server

MS SQL Server 2012 is now supported.

This table describes new features and enhancements for Version 7.7.05.

New Feature

For more information, see...

DataMigrator Flows

Direct Load Flows

A Direct Load flow can now have a load type of IUD. This means that a source synonym for table log records used for Change Data Capture can be used as input.

 

Flow Properties Email

Email messages configured from Flow Properties, which include the log on completion or failure of the flow, can now have a customized subject line, message and importance.

Flow Properties Pane - Email Attribute

Column Naming Strategy

The column Alias (for relational databases this is the actual name) can be used.

 

Prior to load - Truncate

This option is now available for DB2 and UDB tables on all platforms, except IBM i which still does not support it.

 

Show/Modify Data

Data in a table can be viewed and edited.

 

Default DBMS Error Limit Set

You can specify when to stop processing the number of DBMS errors received if the number of errors specified is exceeded.

User Preferences - Run Options

Sending Messages in Process Flows

You can now send email messages to email addresses specified in a procedure.

Flow Properties Pane - Email Attribute

Data Management Console - General

New Files

The New menu now lets you create a file on the server. The Save As button or menu options lets you save it as any supported file type.

 

Traces

You can view traces of the application from the DMC.

Setting User Preferences

New Column Option

There is a new Index option that shows the name of any indexes on the underlying relational database table.

How to Change the Information Displayed for Columns

Backup and Restore Server Configuration Files

There are new options to backup and restore server configuration files.

How to Create a Backup of Server Configuration Files

and

How to Restore Server Configuration Files

Star Schemas and SCD

Change Flag

A new SCD type of Change Flag for Type 1 changes records the date or date and time when a row is updated.

 

This table describes the new features and enhancements for Version 7.7.04.

New Feature

For more information, see...

DataMigrator Flows

Procedures in Parallel Group

A parallel group can now contain procedures in addition to flows.

 

Conditional Action After Email Objects

Multiple objects with different conditions can be used after an email object.

 

Select Multiple Objects

You can select multiple objects with Ctrl + Click to move them as a group.

Moving Objects in the Workspace

Recreate DBMS Table

Recreate DBMS table creates tables with indexes when included in the synonym.

Navigation Pane: Synonyms

Parallelization and Partitioning

Quick ETL Copy for a single source table can now split the table into a user specified number of partitions based on a user selected numeric key column and run them in parallel.

Parallelization and Partitioning

Calculator

Where Condition Sample Values

The list of columns displayed in the where condition calculator can be expanded to show sample values retrieved from the data source or the synonym.

 

Function Assist for EXISTS

A new function assistant for the SQL EXISTS clause simplifies building this test for rows containing specified value(s).

 

Subquery in Select List

A subquery can be used in the list of columns in the select list.

Using Subqueries

HAVING Filter

A condition can be applied to restrict data retrieved after aggregation has been performed with a HAVING filter.

 

Data Management Console - Browser

Additional File Types

The browser now displays additional types of files, including text and XML documents that can be managed, as well as opened and edited with the text editor.

 

Data Profiling - Row Count

When only the count of rows is needed, this new data profiling option is quicker than running the statistics report.

 

File Types

Additional file types are now displayed in the browser. Files with text data can be edited in the text editor.

 

Data Management Console - General

Change Case in Text Editor

When editing files in the text editor, selected text can be changed to all upper case, all lower case, or the case can be inverted.

Text Editor Context Menu

Print Preview

When viewing a report, Print Preview now provides a more accurate image of what the report will look like when printed.

 

Core Engine Settings

Certain core engine settings can now be set from a user dialog without the need to edit a profile.

How to Use Core Engine Settings

Line Colors

You can now control the line colors using Tools Options.

Setting User Preferences

Recent Files

A list of recently used files is now retained and can be selected from the File menu choice Recent Files. The number of files to display can be selected.

 

Autosave

Open files can be automatically saved after a specified number of minutes have elapsed.

User Preferences - General

Synonym Editor

Indexes

Indexes, in addition to the primary key, are shown in the synonym when created. Additional indexes can be added to the synonym when creating a table from a synonym are created too.

 

Column Field Names

The maximum length of field names in a synonym and column names supported in an RDBMS table has been increased from 64 to 128 characters.

 

Undo/Redo

Editor actions can be undone by clicking the Undo button on the toolbar in the Synonym Editor or Data Flow Designer. The maximum number of undo operations can be specified.

User Preferences - Synonym Editor or User Preferences - Data Flow Designer

Database Default Values

If a database column has default value assigned in the database, and no value is assigned in Target Transformations, the database will now assign the default value. In previous releases, a NULL value was always assigned for NULLABLE columns and space or zero for NOT NULL columns.

Unassigned Fields

DataMigrator Reporting

Printing Reports

A new Print Preview option gives a more accurate view of what the report output will look like when printed. Forward and Back buttons enable paging through the report.

The print dialog box allows printing the entire report or just the selected range.

Working With a Report

Report Output

Output from stored procedures that generate reports can now either be sent as text to the log as in prior release or formatted and sent to a special output queue.

Scheduler Configuration Window

Backup

When a backup is made of the DataMigrator Log and Statistics tables, a synonym is also created that can be used to read the tables. A message appears in the Console Log with the tables, locations, and names.

DataMigrator Scheduler

Start or Stop Scanning

You can start or stop the scheduler from scanning from the Data Management Console.

How to Start or Stop the Scheduler

Run When Server Starts

Flows can be scheduled to start whenever the server is started. This is used for flows that need to run all the time, such as those using Change Data Capture or the File Listener capabilities.

 

File Listener

File Listener can now monitor a directory for files or sub-directories that are added, changed, or deleted.

See File Listener Attributes

Documentation Conventions

The following conventions apply throughout this manual:

Convention

Description

THIS TYPEFACE

or

this typeface

Denotes syntax that you must enter exactly as shown.

this typeface

Represents a placeholder (or variable) in syntax for a value that you or the system must supply.

underscore

Indicates a default setting.

this typeface

Represents a placeholder (or variable), a cross-reference, or an important term. It may also indicate a button, menu item, or dialog box option that you can click or select.

Key + Key

Indicates keys that you must press simultaneously.

{  }

Indicates two or three choices. Type one of them, not the braces.

[  ]

Indicates a group of optional parameters. None are required, but you may select one of them. Type only the parameter in the brackets, not the brackets.

|

Separates mutually exclusive choices in syntax. Type one of them, not the symbol.

...

Indicates that you can enter a parameter multiple times. Type only the parameter, not the ellipsis (...).

.
.
.

Indicates that there are (or could be) intervening or additional commands.

Related Publications

Visit our Technical Documentation Library at http://documentation.informationbuilders.com. You can also contact the Publications Order Department at (800) 969-4636.

Customer Support

Do you have any questions about this product?

Join the Focal Point community. Focal Point is our online developer center and more than a message board. It is an interactive network of more than 3,000 developers from almost every profession and industry, collaborating on solutions and sharing tips and techniques, http://forums.informationbuilders.com/eve/forums.

You can also access support services electronically, 24 hours a day, with InfoResponse Online. InfoResponse Online is accessible through our website, http://www.informationbuilders.com. It connects you to the tracking system and known-problem database at the Information Builders support center. Registered users can open, update, and view the status of cases in the tracking system and read descriptions of reported software issues. New users can register immediately for this service. The technical support section of www.informationbuilders.com also provides usage techniques, diagnostic tips, and answers to frequently asked questions.

Call Information Builders Customer Support Service (CSS) at (800) 736-6130 or (212) 736-6130. Customer Support Consultants are available Monday through Friday between 8:00 a.m. and 8:00 p.m. EST to address all your questions. Information Builders consultants can also give you general guidance regarding product capabilities. Please be ready to provide your six-digit site code number (xxxx.xx) when you call.

To learn about the full range of available support services, ask your Information Builders representative about InfoResponse Online, or call (800) 969-INFO.

Information You Should Have

To help our consultants answer your questions most effectively, be ready to provide the following information when you call:

User Feedback

In an effort to produce effective documentation, the Technical Content Management staff welcomes your opinions regarding this document. Please use the Reader Comments form at the end of this document to communicate your feedback to us or to suggest changes that will support improvements to our documentation. You can also contact us through our website http://documentation.informationbuilders.com/connections.asp.

Thank you, in advance, for your comments.

iWay Software Training and Professional Services

Interested in training? Our Education Department offers a wide variety of training courses for iWay Software and other Information Builders products.

For information on course descriptions, locations, and dates, or to register for classes, visit our website (http://education.informationbuilders.com) or call (800) 969-INFO to speak to an Education Representative.

Interested in technical assistance for your implementation? Our Professional Services department provides expert design, systems architecture, implementation, and project management services for all your business integration projects. For information, visit our website (http://www.informationbuilders.com/support).