Specifying Data Target Options

Topics:

After the data targets have been added to the flow, there are two types of options that govern the behavior of the data flow when copying data into the targets.

Data flow properties include:

Target-specific properties include:

Target Load Options

Topics:

How to:

Reference:

Load options specify the way DataMigrator loads data into the targets. If you are moving data to more than one target, they will all use the same load type. The load type is displayed on all target objects and, if changed in one target, the change will be reflected in all the target objects.

Load options vary depending on the type of target. They can include:

  • Insert/Update. Allows you to set a behavior when loading records.

    If you select this option, you can set a behavior when duplicate records are found from the If the record exists drop-down menu. Insert/Update options can be set on a per target basis.

    If the record exists:

    • Include the record. Allows the relational database target to handle duplicates.
    • Reject the record. Issues a SELECT command against the target table to see if a record exists with the key values. If there are no key columns, DataMigrator screens all columns. If the record is found, the record is rejected.
    • Update the existing record. Updates records if the key value on the input record, or entire record if there are no keys, are found in the table. All non-key values are updated in the target.
    • Delete the existing record. Deletes the record if the key on the input record, or entire record if no keys, is found in the table.

    If the record does not exist:

    • Include the record. Includes the key in the data target.
    • Reject the record. Does not include the key in the data target. This option is only available if Update or Delete the existing record was selected.

      Note: When loading relational data targets, the Reject, Update, and Delete options can adversely affect performance, because they determine the existence of a key value on an incoming record before performing the specified action. This is done by issuing a SELECT command against the target table, and then waiting for the response. If one of these actions is required, try to limit the volume of records in the incremental change. These actions will perform best if there are unique keys on the table.

    If you select Update the existing record or Delete the existing record from the If the record exists drop-down menu, you can also set behavior when the record does not exist using the If the record does not exist drop-down menu:

    • Commit every row(s). Specifies the row interval to commit or write transactions to the database.
  • Insert records from memory. Speeds the loading of the data target by inserting a block of rows at once. You can set the row interval to commit or write transactions, and the number of records to load in a block.

    Note: This option is only available for Db2/UDB CLI, MySQL, MS SQL Server, Oracle, Informix, Sybase ASE, and Teradata 13. For other databases, the number of records in a block will default to 1 and cannot be changed. For details about loading relational targets, see How to Set Options for Relational Targets, and Target Properties Window for Relational Targets.

    Insert records from memory:

    • Loads much faster, since a block of rows are inserted at once.
    • Requires input data to be clean. If any one row in the block causes a data source constraint violation, such as not null or unique index, the entire block is rejected for an MS SQL Server target. For other targets it depends on the database.
    • Does not provide row counts (the number of records inserted or rejected) in the detail log or statistics.
    • Does not provide record logging (used to write rejected records to a flat file for review).
  • Bulk load utility via a disk file. Uses database bulk loaders instead of iWay to insert data into a target. DataMigrator automates bulk loading for Greenplum, Hyperstage, Ingres, Informix, Microsoft SQL Server, IBM Db2, MySQL, Teradata, Nucleus, Oracle, Sybase Adaptive Server Enterprise, and Sybase Adaptive Server IQ. You can set format version, maximum number of errors, first and last row to copy, packet size, and row interval to commit or write transactions.

    The options available will depend on the RDBMS of your target type.

    Note that rows rejected by a DBMS vendor bulk load program are not counted as errors. Users of the bulk loader should look into the vendor specific documentation on how to handle the error handling of the bulk loader program.

    Note:
    • To use a database vendor bulk load program, such as bcp from MS SQL Server or Sybase, sqlload from Oracle, or mysql from MySQL, that program must be installed on the system where the DataMigrator Server is running and in the PATH.
    • Does not provide row counts (the number of records inserted or rejected) in the detail log or statistics.
  • Extended Bulk Load Utility. Uses a database loader program with additional capabilities. Currently available for Apache Hive, Cloudera Impala, Google BigQuery, EXASol, Greenplum, Hyperstage, Informix, Jethro, MongoDB, MS SQL Server ODBC, MySQL, Netezza, Oracle, PostgreSQL, Salesforce.com, Sybase ASE, Sybase IQ, Teradata, and 1010data.

    The options available will depend on the RDBMS of your target type. For more information on the options available for each adapter, see Extended Bulk Load Utility Options.

    Note: Does not provide row counts (the number of records inserted or rejected) in the detail log or statistics.

  • IUD Processing enables you to load a data target with only the records that have changed. This feature is an optional, add-on component to DataMigrator available for selected relational databases.

    Note: This setting is not available if the Optimize Load option is enabled in the Data Flow properties.

  • Slowly Changing Dimensions. Enables you to load a data target with column values that change over time. There are two column types. Type I simply overwrites the value. Type II allows you to track activity before and after the change. Type II changes are handled either using an activation flag or end date/begin date. When you update a Type II column, you do not actually change its value. Instead, you update the activation flag to inactive, or the end date to the current date. You then add a new record with the new value and the activation flag on by default or the begin date set.

    Note: This setting is not available if the Optimize Load option is enabled in the Data Flow properties.

  • IUD/Slowly Changing Dimensions is a combination of the previous two load types. The Insert/Update/Delete indications in the CDC logs are not applied directly to the target. Instead, the SCD Types of the target are used.

    Note: This setting is not available if the Optimize Load option is enabled in the Data Flow properties.

Extended Bulk Load Utility Options

Reference:

This option uses a database loader program with additional capabilities and is currently available for specific adapters. When extended bulk load is used for large data volumes, faster throughput is achieved by creating multiple load sessions (or threads). It reads from the source in parallel, a block at a time, where the block size is set by the Commit every rows parameter. It also creates multiple sessions for loading if the database load utility supports that functionality, up to the number specified in Maximum number of load sessions parameter.

Note: The options available will depend on the RDBMS of your target type.

An example of the Extended Bulk Load Utility options for MS SQL Server can be seen in the following image:

MS SQL Server options

The following option is available for every adapter that has an Extended Bulk Load Utility load type.

Commit every row(s)

Number of rows in a block. Use at least the default value of 1,000,000.

The following parameters depend on the database:

Maximum Number of Load Sessions

The maximum number of the concurrent load sessions logged on to the database. This parameter is applicable when the BULKLOAD parameter is set to ON. Its setting can be defined on the flow level and also on the Adapter level. The level setting of the adapter applies to the designs of the new Data Flows. The value set on a flow level takes precedence during the execution.

The default value is set to 10. However, the user should set the value carefully and consider current environment factors, such as the number of CPU cores, available memory, the parameters on the data access agents on a server, network capabilities, and the number of users running the requests.

Maximum Number of Load Session Restarts

The maximum number of restart attempts after recoverable load session error.

The above three options are the only Extended Bulk Load options available for the following adapters:

  • EXASol
  • Hyperstage
  • MySQL
  • MariaDB
  • Netezza
  • Oracle
Note:
  • The Extended Bulk Load Utility option must not be used on a target synonym that is for an Oracle table containing a Sequence. This is due to the rejection of triggers by the Oracle API used in the Extended Bulk Load Utility.
  • The "Bulk thread postponed: too many connections" information message means that the maximum amount of loading threads was already started, so the new available intermediate load file will be queued and started when one of the loading threads become available. This situation occurs when either the Maximum number of Sessions value is too low or Commit value is too low.

The following sections describe the additional options that are available for each adapter.

Reference: Apache Hive and Cloudera ImpalaApache Hive, Cloudera Impala, and Salesforce.com Extended Bulk Load Options

Apache Hive and Cloudera ImpalaApache Hive, Cloudera Impala, and Salesforce.com targets only have the default Extended Bulk Load Utility option, as shown in the following image.

Hive extended bulk load options

Note: For the best performance when loading Apache Hive or Cloudera Impala, it is recommended that the DataMigrator Server be located on a data node or Edge Node (with the Hadoop client installed) of your Hadoop cluster.

However, it is possible to use DataMigrator to load data into Hadoop, even when the servers are not co-located. In this case, the DataMigrator Server will create a temporary file and use FTP to copy it to the Hadoop node. This requires that an FTP server is available on the Hadoop cluster, and in addition for use with Impala REXEC daemon.

Example: Configuring an Auxiliary Connection for Extended Bulk Load

Using Extended Bulk Load with several adapters requires that intermediate data files are transferred to the server where the database is running or a data node of a Hadoop cluster. These adapters include:

  • Apache Hive
  • Apache Drill
  • Cloudera Impala
  • EXASol
  • Jethro
  • MongoDB
  • 1010Data

For these adapters, you must enable the Flat File adapter and create a connection using FTP or SFTP to the serverconfigure Bulk Load to stage using FTP. From the DMC or Web Console, perform the following steps:

  1. Expand the Adapters folder or click the Adapters tabConnect to Data button on the sidebar.
  2. Expand the Configured folder.
  3. Right-click Flat File and click Add Connection.
  4. Right-click the adapter connection and click Configure Bulk Load.

    The Configure Bulk Load dialog box opens.

  5. In the Connection Name box, type the name of the connection for your Hive, Impala, or database server. The connection names must match.
  6. From the Protocol drop-down menu, select FTP or SFTP.
  7. In the Host Name box, type the name of the host for your server.

    Note: For Hive and Drill this must be a data node of your Hadoop cluster. For Impala, it must be a Data node or an Edge node. For Jethro, it must be the system where the Jethro server is running. For EXASol it should not be the system where it is running, rather it should be some other system that it can access using FTP.

  8. In the Initial Directory box, type the name of a directory that you have write access to, such as your home directory.
  9. In the User and Password boxes, type your user ID and password.
  10. Click Test.

    The following message should appear: User has read access to the FTP server.

    Note: Although not tested at this time, you must also have write access. A future release will add testing for write access.

  11. Click OK to close the dialog box.
  12. If successful, click Configure.
  13. Click Close.

Reference: Greenplum Extended Bulk Load Options

Greenplum targets have the following additional options, as shown in the following image.

Greenplum bulk load options
Column Delimiter

Delimiter characters between fields in the intermediate flat file that is loaded into the database. The delimiter can be up to four characters and can be specified as the following:

  • TAB. A tab character. This value is the default.
  • a - A. A character string. For example, ~.
  • 0x nn. A hex code. For example, 0x44 (a comma), or 0x0D0A (a return and a linefeed). The hex code uses ASCII for Windows or UNIX systems and EBCDIC for IBM Mainframes.
Escape Character

Specifies the single character that is used for escape sequences, such as ,, or @;. Also used for escaping data characters that might otherwise be interpreted as row or column delimiters.

Column Enclosure

Characters used to enclose each alphanumeric value in the intermediate flat file that is loaded into the database. The enclosure consists of up to four printable characters. The most common enclosure is one double quotation mark. Numeric digits, symbols used in numbers, such as a period (.), plus sign (+), or minus sign (-), and 0x09 (Horizontal Tab), 0x0A (Line Feed), or 0x0D (Carriage return) cannot be used in the enclosure sequence. In order to specify a single quotation march as the enclosure character, you must enter four consecutive single quotation marks.

Note: When a non-default value is used, the intermediate file is created as the Greenplum term CSV, instead of what is called TEXT.

Global Settings

These options are only available when BULKLOAD=ON and apply to all new flows using Greenplum as the target.

Note: ALIAS_CASE controls the case of column names when new tables are created. Greenplum users may want to select Enforce lower case.

Reference: Informix Extended Bulk Load Options

Informix targets have the following additional options, as shown in the following image.

Informix extended bulk load options
Column Delimiter

Delimiter characters between fields in the intermediate flat file that is loaded into the database. The delimiter can be up to four characters and can be specified as the following:

  • TAB. A tab character. This value is the default.
  • a - A. A character string. For example, ~.
  • 0x nn. A hex code. For example, 0x44 (a comma), or 0x0D0A (a return and a linefeed). The hex code uses ASCII for Windows or UNIX systems and EBCDIC for IBM Mainframes.

Reference: Jethro Extended Bulk Load Options

Jethro targets have the following additional options, as shown in the following image.

Column Delimiter

Delimiter characters between fields in the intermediate flat file that is loaded into the database. The delimiter can be up to four characters and can be specified as the following:

  • TAB. A tab character. This value is the default.
  • a - A. A character string. For example, ~.
  • 0x nn. A hex code. For example, 0x44 (a comma), or 0x0D0A (a return and a linefeed). The hex code uses ASCII for Windows or UNIX systems and EBCDIC for IBM Mainframes.

Reference: MS SQL Server ODBC Extended Bulk Load Options

MS SQL Server ODBC targets have the following additional options, as shown in the following image.

MS SQl Server options
Packet size to send

The packet size in bytes.

Reference: PostgreSQL Extended Bulk Load Options

PostgreSQL targets have the following additional options, as shown in the following image.

Greenplum bulk load options
Column Delimiter

Delimiter characters between fields in the intermediate flat file that is loaded into the database. The delimiter can be up to four characters and can be specified as the following:

  • TAB. A tab character. This value is the default.
  • a - A. A character string. For example, ~.
  • 0x nn. A hex code. For example, 0x44 (a comma), or 0x0D0A (a return and a linefeed). The hex code uses ASCII for Windows or UNIX systems and EBCDIC for IBM Mainframes.
Escape Character

Specifies the single character that is used for escape sequences, such as ,, or @;. Also used for escaping data characters that might otherwise be interpreted as row or column delimiters.

Column Enclosure

Characters used to enclose each alphanumeric value in the intermediate flat file that is loaded into the database. The enclosure consists of up to four printable characters. The most common enclosure is one double quotation mark. Numeric digits; symbols used in numbers, such as a period (.), plus sign (+), or minus sign (-); and 0x09 (Horizontal Tab), 0x0A (Line Feed), or 0x0D (Carriage return) cannot be used in the enclosure sequence. In order to specify a single quotation march as the enclosure character, you must enter four consecutive single quotation marks.

Note: When a non-default value is used, the intermediate file is created as the PostgreSQL term CSV, instead of what is called TEXT.

Global Settings

These options are only available when BULKLOAD=ON and apply to all new flows using PostgreSQL as the target.

Note: ALIAS_CASE controls the case of column names when new tables are created. PostgreSQL users may want to select Enforce lower case.

Reference: Sybase Extended Bulk Load Options

Sybase ASE and Sybase IQ targets have the following additional options, as shown in the following image.

Sybase Extended Bulk Load options
Column Delimiter

Is the delimiter character(s) used between fields. It is used by the intermediate flat file used to load the database.

The delimiter can be up to four characters and can be specified as:

TAB. A tab character. This is the default value.

a - A. A character string. For example, ~.

0x nn. A hex code, for example, 0x44 (a comma) or 0x0D0A (a return and a linefeed).

Row Delimiter

Is the delimiter character used between records. It is used by the intermediate flat file to the load the database. The row delimiter can be specified in the same manner as the (field) delimiter, except that character comma (,) is not permitted.

Activate Server Log

This parameter is available for Sybase IQ only.

Activate log information about integrity constraint violations and the types of violations. With this option two types of LOG files could be written into a temporary space in the configuration directory: MESSAGE LOG file etlblk.msg and ROW LOG file etlblk.log.

For Sybase IQ:

  • Yes. Writes both MESSAGE LOG and ROW LOG files: etlblk.msg and etlblk.log.
  • No. The LOG files are not written. This is the default value.
Note:
  • Extended Bulk Load supports all Sybase IQ native data types, except BINARY and VARBINARY.
  • A user running a data flow using Extended Bulk Load to Sybase IQ must have the following permissions:
allow_read_client_file = on
allow_write_client_file = on

This can be confirmed by using the following Sybase stored procedure:

sp_iqcheckoptions

If these permissions are not set, the flow will fail and display the following error:

FOC1261 PHYSICAL INTERFACE NOT FOUND FOR SUFFIX SQLSYB. MODULE   
NAME : dbodbc12_r 

Reference: 1010 Extended Bulk Load Options

1010 targets have the following additional options, as shown in the following image.

Column Delimiter

Delimiter characters between fields in the intermediate flat file that is loaded into the database. The delimiter can be up to four characters and can be specified as the following:

  • TAB. A tab character. This value is the default.
  • a - A. A character string. For example, ~.
  • 0x nn. A hex code. For example, 0x44 (a comma), or 0x0D0A (a return and a linefeed). The hex code uses ASCII for Windows or UNIX systems and EBCDIC for IBM Mainframes.

Reference: Teradata Extended Bulk Load Options

The Teradata Extended Bulk Load Utility uses TPT (Teradata Parallel Transporter) to load data.

Teradata targets have the following additional options, as shown in the following image.

Teradata Extended Bulk Load options
Rejected records maximum number

The maximum number of rejected records that can be stored in one of the error tables before the job is terminated. If unspecified, the default is unlimited.

TDP Id

The Teradata Director Program Id.

Account Id

The Account Id used to access the database.

Data error table

Name of table to contain records that were rejected because of data conversion errors, constraint violations, or AMP configuration changes. This must be a new table name. The default is the database default.

Index violation table

Name of table to contain records that violated the unique primary index constraint. This must be a new table name. The default is the database default.

Reference: Db2 Target Bulk Load Options

Db2 targets have the following options:

Load Options window
Loading method

INSERT. Insert the rows only. This is the default.

REPLACE. Drops the existing destination table, and creates a new table.

Note:
  • The bulk load option for Db2 is available for z/OS and IBM i only when configured for CLI, not CAF.
  • For bulk load on IBM i, the target table name for Db2 must not exceed 10 characters.
  • For bulk load on IBM i, the target table name must be a qualified name. The current library name must be specified, for example, library1/table1.

Reference: Db2 Target Insert Records from Memory Options

The Bulk Insert API (FASTLOAD) could be in effect for Db2 targets with load option Insert records from memory. The command that changes the settings for the Db2 FASTLOAD is:

ENGINE DB2 SET FASTLOAD [ON|OFF]

If effective, the default value for FASTLOAD is ON.

A SET command for Db2 to control whether or not the FASTLOAD operations are recoverable is:

SQL DB2 SET FASTLOAD_OPTION RECOVERABLE/NONRECOVERABLE

The default value is RECOVERABLE.

Reference: MySQL Bulk Load Options

There are no additional bulk load options for MySQL.

Reference: Nucleus Target Bulk Load Options

Nucleus targets have the following options:

Load options window
Version of ndl
Single-user

Starts the server in exclusive mode. No other database sessions may be established.

Multi-user

The server must already be started in multi-user mode and listening on the default port 1285.

Overwrite the error log file
Yes

Creates a new file.

No

Appends to existing log.ndl file.

Commit loaded rows if error occurs
Yes

Issues a commit.

No

Issues a rollback.

First input file bytes to ignore

Number of bytes to skip before loading data.

Null indicator

Specifies the character to represent null.

First input file rows to skip

Number of rows to skip before loading data.

Number of input file rows to process

Number of rows to process before stopping.

Disconnect all other connections

Disconnects connections.

Skip rows w/ unprintable characters

Skips rows with unprintable characters.

Server password

NDL server password. Required if Single User (ndls) mode and a server password actually exists.

Reference: Red Brick Target Bulk Load Options

Red brick targets have the following options:

Load Options window
Start record number in input field

Row number to begin copying data to the target.

End record number in input field

Row number to end copying data to the target.

Locale

The combination of language and location.

Maximum number of discarded records

The maximum number of discarded records allowed before loading stops. The default is the target RDBMS default.

Discard file name

File where duplicate records are discarded. The records will be stored for possible reloading.

RI discard file name

File where discarded records based on referential integrity will be stored for possible reloading.

Loading mode

INSERT. Insert the rows only. If the table is not empty, the load operation ends. This is the default.

APPEND. Used to insert additional rows of data into an existing table. Each new row must have a primary-key value that does not already exist in the table. Otherwise, the record is discarded.

REPLACE. Replaces the entire contents of a table.

MODIFY. Used to insert additional rows or to update existing rows in a table. If the input row has the same primary-key value as an existing row, the new row replaces the existing row. Otherwise, it is added as a new row.

UPDATE. Updates existing rows in an existing table. Each new row must have a primary-key value that is already present in the table. Otherwise, the record is discarded.

MODIFY AGGREGATE. If the primary key of the input row matches an existing row in the table, the existing row is updated as defined for the specified aggregate operator. If the primary key of the input row does not match an existing row in the table, the row is inserted.

UPDATE AGGREGATE. If the primary key of the input row does not match the primary key of a row already in the table, the input row is discarded. If it does match an existing row, the existing row is updated as defined for the specified aggregate operator.

Optimize

OFF. Indexes are updated when each input row is inserted into the data file, which provides better performance when the data being loaded contains many duplicate rows. This is the default.

ON. Overrides the global optimize mode setting in the rbw.config file.

Reference: Teradata Target Bulk Load Options

Teradata targets have the following options:

Bulk Teradata Options

Note: Teradata has two bulk load programs that DataMigrator supports. FastLoad is used for new target tables, and MultiLoad is used for existing target tables. Not all options shown in the image above are available for new target tables.

UNIX only

Two environment variables, $FASTLOAD_EXE and $MLOAD_EXE, should be set for using bulk load.

$FASTLOAD_EXE. Specifies the location of the Teradata FastLoad utility.

$MLOAD_EXE. Specifies the location of the Teradata MultiLoad utility.

Two lines should be added to the profile, or server start up shell script (edastart), that are similar to the following:

export FASTLOAD_EXE = ~teradata/fastload
export MLOAD_EXE = ~teradata/mload
Rejected records maximum number

The maximum number of rejected records allowed before loading stops. The default is the target RDBMS default.

TDP Id

The Teradata Director Program Id.

Account Id

The account Id used to access the database.

Maximum number of sessions

The maximum number of MultiLoad or FastLoad sessions logged on to the Teradata database.

Start record number in source

Row number to begin copying data to the target.

End record number in source

Row number to end copying data to the target.

Work table

Name of the work table.

Acquisition phase errors table

This table provides information about all errors that occur during the acquisition phase of your Update operator job, as well as some errors that occur during the application phase if the Teradata RDBMS cannot build a valid primary index.

Application phase errors table

This table provides information about uniqueness violations, field overflow on columns other than primary index fields, and constraint errors.

Loading Method

INSERT. Inserts the rows only. This is the default.

UPSERT. Does inserts for missing update rows.

Note: This option is available for existing target tables with keys.

AMP Check

The MultiLoad response to a down Access Module Processor (AMP) condition.

ALL. Pauses the MultiLoad job when AMP is down.

NONE. Allows the MultiLoad job to start, restart, or continue as long as no more than one AMP is down in a cluster.

APPLY. Inhibits the MultiLoad job from entering or exiting the application phase when an AMP is down.

Note: This option is available for existing targets only.

Log Table

Restarts the log table for the MultiLoad checkpoint information. This option is available for existing target tables only.

Notes

Enter an annotation for this target.

Procedure: How to Set the Data Flow Record Logging Options

Setting the data flow record logging options allows you to write particular types of transactions to log files.

Although Record Logging options appear in every data target, they are set on a per flow basis. Changing the options in one target will reset them for all targets.

  1. On the Flow tab, in the Tools group, click Properties, or right-click anywhere in the workspace and click Flow Properties.

    The Properties pane opens.

  2. Expand the Record Logging attribute.
  3. Select the check box next to the options that you want to log.

    Record logging options are optional and are not available when the data target is a formatted file.

  4. Click Save in the Quick Access Toolbar.

    For more information on record logging options for a data flow, see How to Set the Flow Record Logging Properties.

Target Properties Pane

Reference:

To access the Target Properties pane, right-click a target object and click Properties. The following image shows the Properties pane for an existing Relational target.

All targets have the following properties:

Display Name

By default, shows the application name, if enabled, and synonym name. Controls the name that is shown under the object in the flow.

Notes

Enter an annotation for this target.

Type

Is the target type.

Adapter

Is the target adapter.

Synonym

Is the synonym name.

Note: Synonym names cannot be longer than eight characters.

The following sections describe the Target Options available for each different target.

Reference: Target Properties Pane for Relational Targets

How to:

The Target Options section of the Properties pane for relational targets contains the following additional options:

Connection

Is the connection for the data target. For a relational data target, this is a database server. For ODBC, this is a data source.

Table

Is the name of data target table.

Keys

Is the names of key columns in order. This option is only available for new targets.

Prior to Load Options:
These options are only available for existing targets.
No changes

Does not delete the records already in a data target.

Delete all rows from table

Deletes all rows and creates a database log.

Truncate table

Deletes all rows from the table, but does not generate a database log. This is a faster option than using Delete all rows from table.

The Target Load Options include the following load options:

Load Type:

Note: Although Load Type appears in every data target, it is set on a per flow basis. Changing the type in one target will reset it for all targets.

Insert/Update

Specifies the behavior of DataMigrator while loading data.

Insert/Update options can be set on a per target basis.

If you select this option, you can set a behavior when duplicate records are found from the If the record exists drop-down menu.

  • Include the record. Allows the relational database target to handle duplicates.

    Note: If you select Include the record, the record is passed directly to the relational database, which determines whether to accept it or not. If inserting a record would result in a duplicate key, the RDBMS will reject it due to a Unique Index constraint violation and return an error. Processing continues even if such errors occur, up to the number of errors specified under Stop processing after __ DBMS errors in the General properties of the flow.

  • Reject the record. Issues a SELECT command against the target table to see if a record exists with the key values. If there are no key columns, DataMigrator screens all columns. If the record is found, the record is rejected.
  • Update the existing record. Updates records if the key value on the input record, or entire record if there are no keys, are found in the table. All non-key values are updated in the target.
  • Delete the existing record. Deletes the record if the key on the input record (or entire record if no keys) is found in the table.

    If you select Update the existing record or Delete the existing record from the If the record exists drop-down menu, you can also set behavior when the record does not exist using the If the record does not exist drop-down menu:

  • Include the record. Includes the key in the data target.
  • Reject the record. Does not include the key in the data target.

    Note: The Reject, Update, and Delete options can adversely affect performance because they determine the existence of a key value on an incoming record before performing the specified action. This is done by issuing a SELECT command against the target table, then waiting for the response. If one of these actions is required, try to limit the volume of records in the incremental change. These actions will perform best if there are unique keys on the table.

  • Commit every row(s). Specifies the row interval to commit or write transactions to the database.
Insert records from memory

When you specify this load type, you need to specify:

  • Commit every row(s). Specifies the row interval to commit or write transactions to the database.
  • Block size. Specifies how many records you want to process at a time.

Note: The value for Commit every rows should be a multiple of the Block size, for example, 10000 and 2000.

Bulk load utility via a disk file

These options will depend on your target type.

Extended bulk load utility

Uses a database loader program with additional capabilities. For more information, see Extended Bulk Load Utility Options.

Slowly Changing Dimensions

This option is only available for existing relational targets.

Enables you to load a data target with column values that change over time. There are two column types. Type I simply overwrites the value. Type II allows you to track activity before and after the change. Type II changes are handled either using an activation flag or an end date/begin date. When you update a Type II column, you do not actually change its value. Instead, you update the activation flag to inactive, or the end date to the current date. You then add a new record with the new value, and the activation flag on by default or the begin date set.

  • Updates for Type I. Enables you to specify how updates will affect Type I fields. The options are Change all rows or Change only active rows. Change all rows is the default value.
  • Commit every row(s). Specifies the row interval to commit or write transactions to the database.
IUD Processing

Enables you to load a data target with only the records that have changed. This feature is an optional, add-on component to DataMigrator. When this option is selected, the Prior to Load option is eliminated.

IUD/Slowly Changing Dimensions

This load type is a combination of the previous two load types. The Insert/Update/Delete indications in the CDC logs are not applied directly to the target. Instead, the SCD Types of the target are used and deleted records in the source are marked as inactive in the target.

Reference: Set Options for Relational Targets
  1. In the data flow workspace, right-click the data target, and click Properties.

    The target Properties pane opens.

  2. For existing targets, select whether to remove data prior to loading the data target in the Prior to Load Option drop-down menu. The options are:
    • No Changes. Does not delete the records already in a data target.
    • Delete all rows from table. Deletes all rows and creates a database log.
    • Truncate table. Deletes all rows from the table but does not generate a database log. This is a faster option than Delete all rows from table.

      Note: Truncate table is not supported by Db2 on IBM i.

  3. Select a Load Type. The options are:
    • Insert/Update. Allows you to set a behavior when loading records.
    • Insert records from memory. Speeds the loading of the data target by inserting a block of rows at once. You can set the row interval to commit or write transactions and the number of records to load in a block. This option is only available for relational databases that support it, including Db2 on i V6R1 for CLI, Db2 on z, Informix, MS SQL Server, MySQL, Oracle, Sybase ASE, Teradata 13, and UDB.

      This option:

      • Requires clean input data. If any row in the block causes a data source constraint violation, such as not null or unique index, the entire block is rejected.
      • Does not provide row counts (the number of records inserted or rejected) in the detail log or statistics. NA (not available) will appear instead.
      • Does not provide record logging. For example, rejected records cannot be written to a flat file for review.
    • Bulk load utility via a disk file and Extended Bulk Load Utility. Use database bulk loaders instead of iWay to insert data into a target. DataMigrator automates bulk loading for Hyperstage, Ingres, Informix, Microsoft SQL Server, IBM Db2, Teradata, Nucleus, Oracle, Sybase Adaptive Server Enterprise, and Sybase Adaptive Server IQ. You can set format version, maximum number of errors, first and last row to copy, packet size, and row interval to commit or write transactions.
    • Slowly Changing Dimensions. Enables you to load a data target with column values that change over time. There are two column types. Type I simply overwrites the value. Type II allows you to track activity before and after the change. Type II changes are handled using either an activation flag or an end date/begin date. When you update a Type II column, you do not actually change its value. Instead, you update the activation flag to inactive, or the end date to the current date by default. You then add a new record with the new value and the activation flag on by default or the begin date set.
    • IUD Processing. Enables you to load a data target with only the records that have changed. This feature is an optional, add-on component to DataMigrator.

      Note: The IUD Processing option is only available for existing relational targets.

    • IUD/Slowly Changing Dimension. Is a combination of the previous two load types. The Insert/Update/Delete indications in the CDC logs are not applied directly to the target. Instead, the SCD Types of the target are used.

Reference: Target Properties Pane for FOCUS/FDS or XFOCUS Targets

How to:

The Target Options section of the Properties pane for FOCUS/FDS or XFOCUS targets contains the following additional options:

Data File

Is the name of the data file pointed to by the synonym.

On IBM z/OS, to create a file in HFS (hierarchical file system), enter the name. To create a dataset, enter the name as

//’qualif.tablename.FOCUS’

where:

qualif

Is a fully qualified location.

tablename

Should match the synonym name.

Keys

Is the number of key columns. This option is only available for new targets.

Prior to Load Options:

These options are only available for existing targets.

No changes

Does not drop (delete) the data target.

Drop Table

Drops and recreates the data target.

Load Type

Specifies the method DataMigrator uses to load data.

Insert/Update

Since DataMigrator uses Insert/Update to load FOCUS/FDS or XFOCUS targets, you can set a behavior when duplicate records are found from the If the record exists drop-down menu. These options can be set on a per target basis.

  • Include the record. Includes the duplicate record in the data target.
  • Reject the record. Issues a SELECT command against the target table to see if a record exists with the key values. If there are no key columns, DataMigrator screens all columns. If the record is found, the record is rejected.
  • Update the existing record. Updates records if the key value on the input record, or entire record if there are no keys, are found in the table. All non-key values are updated in the target.
  • Delete the existing record. Deletes the record if the key on the input record (or entire record if no keys) is found in the table.

    If you select Update the existing record or Delete the existing record from the If the record exists drop-down menu, you can also set behavior when the record does not exist using the If the record does not exist drop-down menu:

    • Include the record. Includes the key in the data target.
    • Reject the record. Does not include the key in the data target.
    • Commit every row(s). Specifies the row interval to commit or write transactions to the database.
Reference: Set Options for FOCUS/FDS or XFOCUS Targets
  1. In the data flow workspace, right-click the data target and click Properties.

    The Target Properties pane opens.

  2. For existing targets, select whether to remove data prior to loading the data target in the Prior to Load Options section. The options are:
    • No changes. Does not delete the records already in a data target.
    • Drop Table. Drops and recreates the data target.
  3. Select a behavior for loading records using the If the record exists and If the record does not exist drop-down menus. These options can be set on a per target basis. They are:
    • Include the record. Includes the duplicate record in the data target.
    • Reject the record. Issues a SELECT command against the target table to see if a record exists with the key values. If there are no key columns, DataMigrator screens all columns. If the record is found, the record is rejected.
    • Update the existing record. Updates records if the key value on the input record, or entire record if there are no keys, are found in the table. All non-key values are updated in the target.
    • Delete the existing record. Deletes the record if the key on the input record (or entire record if no keys) is found in the table.
  4. If you select Update the existing record or Delete the existing record from the If the record exists drop-down menu, you can also set behavior when the record does not exist using the If the record does not exist drop-down menu:

    Include the record includes the key in the data target.

  5. Click OK.
    Note:
    • For FOCUS/FDS or XFOCUS targets, the only Load Type available is Insert/Update.
    • Synonym names for FOCUS/FDS files cannot be longer than eight characters.

Reference: Target Properties for New XML Targets

DataMigrator can create a new XML document with name/value pairs. For more information on any other structures, see Creating a Data Flow Using a Target Based on a Predefined XML Schema.

Note: In order to create an XML document, a Server Administrator must first configure an adapter for XML.

The Target Options section of the Properties pane for XML targets contains the following additional options:

Data File

Is the name of the XML document described by the synonym. DataMigrator will also create an XML Schema Definition with the same name as the data file and an extension of .xsd.

Load Type

Is set to Insert/Update for XML documents. Note that no updates are currently performed.

Records

Is the name of the top level element.

Record

Is the name of the row level element.

Column

Is the name of the column level element.

Example: New XML Targets

The following example uses the same table ibisamp/dminv as a source with all the actual columns, and then creates a new XML target called dminvx. The element names are entered as shown:

Records - Inventory

Record - Item

Column - Detail

The first three rows of the resulting XML document would look like the following example when viewed in a browser:

XML Target Example

Reference: Target Properties Pane for Delimited Flat File Targets

The Target Options section of the Properties pane for delimited flat file targets contains the following additional options:

Note: Delimited Flat File will only appear as a target type if it is configured on the server.

Delimited Flat File Properties
Connection

Select the connection, FTP or SFTP server, where the data should be written.

Data File

Is the name of the data file pointed to by the synonym.

On IBM z/OS, to create a file in HFS (hierarchical file system), enter the name. To create a dataset, enter the name as

//’qualif.tablename.DATA’

where:

qualif

Is a fully qualified location.

tablename

Should match the synonym name.

Code Page

Specifies a code page for the target data.

Field Delimiter

Is the delimiter character(s) used between fields. The delimiter can be up to 30 characters. It can be specified as:

TAB. A tab character. This is the default.

a. A character string, for example ~.

0xnn. A hex code, for example, 0x44 (a comma), or 0x0D0A (a return and a linefeed). The hex code uses ASCII for Windows or UNIX systems and EBCDIC for IBM Mainframes.

Header

Inserts column headings as the first row of data and surrounds the column names with the character specified in the Enclosure field. When set to Yes, the FIELDNAME is used for the header.

You can change this to use the TITLE as the header instead. To do so, expand the Adapters folder, right-click Delimited Flat File (CSV/TAB), and click Change Settings. From the HEADER drop-down menu, select Build header based on field title and click Save.

Enclosure

This character is used to surround the column headings when Header is set to Yes.

Reference: Target Properties Pane for Flat File Targets

The Target Options section of the Properties pane for flat file targets contains the following additional options:

Flat File Targets properties
Connection

The connection configured for the selected adapter.

Data File

Is the name of the data file pointed to by the synonym.

On IBM z/OS, to create a file in HFS (hierarchical file system), enter the name. To create a dataset, enter the name as

//’qualif.tablename.DATA’

where:

qualif

Is a fully qualified location.

tablename

Should match the synonym name.

Code Page

Specifies a code page for the target data.

Prior to Load Options:

This option is only available for existing targets.

No changes

Does not delete the data target.

Delete File

Deletes and recreates the data target.

Reference: Target Properties Dialog Box for Formatted File Targets

How to:

The Target Options section of the Properties pane for formatted file targets contains the following additional options:

Formatted Flat File target properties
Connection

The connection configured for the adapter. The default value is <local>.

Format

Is the format of the target data.

Data File

Is the name of the data file pointed to by the synonym.

Reference: Set Options for Flat, Delimited Flat, Formatted, and XML File Targets
  1. In the data flow workspace, right-click the data target and click Properties.

    The Target Properties pane opens.

  2. For existing targets, select whether to remove data prior to loading the data target in the Prior to Load Options section. The options are:

    No changes does not delete the data target. New records are appended.

    Delete File drops and recreates the data target.

  3. Click OK.
    Note:
    • For flat files and XML files, the Load Type cannot be changed from Insert/Update, and the data is always appended.
    • For formatted file type EXL2K, the only Load Type available is Loading Flat File using HOLD. Excel workbooks will be created in .xlsx format.

Loading Data to Salesforce.com

How to:

Reference:

In order to insert or update rows in a Salesforce object, you can use a custom field identified as an External ID field.

Procedure: How to Create a Custom Field as an External ID

  1. Log in to salesforce.com.
  2. Click your name and select Setup.
  3. Under App Setup, expand Customize. Select an object, for example, Account.
  4. Click Add a field to on the object page.
  5. Under Custom fields and relationships, click New.
  6. Select a data type appropriate to an ID field, such as Text or Number. Click OK.
  7. Enter a name, for example, Account ID. Fill in the other fields as desired.
  8. Select the External ID checkbox.
  9. Click Next twice and then click Save.

    The Account (or other object) Fields page opens. The new field is shown with its AP name, for example, Account_ID__C.

  10. Create, or update, a synonym for Accounts (or another object).

Procedure: How to Identify an External ID Field In a Synonym

The synonyms created for Salesforce.com objects identify the ID field as the key. To use the External ID field as the key, you must edit the generated synonym and save a copy of it.

  1. In the DMC, double-click the synonym for Account (or another object).
  2. Right-click the ACCOUNT segment and click Properties.
  3. For the attribute KEY, type the name of the External ID field, such as ACCOUNT_ID__C.
  4. In the DMC, click Save As.
  5. Enter a new name, such as accountx. Click Save.

Reference: Using an External ID Field to Insert or Update Rows in a Salesforce.com Object

Using the alternative synonym you can now use the External ID field to identify rows to be inserted or updated as you would with a table in a relational database. Salesforce.com will automatically do an upsert; if a value of the External ID field matches a value already in the object being loaded, an update will occur. If not, it will do an insert.

Reference: Deleting Rows from a Salesforce.com Object

The internal ID field is used to delete rows from a Salesforce.com object. To delete rows using the External ID, create a data flow that performs a DB_LOOKUP. You can also create a data flow that joins a table with a list of External ID values to be deleted to the Salesforce.com object to obtain the values of the corresponding ID field.

WebFOCUS

Feedback