Generating a Surrogate Key

Many data warehousing applications make use of a surrogate key instead of using the keys from an operational system, like an alphanumeric employee ID or an integer employee number.

A surrogate key is usually an incrementing integer with no intrinsic meaning. One reason applications use surrogate keys is to improve the performance of JOINs.

You can build a data flow to load a table that has surrogate keys by using a self-referential transformation to generate the values for the surrogate key field.

In the following example, DMSALE, a sales table, is the data source, and the extracted records are loaded into a table named DMREPS. Values for the surrogate key REPNO are generated when the data flow is run.

Refer to the sample data flow DMSURRG for the complete example.

Example: Generating a Surrogate Key in a Data Flow

In the DMC:

  1. Right-click an application directory in the navigation pane, point to New, and then click Flow. The Data Flow tab opens in the right pane, with the SQL object displayed.
  2. Drag the data source object DMSALE from the ibisamp application directory in the navigation pane into the workspace, to the left of the SQL object.
  3. Right-click the SQL object and click Column Selection.

    The Column Selection dialog box opens.

  4. Select EMPID, PLANT, and SALESREP in the Available Columns list and click the arrow to add them to the Selected Columns list.
  5. With all three columns selected, select Group By from the Aggregate drop-down menu.
  6. Click the Insert Columns button.

    The SQL Calculator opens.

  7. In the Alias box, enter REPNO for the virtual column.
  8. Click the Functions tab, expand the Miscellaneous folder, and double-click the COUNTBY function.

    The Function Assist dialog box opens.

  9. Leave the attribute value set to 1 and click OK.

    The SQL Calculator should appear as shown in the following image.

    SQL Calculator
  10. Click OK to close the SQL Calculator.

    The Column Selection dialog box should appear as the following image.

    Column Selection dialog box
  11. Click OK to close the Column Selector dialog box.
  12. Drag the data target object DMREPS from the ibisamp application directory into the workspace, to the right of the SQL object.
  13. Right-click the target object DMREPS, and click Properties.

    The Properties dialog box opens.

  14. If the flow may be rerun, you should clear the target using the Prior to Load Option drop-down menu. For a relational target, select Delete all rows from table or Truncate Table. For a FOCUS/FDS target, select Drop Table. Close the Properties dialog box.
  15. Right-click the target object and click Target Transformations.

    The Transformations dialog box opens.

  16. Click the AutoMap button to map REPNO, EMPID, and PLANT. This moves them to the Expressions list.

    To map FNAME and LNAME, use substring to extract the necessary characters from the SALESREP source column, of which the first 16 characters are first name and the next 34 characters are last name.

  17. Double-click FNAME in the Target Columns list.

    The Transformation Calculator opens.

  18. Type the following expression under the Expression tab:
    SUBSTR(50, SALESREP, 1, 16, 16, 'A16')

    Click OK.

  19. Double-click LNAME in the Target Columns list.

    The Transformation Calculator opens.

  20. Type the following expression under the Expression tab:
    SUBSTR (50, SALESREP, 17, 50, 34, 'A34')

    Note: While the creation of a surrogate key does not require the use of the SUBSTR function, it is a useful routine that can be used in a number of different situations.

    Click OK.

  21. To map STATE, double-click STATE in the Target Columns list.

    The Transformation Calculator opens.

  22. Type the following expression under the Expression tab:
    DECODE PLANT (BOS MA DAL TX LA CA ORL FL SEA WA STL MO)

    This expressions will convert a three-character city name to its two-character state code.

    Click OK.

  23. Double-click ACTIVE in the Target Column list.

    The Transformation Calculator opens.

  24. Type 1 in the Expression tab.
  25. Click OK.

    When you have completed the transformations, the Transformations dialog box should appear, as shown in the following image.

    Transformations dialog box
  26. Test the transformations by clicking the Test Transforms button.

    The Test Transformations dialog box opens, as shown in the following image.

    Test Transformations dialog box

    Note that REPNO is incremented by one for each record and that FNAME and LNAME are transformed correctly.

    Close the Test Transformations dialog box.

  27. Click OK to close the Transformations dialog box.
  28. Save the data flow as DMXSURRG.
  29. On the Flow tab, in the Run group, click Run, and then click Run from the drop-down menu.

When the data flow run is complete, verify that the target was loaded properly and that the surrogate key was generated by right-clicking the DMREPS object, clicking Expand, and then clicking the Sample Data tab in the window that opens. The following image shows the DMREPS window with the Sample Data tab selected.

There are two ways to confirm that the job ran correctly:

On the Flow tab, in the Reports group, click View Last Log.

or

Right-click the flow from the navigation pane, point to Logs, and then click Last Log.

The following image shows the log of the most recently run procedure.

WebFOCUS

Feedback