SQL Translation

Topics:

This section describes how the Extender for Db2 and the SQL Translator access heterogeneous relational and non-relational databases. The topics include:

Column Name Resolution

When resolving column names, the SQL Translator does not accept a unique truncation of a column name as a valid name for that column. For example, if you had a table with a column named EMPID, you cannot refer to that column as EMP (assuming no other column referenced in the request began with those three letters). You must refer to the column by its full name, EMPID.

Alternate Column Names

The SQL Translator is fully ANSI compliant; therefore, the user cannot use ALIAS= facility in the Master File. The ANSI specification states that a column has only one name. To rename a column logically, the application should create a view of that table with a different column name.

Dynamically Defined Virtual Fields

It is not possible to define a virtual field dynamically in a remote procedure and use it in subsequent SQL statements against that table. Virtual fields must be defined in the Master File. For more information about Master Files, consult the server Adapter Administration manual.

Answer Set Generation

Topics:

To provide completely transparent SQL access, SQL Translation Services create a Cartesian product style answer set in all cases, regardless of the nature of the underlying DBMS. A Cartesian product style answer set is in keeping with the SQL-based nature of the data access mechanism.

What Is a Cartesian Product?

A Cartesian product or set multiplication is defined as the pairing of each element of x with every element of y. This type of response is the expected result of a relational JOIN.

What Does This Mean to You?

This means that in some instances, the answer sets received are larger than expected. The Cartesian product generation only affects situations involving JOINs, either implicit or explicit.

  • Explicit join. Defined in the SQL statement used to generate the answer set.
  • Implicit join. Reference to any data structure made up of independent parts, such as segments in a hierarchy.

For example, there is a three-segment hierarchical database. The top segment represents departments, one child segment represents employees, and the other—the furniture used by that department. Assume that the payroll department has 20 employees and 22 desks. If you ask for all of the employees and furniture from the payroll department, the SQL user would expect to get each employee listed 22 times, once for each desk. This type of answer is the Cartesian product set answer, and is consistent with the result you would expect from an SQL-based DBMS, such as Db2.

The Cartesian product set answer can appear only under certain specific circumstances, as in the above example, with multi-path requests in a hierarchical data structure. In general, it only results in a repetition of rows.

Answer Set Generation Logic

The algorithm used to interpret the generation of answer sets is straightforward. This algorithm is the structure around which answer set generation is performed. This algorithm will be familiar to any experienced SQL user, but may be a new experience to application developers and users that are more familiar with other DBMS systems.

This algorithm does not correspond to the internal mechanism of generating answer sets, but is a convenient means of thinking about that process. The internal mechanisms are different because they have been optimized for performance in specific DBMS environments.

The (simplified) algorithm is:

  1. Create the Cartesian product of every logical table referenced in the answer set. A logical table is defined as:
    • A relational table or view.
    • A flat file (VSAM, C-ISAM, etc.).
    • A segment of a hierarchical database.
    • A segment of a network database.
    • Any other data structure designated as a segment in the Master File.
  2. Remove all rows from the Cartesian product that do not pass the screening criteria specified in the WHERE clause of the SQL statement.
  3. Calculate any valued expressions in the SQL statement.
  4. Perform the ordering and grouping specified in the SQL statement.
  5. Remove any repeated values if specified with the DISTINCT operator.
  6. Calculate the results of any column functions (SUM, COUNT).
  7. Remove the result rows that do not correspond to screening conditions in the HAVING clause of the SQL statement.
  8. Return the answer set.

What to Look For

Three things help to explain the Cartesian product answer set generation, particularly for users unfamiliar with SQL-based DBMSs.

  • Results of aggregate functions, such as sum or count, are generated after the Cartesian product is created. This means that in the department/employee/furniture example, if the user requests a sum of the employees' salaries in every department that had enough desks for all employees, they receive the result of 22 times the sum of the salaries for the payroll department. This is because each employee is associated with each desk, and each desk with each employee, resulting in 440 (20 X 22) items in the Cartesian product instead of 20.
  • The Cartesian product is generated for the referenced logical tables in the request. For a hierarchical database, this refers to the referenced subtree. If the department/employee/furniture request is changed to ask for only a sum of the salaries of the employees, with no reference to the desks, the result is the expected sum of the salaries. Since the desks were not referenced, the employees would not be repeated in the personnel department.
  • A row is created only when every logical table in the join exists. If you ask for the sum of salaries where there are enough desks, you do not get a result for departments that did not own any desks. This behavior is usually referred to as an inner join.

Additional Features for SQL Translation Services

  • ANSI Level 2.The SQL Translator is compliant with the ANSI Level 2 SQL definition.
  • Virtual Column Support.The SQL Translator supports the definition of virtual column in the Master File, and the use of these in any capacity in which you would use a regular database column.
  • SQL Join Improvements.The SQL Translator handles virtually any join predicate based on an equality condition, regardless of the indexing or other characteristic of the column. This eliminates the necessity to understand any of the characteristics of the DBMS in which the data is stored.
  • View Creation/Deletion.You can CREATE and DROP temporary views in server databases.
  • SQL Translation Performance Enhancements.The SQL Translator provides improved functionality and performance. A sophisticated JOIN optimizer is included.

These and other features allow server users to develop the client/server applications quickly and easily.

SQL Translation Services Limitations

The following limitations exist in the Extender for Db2.

  1. A maximum of 16 tables may be referenced in a single SQL statement.
  2. A maximum of six SELECT statements may be joined by the UNION operator.
  3. Correlated subqueries are not supported, except for Db2 tables.
  4. The maximum number of columns that may be in the column list of a SELECT statement is 256. Your actual limit for a given query may be less, because the SQL Translation Services may reserve several of these items for its own use.
  5. Date and time arithmetic are not supported.
  6. The maximum number of fields in a GROUP BY or ORDER BY clause is 32. Again, the SQL Translation Services may reserve a small number of these for its own use.
  7. The maximum length of an SQL statement is 4000 bytes.
  8. The maximum size of a row is 32K.

WebFOCUS

Feedback