About Column Options

Hyperstage supports NULL and NOT NULL specifications for columns.



LOOKUP Columns

Hyperstage provides an additional modifier for string data type columns, called a LOOKUP column. The LOOKUP column uses an integer substitution for values. You can declare a LOOKUP column on a CHAR or VARCHAR column to increase its compression and performance in queries. However, to use a LOOKUP column, the CHAR or VARCHAR column should meet the following criteria:

Typically, a LOOKUP column is useful for fields like state, gender, category, and similar fields where the number of instances is very high, but the number of unique values is very low. To determine the ratio of records to distinct values, determine the number of distinct values using SELECT COUNT (DISTINCT <COLUMN>) FROM… Then, compare this to the number of records using a SELECT COUNT(<COLUMN>) FROM…

Note: Using a LOOKUP on a column where there are more than 10,000 distinct values will result in greatly reduced load speeds.

To declare a column as a LOOKUP column, use the following example syntax when creating a table:

CREATE TABLE (a VARCHAR(200), b VARCHAR(200) LOOKUP=TRUE, c INTEGER) with (engine=infobright);

In this example, a LOOKUP attribute is associated with columns a and b, but column c is a standard integer column.

Note:



Optimizing Columns for INSERTs

Hyperstage provides an additional modifier for columns to help optimize for INSERT operations, called a for_insert column. The for_insert modifier ensures that the most recent data pack is left uncompressed allowing for faster INSERTs in the case of a large number of single INSERTs with AUTOCOMMIT enabled, or small frequent LOADs (< 10000 rows each) with AUTOCOMMIT enabled.

If you are expecting a large number of individual INSERTs or small frequent LOADs with AUTOCOMMIT enabled, you should consider setting the for_insert modifier on character columns and large numeric columns (for example, 64-bit random identifiers or partNo). Small numeric columns (for example, color number or region ID) can be decompressed and recompressed with ease, and are unlikely to gain performance benefit from the for_insert modifier. For columns marked as lookup, the for_insert modifier may give very little benefit only. For smaller machines, you may wish to leave the for_insert modifier off in order to maximize compression for disk space.

Note: Currently, you can only set the for_insert modifier at the time of table creation. Modifying the column using ALTER TABLE to add or remove the for_insert modifier will be supported in a future release.

To declare a column as a for_insert column, add the comment 'for_insert' on the column. Enter the following command:

psql> create table …
(…
<<column name>> <<column type>> … comment 'for_insert'  …
…)
engine=infobright;

Issuing a SHOW CREATE TABLE command will display if the 'for_insert' modifier has been used for each column.



Unsupported Column Options

The following column options are ignored by Hyperstage:



Unsupported Indices Options

Hyperstage uses Knowledge Grid technology instead of standard indices and does not support explicit indices. The following elements of CREATE TABLE syntax related to indices are not allowed:


WebFOCUS