Adapters for the following types of data sources support Unicode:
For information about all adapters, see the Adapter Administration technical content.
Relational adapters in a Unicode environment assume that the DBMS returns character data to the server already converted to Unicode. The relational adapters convert data to the correct DBMS API when writing to a relational data source (for example, Oracle to UTF-8, Microsoft SQL Server to UTF-16, and Db2 on MVS to UTF-EBCDIC).
XML-based adapters obtain the code page from the XML declaration of the processed XML document.
The Adapter for Web Services generates SOAP requests using the UTF-8 code page.
Information Builders supports Db2 databases, version 8 and higher. To prepare the Db2 environment for Unicode on:
DB2CODEPAGE=1208
For example, for American English, you would export the following variables:
export LANG=EN_US.UTF-8
CURRENTAPPENSCH=UNICODE
The adapter supports Unicode only with the CLI interface.
In a Unicode environment, the Adapter for Db2 requires a BIND command for PREPARE/EXECUTE logic using parameter markers.
For information about data type support, see Relational Adapter Data Type Support for Unicode.
When retrieving a fixed-format sequential file, the server attempts to determine the code page the file was meant to be retrieved with by checking the Master File's CODEPAGE attribute. If the Master File does not contain the CODEPAGE attribute, the server uses the value specified by the APP PROPERTY CODEPAGE command, if one was issued. If a code page was not specified by the attribute nor by the command, the server code page is used to read the file.
If you use the Data Management Console to generate a data flow that creates a fixed-format sequential file, you can specify a code page in the Properties panel for the target object. DataMigrator will then create the fixed-format file in a way that can be read by the server when that server has been configured for the specified code page.
In a Unicode configuration, HOLD files in BINARY and ALPHA formats are created using UTF-8 conversion, which assigns each character three bytes of storage in ASCII environments or four bytes in EBCDIC environments. Fields defined in the Master File using the data type A in both the USAGE and ACTUAL attributes are described in terms of characters. Fields defined using any other combination of USAGE and ACTUAL attribute values are described in terms of bytes.
To force a field in a fixed-format sequential file to be described in terms of bytes, add B to the end of the field's ACTUAL attribute. For example, to specify that a field is stored in 10 bytes, you would specify:
ACTUAL=A10B APP PROPERTY appname CODEPAGE pagenum
The adapter will then read the specified number of bytes from the record and convert their contents to the number of characters specified by the file code page.
Regardless of how much storage a character occupies, it occupies only one space on a report, as always.
To set an application's code page, issue the following command from the application's profile or a server profile:
APP PROPERTY appname CODEPAGE pagenum
where:
Is the name of the application.
Is the number of the code page that the server will use to read a fixed-format sequential file in the named application.
The adapter, using the OLE DB interface, supports Unicode data stored in NCHAR and NVARCHAR fields (where N stands for national). N columns can support data of any language or combination of languages.
For information about data type support, see Relational Adapter Data Type Support for Unicode.
The Adapter for MySQL is implemented using JDBC. This implementation supports Unicode data stored in character fields with CHARACTER SET set to UTF-8.
You must set the LANG environment variable in the edastart file or in a separate shell file before you start the server. For example, for American English, you would export the following variable:
export LANG=EN_US.UTF-8
or
export LANG=en_US.utf-8
For information about data type support, see Relational Adapter Data Type Support for Unicode.
The adapter supports Unicode data in Oracle release 10g or higher databases that have been configured with the NLS_CHARACTERSET parameter set to either UTF8 or AL32UTF8. However, AL32UTF8 is preferred. You must set the NLS_LANG environment variable in the edastart file, in a separate shell file, in a database profile, or in a user profile.
Set NLS_LANG using the following syntax:
NLS_LANG = language_territory.characterset
For example, for American English:
NLS_LANG=American_America.AL32UTF8
where:
Is the selected language.
Is the name of the country associated with the selected language.
Is the value of the NLS_CHARACTERSET variable that is set in the Oracle database. For Unicode, this can be AL32UTF8 or UTF8. AL32UTF8 is preferred.
For example, for American English UTF-8, you would use the following setting:
NLS_LANG=American_America.AL32UTF8
For information about data type support, see Relational Adapter Data Type Support for Unicode.
SAP uses UTF-16 encoding in its Unicode system. The server uses UTF-8 and handles all conversions between the two encoding schemes.
NLS settings for the Reporting Server should be configured in such a way that the Application Server code page can handle the list of chosen languages. For example, ISO 8859-1 can accommodate most Western European languages. The 8859 family can handle character specifics with the lower set almost being mapped to US ASCII. Therefore, with 8859-1 one could request English, German, French, and Spanish. When a character set requires a code page that takes more than one byte per character (for example, many Asian languages), the only choice for the server is 65001 (UTF-8).
The adapters provide access to Unicode SAP BW and SAP ECC systems, respectively. This extends support of data and metadata in multiple languages to the server, consistent with support by the SAP server. A synonym can be created using one or more languages. Those languages will be used to create titles and descriptions.
The adapter supports Unicode data in Sybase ASE version 15.0 and higher databases that have been created with the CHARACTER SET option set to UTF-8. You must set the LANG variable in the edastart file or in a separate shell file before starting the server.
For example, for American English, you would export the following variables:
export LANG=EN_US.UTF-8
For information about data type support, see Relational Adapter Data Type Support for Unicode.
The Adapter for Teradata (CLI) supports Unicode UTF-8 format if:
Contact your database administrator (DBA) to determine whether international language support has been enabled in your Teradata system and/or consult the Teradata documentation for details about International Character Set support.
Note that, at the present time, when Unicode is enabled, the length of a Teradata Column Name and/or TITLE cannot exceed 21 characters (bytes).
For information about data type support, see Relational Adapter Data Type Support for Unicode.
In Unicode databases the information in CHAR(n) columns is stored in a UTF-8 encoding scheme. Most RDBMS Unicode columns of CHAR type specify length in bytes, not characters; the B-modifier in the Actual format denotes that a character column with a fixed byte length might contain a varying number of UTF-8 characters. This is reflected in the AnV Usage format, as shown in the following table.
DBMS |
Column Type |
Usage |
Actual * |
---|---|---|---|
Db2 |
CHAR(n) |
AnV |
AnB |
GRAPHIC(n) |
An |
An |
|
VARCHAR(n) |
AnV |
AnVB |
|
VARGRAPHIC(n) |
AnV |
AnV |
|
Microsoft SQL Server |
CHAR(n) single byte code page |
An |
An |
CHAR(n) double byte code page |
AnV |
AnV |
|
NCHAR(n) |
An |
An |
|
VARCHAR(n) |
AnV |
AnV |
|
NVARCHAR(n) |
AnV |
AnV |
|
MySQL |
CHAR(n) |
An |
An |
VARCHAR (n) |
AnV |
AnV |
|
Oracle |
CHAR(n CHAR) |
An |
An |
CHAR(n BYTE) |
AnV |
AnB |
|
NCHAR(n) |
An |
An |
|
VARCHAR(n CHAR) |
AnV |
AnV |
|
VARCHAR(n BYTE) |
AnV |
AnVB |
|
NVARCHAR(n) |
AnV |
AnV |
|
Sybase ASE |
CHAR(n) |
An |
AnB |
UNICHAR(n) |
An |
An |
|
VARCHAR(n) |
AnV |
AnVB |
|
UNIVARCHAR(n) |
AnV |
AnV |
|
Teradata761 |
CHAR(n) |
An |
An |
VARCHAR (n) |
AnV |
AnV |
Note that on EBCDIC platforms the ACTUAL size for a B-suffixed format is increased 1.5 times to accommodate the expansion when converting from UTF-8 to UTF-EBCDIC. For example, on MVS the synonym created for a Db2 CHAR(10) column contains the following: USAGE=A10, ACTUAL=A15B.
WebFOCUS | |
Feedback |