HP Neoview Character Sets Administrator's Guide HP Part Number: 544818-001 Published: April 2008 Edition: HP Neoview Release 2.
© Copyright 2007 Hewlett-Packard Development Company, L.P. Legal Notice Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
Table of Contents About This Document.........................................................................................................7 Intended Audience.................................................................................................................................7 New and Changed Information in This Edition.....................................................................................7 Document Organization.....................................................................
How Pass-Through Mode and UTF16 Conversion Are Implemented From the Java Client.........43 Encoding Data Sources....................................................................................................................44 Control File Option Syntax........................................................................................................45 Control File Example..................................................................................................................
List of Tables 1-1 1-2 1-3 2-1 3-1 3-2 3-3 4-1 5-1 B-1 C-1 C-2 C-3 C-4 C-5 C-6 C-7 D-1 D-2 D-3 Character Sets Stored in ISO88591 and UCS2 Columns for the Neoview Character Set Configurations...............................................................................................................................14 Default Prefixes for Character String Literals...............................................................................
About This Document This manual contains the information needed by database administrators and end users to use, configure, and troubleshoot the Neoview Character Sets feature for Release 2.3. Intended Audience This manual is intended for database administrators and other users of the Neoview Character Sets feature on the Neoview platform. New and Changed Information in This Edition This is a new manual.
Computer Type Computer type letters within text indicate case-sensitive keywords and reserved words. Type these items exactly as shown. Items not enclosed in brackets are required. For example: myfile.sh Bold Text Bold text in an example indicates user input typed at the terminal. For example: ENTER RUN CODE ?123 CODE RECEIVED: 123.00 The user must press the Return key after typing the input. [ ] Brackets Brackets enclose optional syntax items.
expression-n… Punctuation Parentheses, commas, semicolons, and other symbols not previously described must be typed as shown. For example: DAY (datetime-expression) @script-file Quotation marks around a symbol such as a bracket or brace indicate the symbol is a required character that you must type as shown. For example: "{" module-name [, module-name]... "}" Item Spacing Spaces shown between items are required unless one of the items is a punctuation symbol such as a parenthesis or a comma.
Neoview Guide to Stored Procedures in Java Information about how to use stored procedures that are written in Java within a Neoview database. Neoview Management Dashboard Information on using the Dashboard Client, including how to install the Client, Client Guide for Database start and configure the Client Server Gateway (CSG), use the Client windows Administrators and property sheets, interpret entity screen information, and use Command and Control to manage queries from the Client.
Publishing History Part Number Product Version Publication Date 544818–001 HP Neoview Release 2.3 April 2008 HP Encourages Your Comments HP encourages your comments concerning this document. We are committed to providing documentation that meets your needs. Send any errors found, suggestions for improvement, or compliments to: pubs.comments@hp.com Include the document title, part number, and any comment, error found, or suggestion for improvement you have concerning this document.
1 Introduction to Neoview Character Sets The Neoview Character Sets feature allows clients to store data encoded in any supported character set, including multibyte data, into Neoview database objects on the Neoview platform. Clients include customer applications running on other systems and users accessing Neoview client applications from client workstations.
client locale character set, SJIS characters, or UTF8 characters, depending on the selected Neoview character set configuration. Table 1-1 (page 14) identifies the character set encodings that the Neoview database uses to store characters in ISO88591 and UCS2 columns for the three Neoview character set configurations.
Table 1-2 Default Prefixes for Character String Literals Neoview Character Set Configuration Default Column Default Prefix for Character Set Definition Non-Default Column Character String Literals Non-Default Prefixes for (Does Not Need to Be Character Set Definition (Does Not Need to Be Character String Literals Specified) (Must Be Specified) Specified) (Must Be Specified) ISO88591 CHARACTER SET ISO88591 CHARACTER SET UCS2 _ISO88591 _UCS2, N SJIS CHARACTER SET ISO88591 CHARACTER SET UCS2 _ISO8859
the Neoview database can store data from different client locale character sets. Every client locale character is mapped to a plane within the Unicode encoding and equivalent characters are mapped to the same Unicode code point. Java-based Neoview client applications such as Neoview DB Admin, Neoview Command Interface, and JDBC applications can display data from multiple client locale characters.
Table 1-3 Features, Behaviors, and Limitations of the Neoview Character Set Configurations Configuration Features and Behaviors Limitations ISO88591 The ISO88591 configuration replicates the Neoview character set environment for Release 2.2. It allows users to store data encoded in any character set—including ISO8859-1 through ISO8859-15 and East Asian multibyte character sets—in ISO88591 columns.
Table 1-3 Features, Behaviors, and Limitations of the Neoview Character Set Configurations (continued) Configuration Features and Behaviors • Supports the character sets EUC-JP, KS-Code, BIG5, GB2312, GB18030, GBK, UTF8, and UTF16 from client locales. Multibyte client locale character encoding is converted to UTF16 encoding when it is stored in UCS2 columns in the Neoview database. • Requires Release 2.3 Neoview ODBC and Neoview JDBC drivers. If you connect a Release 2.2 driver to a Release 2.
NOTE: For information about how to check the version compatibility of Neoview ODBC drivers and Neoview JDBC drivers and install them, see these Readme files: • README for the HP Neoview ODBC Driver for Windows • README for the HP Neoview UNIX Drivers • README for the HP Neoview JDBC Type 4 Driver Compatibility Between Neoview ODBC and JDBC Drivers and the Neoview Database 19
2 How to Select the Correct Neoview Character Set Configuration Table 2-1 identifies the criteria you should use to identify the correct Neoview Character Set configuration for your Neoview platform. For professional assistance, contact your HP service provider. Table 2-1 Criteria for Selecting the Correct Neoview Character Set Configuration Select This Configuration... If Your Neoview Platform Environment Meets Any One of These Conditions or Sets of Conditions...
3 Using SQL Language Elements to Define and Manage Database Encoding Rules for Encoding SQL Language Elements Table 3-1 describes the rules that govern the use of character set data in SQL language elements for each of the three Neoview character set configurations. NOTE: Failure to observe the rules described in Table 3-1 can cause SQL queries to fail and return error messages.
Table 3-1 Summary of SQL Language Rules by Neoview Character Set Configuration (continued) SQL Language Rule ISO88591 Configuration Explicitly specify a Use these prefixes: valid character set • In ISO8859-1 string prefix value literals, you can but are (_character-set) not required to specify: for every string literal — _ISO88591 for in a column that is not ISO8859-1 characters in the default column for the configuration.
Table 3-1 Summary of SQL Language Rules by Neoview Character Set Configuration (continued) SQL Language Rule ISO88591 Configuration SJIS Configuration Size SQL identifiers as • SQL identifiers can be up • SQL identifiers can be a dictated by the selected to 128 characters (bytes) maximum of 64 to 128 configuration. SQL in length. characters in length, identifiers are limited • Regular identifiers are depending on the SJIS to 128 byte lengths in characters stored. used.
Table 3-1 Summary of SQL Language Rules by Neoview Character Set Configuration (continued) SQL Language Rule ISO88591 Configuration SJIS Configuration Unicode Configuration EMS event messages use client locale character encoding and might not be readable from Neoview DB Admin. Uses UTF8 encoding. Uses UTF8 encoding. EMS Event Messages EMS event messages from the Neoview platform are normally sent in UTF8 encoding.
Table 3-3 Behavior of SQL String Functions (continued) SQL String Function ISO88591 Configuration SJIS Configuration Unicode Configuration CHAR_LENGTH Returns the number of ISO88591 characters. Returns the number of bytes. Returns the number of UCS2 characters. CODE_VALUE Returns the code point for the first byte in the expression. Returns the code point for the Returns the code point for the first byte in the expression. first bytes in the expression.
Table 3-3 Behavior of SQL String Functions (continued) SQL String Function ISO88591 Configuration SJIS Configuration Unicode Configuration OCTET_LENGTH Returns the number of bytes in Returns the number of bytes an ISO8859-1 string. in the character string. Returns the number of bytes in a UCS2 string. POSITION Same as LOCATE Same as LOCATE Same as LOCATE REPEAT Returns a character string made by the repetition of the provided character string by the number of times specified.
Table 3-3 Behavior of SQL String Functions (continued) SQL String Function ISO88591 Configuration SJIS Configuration Unicode Configuration UCASE Upshifts ISO8859-1 characters. Upshifts ISO8859-1 characters. Upshifts UCS2 characters. For information about the UPSHIFT NOTE: If the expression function, see the Neoview SQL contains SJIS characters, the Reference Manual. result might contain invalid SJIS characters. The LCASE function works on one byte at a time, not on one character at a time.
4 Capabilities and Limitations of Neoview Client Applications This chapter describes the capabilities and limitations of these Neoview client applications with respect to the Neoview Character Sets feature for this Neoview release: • • • • • • “Neoview Transporter Java Client” (page 31) “Neoview Loader” (page 31) “Neoview DB Admin” (page 32) “Neoview Management Dashboard Client” (page 32) “Neoview Command Interface (NCI)” (page 32) “Workload Management Services (WMS)” (page 32) Neoview Transporter Java Cl
For information about using gcmd to set the value of the cSetConversion (-cc) argument to specify whether or not the Neoview Loader should perform character set conversion, see “How Character Encoding Is Implemented on the Neoview Loader” (page 47). Neoview DB Admin For this Neoview release, Neoview DB Admin references the character set values defined in the SYSTEM_DEFAULTS table. It displays and allow users to enter characters in all the supported client locale character encodings.
• • • Within WMS, SQL statements are always assumed to be in UTF8 encoding for the SJIS and Unicode configurations. In the ISO88591 configuration, SQL statements are in the client locale character encoding and will therefore display properly through a WMS Java client but not through a WMS ODBC client. Within WMS, query plans are assumed to be in the encoding specified by the ISO_MAPPING value for the SJIS and Unicode configurations.
5 Troubleshooting Guidelines for Neoview Character Sets Users Table 5-1 identifies the Neoview Character Sets-related problems that users might need to troubleshoot. For each problem type, the symptoms, probable causes, and recommended corrective actions are provided. If you encounter problems that are not described here, contact your HP service provider.
Table 5-1 Troubleshooting Symptoms, Causes, and Recommended Corrective Actions for Users (continued) 36 Problem Type Symptoms Probable Causes Recommended Corrective Actions Correct character string literal prefixes not provided An incompatible character sets error (4039) is displayed at a client workstation. This error is displayed when a user fails to explicitly specify the correct prefix for a character string literal (for example, _ISO88591 or _UCS2).
Table 5-1 Troubleshooting Symptoms, Causes, and Recommended Corrective Actions for Users (continued) Problem Type Symptoms Probable Causes Recommended Corrective Actions Incompatible client locale errors in the ISO88591 configuration These symptoms occur Causes can include: when a client • You inserted characters from workstation attempts one client workstation into an to query the Neoview SQL table column and database: attempted to retrieve • An error is incompatible characters from generated stating
Table 5-1 Troubleshooting Symptoms, Causes, and Recommended Corrective Actions for Users (continued) 38 Problem Type Symptoms Probable Causes Recommended Corrective Actions Incompatible client locale errors when an ODBC driver-connected query application is used in the Unicode configuration.
A Character Set Mapping Tables The Neoview platform and its clients use mapping tables for these character sets to support the Neoview Character Sets feature for this Neoview release: • Big5 • EUC-JP • GB2312 • GB18030 • GBK • KSC5601-1987 • SJIS To access these mapping tables, see Mapping Tables for Neoview Character Sets.
B Capabilities and Limitations of Multiple Client Locales in the Unicode Configuration This appendix describes the capabilities and limitations imposed on multiple client locales in the Unicode configuration for this Neoview release.
C Configuring Neoview Client Applications The Neoview Transporter, Neoview Loader, Neoview ODBC drivers, and Neoview JDBC driver each provides certain translation functions on client locale character encoding inserted into the Neoview database and database encoding retrieved by the client workstations.
Table C-1 How Pass-Through Mode and UTF16 Conversion are Implemented From the Neoview Transporter Java Client How Pass-Through Mode is Enabled and Disabled for the Java Client How to Enable and Disable UTF16 Conversion for Java Strings Additional Guidelines The JDBC connectivity server communicates the current ISO_MAPPING value to the Java Client so that it knows what character set to store in ISO88591 columns in each of the three configurations: • If the value of ISO_MAPPING is ISO88591 (ISO88591 configu
occur. Data is extracted from the Neoview database and converted from its database encoding into UTF16 Java strings. Those strings are then encoded using the encoding specified in the control file or, if not specified there, by the default encoding and written to the target source. You can control how encoding and decoding errors are handled when user data is loaded. The NVT.encoding-error-disposition system property controls how unmappable or malformed characters are handled.
. } Encoding Control Files A control file is a text file that instructs the Java Client how you want your data moved from source to target for loading or extracting purposes. Control file characters are encoded in UTF8. UTF8 supports existing control files and allows non-ASCII characters to be used in newly-created control files.
How Character Encoding Is Implemented on the Neoview Loader • Because the Neoview Loader does not perform character set translation (with its pass-through flag is always set to ON), the character data in any input file for the Neoview Loader must use the same encoding as is required for database encoding. • You must enable the loader's pass-through mode flag to load UTF8 files into ISO88591 columns.
Table C-6 Attribute Values Used by the Neoview ODBC Driver for UNIX for a Sample DSN Configuration Attributes Values Description = Data Source for Charset Support Catalog = NEO Schema = example-schema-name DataLang =0 ReplacementCharacter =? FetchBufferSize = SYSTEM_DEFAULT Server = TCP: example-IP-address Service Name = HP_DEFAULT_SERVICE SQL_ATTR_CONNECTION_TIMEOUT = SYSTEM_DEFAULT SQL_LOGIN_TIMEOUT = SYSTEM_DEFAULT SQL_QUERY_TIMEOUT = NO_TIMEOUT The DataLang attribute is used for tr
D Neoview ODBC Driver and Neoview JDBC Driver Mappings of Character Sets and Language IDs This appendix provides information about the language ID values that map to the client locale character sets supported by the Neoview ODBC drivers and Neoview JDBC driver. The Language attribute used on the client side of the Neoview platform can take one of several values. The default value for the character set is SYSTEM_DEFAULT. For this Neoview release, users cannot specify other values for these character sets.
Because there is no Microsoft driver manager on the *nix side, the Neoview ODBC driver for UNIX takes as input any character that is sent “as is” by the client application. If your language settings match any of those listed in Table D-2 (page 49), the Neoview ODBC driver for UNIX performs the required translations. If the language settings do not match, the driver uses pass-through mode, meaning that all character data is sent to the server “as is.
Glossary character set A mapping of characters to code point values. client locale In the context of the Neoview Character Set feature, the character set used by a client. compatible character sets Two or more character sets are compatible when every character in one character set can be successfully mapped to a character in the other character set, although not necessarily with the same code point values.
Index C R Capabilities and limitations multiple client locales in Unicode configuration, 41 Neoview Command Interface, 32 Neoview DB Admin, 32 Neoview Loader, 31 Neoview Management Dashboard Client, 32 Neoview Transporter, 31 Character set column definitions, 13 Client locale character encoding overview, 15 Compatibility between drivers and Neoview database, 18 Compatible client locales, 15 Configuring JDBC driver, 48 Neoview Loader, 47 Neoview Transporter, 43 ODBC driver for UNIX, 47 ODBC driver for Wind