Neoview Character Sets Administrator's Guide (R2.4, R2.5)
Unicode format conversions (UTF16, UTF8, or UCS2) along the way. Once these conversions
map the SJIS character hexadecimal value to a Unicode code point value that is shared by other
SJIS character hexadecimal values, it can be difficult to determine the original SJIS character
hexadecimal value. When the hexadecimal values of SJIS characters with the same Unicode code
point value are compared and found not to be the same, a SJIS character mismatch has occurred.
SJIS character mismatches can occur on the client and SQL sides of the Neoview platform. This
discussion focuses on the more critical SQL-side SJIS character mismatches that occur from the
Neoview database.
SQL-Side SJIS Character Mismatch Examples
SJIS characters with the hexadecimal values 0x81CA, 0xEEF9, and 0xFA54 all represent the same
glyph and map to the Unicode code point value 0xFFE2.
Assume in these examples that you are operating in a Windows environment and using the
Neoview ODBC driver for Windows. If you execute an SQL statement that contains any of these
three SJIS characters and the SQL compiler is invoked to compile the statement, the compiler
translates the character to the UCS2 code point value of 0xFFE2 before parsing the statement.
When the SQL compiler converts the character string literal back to SJIS encoding, it automatically
converts 0xFFE2 to the lowest value (0x81CA) and returns it to the client. This is fine if the
hexadecimal value of the SJIS character originally provided by the client was 0x81CA, but not if
it was 0xEEF9 or 0xFA54.
First, assume you perform either or both of these two SQL operations:
CREATE TABLE T1 (col1 char(2) character set ISO88591) no partition;
and
INSERT INTO T1 values ( 'character-glyph' ) ;
where character-glyph represents the common SJIS glyph and has a SJIS hexadecimal value
of 0x81CA
In these examples, assume that the lower hexadecimal value is chosen, so both operations produce
a row where the value in col1 is 0x81CA.
The SQL operation:
SELECT * from T1 where col1 = 'character-glyph' ;
displays the row because the compiler uses a WHERE clause to translate the SJIS character glyph
to 0x81CA, allowing the comparison routine used by the SELECT statement to find the row.
Next, assume ODBC is instructed to insert a row where the col1 value is 0xEEF9 and is bound
to the statement by a parameter:
INSERT INTO T1 values ( ? ) ;
The new row assumes the value 0xEEF9 in col1. The previous SELECT statement, which used a
character string literal, would not have selected this row because the WHERE clause was searching
for 0x81CA.
If a new row is inserted using JDBC, a SJIS character mismatch does not occur because 0xEEF9
(or 0xFA54) would be converted to Unicode and back to the SJIS character value 0x81CA before
the data is put in the new row.
SQL-Side SJIS Character Mismatch Scenarios
SJIS character mismatches can occur from the Neoview database in the SJIS configuration using
a Neoview ODBC connection in either of these two scenarios:
• The Neoview ODBC driver converts an SQL statement's SQL identifiers and SJIS
MS932-encoded character string literals to UTF8 and sends the UTF8 data to the Neoview
database, where it is converted back to SJIS and stored in a SJIS column. At the same time,
the Neoview ODBC driver also sends SJIS MS932-encoded characters with the same original
SJIS Character Mismatches 43