Neoview Character Sets Administrator's Guide (R2.3)

occur. Data is extracted from the Neoview database and converted from its database encoding
into UTF16 Java strings. Those strings are then encoded using the encoding specified in the
control file or, if not specified there, by the default encoding and written to the target source.
You can control how encoding and decoding errors are handled when user data is loaded. The
NVT.encoding-error-disposition system property controls how unmappable or malformed
characters are handled. Allowed property values are REPLACE, REPORT and IGNORE, all of
which are case-insensitive. The default is REPORT, which means the record containing the
characters that cannot be encoded is rejected as a bad record. REPLACE replaces the offending
character with a replacement character specified by the NVT.replacement-char system
property. IGNORE causes the offending character to be skipped over and the process continues
with the next character.
For extract characters that cannot be translated into the specified data file, encoding is replaced
with the character specified in NVT.replacement-char or by a replacement character, by
default a question mark.
The Java Client automatically chooses the encoding that best matches the mapping tables used
by the Neoview platform when specific Java encodings are specified in the control file or through
the default encoding. Table C-2 (page 45) maps these specified or default encodings to the
encodings are that are actually used. All other encoding are used as they are specified. If the
encoding used is not supported by Java, a fatal error is reported and execution terminates.
Table C-2 Mapping of Specified and Used Java Encodings
Actual Java Encoding Used
Specified or Default Java
Encoding
MS932Shift-JIS or SJIS
MS949EUC_KR or EUC-KR
MS950Big5
MS936GB2312, GBK, or EUC_CN
When the Java Client is enabled for pass-through mode, all the data source encodings specified
in the control file are ignored.
Control File Option Syntax
[ encoding = "encoding" ]
Where encoding specifies any valid Java character set encoding.
Control File Example
options {
# encoding to use if not specified by data source
encoding = UTF-8,
truncate = true
.
.
.
}
sources {
# encoding overrides the UTF-8 specified in the
# global options section
ex_file_1 file /data/ex_file_1 options (encoding =
SJIS),
# encoding is UTF-8 as specified in global options
ex_file_2 pipe ./data-files/test_data_FSR030-pipe
.
.
How Character Encoding Is Implemented on the Neoview Transporter Java Client 45