ALLBASE/SQL Reference Manual (36216-90216)

226 Chapter7

Data Types

Native Language Data

Character data in the DBEnvironment can be represented in the native language speciﬁed

by the DBEnvironment language. When native language character columns are created,

they follow the same rules as CHAR and VARCHAR columns. For character columns, size

is deﬁned in bytes. Thus a column deﬁned as CHAR (20) could hold 20 characters in ASCII

or 10 characters in Japanese Kanji.

Numeric data must be in ASCII representation.

Pattern matching is in terms of conceptual characters rather than bytes. This is

necessary for languages in which there are both one-byte and two-byte characters

frequently mixed in the same string. An example is Japanese, in which the Kanji and

Hiragana characters occupy 16 bits each, whereas the Katakana characters use only 8 bits.

Conceptual character matching is also necessary to establish a collating sequence that

includes the one-byte ASCII character set as a subset of a two-byte character set such as

Chinese.

Truncation is done on a character basis. For example, imagine a column deﬁned as CHAR

(20). If a string contains 11 Kanji characters, or 22 bytes, the last character is truncated if

you try to insert it into the column. In a case where a string contains both Kanji and

Katakana characters and is 21 bytes long, the truncation depends on the size of the last

character. If it is a 2-byte Kanji character, the data is truncated to 19 bytes; if it is a 1-byte

Katakana character, the data is truncated to 20 bytes.

An implicit type conversion occurs when an NATIVE 3000 string is compared to a native

language CHAR or VARCHAR type. The shorter string is padded with ASCII blanks before

the comparison is done.

When a case insensitive ASCII expression is compared to a case insensitive NLS

expression, the two expressions are compared using the NLS collation rules. The case

insensitive NLS comparison is done by using the NLSCANMOVE and NLCOLLATE intrinsics.

The same ASCII characters in upper and lower case are equivalent. The same accent

characters (extended characters) in upper and lower case are also equivalent. However, an

accent character may not be the same as its ASCII equivalent, depending on the speciﬁc

language collation table.