HP-UX Reference (11i v1 05/09) - 4 File Formats (vol 8)

l
localedef(4) localedef(4)
character constants A single character (e.g., A) having the numerical value of the char-
acter in the machine’s character set.
symbolic names A string enclosed between < and
> is a symbolic name.
localedef input files are recommended to be written entirely in
symbolic names, utilizing a user defined or system-supplied char-
map file. This aids portability of localedef
input files between
different encoded character sets (see charmap(4)).
Symbolic names can be defined within a locale definition file by the
collating-element
and collating-symbol
keywords.
These are not character constants. It is an error if such an inter-
nally defined symbolic name collides with one defined in a charmap
file.
integer lists
Integer list
operands consists of one or more decimal digits separated by semicolons.
shift Shift
operands follow keywords toupper and tolower, and must consist of two
character-code constants enclosed by left and right parentheses and separated by a comma.
Each such character pair is separated from the next by a semicolon. For
tolower
, the
first constant represents an uppercase character and the second the corresponding lower-
case character. For
toupper, the first constant represents an lowercase character and
the second the corresponding uppercase character.
collating element entry
The order_start keyword is followed by collating element entries, one per line, in
ascending order by collating position. The collating element entries have the form:
collation_element[weight[;weight]]
collation_element can be a character, a collating symbol enclosed in angle brackets
representing a character or collating element, the special symbol
UNDEFINED or an
ellipsis (...).
A character stands for itself; a collating symbol can be a symbolic name for a character that
is interpreted by the charmap file, a multi-character collating element defined by a
collating-element
keyword, or a collating symbol defined by the collating-
symbol keyword
.
The special symbol UNDEFINED specifies the collating position of any characters not expli-
citly defined by collating element entries. For example, if some group of characters is to be
omitted from the collation sequence and just collate after all defined characters, a collating
symbol might be defined before the order_start keyword:
collating-symbol <HIGH>
Then somewhere in the list of collating element entries:
UNDEFINED <HIGH>
Notice that there is no second weight. This means that on a second pass all characters col-
late by their encoded value.
An ellipsis is interpreted as a list of characters with an encoded value higher than that of
the character on the preceding line and lower than that on the following line. Because it is
tied to encoded value of characters, the ellipsis is inherently non-portable. If it is used, a
warning is issued and no output generated unless the -c option was given.
The weight operands provide information about how the collating element is to be collated
on first and subsequent passes. Weight can be a two-character string, the special symbol
IGNORE, or a collating element of any of the forms specified for collating_element except
UNDEFINED. If there are no weights, the character is collating strictly by its position in
the list. If there is only one weight given, the character sorts by its relative position in the
list on the second collation pass.
An equivalence class is defined by a series of collating element entries all having the same
character or symbol in the first weight position. For example, in many locales all forms of
the character ’A collate equal on the first pass. This is represented in the collating element
entries as:
HP-UX 11i Version 1: September 2005 7 Hewlett-Packard Company Section 4161