HP-UX 11.0 - 11i Internationalization Features White Paper
Encoding Characters
Unicode 2.1 Support [11.0 patch, 11i v1]
Chapter 2
30
Unicode 2.1 Support [11.0 patch, 11i v1]
HP-UX provides system-level support for the Unicode 2.1/ISO 10646 character set. Hewlett-Packard’s support
for Unicode provides the basis for enabling heterogeneous interoperability for all locales.
ISO 10646 is an industry standard for defining a single encoding that uniquely encodes all the world’s
characters. Unicode 2.1 is the companion specification to ISO 10646. Unicode support conforms with existing
X/Open (OpenGroup), POSIX, ISO C and other relevant UNIX-based standards.
HP-UX 11.0 supports Unicode/ISO 10646 by using the UTF-8 (Universal Transformation Format - 8)
representation for persistent storage. UTF-8 is an industry-recognized 8-bit multibyte format representation
for Unicode. This representation allows for successful data transmission over 8-bit networking protocols as
well as safe storage and retrieval within a historically byte-oriented operating system such as HP-UX.
For internal processing, HP-UX uses the four-octet (32-bit) canonical form specified in ISO 10646. This
support allows parity with current HP-UX wchar_t implementation, that has been based on a 32-bit
representation.
Full systems level support is available for all locales provided in the release.
For more information on the Unicode features of the Asian System Environment, refer to the
/usr/share/doc/ASX-UTF8 directory.
The following tables display a select subset of locale binaries that are provided for 32-bit application
processing:
Table 2-13 Base utf8 Locales for 32-bit Application Processing
Locale
C.utf8 C UTF-8
univ.utf8 universal
Table 2-14 European utf8 Locales for 32-bit Application Processing
Locale Language (Region)
fr_CA.utf8 French (Canada)
fr_FR.utf8 French (France)
de_DE.utf8 German (Germany)
it_IT.utf8 Italian (Italy)
es_ES.utf8 Spanish (Spain)
sv_SE.utf8 Swedish (Sweden)