HP-UX Reference (11i v3 07/02) - 1 User Commands A-M (vol 1)
e
eucset(1) eucset(1)
NAME
eucset - set and get code widths for ldterm
SYNOPSIS
eucset [-p]
eucset [[-c HP15-codeset]or[
-c UTF8]or[-c ASIAN_UTF8]or[-c GB18030]or[cswidth]]
DESCRIPTION
The
eucset command sets or gets (reports) the encoding and display widths of the Extended UNIX Code
(EUC), UCS Transformation Format (UTF8), or GB18030 characters processed by the current input termi-
nal. EUC is an encoding method for codesets composed of single or multiple bytes. EUC permits applica-
tions and the terminal hardware to use the 7-bit US ASCII code and up to three single byte or multibyte
codesets simultaneously.
ldterm is a STREAMS terminal line discipline module which obtains codeset information from
eucset.
See ldterm(7).
The cswidth value defines the character widths for codesets. If cswidth is not implicitly or explicitly defined
by passing no argument to the
eucset command, the cswidth value is determined by the following criteria
in descending priority:
1. Use the cswidth value stored in the current locale, if defined.
2. Use predefined cswidth values if the codeset name defined in the locale is GB18030, UTF8, or one of the
four HP15 codesets.
3. Use the CSWIDTH environment variable if defined and in the correct format.
4. Use 7-bit US ASCII as the default codeset and its cswidth value.
This command must be used to specify EUC or non-EUC codesets, whether they are single byte or multi-
byte. However, the eucset command can correctly set the cswidth parameter without using any options
in most cases except for ASIAN_UTF8. See the WARNINGS section for special warnings on the values of
the cswidth argument.
For the GB18030, ASIAN_UTF8, or UTF8 setting, use the
-c option.
Options
The eucset command recognizes the following options and arguments:
-p Displays the current settings of the EUC character widths for the terminal.
-c Sets the width to one of the four HP15 codesets, UTF8,orASIAN_UTF8,or
GB18030.
The HP15 codesets supported are
SJIS, CCDC, GB, and BIG5.
cswidth Defines the character widths for codesets 1 through 3. See the EUC Code Set Classes sec-
tion in this manpage for more information.
EUC Code Set Classes
EUC divides codesets into four classes. Each codeset has two characteristics: the number of bytes for
encoding the characters in the codeset, and the number of display columns to display the characters in the
codeset. All characters within a codeset possess the same characteristics. ASIAN_UTF8 is used for setting
double width display, and UTF8 is used for single width.
• Codeset 0 consists of all 7-bit, single byte ASCII characters. The most significant bit of each of
these characters is 0 (zero). Characters in codeset 0 require one byte for encoding, and occupy one
display column. These values are fixed for codeset 0 (zero). The 7-bit US ASCII code is the pri-
mary EUC codeset, which is available to users without direct specification.
• Codeset 1 is a supplementary EUC codeset. Codeset 1 characters have an initial byte whose most
significant bit is 1. Characters in codeset 1 may require more than one byte for encoding, and may
require more than one display column. The
eucset command must be used to set the charac-
teristics for codeset 1.
• Codesets 2 and 3 are supplementary EUC codesets. Characters in these codesets have an initial
byte of SS2 or SS3, respectively. They require more than one byte for encoding, and may require
more than one display column. The eucset command must be used to set the characteristics for
codesets 2 and 3.
298 Hewlett-Packard Company − 1 − HP-UX 11i Version 3: February 2007