Preparing your LDAP Directory for HP-UX Integration
29
Numeric UID & GID
The uidnumber and gidnumber attribute are used to represent the user's Posix user id number and a group's
group id number. These values must be integers. They must not contain any alpha characters. An invalid
format could cause unknown results, and potentially be a security risk. The LDAP-UX Client Services
product will not return user or group entries that contain an invalidly formatted uidNumber or gidNumber.
However, when using the NIS/LDAP Gateway, the syntax checking is up to the NIS client. Thus a poorly
written client may return an invalid uid number. Although some LDAP directory servers do check the
syntax of these attributes, the Directory Administrator should not rely on this feature.
Aside from the integer syntax requirement, the uidNumber and gidNumber attributes must be smaller than
31 bits in size. The maximum uid and gid number allowed is 2,147,483,647. This limit is defined by the
HP-UX operating system and may be different on other OS architectures. The earliest versions of the Unix
operating systems defined the uid and gid number to be at most 15 bits, thus 32,767 was the maximum.
Some legacy applications may have problems with uids larger than 32,767.
Multi-language Support
In order to support international character sets the LDAP v3 standard defines that textual data shall be
presented in UTF-8
6
format. UTF-8 is a sophisticated process for encoding the Unicode defined character
sets. UTF-8 defines a 1 to 6 byte character that can represent any Unicode character (this also includes the
UCS-2 and UCS-4 characters.) UTF-8 was created to be compatible with the 7-bit ASCII character set. An
ASCII text file, when converted to UTF-8 is bit for bit the same. However, characters beyond ASCII
require encoding in a 2 to 6 byte sequence. For example, the UTF-8 character "é" is displayed as the two
characters "é" in ISO8859-1 (LATIN-1) character set, when no translation is performed.
However the UTF-8 format is merely a suggested format for text data, used to promote interoperability of
LDAP based applications. The LDAP v3 specification says that the directory server should not modify or
reject textual data, even if it results in an invalid UTF-8 character. This means that an LDAP directory
could store data in any format, such as LATIN-1 or Shift JIS. However that would be poor practice, which
would lead to interoperability problems.
The HP-UX LDAP integration products merely act as conduits for data from the directory. Translation from
UTF-8 to other formats is not performed. If you plan to use UTF-8 formatted data you should install the
UTF-8 locales. For the 10.20 OS, please check with your HP support representative to see if a patch is
available for your system. For the 11.00 OS, UTF-8 locales are available with the core operating system, as
of the "9812" release.
The bigger question for the Directory Administrator and the HP-UX System Administrator occurs when
migrating the Posix data to the LDAP directory. If the existing Posix data to migrate to the directory is not
in ASCII or UTF-8 format the Administrators must decide if it should be converted to UTF-8 (see the
iconv (1) man page.) For example, suppose the /etc/passwd file contains HP-Roman8 characters. Should
those characters be converted to UTF-8 before being added to the directory? Leaving the data in
HP-Roman8 means that the data will not be interoperable with other LDAP enabled applications. However,
converting the data to UTF-8 means that you will need to change your HP-UX system locales to UTF-8
(which may also require installing several patches which include OS UTF-8 locale support and application
6
The Unicode Consortium, "The Unicode Standard" internet web page,
http://www.unicode.org/unicode/standard/standard.html