HP-UX Reference (11i v3 07/02) - 5 Miscellaneous Topics (vol 9)

r
regexp(5) regexp(5)
.*[\
(period, asterisk, left bracket, and backslash) lose their special meaning within a
bracket expression.
The character sequences:
[. [= [:
(left-bracket followed by a period, equal-sign or colon) are special inside a bracket
expression and are used to delimit collating symbols, equivalence class expressions
and character class expressions. These symbols must be followed by a valid expres-
sion and the matching terminating .], =]
,or:].
matching list A matching list expression specifies a list that matches any one of the characters
represented in the list. The first character in the list cannot be the circumflex. For
example,
[abc] is an RE that matches any of
a, b,orc.
non-matching list
A non-matching list expression begins with a circumflex (
ˆ), and specifies a list that
matches any character or collating element except newline and the characters
represented in the list. For example,
[ˆabc] is an RE that matches any character
except newline or a, b,orc. The circumflex has this special meaning only when it
occurs first in the list, immediately following the left square bracket.
collating element
A collating element is a sequence of one or more characters that represents a single
element in the collating sequence as identified via the most current setting of the
locale variable
LC_COLLATE (see setlocale(3C)).
collating symbol
A collating symbol is a collating element enclosed within bracket-period (
[. .])
delimiters. Multicharacter collating elements must be represented as collating sym-
bols to distinguish them from single-character collating elements. For example, if the
string
ch is a valid collating element, then [[.ch.]] is treated as an element
matching the same string of characters, while ch is treated as a simple list of the
characters c and h. If the string within the bracket-period delimiters is not a valid
collating element in the current collating sequence definition, the symbol is treated as
an invalid expression.
noncollating character
A noncollating character is a character that is ignored for collating purposes. By
definition, such characters cannot participate in equivalence classes or range expres-
sions.
equivalence class
An equivalence class expression represents the set of collating elements belonging
to an equivalence class. It is expressed by enclosing any one of the collating elements
in the equivalence class within bracket-equal (
[= =]) delimiters. For example, if a
and A belong to the same equivalence class, then [[=a=]b] and [[=A=]b] are
each equivalent to [aAb].
range expression
A range expression represents the set of collating elements that fall between two
elements in the current collation sequence as defined via the most current setting of
the locale variable LC_COLLATE (see setlocale(3C)). It is expressed as the starting
point and the ending point separated by a hyphen (-).
The starting range point and the ending range point must be a collating element, col-
lating symbol, or equivalence class expression. An equivalence class expression used
as an end point of a range expression is interpreted such that all collating elements
within the equivalence class are included in the range. For example, if the collating
order is A, a, B, b, C, c, ch, D, and d and the characters A and a belong to the same
equivalence class, then the expression [[=a=]-D] is treated as
[AaBbCc[.ch.]D].
Both starting and ending range points must be valid collating elements, collating sym-
bols, or equivalence class expressions, and the ending range point must collate equal
to or higher than the starting range point; otherwise the expression is invalid. For
404 Hewlett-Packard Company 2 HP-UX 11i Version 3: February 2007