HP-UX Reference (11i v3 07/02) - 5 Miscellaneous Topics (vol 9)

r
regexp(5) regexp(5)
example, with the above collating order and assuming that
E is a noncollating charac-
ter, then both the expressions
[[=A=]-E]
and [d-a] are invalid.
An ending range point can also be the starting range point in a subsequent range
expression. Each such range expression is evaluated separately. For example, the
bracket expression
[a-m-o] is treated as
[a-mm-o].
The hyphen character is treated as itself if it occurs first (after an initial
ˆ, if any) or
last in the list, or as the rightmost symbol in a range expression. As examples, the
expressions
[-ac] and [ac-]
are equivalent and match any of the characters a, c,
or
-; the expressions -ac]
and [ˆac-] are equivalent and match any characters
except newline,
a, c,or-
; the expression [%--] matches any of the characters in
the defined collating sequence between
%
and - inclusive; the expression [--@]
matches any of the characters in the defined collating sequence between
- and @
inclusive; and the expression [a--@]
is invalid, assuming - precedes a in the collat-
ing sequence.
If a bracket expression must specify both
-
and ], the ] must be placed first (after the
^, if any) and the -
last within the bracket expression.
character class
A character class expression represents the set of characters belonging to a character
class, as defined via the most current setting of the locale variable
LC_CTYPE.Itis
expressed as a character class name enclosed within bracket-colon (
[: :]) delim-
iters.
Standard character class expressions supported in all locales are:
[:alpha:] letters
[:upper:] upper-case letters
[:lower:] lower-case letters
[:digit:] decimal digits
[:xdigit:] hexadecimal digits
[:alnum:] letters or decimal digits
[:space:] characters producing white-space in displayed text
[:print:] printing characters
[:punct:] punctuation characters
[:graph:] characters with a visible representation
[:cntrl:] control characters
[:blank:] blank characters
For example, if the locale variable LC_CTYPE is set to C, the expression
[[:upper:]] is equivalent to [A-Z]. Similarly the expression [[:digit:]]
is same as [0-9].
REs Matching Multiple Characters
The following rules may be used to construct REs matching multiple characters from REs matching a single
character:
RE RE The concatenation of REs is an RE that matches the first encountered concatenation
of the strings matched by each component of the RE. For example, the RE bc
matches the second and third characters of the string abcdefabcdef .
RE* An RE matching a single character followed by an asterisk (*) is an RE that matches
zero or more occurrences of the RE preceding the asterisk. The first encountered
string that permits a match is chosen, and the matched string will encompass the
maximum number of characters permitted by the RE. For example, in the string
abbbcdeabbbbbbcde, both the RE b*c and the RE bbb*c are matched by the
substring bbbc in the second through fifth positions. An asterisk as the first charac-
ter of an RE loses this special meaning and is treated as itself.
HP-UX 11i Version 3: February 2007 3 Hewlett-Packard Company 405