regexp.5 (2010 09)

r
regexp(5) regexp(5)
For example, with the above collating order and assuming that
E is a noncollating
character, then both the expressions
[[=A=]-E]
and [d-a] are invalid.
An ending range point can also be the starting range point in a subsequent range
expression. Each such range expression is evaluated separately. For example, the
bracket expression
[a-m-o] is treated as
[a-mm-o].
The hyphen character is treated as itself if it occurs first (after an initial
ˆ, if any)
or last in the list, or as the rightmost symbol in a range expression. As examples,
the expressions
[-ac] and [ac-]
are equivalent and match any of the characters
a, c,or-; the expressions
[ˆ-ac] and [ˆac-] are equivalent and match any
characters except newline,
a, c,or
-; the expression [%--] matches any of the
characters in the defined collating sequence between
% and - inclusive; the expres-
sion
[--@] matches any of the characters in the defined collating sequence
between - and @ inclusive; and the expression
[a--@] is invalid, assuming - pre-
cedes
a in the collating sequence.
If a bracket expression must specify both
-
and ], the ] must be placed first (after
the
ˆ, if any) and the -
last within the bracket expression.
character class
A character class expression represents the set of characters belonging to a charac-
ter class, as defined via the most current setting of the locale variable
LC_CTYPE
.
It is expressed as a character class name enclosed within bracket-colon (
[: :])
delimiters.
Standard character class expressions supported in all locales are:
[:alpha:] letters
[:upper:] upper-case letters
[:lower:] lower-case letters
[:digit:] decimal digits
[:xdigit:] hexadecimal digits
[:alnum:] letters or decimal digits
[:space:] characters producing white-space in displayed text
[:print:] printing characters
[:punct:] punctuation characters
[:graph:] characters with a visible representation
[:cntrl:] control characters
[:blank:] blank characters
For example, if the locale variable
LC_CTYPE is set to C, the expression
[[:upper:]] is equivalent to [A-Z]. Similarly the expression [[:digit:]]
is same as [0-9].
REs Matching Multiple Characters
The following rules may be used to construct REs matching multiple characters from REs matching a sin-
gle character:
RE RE The concatenation of REs is an RE that matches the first encountered concatenation
of the strings matched by each component of the RE. For example, the RE bc
matches the second and third characters of the string abcdefabcdef.
RE
* An RE matching a single character followed by an asterisk (*) is an RE that
matches zero or more occurrences of the RE preceding the asterisk. The first
encountered string that permits a match is chosen, and the matched string will
encompass the maximum number of characters permitted by the RE. For example,
in the string abbbcdeabbbbbbcde, both the RE b*c and the RE bbb*c are
matched by the substring bbbc in the second through fifth positions. An asterisk
as the first character of an RE loses this special meaning and is treated as itself.
HP-UX 11i Version 3: September 2010 3 Hewlett-Packard Company 3