regexp.5 (2010 09)
r
regexp(5) regexp(5)
[] square brackets
*+? asterisk, plus sign, question mark
^$ anchoring
concatenation
| alternation
For example, the ERE
abba|cde is interpreted as "match either
abba or cde. It does not mean "match
abb followed by a or c followed in turn by
de (because concatenation has a higher order of precedence
than alternation).
Expression Anchoring
An ERE can be limited to matching strings that begin or end a line (i.e., anchored) according to the fol-
lowing rules:
• A circumflex (
ˆ) matches the beginning of a line (anchors the expression to the beginning of a
line). For example, the ERE ˆab
matches the string ab in the line abcdef, but not the same
string in the line
cdefab.
• A dollar sign (
$) matches the end of a line (anchors the expression to the end of a line). For
example, the ERE ab$ matches the string ab in the line cdefab, but not the same string in the
line abcdef.
• An ERE anchored by both
ˆ and $ matches only strings that are lines. For example, the ERE
^abcdef$ matches only lines consisting of the string abcdef. Only empty lines match the
ERE ˆ$.
The use of duplication characters (+,*) following anchors is illegal.
PATTERN MATCHING NOTATION
The following rules apply to pattern matching notation except as noted in the descriptions of the specific
utilities using pattern matching.
Patterns Matching a Single Character
The following patterns match a single character or a single collating element:
Ordinary Characters
An ordinary character is a pattern that matches itself. An ordinary character is any character in the sup-
ported character set except newline and the pattern matching special characters listed in Special Charac-
ters below. Matching is based on the bit pattern used for encoding the character, not on the graphic
representation of the character.
Special Characters
A pattern matching special character preceded by a backslash (
\) is a pattern that matches the special
character itself. When not preceded by a backslash, such characters have special meaning in the
specification of patterns. The pattern matching special characters and the contexts in which they have
their special meaning are:
?*[ The question mark, asterisk, and left square bracket are special except when used
in a bracket expression (see Pattern Bracket Expression).
Question Mark
A question mark (
?), when used outside of a bracket expression, is a pattern that matches any printable
or nonprintable character except newline.
Pattern Bracket Expression
The syntax and rules for pattern bracket expressions are the same as for RE bracket expressions found
above with the following exceptions:
The exclamation point character (!) replaces the circumflex character (ˆ) in its role in a non-
matching list in the regular expression notation.
The backslash is used as an escape character within bracket expressions.
6 Hewlett-Packard Company − 6 − HP-UX 11i Version 3: September 2010