regexp.5 (2010 09)

ManualsBrandsHP ManualsSoftwareHP-UX Manpages

regexp(5) regexp(5)

$RE$ A subexpression can be deﬁned within an RE by enclosing it between the character

pairs $ and $. Such a subexpression matches whatever it would have matched

without the $ and $. Subexpressions can be arbitrarily nested. An asterisk

immediately following the \( loses its special meaning and is treated as itself. An

asterisk immediately following the \) is treated as an invalid character.

\n The expression \n matches the same string of characters as was matched by a

subexpression enclosed between $ and $ preceding the

\n. The character n

must be a digit from

1 through

9, specifying the n-th subexpression (the one that

begins with the n-th

and ends with the corresponding paired

\). For example,

the expression

ˆ$.*$\1$ matches a line consisting of two adjacent appearances

of the same string.

If the

\n is followed by an asterisk, it matches zero or more occurrences of the

subexpression referred to. For example, the expression

$ab\(cd$ef\)Z\2*Z\1

matches the string

abcdefZcdcdZabcdef

\{m,n\} An RE matching a single character followed by

\{m\}, \{m,\},or\{m

,n\} is

an RE that matches repeated occurrences of the RE. The values of m and n must be

decimal integers in the range 0 through 255, with m specifying the exact or

minimum number of occurrences and n specifying the maximum number of

occurrences.

\{m\} matches exactly m occurrences of the preceding RE, \{m,\}

matches at least m occurrences, and \{m,n\} matches any number of occurrences

between m and n,inclusive.

The ﬁrst encountered string that matches the expression is chosen; it will contain as

many occurrences of the RE as possible. For example, in the string

abbbbbbbc

the RE b\{3\} is matched by characters two through four, the RE b\{3,\} is

matched by characters two through eight, and the RE b\{3,5\}c is matched by

characters four through nine.

Expression Anchoring

An RE can be limited to matching strings that begin or end a line (i.e., anchored) according to the follow-

ing rules:

• A circumﬂex (ˆ) as the ﬁrst character of an RE anchors the expression to the beginning of a line;

only strings starting at the ﬁrst character of a line are matched by the RE. For example, the RE

^ab matches the string ab in the line abcdef, but not the same string in the line

cdefab.

• A dollar sign (

$) as the last character of an RE anchors the expression to the end of a line; only

strings ending at the last character of a line are matched by the RE. For example, the RE

ab$

matches the string ab in the line cdefab, but not the same string in the line

abcdef.

• An RE anchored by both

ˆ and $ matches only strings that are lines. For example, the RE

^abcdef$ matches only lines consisting of the string abcdef.

The use of duplication characters (+,*) following anchors is illegal.

EXTENDED REGULAR EXPRESSIONS

The extended regular expression (ERE) notation and construction rules apply to utilities deﬁned as using

extended REs. Any exceptions to the following rules are noted in the descriptions of the speciﬁc utilities

using EREs.

EREs Matching a Single Character

The following EREs match a single character or a single collating element:

Ordinary Characters

An ordinary character is an ERE that matches itself. An ordinary character is any character in the sup-

ported character set except newline and the regular expression special characters listed in Special Char-

acters below. An ordinary character preceded by a backslash (

\) is treated as the ordinary character

itself. Matching is based on the bit pattern used for encoding the character, not on the graphic represen-

tation of the character.

Special Characters

A regular expression special character preceded by a backslash is a regular expression that matches the

special character itself. When not preceded by a backslash, such characters have special meaning in the

speciﬁcation of EREs. The extended regular expression special characters and the contexts in which they

4 Hewlett-Packard Company − 4 − HP-UX 11i Version 3: September 2010