HP-UX Reference (11i v3 07/02) - 5 Miscellaneous Topics (vol 9)

ManualsBrandsHP ManualsSoftwareHP-UX Reference Manuals

401

402

403

404

405

406

407

408

409

410

regexp(5) regexp(5)

$RE$ A subexpression can be deﬁned within an RE by enclosing it between the character

pairs $ and $. Such a subexpression matches whatever it would have matched

without the $ and $. Subexpressions can be arbitrarily nested. An asterisk

immediately following the \( loses its special meaning and is treated as itself. An

asterisk immediately following the \) is treated as an invalid character.

\n The expression \n matches the same string of characters as was matched by a subex-

pression enclosed between $ and $ preceding the

\n. The character n must be a

digit from

1 through 9, specifying the n-th subexpression (the one that begins with

the n-th

\( and ends with the corresponding paired

\). For example, the expression

^$.*$\1$ matches a line consisting of two adjacent appearances of the same

string.

If the \n is followed by an asterisk, it matches zero or more occurrences of the subex-

pression referred to. For example, the expression

$ab\(cd$ef\)Z\2*Z\1

matches the string abcdefZcdcdZabcdef

\{m,n\} An RE matching a single character followed by

\{m\}, \{m,\},or\{m

,n\} is an

RE that matches repeated occurrences of the RE. The values of m and n must be

decimal integers in the range 0 through 255, with m specifying the exact or minimum

number of occurrences and n specifying the maximum number of occurrences.

\{m\} matches exactly m occurrences of the preceding RE, \{m,\} matches at

least m occurrences, and

\{m,n\} matches any number of occurrences between m

and n,inclusive.

The ﬁrst encountered string that matches the expression is chosen; it will contain as

many occurrences of the RE as possible. For example, in the string

abbbbbbbc the

b\{3\} is matched by characters two through four, the RE b\{3,\} is matched

by characters two through eight, and the RE b\{3,5\}c is matched by characters

four through nine.

Expression Anchoring

An RE can be limited to matching strings that begin or end a line (i.e., anchored) according to the following

rules:

• A circumﬂex (ˆ) as the ﬁrst character of an RE anchors the expression to the beginning of a line;

only strings starting at the ﬁrst character of a line are matched by the RE. For example, the RE

^ab matches the string ab in the line abcdef, but not the same string in the line

cdefab.

• A dollar sign (

$) as the last character of an RE anchors the expression to the end of a line; only

strings ending at the last character of a line are matched by the RE. For example, the RE

ab$

matches the string ab in the line cdefab, but not the same string in the line

abcdef.

• An RE anchored by both

ˆ and $ matches only strings that are lines. For example, the RE

^abcdef$ matches only lines consisting of the string abcdef.

The use of duplication characters (+,*) following anchors is illegal.

EXTENDED REGULAR EXPRESSIONS

The extended regular expression (ERE) notation and construction rules apply to utilities deﬁned as using

extended REs. Any exceptions to the following rules are noted in the descriptions of the speciﬁc utilities

using EREs.

EREs Matching a Single Character

The following EREs match a single character or a single collating element:

Ordinary Characters

An ordinary character is an ERE that matches itself. An ordinary character is any character in the sup-

ported character set except newline and the regular expression special characters listed in Special Charac-

ters below. An ordinary character preceded by a backslash (\) is treated as the ordinary character itself.

Matching is based on the bit pattern used for encoding the character, not on the graphic representation of

the character.

Special Characters

A regular expression special character preceded by a backslash is a regular expression that matches the

special character itself. When not preceded by a backslash, such characters have special meaning in the

406 Hewlett-Packard Company − 4 − HP-UX 11i Version 3: February 2007