HP-UX Reference (11i v1 05/09) - 5 Miscellaneous Topics (vol 9)
r
regexp(5) regexp(5)
NAME
regexp - regular expression and pattern matching notation definitions
DESCRIPTION
A regular expression is a mechanism supported by many utilities for locating and manipulating patterns
in text. pattern matching notation is used by shells and other utilities for file name expansion. This
manual entry defines two forms of regular expressions: Basic Regular Expressions and Extended Reg-
ular Expressions; and the one form of Pattern Matching Notation.
BASIC REGULAR EXPRESSIONS
Basic regular expression (
RE) notation and construction rules apply to utilities defined as using basic RE
s.
Any exceptions to the following rules are noted in the descriptions of the specific utilities that use RE
s.
REs Matching a Single Character
The following REs match a single character or a single collating element:
Ordinary Characters
An ordinary character is an RE that matches itself. An ordinary character is any character in the sup-
ported character set except <newline> and the regular expression special characters listed in Special Char-
acters below. An ordinary character preceded by a backslash (\ ) is treated as the ordinary character itself,
except when the character is (, ), {, or }, or the digits 1 through 9 (see REs Matching Multiple Characters).
Matching is based on the bit pattern used for encoding the character; not on the graphic representation of
the character.
Special Characters
A regular expression special character preceded by a backslash is a regular expression that matches the
special character itself. When not preceded by a backslash, such characters have special meaning in the
specification of REs. Regular expression special characters and the contexts in which they have special
meaning are:
.[\ The period, left square bracket, and backslash are special except when used in a
bracket expression (see RE Bracket Expression).
∗∗ The asterisk is special except when used in a bracket expression, as the first character
of a regular expression, or as the first character following the character pair \( (see
REs Matching Multiple Characters).
^ The circumflex is special when used as the first character of an entire RE (see Expres-
sion Anchoring) or as the first character of a bracket expression.
$ The dollar sign is special when used as the last character of an entire RE (see Expres-
sion Anchoring).
delimiter Any character used to bound (i.e., delimit) an entire RE is special for that RE.
Period
A period ( . ), when used outside of a bracket expression, is an RE that matches any printable or nonprint-
able character except <newline>.
RE Bracket Expression
A bracket expression enclosed in square brackets ( [ ] ) is an RE that matches a single collating element con-
tained in the nonempty set of collating elements represented by the bracket expression.
The following rules apply to bracket expressions:
bracket expression
A bracket expression is either a matching list expression or a non-matching list
expression, and consists of one or more expressions in any order. Expressions can
be: collating elements, collating symbols, noncollating characters, equivalence classes,
range expressions, or character classes. The right bracket ( ]) loses its special mean-
ing and represents itself in a bracket expression if it occurs first in the list (after an
initial ˆ, if any). Otherwise, it terminates the bracket expression (unless it is the end-
ing right bracket for a valid collating symbol, equivalence class, or character class, or
it is the collating element within a collating symbol or equivalence class expression).
The special characters
HP-UX 11i Version 1: September 2005 − 1 − Hewlett-Packard Company Section 5−−299