HP-UX Reference (11i v2 07/12) - 5 Miscellaneous (vol 9)
r
regexp(5) regexp(5)
NAME
regexp - regular expression and pattern matching notation definitions
DESCRIPTION
A Regular Expression is a mechanism supported by many utilities for locating and manipulating patterns
in text. Pattern Matching Notation is used by shells and other utilities for file name expansion. This
manpage defines two forms of regular expressions: Basic Regular Expressions and Extended Regular
Expressions; and the one form of Pattern Matching Notation.
BASIC REGULAR EXPRESSIONS
Basic regular expression (RE) notation and construction rules apply to utilities defined as using basic REs.
Any exceptions to the following rules are noted in the descriptions of the specific utilities that use REs.
REs Matching a Single Character
The following REs match a single character or a single collating element:
Ordinary Characters
An ordinary character is an RE that matches itself. An ordinary character is any character in the sup-
ported character set except newline and the regular expression special characters listed in Special Char-
acters below. An ordinary character preceded by a backslash (
\) is treated as the ordinary character itself,
except when the character is (, ), {,or}, or the digits
1 through 9 (see REs Matching Multiple Char-
acters). Matching is based on the bit pattern used for encoding the character; not on the graphic represen-
tation of the character.
Special Characters
A regular expression special character preceded by a backslash is a regular expression that matches the
special character itself. When not preceded by a backslash, such characters have special meaning in the
specification of REs. Regular expression special characters and the contexts in which they have special
meaning are:
.[\ The period, left square bracket, and backslash are special except when used in a
bracket expression (see RE Bracket Expression).
* The asterisk is special except when used in a bracket expression, as the first character
of a regular expression, or as the first character following the character pair \( (see
REs Matching Multiple Characters).
^ The circumflex is special when used as the first character of an entire RE (see
Expression Anchoring) or as the first character of a bracket expression.
$ The dollar sign is special when used as the last character of an entire RE (see Expres-
sion Anchoring).
delimiter Any character used to bound (that is, delimit) an entire RE is special for that RE.
Period
A period (
.), when used outside of a bracket expression, is an RE that matches any printable or nonprint-
able character except newline.
RE Bracket Expression
A bracket expression enclosed in square brackets ([]) is an RE that matches a single collating element
contained in the nonempty set of collating elements represented by the bracket expression.
The following rules apply to bracket expressions:
bracket expression
A bracket expression is either a matching list expression or a nonmatching list
expression, and consists of one or more expressions in any order. Expressions can
be: collating elements, collating symbols, noncollating characters, equivalence classes,
range expressions, or character classes. The right bracket (]) loses its special mean-
ing and represents itself in a bracket expression if it occurs first in the list (after an
initial ˆ, if any). Otherwise, it terminates the bracket expression (unless it is the end-
ing right bracket for a valid collating symbol, equivalence class, or character class, or
it is the collating element within a collating symbol or equivalence class expression).
The special characters
HP-UX 11i Version 2: December 2007 Update − 1 − Hewlett-Packard Company 353