regexp.5 (2010 09)
r
regexp(5) regexp(5)
NAME
regexp - regular expression and pattern matching notation definitions
DESCRIPTION
A Regular Expression is a mechanism supported by many utilities for locating and manipulating pat-
terns in text. Pattern Matching Notation is used by shells and other utilities for file name expansion.
This manual entry defines two forms of regular expressions: Basic Regular Expressions and
Extended Regular Expressions; and the one form of Pattern Matching Notation.
BASIC REGULAR EXPRESSIONS
Basic regular expression (RE) notation and construction rules apply to utilities defined as using basic
REs. Any exceptions to the following rules are noted in the descriptions of the specific utilities that use
REs.
REs Matching a Single Character
The following REs match a single character or a single collating element:
Ordinary Characters
An ordinary character is an RE that matches itself. An ordinary character is any character in the sup-
ported character set except newline and the regular expression special characters listed in Special
Characters below. An ordinary character preceded by a backslash (
\) is treated as the ordinary charac-
ter itself, except when the character is (, ), {, or }, or the digits
1 through 9 (see REs Matching Mul-
tiple Characters). Matching is based on the bit pattern used for encoding the character; not on the
graphic representation of the character.
Special Characters
A regular expression special character preceded by a backslash is a regular expression that matches the
special character itself. When not preceded by a backslash, such characters have special meaning in the
specification of REs. Regular expression special characters and the contexts in which they have special
meaning are:
.[\ The period, left square bracket, and backslash are special except when used in a
bracket expression (see RE Bracket Expression).
* The asterisk is special except when used in a bracket expression, as the first charac-
ter of a regular expression, or as the first character following the character pair \(
(see REs Matching Multiple Characters).
^ The circumflex is special when used as the first character of an entire RE (see
Expression Anchoring) or as the first character of a bracket expression.
$ The dollar sign is special when used as the last character of an entire RE (see
Expression Anchoring).
delimiter Any character used to bound (i.e., delimit) an entire RE is special for that RE.
Period
A period (.), when used outside of a bracket expression, is an RE that matches any printable or nonprint-
able character except newline.
RE Bracket Expression
A bracket expression enclosed in square brackets (
[]) is an RE that matches a single collating element
contained in the nonempty set of collating elements represented by the bracket expression.
The following rules apply to bracket expressions:
bracket expression
A bracket expression is either a matching list expression or a non-matching
list expression, and consists of one or more expressions in any order. Expressions
can be: collating elements, collating symbols, noncollating characters, equivalence
classes, range expressions, or character classes. The right bracket (
]) loses its spe-
cial meaning and represents itself in a bracket expression if it occurs first in the list
(after an initial ˆ, if any). Otherwise, it terminates the bracket expression (unless
it is the ending right bracket for a valid collating symbol, equivalence class, or char-
acter class, or it is the collating element within a collating symbol or equivalence
class expression). The special characters
HP-UX 11i Version 3: September 2010 − 1 − Hewlett-Packard Company 1