HP-UX Reference (11i v1 05/09) - 3 Library Functions N-Z (vol 7)

ManualsBrandsHP ManualsSoftwareHP-UX Reference Manuals

251

252

253

254

255

256

257

258

259

260

re_comp(3X) re_comp(3X)

(TO BE OBSOLETED)

NAME

re_comp(), re_exec() - compile and execute regular expressions

SYNOPSIS

#include <re_comp.h>

char *re_comp(const char *string);

int re_exec(const char *string);

DESCRIPTION

The re_comp() function converts a regular expression string (RE) into an internal form suitable for pat-

tern matching. The re_exec() function compares the string pointed to by the string argument with the

last regular expression passed to

re_comp().

If re_comp() is called with a null pointer argument, the current regular expression remains unchanged.

Strings passed to both re_comp() and re_exec()

must be terminated by a null byte, and may include

newline characters.

The

re_comp() and re_exec() functions support simple regular expressions, which are deﬁned

below.

The following one-character REs match a single character:

1.1 An ordinary character (not one of those discussed in 1.2 below) is a one-character RE that

matches itself.

1.2 A backslash (\) followed by any special character is a one-character RE that matches the special

character itself. The special characters are:

a. ., *, [, and \ (period, asterisk, left square bracket, and backslash, respectively),

which are always special, except when they appear within square brackets ([];

see 1.4 below).

b. ^(caret or circumﬂex), which is special at the beginning of an entire RE (see 3.1

and 3.2 below), or when it immediately follows the left of a pair of square brack-

ets ([]) (see 1.4 below).

c. $ (dollar symbol), which is special at the end of an entire RE (see 3.2 below).

d. The character used to bound (delimit) an entire RE, which is special for that RE.

1.3 A period (.) is a one-character RE that matches any character except new-line.

1.4 A non-empty string of characters enclosed in square brackets ([]) is a one-character RE that

matches any one character in that string. If, however, the ﬁrst character of the string is a

circumﬂex (ˆ), the one-character RE matches any character except new-line and the remaining

characters in the string. The ˆ has this special meaning only if it occurs ﬁrst in the string. The

minus (-) may be used to indicate a range of consecutive ASCII characters; for example, [0-9] is

equivalent to [0123456789]. The - loses this special meaning if it occurs ﬁrst (after an initial ˆ, if

any) or last in the string. The right square bracket (]) does not terminate such a string when it

is the ﬁrst character within it (after an initial ˆ, if any); for example, []a-f] matches either a right

square bracket (]) or one of the letters a through f inclusive. The four characters listed in 1.2.a

above stand for themselves within such a string of characters.

The following rules may be used to construct REs from one-character REs:

2.1 A one-character RE is a RE that matches whatever the one-character RE matches.

2.2 A one-character RE followed by an asterisk (*) is a RE that matches zero or more occurrences of

the one-character RE. If there is any choice, the longest leftmost string that permits a match is

chosen.

2.3 A one-character RE followed by \{m\}, \{m,\},or\{m,n\} is a RE that matches a range of

occurrences of the one-character RE. The values of m and n must be non-negative integers less

than 256; \{m\} matches exactly m occurrences; \{m,\} matches at least m occurrences;

\{m,n\} matches any number of occurrences between m and n inclusive. Whenever a choice

exists, the RE matches as many occurrences as possible.

2.4 The concatenation of REs is a RE that matches the concatenation of the strings matched by each

component of the RE.

HP-UX 11i Version 1: September 2005 − 1 − Hewlett-Packard Company Section 3−−783