HP-UX Reference (11i v3 07/02) - 3 Library Functions N-Z (vol 7)
r
regcomp(3C) regcomp(3C)
Within bracket expressions: Collation ranges, character classes, and equivalence
classes are effectively expanded into equivalent lists of collation elements and char-
acters. Opposite-case counterpoints are then generated for each collation element
or character to form the complete matching list or non-matching list for the
bracket expression. Opposite-case counterpoints for a multi-character collating
element include all possible combinations of opposite-case counterpoints for each
individual character comprising the collating element. These are then combined to
form new valid multi-character collating elements. For example, the opposite-case
counterpoints for [.ch.] could be [.Ch.],
[.cH.], and [.CH.].
The default regular expression type for pattern is Basic Regular Expression. The application can specify
Extended Regular Expressions by using the
REG_EXTENDED
cflags value.
If the function
regcomp() succeeds, it returns zero; otherwise it returns a non-zero value indicating the
error.
If regcomp() succeeds, and if the REG_NOSUB flag was not set in cflags,
regcomp() sets re_nsub to
the number of parenthesized subexpressions (delimited by
\( and \) in basic regular expressions or (
and
) in extended regular expressions) found in pattern.
regexec() matches the null-terminated string specified by string against the compiled regular expres-
sion preg initialized by a previous call to
regcomp(). If it finds a match, regexec()
returns zero; oth-
erwise it returns non-zero indicating either no match or an error. The eflags argument is the bit-wise logi-
cal OR of the following flags:
REG_NOTBOL The first character of the string pointed to by string is not the beginning of the
line. Therefore, the circumflex character (ˆ), when taken as a special character,
never matches.
REG_NOTEOL The last character of the string pointed to by string is not the end of the line.
Therefore, the dollar sign ($), when taken as a special character, never matches.
If nmatch is not zero, and REG_NOSUB was not set in the cflags argument to
regcomp(), then
regexec() fills in the pmatch array with byte offsets to the substrings of string that correspond to the
parenthesized subexpressions of pattern: pmatch[i].rm_so is the byte offset of the beginning and
pmatch[i].rm_eo is the byte offset one byte past the end of the substring i. (Subexpression i begins at the
ith matched left parenthesis, counting from 1). Offsets in pmatch[0] identify the substring that corresponds
to the entire regular expression. Unused elements of pmatch are set to −1. If there are more than nmatch
subexpressions in pattern (pattern itself counts as a subexpression),
regexec() still does the match, but
only records the first nmatch substrings.
When matching a regular expression, any given parenthesized subexpression of pattern might participate in
the match of several different substrings of string, or it might not match any substring, even though the
pattern as a whole did match. The following explains which substrings are reported in pmatch when
matching regular expressions:
1. If subexpression i in a regular expression is not contained within another subexpression, and it
participated in the match several times, the byte offsets in pmatch[i] delimit the last such
match.
2. If subexpression i is not contained within another subexpression, and it did not participate in an
otherwise successful match (because either
*, ?,or | was used), then the byte offsets in
pmatch[i] are −1.
3. If subexpression i is contained in subexpression j, and a match of subexpression j is reported in
pmatch[j], the match or no-match reported in pmatch[i] is the last one that occurred within the
substring in pmatch[j].
4. If subexpression i is contained in subexpression j, and the offsets in pmatch[j] are −1, the offsets
in pmatch[i] will also be −1.
5. If subexpression i matched a zero-length string, both offsets in pmatch[i] refer to the character
immediately following the zero-length substring.
If
REG_NOSUB was set in cflags in the call to regcomp() , and nmatch is not zero in the call to
regexec(), the content of the pmatch array is unspecified.
regfree() frees any memory allocated by regcomp() associated with preg.
276 Hewlett-Packard Company − 2 − HP-UX 11i Version 3: February 2007