User Guide

Table Of Contents
Validating form data with regular expressions 673
Multicharacter regular expressions
Use the following rules to build a multicharacter regular expression:
Parentheses group parts of regular expressions together into a subexpression that can be treated
as a single unit. For example, “(ha)+” matches one or more instances of ha.
A one-character regular expression or grouped subexpression followed by an asterisk (*)
matches zero or more occurrences of the regular expression. For example, “[a-z]*” matches zero
or more lowercase characters.
A one-character regular expression or grouped subexpression followed by a plus sign (+)
matches one or more occurrences of the regular expression. For example, “[a-z]+” matches one
or more lowercase characters.
A one-character regular expression or grouped subexpression followed by a question mark (?)
matches zero or one occurrences of the regular expression. For example, “xy?z” matches either
xyz or xz.
The carat (^) at the beginning of a regular expression matches the beginning of the field.
The dollar sign ($) at the end of a regular expression matches the end of the field.
The concatenation of regular expressions creates a regular expression that matches the
corresponding concatenation of strings. For example, “[A-Z][a-z]*” matches any capitalized
word.
The OR character (|) allows a choice between two regular expressions. For example, “jell(y|ies)”
matches either jelly or jellies.
Braces ({}) indicate a range of occurrences of a regular expression. You use them in the form
“{m, n}” where m is a positive integer equal to or greater than zero indicating the start of the
range and n is equal to or greater than m, indicating the end of the range. For example,
“(ba){0,3}” matches up to three pairs of the expression ba. The form “{m,}” requires at least m
occurrences of the preceding regular expression. The form “{m}” requires exactly m
occurrences of the preceding regular expression. The form “{,n}” is not allowed.
\D Any character except a digit. \W Any character not matched by \w. The
equivalent of [^A-Za-z0-9_].
\f Form feed. \n Backreference to the nth expression in
parentheses. See “Backreferences
on page 674.
\n Line feed. \ooctal The character represented in the ASII
character table by the specified octal
number.
\r Carriage return. \xhex The character represented in the ASCII
character table by the specified
hexadecimal number.
Escape
seq
Matches Escape
seq
Meaning