User Guide

Table Of Contents
672 Chapter 28: Validating Data
Special characters
Because special characters are the operators in regular expressions, in order to represent a special
character as an ordinary one, you must escape it by preceding it with a backslash. For example,
use two backslash characters (\\) to represent a backslash character.
Single-character regular expressions
The following rules govern regular expressions that match a single character:
Special characters are: + * ? . [ ^ $ ( ) { | \
Any character that is not a special character or escaped by being preceded by a backslash (\)
matches itself.
A backslash (\) followed by any special character matches the literal character itself; that is, the
backslash escapes the special character.
A period (.) matches any character except newline.
A set of characters enclosed in brackets ([]) is a one-character regular expression that matches
any of the characters in that set. For example, “[akm]” matches an a, k, or m. If you include ]
(closing square bracket) in square brackets, it must be the first character. Otherwise, it does not
work, even if you use \].
A dash can indicate a range of characters. For example, [a-z] matches any lowercase letter.
If the first character of a set of characters in brackets is the caret (^), the expression matches any
character except those in the set. It does not match the empty string. For example: “[^akm]”
matches any character except a, k, or m. The caret loses its special meaning if it is not the first
character of the set.
You can make regular expressions case insensitive by substituting individual characters with
character sets; for example, “[Nn][Ii][Cc][Kk]” is a case-insensitive pattern for the name Nick
(or NICK, or nick, or even nIcK).
You can use the following escape sequences to match specific characters or character classes:
Escape
seq
Matches Escape
seq
Meaning
[\b] Backspace. \s Any of the following white space
characters: space, tab, form feed, and
line feed.
\b A word boundary, such as a space. \S Any character except the white space
characters matched by \s.
\B A nonword boundary. \t Tab.
\cX The control character Ctrl-x. For
example, \cv matches Ctrl-v, the
usual control character for pasting
text.
\v Vertical tab.
\d A digit character [0-9]. \w An alphanumeric character or
underscore. The equivalent of
[A-Za-z0-9_].