Datasheet
String doubleIdentifierRE = “\\b(\\w+)\\s+\\1\\b”;
Pattern classPattern = Pattern.compile(unadornedClassRE);
Pattern doublePattern = Pattern.compile(doubleIdentifierRE);
Matcher classMatcher, doubleMatcher;
int lineNumber=0;
try {
BufferedReader br = new BufferedReader(new FileReader(fileName));
String line;
while( (line=br.readLine()) != null) {
lineNumber++;
classMatcher = classPattern.matcher(line);
doubleMatcher = doublePattern.matcher(line);
if(classMatcher.find()) {
System.out.println(“The class [“ +
classMatcher.group(1) +
“] is not public”);
}
while(doubleMatcher.find()) {
System.out.println(“The word \”” + doubleMatcher.group(1) +
“\” occurs twice at position “ +
doubleMatcher.start() + “ on line “ +
lineNumber);
}
}
} catch(IOException ioe) {
System.out.println(“IOException: “ + ioe);
ioe.printStackTrace();
}
}
}
The first regular expression, ^\\s*class (\\w+), searches for unadorned class keywords starting at
the beginning of the line, followed by zero or more white space characters, then the literal
class. The
group operator is used with one or more word characters (A–Z, a–z, 0–9, and the underscore), so the
class name gets matched.
The second regular expression,
\\b(\\w+)\\s+\\1\\b, uses the word boundary meta-character (\b) to
ensure that words are isolated. Without this, the string
public class would match on the letter c. A
back reference is used to match a string already matched, in this case, one or more word characters. One
or more characters of white space must appear between the words. Executing the previous program on
the preceding test Java source file gives you the following output:
The class [EmptyClass] is not public
The class [MyArrayList] is not public
The word “extends” occurs twice at position 18 on line 6
The word “test” occurs twice at position 32 on line 11
The word “code” occurs twice at position 49 on line 11
69
Chapter 1: Key Java Language Features and Libraries
05_777106 ch01.qxp 11/28/06 10:43 PM Page 69