Datasheet
❑ data—A directory containing sample XML files.
❑ docs—A directory containing all the documentation.
❑ Readme.html—The jump-off point for the Xerces documentation; open it with your Web
browser.
❑ samples—A directory containing the source code for the samples.
❑ xercesImpl.jar—A jar file containing the parser implementation.
❑ xercesSamples.jar—A jar file containing the sample applications.
❑ xml-apis.jar—A jar file containing the parsing APIs (SAX, DOM, and so on).
You must include xml-apis.jar and xercesImpl.jar in your Java classpath in order to use Xerces in your
application. There are a variety of ways to accomplish this, including setting the CLASSPATH environ-
ment variable in your DOS Command window or UNIX shell window. You can also set the CLASSPATH
variable for the application server you’re using.
Another installation option is to make Xerces the default XML parser for your JDK installation. This
option only works for JDK 1.3 and above.
JDK 1.3 introduced an Extension Mechanism for the JDK. It works like this. The JDK installation
includes a special extensions directory where you can place jar files that contain extensions to Java. If
JAVA_HOME is the directory where your JDK has been installed, then the extensions directory is
<JAVA_HOME>\jre\lib\ext using Windows file delimiters and <JAVA_HOME>/jre/lib/ext using
UNIX file delimiters.
If you’re using JDK 1.4 or above, you should use the Endorsed Standards Override Mechanism, not the
Extension Mechanism. The JDK 1.4 Endorsed Standards Override Mechanism works like the Extension
Mechanism, but it’s specifically designed to allow incremental updates of packages specified by the JCP.
The major operational difference between the Extension Mechanism and the Endorsed Standards
Override Mechanism is that the directory name is different. The Windows directory is named
<JAVA_HOME>\jre\lib\endorsed, and the UNIX directory is named <JAVA_HOME>/jre/lib/endorsed.
Development Techniques
Now that you have Xerces installed, let’s look at some techniques for getting the most out of Xerces and
XML. We’re going to start by looking at how to set the Xerces configuration through the use of features
and properties. We’ll look at the Deferred DOM, which uses lazy evaluation to improve the memory
usage of DOM trees in certain usage scenarios. There are two sections, each on how to deal with
Schemas/Grammars and Entities. These are followed by a section on serialization, which is the job of
producing XML as opposed to consuming it. We’ll finish up by examining how the Xerces Native
Interface (XNI) gives us access to capabilities that are not available through SAX or DOM.
Xerces Configuration
The first place we’ll stop is the Xerces configuration mechanism. There are a variety of configuration set-
tings for Xerces, so you’ll need to be able to turn these settings on and off.
16
Chapter 1
01 543555 Ch01.qxd 11/5/03 9:40 AM Page 16