Datasheet
❑ Mismatched encoding declaration—The character encoding used in a file and the encoding
name specified in the encoding declaration must match. The encoding declaration is the encod-
ing="name" that appears after <? xml version="1.0" encoding="name"?> in an XML document.
If the encoding of the file and the declared encoding don’t match, you may see errors about
invalid characters.
❑ Forgetting to use namespace-aware methods—If you’re working with namespaces, be sure to
use the namespace-aware versions of the methods. With SAX this is fairly easy because most
people are using the SAX 2.0 ContentHandler, which has only the namespace-aware callback
methods. If you’re using DocumentHandler and trying to do namespaces, you’re in the wrong
place. You need to use ContentHandler. In DOM-based parsers, this is a little harder because
there are namespace-aware versions of methods that have the letters NS appended to their
names. So, Element#getAttributeNS is the namespace-aware version of the
Element#getAttribute method.
❑ Out of memory using the DOM—Depending on the document you’re working with, you may
see out-of-memory errors if you’re using the DOM. This happens because the DOM tends to be
very memory intensive. There are several possible solutions. You can increase the size of the
Java heap. You can use the DOM in deferred mode—if you’re using the JAXP interfaces, then
you aren’t using the DOM in deferred mode. Finally, you can try to prune some of the nodes in
the DOM tree by setting the feature http://apache.org/xml/features/dom/include-ignorable-
whitespace to false.
❑ Using appendChild instead of importNode across DOM trees—The Xerces DOM implementa-
tion tries to enforce some integrity constraints on the contents of the DOM. One common thing
developers want to do is create a new DOM tree and then copy some nodes from another DOM
tree into it. Usually they try to do this using Node#appendChild, and then they start seeing
exceptions like DOMException: DOM005 Wrong document, which is confusing. To copy nodes
between DOM trees you need to use the Document#importNode method, and then you can call
the method you want to put the node into its new home.
Applications
We’ve covered a lot of ground in this chapter, and yet we’ve hardly begun. XML parsing has so many
applications that it’s hard to show all the ways you might use it in your application. Here are a couple of
ideas.
One place you end up directly interacting with the XML parser is in the kind of example we’ve been
using through out this chapter: turning XML documents into domain-specific objects within your appli-
cation. Although there are some proposals for tools that can do it for you, this is a task where you’ll still
see developers having direct interaction with the parser, at least for a little while longer.
Another application people use the parser for directly is filtering XML. When you have a very large
XML document and you need only part of it, using SAX to cut out the stuff you don’t want to deal with
is a very viable solution.
50
Chapter 1
01 543555 Ch01.qxd 11/5/03 9:40 AM Page 50