SAX vs. DOM

SAX (Simple API for XML) and DOM (Document Object Model) were both designed to allow programmers to access their information without having to write a parser. By keeping the information in XML format, and by using either SAX or DOM apis your program is free to use whatever parser it wishes. This can happen because parser writers must implement the SAX and DOM apis.

So both SAX and DOM were created to serve the same purpose, which is giving you access to the information stored in XML documents. However, both of them take very different approaches to giving you access to your information.

DOM

DOM gives you access to the information stored in your XML document as a hierarchical object model. DOM creates a tree of nodes (based on the structure and information in your XML document) and you can access your information by interacting with this tree of nodes. This works out really well because XML is hierarchical in nature.

Delphi facilitates use of a DOM by it's TXMLDocument component (available in the Enterprise version of Delphi 6, or in the Professional version of Delphi 7) and by the XML Data Binding wizard (available in the Enterprise versions of Delphi 6 and 7).

SAX

SAX chooses to give you access to the information in your XML document, not as a tree of nodes, but as a sequence of events! You ask, how is this useful? The answer is that SAX chooses not to create a default Delphi object model on top of your XML document (like DOM does). This makes SAX faster, but also necessitates the following things:

Creation of your own custom object model;
creation of a class that listens to SAX events and properly creates your object model.

Note that these steps are not necessary with DOM, because DOM already creates an object model for you (which represents your information as a tree of nodes).

In the case of DOM, the parser does almost everything: read the XML document in, create an object model on top of it and then give you a reference to this object model (a Document object) so that you can manipulate it. SAX is not called the Simple API for XML for nothing, it is really simple. SAX doesn't expect the parser to do much, all SAX requires is that the parser should read in the XML document, and fire a bunch of events depending on what tags it encounters in the XML document. You are responsible for interpreting these events by writing an XML document handler class, which is responsible for making sense of all the tag events and creating objects in your own object model.

What kinds of SAX events are fired by the SAX parser? These events are really very simple. SAX will fire an event for every open tag, and every close tag. It also fires events for #pcdata and for cdata sections. Your document handler (which is a listener for these events) has to interpret these events in some meaningful way and create your custom object model based on them. Your document handler will have to interpret these events, and the sequence in which these events are fired is very important. SAX also fires events for processing instructions, DTDs, comments, etc.

Delphi by default does not provide any support for SAX. This is why the SAX for Pascal project was initiated.

SAX vs. DOM

Both SAX and DOM have their respective strengths and weaknesses, as the following table shows:

	SAX	DOM
Advantages	Speed Can process arbitrarily large documents due to low memory requirements	Arbitrary navigability Makes XSL-style transformations much easier
Disadvantages	No navigability Event based programming hard for some programmers Need to maintain state externally (makes XSL-style transformations hard to code)	Uses lots of memory, not well suited for large documents Slow, as it needs to build an in-memory representation Tight coupling between XML document and object model

The SAX document handler you write does element to object mapping. So, if your information is structured in a way that makes it easy to create this mapping, you should use the SAX api. On the other hand, if your data is much better represented as a tree then you should use DOM.

Other references

xml.com contains more links to articles about the differences and advantages of SAX and DOM.

Home