The Birth of SAX

Before SAX, almost every XML parser offered its own interface, so applications were built to use specific parsers. The interfaces were low-level and generally similar in structure; the differences were mostly in the details. When new parsers were made available, applications had to be modified extensively to work with the different interface in order to take advantage of the new parser, even though the fundamental structure was essentially unchanged.

As is so often the case, the solution lay in introducing another layer of indirection. A group of XML developers using Java, led by David Megginson on the XML-DEV mailing list, defined a set of Java interfaces that allowed an application to work with any parser. The only requirement was that there be a driver for the new API for each parser. The driver was a class that used the parser-specific interface to make calls back to the application using the new, general interface. The application would create handler objects that implemented methods the driver would use to call back to the application. When Megginson released the specification, he also released a set of drivers for many of the more popular Java XML parsers. The initial specification supported the XML 1.0 recommendation, but not any of the more complex layers that have been built on top of it; the initiatives to create those were largely in their infancy at the time. The group of developers called the new API the “Simple API for XML,” or SAX, because it was actually simpler than most of the parser-specific interfaces it was designed to abstract away.

The new API was widely received as a major step forward for application writers—it was easy to use, allowed the use of arbitrary parsers with an application, and was carefully defined before any other common APIs were available. Java programmers became extremely happy as the stress levels dropped in their professional lives. Developers in other languages adapted the specification in ways that allowed SAX to remain an identifiable API even as it was made to work with the native conventions used in those languages. Python programmers in the XML-SIG, led by Lars Marius Garshol, created an adaptation of the API and implemented drivers for several parsers. This implementation was accepted as part of the PyXML package.

The W3C then released the Namespaces recommendation. This recommendation changed the very concept of what constituted a name. While there was great debate over the value of the new recommendation, most people recognized that it did solve real problems and that it was here to stay. No one wanted a return of having to chase incompatible APIs, so the SAX developers quickly dug in and worked on a version of SAX that could support Namespaces. The revised API is known as SAX2. It is interesting to note that some of the first implementations of namespaces were filters written as SAX handlers; the SAX events were used to drive the SAX2 handlers with a little bit of processing in the middle to add the Namespaces support. Information on the Java version of SAX2 and links to additional SAX resources can be found at the SAX home page at

Python developers rapidly adopted the SAX2 interface, taking the opportunity to clean up some warts of the early mapping of SAX from the Java-based specification. The SAX2 API rapidly became part of PyXML and was adopted for use in the Python standard library. When Python programmers speak of the SAX API, they are generally referring to the second version.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.