Learn SAX Tutorial
Introduction
In the SAX Zone you will learn about SAX as an alternative to the XMLDOM and how to make use of it in
your applications - start reading to Learn XML
Table of Contents
Python & XML After a brief introduction to the world of XML, this chapter looks at what Python brings to the XML world. We'll review the Python features that apply to XML, and then we'll give some specific examples of Python with XML. As a very high-level language, Python includes many powerful data structures as part of the core language and libraries. The more recent versions of Python, from 2.0 onward, include excellent support for Unicode and an impressive range of encodings, as well as an excellent (and fast!) XML parser that provides character data from XML as Unicode strings. Python's standard library also contains implementations of the industry-standard DOM and SAX interfaces for working with XML data, and additional support for alternate parsers and interfaces is available. SAX2: Producing SAX2 Events This chapter drills more deeply into how to produce XML events with SAX, including further customization of SAX parsers. A "SAX parser" interface usually works in a kind of "pull" mode: when a thread makes an XMLReader.parse() request, it blocks until the XML document has been fully read and processed. You can also have pure "push" mode event producers. The most common kind writes events directly to event handlers and doesn't use any kind of input abstraction to indicate the data's source; it's not parsing XML text. We discuss several types of such producers later in this chapter. Information on the MinML SAX parser MinML, presumably standing for minimal XML, can be downloaded from http://www.wilson.co.uk/xml/minml.htm It is SAX 1.0 compliant, and consumes less memory than NanoXML. Read this article for more details. Introduction to MSXML and SAX2 Traditional XML parsers parse an entire XML file before you can do anything with the contents. If you're working with large files, this can be costly both in resources and time, particularly when all you want is a single piece of data buried deep within the file. Often, you just need to parse a portion of a file or process a file from start to end, extracting information from specific nodes. The Simple API for XML, SAX, currently in version 2, takes a whole different approach. Those who like open source software will love SAX. It was developed by participants on the XML-DEV mailing list and placed in the public domain by its primary author. Responding to SAX Events in JDOM Using SAX events with JDOM requires: 1. Creating a class that implements org.xml.sax.ContentHandler 2. Instantiating a SAXOutputter object (passing in our ContentHandler class) 3. Passing a JDOM Document to the SAXOutputter output method Want to get the original free SAX parser? The SAX control is an alternative ActiveX control for use in VB or ASP which you can use to access XML documents. Programming the SAX2 interfaces from Visual Basic This is a wrapper for the SAX2 implementation of the May 2000 MSXML release Information on James Clark's XT (SAX parser) This article contains information on doing transformations using SAX. It describes where to get XT, a Java application. The XT download also comes with sax.jar, which holds James Clark's SAX parser. Then you can use the XT transformation class, com.jclark.xsl.sax.Driver The Joy of SAX
- The SAX control is an alternative to using the DOM to parse an XML file, which is faster and can handle very large files. But it is read-only. Martin explains how to use the event handling in Visual Basic when using the SAX implementation from the May 2000 MSXML release.
- The code for the article is a wrapper that you can use in VB, for the SAX implementation. This could be used as your standard implementation for working with the SAX control
Firing SAX events from JDOM Our previous example used a SAX parser to create a document. From our point of view it was single event, from text file to Document object. However, SAX parsers work by raising events when different parts of an XML stream are encountered. All of that was hidden by the call to SAXBuilder.build. Sometimes, though, you want to see those SAX events. Integration with SAX SAX (simple API for XML) is an event-driven approach to XML processing. In contrast to the DOM, where an XML document is viewed as a node tree (or a directed graph, if you prefer), SAX looks at an XML document as a series of events, such as Start of Document, or Start of Element. A SAX parser hooks code to these events. JDOM and SAX may be used in two ways: To create a new JDOM Document from a SAX parser, and to use a JDOM Document as an XML source to drive a SAX parser. How is XML Used in ASP (a comparison of approaches) SAX takes a different approach to working with an XML document. It stays true to the parser concept by not building a data structure to represent the document. Instead, an application using the SAX API responds to events, such as the start and finish of an element, generated as the parser processes the incoming stream. SAX was developed when a group of interested programmers decided not to wait for the standards bodies to develop an event-driven XML API. Although it does not have the backing of a formal body, SAX is implemented in several parsers, including those from IBM. Since SAX-implementing parsers don't attempt to build a structure to represent the complete document, they can be highly efficient in their use of system resources. This makes them useful for working with very large documents or when the consumer is interested in only a small subset of the larger document. The burden of maintaining the larger picture, however, rests with the application. A list of parsers This is a collection of XML toolsets, parsers and XSLT processors. Most of the are free and come with the source code under public license. Some of them work under Windows and many of them are Java based. Connecting E-commerce Systems with XML Connecting systems together results in increased efficiency, a decrease in erroneous data, and speeds up the business process. It has become clear that XML is the way to connect these systems together. Having good SAX with Java Covers the Simple API for XML, or SAX, interface. It explains why might you use it instead of the DOM, and will get you writing simple applications with SAX, as well as explaining a little bit about where it came from, and where it's going.
|