Mark Wilson I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
First posted :
03/24/2008
Times viewed :
1455
The javax.xml.parsers.SAXParser Class
Super Class :
java.lang.Object.
Since: JAXP1.0
Public Members
Signature
Public methods
abstract Parser
getParser()
abstract Object
getProperty(String name)
abstract Object
getProperty(String name)
abstract org.xml.sax.XMLReader
getXMLReader()
abstract boolean
isNamespaceAware()
abstract boolean
isValidating()
void
parse(File f, DefaultHandler dh)
void
parse(File f, HandlerBase hb)
void
parse(InputSource is, DefaultHandler dh)
void
parse(InputSource is, HandlerBase hb)
void
parse(InputStream is, DefaultHandler dh)
void
parse(InputStream is, DefaultHandler dh, String systemId)
void
parse(InputStream is, HandlerBase hb)
void
parse(InputStream is, HandlerBase hb, String systemId)
void
parse(String uri, DefaultHandler dh)
void
parse(String uri, HandlerBase hb)
abstract
void setProperty(String name, Object value)
Overview
This is an abstract class. This class provides methods that wraps org.xml.sax.XMLReader
implementation class. The first thing you need to develop any SAX based
application is to get an instance of any class that conforms
org.xml.sax.XMLReader interface. In JAXP1.0 this class wraps org.xml.sax.Parser
interface, however this interface was replaced by org.xml.sax.XMLReader interface.
Once an instance of this class is obtained, XML can be parsed from a variety of
input sources. These input sources are InputStreams, Files, URLs, and SAX
InputSources.
SAX is an event based parser(To know more details, please visit SAX
documentation).
As the content of xml document is parsed by the underlying parser,
methods of given org.xml.sax.HandlerBase or the org.xml.sax.helpers.DefaultHandler
is called.
An implementation of SAXParser is NOT guaranteed to behave as per
the specification if it is used concurrently by two or more threads. It is
recommended to have one instance of the SAXParser per thread or it is upto the
application to make sure about the use of SAXParser from more than one thread.
Method Overview
Important methods of this class
are parse(File f, DefaultHandler dh),parse(File f, HandlerBase hb), parse(InputSource is, DefaultHandler dh), parse(InputStream
is, HandlerBase hb).
To
parse any xml document using SAX, you need a class that wraps DefaultHandler or
HandlerBase. You generally override methods provided in DefaultHandler or HandlerBase
class and write your application logic.
1.
parse(File f, DefaultHandler dh) method is used to parse the content of
the file specified as XML using the specified org.xml.sax.helpers.DefaultHandler.
2.
parse(File f, HandlerBase hb) is used to parse the content of the file
specified as XML using the specified org.xml.sax.HandlerBase. Use of the DefaultHandler
version of this method is recommended as the HandlerBase class has been
deprecated in SAX 2.0.
3.
parse(InputSource is, DefaultHandler dh) method is used to parse the
content given org.xml.sax.InputSource as XML using the specified DefaultHandler.
Example of How to Create SAX Parser
The first step is
to create any SAX parser it to create a supporting class that can takecare all
parser generated events. These events get generated during parsing but not
after the parsing are done. So the first step is to register the handler class
with the parser. The handler is nothing more than a set of call backs that SAX
defines to let us interject application code at important events within a
document’s parsing. This is one of the reason to make SAX a powerful interface:
it allows documents to be handled sequentially, without having to first read
the entire document into memory.
Example to create a Handler class
In this example,
I have only overridden four important methods. One of the problems of extending
any SAX handler class is mismatch in the signature of methods, so if you are
extending a handler class ensure that the methods which you override should
match the signature otherwise methods will not be called during parsing. The
other way is to implement org.xml.sax.ContentHandler interface.
import
org.xml.sax.helpers.DefaultHandler;
import
org.xml.sax.SAXException;
import
org.xml.sax.Attributes;
public class
DefaultHandlerImpl extends DefaultHandler{
public DefaultHandlerImpl() {
}
/**
* Receive notification of the beginning of
the document.
* @throws SAXException
*/
public void startDocument () throws
SAXException
{
// Write your application specific logic
}
/**
* Receive notification of the end of the
document.
* @throws SAXException
*/
public void endDocument() throws
SAXException {
// Write your application specific logic
}
/**
* Receive notification of the start of an
element.
* @param uri
* @param localName
* @param qName
* @param attributes
* @throws SAXException
*/
public void startElement (String uri, String
localName,
String qName,
Attributes attributes)
throws SAXException{
// Write your application specific
logic
System.out.println("URI
:"+uri);
}
/**
* Receive notification of the end of an
element.
* @param uri
* @param localName
* @param qName
* @throws SAXException
*/
public void endElement (String uri, String
localName, String qName)
throws SAXException
{
// Write your application specific
logic
}
/**
* Receive notification of character data
inside an element.
* @param ch
* @param start
* @param length
* @throws SAXException
*/
public void characters (char ch[], int
start, int length)
System.out.println("Usage is: java
MainClass [xml file Name]");
System.exit(0);
}else{
fileName = args[0];
}
MainClass mainClass = new MainClass();
mainClass.startParsing(fileName);
}
}
Example to create a SAX Parser from Inputsource
Instead of using
a full filename, method may also be invoced with an org.xml.sax.InputSource as
an argument. There is actually remarkably little to comment on in regard to
this class. It is used as a helper and wrapped class more than anything else.
An inutsource simply encapsulates information about a single object. While this
is not very helpful in our example, in situations where a system identifier,
public identifier, or a stream may all be tied to one URI, using an inoutsource
for encapsulation can become very handy. The class has accessor and mutator
methods for its system id and public id, a character encoding, a bytestream(
java.io.InoutStream), and a character stream(java.io.Reader). Passed as an
argument to the parse() method. SAX also guarantees that the parser will never
modify the input source. This ensures that the original input to a parser is
still available on changed after its use by a parser or XML-aware application.
While we don’t spend any further time looking at this utility class here, many
of the applications we look at later is the examples use the input Source class
as input to SAX parsers rather than a specific URI.
Following code snippet creates an InoutSource
from a File.