Mark Wilson I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
An excerpt from New Riders
Inside XML by Steven Holzner.
This chapter is all about using XML with Java to create standalone
programs. In fact, I'll even create a few browsers in this chapter.
Here, the programs we write will be based on the XML DOM, and I'll
use the XML for Java (XML4J) packages from IBM alphaWorks (http://www.alphaworks.ibm.com/tech/xml4j).
This is the famous XML parser
that adheres to the W3C standards and has implemented the W3C DOM
level 1 (and part of level 2). It's the most widely used standalone
XML Java parser available.
As of this writing, the current version is
3.0.1, and it's based on the Apache Xerces XML Parser Version
1.0.3.
XML Parser for Java is a validating XML parser written in 100%
pure Java. The package (com.ibm.xml.parser) contains classes and
methods for parsing, generating, manipulating, and validating XML
documents. XML Parser for Java is believed to be the most robust XML
processor currently available and conforms most closely to the XML
1.0 Recommendation.
In fact, this points out one of the problems with working with
modern XML Java parsers-they're always in a state of flux. It turns
out that the com.ibm.xml.parser package mentioned here is now
deprecated, which in Java terms means that it's obsolete (although
still supported) and scheduled to be removed in a future release.
Instead, we'll use the org.apache.xerces.parsers package, which is
the successor to com.ibm.xml.parser.
This is an occupational hazard when working with third-party
parsers, which historically have been extremely volatile. For
example, when XML was still very young, I wrote a book based largely
on the Microsoft XML Java parser, which was the only commercial-grade
Java XML parser available at that time. And just before the book
appeared on shelves, Microsoft changed its parser utterly so that
virtually none of the code in the book worked. (The Microsoft XML
Java parser is not even available as a standalone package anymore.)
That's not an uncommon experience.
On the other hand, the alphaWorks parser has been changed so that
it's now based on the W3C DOM (the package we'll be using to support
nodes and elements in code will be alphaWork's org.w3c.dom package),
which means that things have finally become standardized. However,
the package names and the actual parsers we'll use, such as
org.apache.xerces.parsers.DOMParser in this chapter, are still
subject to change. By the time you read this, the alphaWorks packages
may well have changed, something that's beyond our control here. In
that case, you should refer to the XML for Java documentation to see
what changes you need to make to your code-now that the W3C DOM is
available, those changes should be minimized compared to what
happened in the past.
This chapter and the next one provide you with a good introduction
to the XML for Java parser. However, there's enough material here to
take up a whole book-in fact, such books have been published, as
recently as last year. (Those books are now obsolete because of
changes in the parser-surprise!) The XML for Java packages are
extensive and come with hundreds of pages of documentation, so if you
want to pursue XML for Java programming beyond the techniques that
you see in these chapters, dig into that documentation.
We saw XML for Java in this book as early as Chapter 1, "Essential
XML," where I used an example that comes with XML for Java named
DOMWriter that lets you validate XML documents based on DTDs. In
Chapter 1, we saw this document, greeting.xml:
<?xml version="1.0" encoding="UTF-8"?>
<DOCUMENT>
<GREETING>
Hello From XML
</GREETING>
<MESSAGE>
Welcome to the wild and
woolly world of XML.
</MESSAGE>
</DOCUMENT>
I tested this document using DOMWriter like this, where you can
see that it reports validation errors:
%java dom.DOMWriter greeting.xml
greeting.xml:
[Error] greeting.xml:2:11: Element type "DOCUMENT" must be
declared
[Error] greeting.xml:3:15: Element type "GREETING" must be
declared
[Error] greeting.xml:6:14: Element type "MESSAGE" must be
declared.
<?xml version="1.0" encoding="UTF-8"?>
<DOCUMENT>
<GREETING>
Hello From XML
</GREETING>
<MESSAGE>
Welcome to the wild and
woolly world of XML.
</MESSAGE>
</DOCUMENT>
In this chapter, we'll build our own Java programs using XML for
Java directly, including parsing and filtering XML documents, as well
as creating standalone browsers and even a specialized graphical
browser that uses XML documents not to display text, but to display
circles.
That's one advantage of being able to create your own
programs using parsers like the ones in XML for Java: You can create
your own specialized browsers.