Mark Wilson I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
You use XSLT to manipulate documents, changing and working with
their markup as you want. One of the most common transformations is
from XML documents to HTML documents, and that's the kind of
transformation we'll see in the examples in this chapter.
To create an XSLT transformation, you need two documents-the
document to transform, and the style sheet that specifies the
transformation. Both documents are well-formed XML documents.
Here's an example; this document, planets.xml, is a well-formed
XML document that holds data about three planets-Mercury, Venus, and
Earth. Throughout this chapter, I'll transform this document to HTML
in various ways. For programs that can understand it, you can use the
<?xml-stylesheet?> processing instruction to indicate what XSLT
style sheet to use, where you set the type attribute to "text/xml"
and the href attribute to the URI of the XSLT style sheet, such as
planets.xsl in this example (XSLT style sheets usually have the
extension .xsl).
Here's what the style sheet planets.xsl might look like. In this
case, I'm converting planets.xml into HTML, stripping out the names
of the planets, and surrounding those names with HTML <P>
elements:
All right, we have an XML document and the style sheet we'll use
to transform it. So, how exactly do you transform the document?
Making a Transformation Happen
You can transform documents in three ways:
n In the server. A server program, such as a Java
servlet, can use a style sheet to transform a document automatically
and serve it to the client. One such example is the XML Enabler,
which is a servlet that you'll find at the XML for Java Web site,
www.alphaworks.ibm.com/tech/xml4j.
n In the client. A client program, such as a
browser, can perform the transformation, reading in the style sheet
that you specify with the <?xml-stylesheet?> processing
instruction. Internet Explorer can handle transformations this way to
some extent.
n With a separate program. Several standalone
programs, usually based on Java, will perform XSLT transformations.
I'll use these programs primarily in this chapter.
In this chapter, I'll use standalone programs to perform
transformations because those programs offer by far the most complete
implementations of XSLT. I'll also take a look at using XSLT in
Internet Explorer.
Two popular programs will perform XSLT transformations: XT and XML
for Java.
James Clark's XT
You can get James Clark's XT at www.jclark.com/xml/xt.html.
Besides XT itself, you'll also need a SAX-compliant XML parser, such
as the one we used in the previous chapter that comes with the XML
for Java packages, _or James Clark's own XP parser, which you can get
at www.jclark.com/xml/_xp/ index.html.
XT is a Java application. Included in the XT download is the JAR
file you'll need, xt.jar. The XT download also comes with sax.jar,
which holds James Clark's SAX parser. You can also use the XML for
Java parser with XT; to do that, you must include both xt.jar and
xerces.jar in your _CLASSPATH, something like this:
Then you can use the XT transformation class,
com.jclark.xsl.sax.Driver. You supply the name of the SAX parser you
want to use, such as the XML for Java class
org.apache.xerces.parsers.SAXParser, by setting the
com.jclark.xsl.sax.parser variable with the java -D switch. Here's
how I use XT to transform planets.xml, using planets.xsl, into
planets.html:
XT is also packaged as a Win32 exe. To use xt.exe, however, you
will need the Microsoft Java Virtual Machine (VM) installed (included
with Internet Explorer). Here's an example in Windows that performs
the same transformation as the previous command:
C:\>xt planets.xml planets.xsl planets.html
XML for Java
You can also use the IBM alphaWorks XML for Java XSLT package,
called LotusXSL. LotusXSL implements an XSLT processor in Java that
can be used from the command line, in an applet or a servlet, or as a
module in another program. By default, it uses the XML4J XML parser,
but it can interface to any XML parser that conforms to the either
the DOM or the SAX _specification.
Here's what the XML for Java site says about LotusXSL: "LotusXSL
1.0.1 is a complete and a robust reference implementation of the W3C
Recommendations for XSL Transformations (XSLT) and the XML Path
Language (XPath)."
You can get LotusXSL at www.alphaworks.ibm.com/tech/xml4j; just
click the XML item in the frame at left, click LotusXSL, and then
click the Download button (or you can go directly to
www.alphaworks.ibm.com/tech/lotusxsl, although that URL may change).
The download includes xerces.jar, which includes the parsers that the
rest of the LotusXSL package uses (although you can use other
parsers), and xalan.jar, which is the LotusXSL JAR file. To use
LotusXSL, make sure that you have xalan.jar in your CLASSPATH; to use
the XML for Java SAX parser, make sure that you also have xerces.jar
in your CLASSPATH, something like this:
Unfortunately, the LotusXSL package does not have a built-in class
that will take a document name, a style sheet name, and an output
file name like XT. However, I'll create one named xslt, and you can
use this class quite generally for transformations. Here's what
xslt.java looks like:
After you've set the CLASSPATH as indicated, you can create
xslt.class with javac like this:
%javac xslt.java
The file xslt.class is all you need. After you've set the
CLASSPATH as indicated, you can use xslt.class like this to transform
planets.xml, using the style sheet planets.xsl, into
planets.html:
%java xslt planets.xml planets.xsl planets.html
What does planets.html look like? In this case, I've set up
planets.xsl to simply place the names of the planets in <P>
HTML elements. Here are the results, in planets.html:
<HTML>
<P>Mercury</P>
<P>Venus</P>
<P>Earth</P>
</HTML>
That's the kind of transformation we'll see in this chapter.
There's another way to transform XML documents without a
standalone program-you can use a client program such as a browser to
transform documents.
Using Browsers to Transform XML Documents
Internet Explorer includes a partial implementation of XSLT; you
can read about Internet Explorer support at
http://msdn.microsoft.com/xml/XSLGuide/. That support is based on the
W3C XSL working draft of December 16, 1998 (which you can find at
www.w3.org/TR/1998/WD-xsl-19981216.html); as you can imagine, things
have changed considerably since then.
To use planets.xml with Internet Explorer, I have to make a few
modifications. For example, I have to convert the type attribute in
the <?xml-stylesheet?> processing instruction from "text/xml"
to "text/xsl":
I can also convert the style sheet planets.xsl for use in Internet
Explorer. _A major difference between the W3C XSL recommendation and
the XSL implementation in Internet Explorer is that Internet Explorer
doesn't implement any default XSL rules (which I'll discuss in this
chapter). This means that I have to explicitly include an XSL rule
for the root of the document, which you specify with /. I also have
to use a different namespace in the style sheet,
http://www.w3.org/TR/WD-xsl, and omit the version attribute in the
<xsl:stylesheet> element: