Mark Wilson I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
This is an excerpt from
Chapter 4 of the New Riders
book called
Inside XSLT written by Steven Holzner.
PLEASE NOTE: To work with the XSLT and XPath in this
document TopXML recommends you use the demo version of Xselerator
XSL Editor
The best way to understand patterns is by example. For example,
suppose that you want to transform planets.xml into planets.html,
but retain only the first planet, Mercury. You can do that with the
predicate [position() < 2], because the first planet's position
is 1, the next is 2, and so on. Note, however, that < is
invariably a sensitive character for XSLT processors, because it's
what you use to start markup; rather than <, you should use
<.;. And note that you have to strip the other elements out
of planets.xml by supplying an empty template for them, which I can
do with the predicate [position() >= 2]:
Listing 4.10 Retaining only Mercury
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/PLANETS">
<HTML>
<HEAD>
<TITLE>
The Planets Table
</TITLE>
</HEAD>
<BODY>
<H1>
The
Planets Table
</H1>
<TABLE BORDER="2">
<TR>
<TD>Name</TD>
<TD>Mass</TD>
<TD>Radius</TD>
<TD>Day</TD>
</TR>
<xsl:apply-templates/>
</TABLE>
</BODY>
</HTML>
</xsl:template>
<xsl:template match="PLANET[position()
<.; 2]">
<TR>
<TD><xsl:value-of select="NAME"/></TD>
<TD><xsl:apply-templates select="MASS"/></TD>
<TD><xsl:apply-templates
select="RADIUS"/></TD>
<TD><xsl:apply-templates select="DAY"/></TD>
</TR>
</xsl:template>
<xsl:template match="PLANET[position()
>= 2]">
</xsl:template>
<xsl:template match="MASS">
<xsl:value-of
select="."/>
<xsl:text>
</xsl:text>
<xsl:value-of
select="@UNITS"/>
</xsl:template>
<xsl:template match="RADIUS">
<xsl:value-of
select="."/>
<xsl:text>
</xsl:text>
<xsl:value-of
select="@UNITS"/>
</xsl:template>
<xsl:template match="DAY">
<xsl:value-of
select="."/>
<xsl:text>
</xsl:text>
<xsl:value-of
select="@UNITS"/>
</xsl:template>
</xsl:stylesheet>
Here's the resulting document note that only the first planet,
Mercury, was retained:
<HTML>
<HEAD>
<TITLE>
The Planets Table
</TITLE>
</HEAD>
<BODY>
<H1>
The Planets Table
</H1>
<TABLE
BORDER="2">
<TR>
<TD>Name</TD>
<TD>Mass</TD>
<TD>Radius</TD>
<TD>Day</TD>
</TR>
<TR>
<TD>Mercury</TD>
<TD>.0553 (Earth = 1)</TD>
<TD>1516 miles</TD>
<TD>58.65 days</TD>
</TR>
</TABLE>
</BODY>
</HTML>
In the following example, COLOR and POPULATED attributes have
been added to Earth's <PLANET> element:
How can you select only elements that have both COLOR and
POPULATED attributes? You can use the predicate "[@COLOR and
@POPULATED]". To strip out the other elements so the default rules
don't place their text into the result document, you can use a
predicate such as "[not(@COLOR) or not(@POPULATED)]", as shown in
Listing 4.11.
Listing 4.11 Selecting only elements that have both COLOR and
POPULATED attributes
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/PLANETS">
<HTML>
<HEAD>
<TITLE>
Colorful, Populated Planets
</TITLE>
</HEAD>
<BODY>
<H1>
Colorful, Populated Planets
</H1>
<TABLE BORDER="2">
<TR>
<TD>Name</TD>
<TD>Mass</TD>
<TD>Radius</TD>
<TD>Day</TD>
</TR>
<xsl:apply-templates/>
</TABLE>
</BODY>
</HTML>
</xsl:template>
<xsl:template match="PLANET[@COLOR and
@POPULATED]">
<TR>
<TD><xsl:value-of select="NAME"/></TD>
<TD><xsl:apply-templates select="MASS"/></TD>
<TD><xsl:apply-templates
select="RADIUS"/></TD>
<TD><xsl:apply-templates select="DAY"/></TD>
</TR>
</xsl:template>
<xsl:template match="PLANET[not(@COLOR) or
not(@POPULATED)]">
In the following example, I copy planets.xml to a new XML
document, and change the text in Venus's <NAME> element to
"The Planet of Love". To do that, I start by copying all nodes and
attributes to the result document:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates
select="@*|node()"/>
</xsl:copy>
</xsl:template>
.
.
.
Now I'll add a new rule that matches <NAME> elements that
have the text "Venus" with the pattern "NAME[text() = 'Venus']".
Even though <NAME> elements match both rules in this
stylesheet, the rule with the pattern "NAME[text() = 'Venus']" is a
more specific match, so the XSLT processor uses it for Venus's
<NAME> element:
In fact, in XPath expressions, you can refer to the context node
as ".", and the default value of a node is its text, so the
following rule works exactly the same way:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates
select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="NAME[. =
'Venus']">
<NAME>
The Planet of
Love
</NAME>
</xsl:template>
</xsl:stylesheet>
It's worth packing in as many examples as possible you can never
have too many match pattern or XPath examples.
The following is a
good collection of match pattern examples:
PLANET matches the <PLANET> element children of the
context node.
/PLANETS matches the root element (PLANETS) of this
document.
* matches all element children of the context node.
PLANET[3] matches the third <PLANET> child of the context
node.
PLANET[last()] matches the last <PLANET> child of the
context node.
PLANET[NAME] matches the <PLANET> children of the context
node that have <NAME> children.
PLANET[DISTANCE]/NAME matches all <NAME> elements of
<PLANET> elements that contain at least one <DISTANCE>
element.
PLANET[DISTANCE]/PLANET[DAY] matches all <PLANET> elements
of <PLANET> elements where the <PLANET> element
contains at least one <DISTANCE> element, and the
<PLANET> element has at least one <DAY> element.
PLANETS[PLANET/DAY] matches all <PLANETS> elements that
have <PLANET> elements with at least one <DAY>
element.
PLANET[DISTANCE][NAME] matches all <PLANET> elements that
have <DISTANCE> and <NAME> elements.
PLANETS/PLANET[last()] matches the last <PLANET> in each
<PLANETS> element.
*/PLANET matches all <PLANET> grandchildren of the context
node.
/PLANETS/PLANET[3]/NAME[2] matches the second <NAME>
element of the third <PLANET> element of the <PLANETS>
element.
//PLANET matches all the <PLANET> descendants of the
document root.
PLANETS//PLANET matches the <PLANET> element descendants
of the <PLANETS> element children of the context node.
//PLANET/NAME matches all the <NAME> elements that are
children of a <PLANET> parent.
PLANETS//PLANET/DISTANCE//PERIHELION matches <PERIHELION>
elements anywhere inside a <PLANET> element's
<DISTANCE> element, anywhere inside a <PLANETS>
element.
@UNITS matches the UNITS attribute of the context node.
@* matches all the attributes of the context node.
*[@UNITS] matches all elements with the UNITS attribute.
DENSITY/@UNITS matches the UNITS attribute in <DENSITY>
elements.
PLANET[not(@COLOR) or not(@SIZE)] matches <PLANET>
elements that do not have both COLOR and SIZE attributes.
PLANETS[@STAR="Sun"]//DENSITY matches any <DENSITY>
element with a <PLANETS> ancestor element that has a STAR
attribute set to "Sun".
PLANET[NAME="Venus"] matches the <PLANET> children of the
context node that have <NAME> children with text equal to
"Venus".
PLANET[NAME[1] = "Venus"] matches all <PLANET> elements
where the text in the first <NAME> element is "Venus".
DAY[@UNITS != "million miles"] matches all <PLANET>
elements where the UNITS attribute is not "million miles".
PLANET[@UNITS = "days"] matches all <PLANET> children of
the context node that have a UNITS attribute with value "days".
PLANET[6][@UNITS = "days"] matches the sixth <PLANET>
child of the context node, only if that child has a UNITS attribute
with value "days". Can also be written as PLANET[@UNITS =
"days"][6].
PLANET[@COLOR and @UNITS] matches all the <PLANET>
children of the context node that have both a COLOR attribute and a
UNITS attribute.
*[1][NAME] matches any <NAME> element that is the first
child of its parent.
*[position() < 5] matches the first five children of the
context node.
*[position() < 5][@UNIT] matches the first five children
of the context node with a UNITS attribute.
text() matches all text node children of the context node.
text()[starts-with(., "In the course of human events")] matches
all text nodes that are children of the context node and start with
"In the course of human events".
/PLANETS[@UNITS = "million miles"] matches all PLANETS where the
value of the UNITS attribute is equal to "million miles".
PLANET[/PLANETS/@UNITS = @REFERENCE] matches all <PLANET>
elements where the value of the REFERENCE attribute is the same as
the value of the UNITS attribute of the PLANETS element at the root
of the document.
PLANET/* matches all element children of PLANET elements.
PLANET/*/DAY matches all DAY elements that are grandchildren of
PLANET elements that are children of the context node.
*/* matches the grandchildren elements of the current
element.
astrophysics:PLANET matches the PLANET element in the
"astrophysics" namespace.
astrophysics:* matches any elements in the "astrophysics"
namespace.
PLANET[DAY and DENSITY] matches all <PLANET> elements that
have at least one <DAY> and one <DENSITY> element.
PLANET[(DAY or DENSITY) and MASS] matches all <PLANET>
elements that have at least one <DAY> or <DENSITY>
element, and also have at least one <MASS> element.
PLANET[DAY and not(DISTANCE)] matches all <PLANET>
elements that have at least one <DAY> element and no
<DISTANCE> elements.
PLANET[MASS = /STANDARD/REFERENCE/MASS] matches all
<PLANET> elements where the <MASS> element's value is
the same as the value of the
/<STANDARD>/<REFERENCE>/<MASS> element.
That completes this coverage of match patterns for the moment;
you'll see related material in Chapter 7 on XPath expressions.
Chapter 5 starts looking at ways of working with the data in XML
documents by sorting that data and using it to make choices based
on data values.