To specify a node or set of nodes in XPath, you use a location
path. A location path, in turn, consists of one or more location
steps, separated by / or //. If you start the location path with /,
the location path is called an absolute location path because you're
specifying the path from the root node; otherwise, the location path
is relative, starting with the current node, which is called the
context node. Got all that? Good, because there's more.
A location step is made up of an axis, a node test, and zero or
more predicates. For example, in the expression
child::PLANET[position() = 5], child is the name of the axis, PLANET
is the node test, and [position() = 5] is a predicate. You can create
location paths with one or more location steps, such as
/descendant::PLANET/child::NAME, which selects all the <NAME>
elements that have a <PLANET> parent. The best way to
understand all this is by example, and we'll see plenty of them in a
few pages. In the meantime, I'll take a look at what kind of axes,
node tests, and predicates XPath supports.
XPath Axes
In the location path child::NAME, which refers to a <NAME>
element that is a child of the current node, the child is called the
axis. XPath supports many different axes, and it's important to know
what they are. Here's the list:
Axis
Description
ancestor
‑Holds the ancestors of the context node. The ancestors of the
context node are the parent of context node and the parent's parent
and so forth, back to and including the root node.
ancestor-or-self
‑Holds the context node and the ancestors of the context
node.
attribute
Holds the attributes of the context node.
child
Holds the children of the context node.
descendant
‑Holds the descendants of the context node. A descendant is a
child or a child of a child, and so on.
descendant-or-self
‑Contains the context node and the descendants of the context
node.
following
‑Holds all nodes in the same document as the context node that
come after the context node.
following-sibling
‑Holds all the following siblings of the context node. A
sibling is a node on the same level as the context node.
namespace
‑Holds the namespace nodes of the context node.
parent
Holds the parent of the context node.
preceding
‑Contains all nodes that come before the context node.
preceding-sibling
‑Contains all the preceding siblings of the context node. A
sibling is a node on the same level as the context node.
self
Contains the context node.
You can use axes to specify a location step or path, as in this
example, where I'm using the child axis to indicate that I want to
match to child nodes of the context node, which is a <PLANET>
element. (We'll see later that an _abbreviated version lets you omit
the child:: part.)
<xsl:template match="PLANET">
<HTML>
<CENTER>
<xsl:value-of
select="child::NAME"/>
</CENTER>
<CENTER>
<xsl:value-of select="child::MASS"/>
</CENTER>
<CENTER>
<xsl:value-of select="child::DAY"/>
</CENTER>
</HTML>
</xsl:template>
In these expressions, child is the axis, and the element names
NAME, MASS, and DAY are node tests.
XPath Node Tests
You can use names of nodes as node tests, or you can use the wild
card * to select element nodes. For example, the expression
child::*/child::NAME selects all <NAME> elements that are
grandchildren of the context node. Besides nodes and the wild card
character, you can also use these node tests:
Node
Test
Description
comment()
Selects comment nodes.
node()
Selects any type of node.
processing-instruction()
‑Selects a processing instruction node. You can specify the
name of the processing instruction to select in the parentheses.
text()
Selects a text node.
XPath Predicates
The predicate part of an XPath step is perhaps its most intriguing
part because it gives you the most power. You can work with all kinds
of expressions in predicates; here are the possible types:
n Node sets
n Booleans
n Numbers
n Strings
n Result tree fragments
I'll take a look at these various types in turn.
XPath Node Sets
As its name implies, a node set is simply a set of nodes. An
expression such as child::PLANET returns a node set of all
<PLANET> elements. The expression child::PLANET/child::NAME
returns a node list of all <NAME> elements that are children of
<PLANET> elements. To select a node or nodes from a node set,
you can use various functions that work on node sets in
predicates.
Function
Description
last()
Returns the number of nodes in a node set.
position()
‑Returns the position of the context node in the context node
set (starting with 1).
count(node-set)
‑Returns the number of nodes in node-set. Omitting node-set
makes this function use the context node.
id(string
ID)
‑Returns a node set containing the element whose ID matches the
string passed to the function, or returns an empty node set if no
element has the specified ID. You can list multiple IDs separated by
whitespace, and this function will return a node set of the elements
with those IDs.
local-name(node-set)
‑Returns the local name of the first node in the node set.
Omitting node-set makes this function use the context node.
namespace-uri(node-set)
‑Returns the URI of the namespace of the first node in the node
set. Omitting node-set makes this function use the context node.
name(node-set)
‑Returns the full, qualified name of the first node in the node
set. Omitting node-set makes this function use the context node.
Here's an example; in this case, I'll number the elements in the
output _document using the position() function:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="PLANETS">
<HTML>
<HEAD>
<TITLE>
The Planets
</TITLE>
</HEAD>
<BODY>
<xsl:apply-templates select="PLANET"/>
</BODY>
</HTML>
</xsl:template>
<xsl:template match="PLANET">
<P>
<xsl:value-of select="position()"/>.
<xsl:value-of select="NAME"/>
</P>
</xsl:template>
</xsl:stylesheet>
Here's the result, where you can see that the planets are
numbered:
<HTML>
<HEAD>
<TITLE>
The Planets
</TITLE>
</HEAD>
<BODY>
<P>1.
Mercury</P>
<P>2.
Venus</P>
<P>3.
Earth</P>
</BODY>
</HTML>
You can use functions that operate on node sets in predicates, as
in child::PLANET[position() = last()], which selects the last
<PLANET> child of the context node.
XPath Booleans
You can also use Boolean values in XPath expressions. Numbers are
considered false if they're zero and are considered true otherwise.
An empty string ("") is also considered false, and all other strings
are considered true.
You can use XPath logical operators to produce Boolean true/false
results; here are the logical operators:
Operator
Description
!= Is not
equal to.
<
Is less than. (Use < in XML documents.)
<=
‑Is less than or equal to. (Use <= in XML
documents.)
=
‑Is equal to. (C, C++, Java, JavaScript programmers take
note-this operator is one = sign, not two.)
>
Is greater than.
>= Is
greater than or equal to.
You shouldn't use < directly in XML documents; use the entity
reference < instead.
You can also use the keywords and and or to connect Boolean
clauses with a logical And or Or operation, as we've seen when
working with JavaScript and Java.
Here's an example using the logical operator >. This rule
applies to all <PLANET> elements after position 5:
<xsl:template match="PLANET[position() > 5]">
<xsl:value-of select="."/>
</xsl:template>
There is also a true() functions that always returns a value of
true, and a false() function that always returns a value of
false.
You can also use the not() function to reverse the logical sense
of an expression, as in this case, where I'm selecting all but the
last <PLANET> element:
<xsl:template match="PLANET[not(position() = last())]">
<xsl:value-of select="."/>
</xsl:template>
Finally, the lang() function returns true or false, depending on
whether the language of the context node (which is given by xml:lang
attributes) is the same as the language you pass to this
function.
XPath Numbers
In XPath, numbers are actually stored as in double-precision
floating-point format. (See Chapter 10, "Understanding Java," for
more details on doubles; technically speaking, all XPath numbers are
stored in 64-bit IEEE 754 floating-point double-precision format.)
All numbers are stored as doubles, even integers such as 5, as in the
example we just saw:
<xsl:template match="PLANET[position() > 5]">
<xsl:value-of select="."/>
</xsl:template>
You can use several operators on numbers:
Operator
Action
+
Adds.
-
Subtracts.
*
Multiplies.
div
‑Divides. (The / character, which stands for division in other
languages, is already heavily used in XML and XPath.)
mod ‑Returns the
modulus of two numbers (the remainder after dividing the first by the
second).
For example, the element <xsl:value-of select="180 + 420"/>
inserts the string "600" into the output document. This example
selects all planets whose day (measured in earth days) divided by its
mass (where the mass of Earth _= 1) is greater than 100:
<xsl:template match="PLANETS">
<HTML>
<BODY>
<xsl:apply-templates select="PLANET[DAY div MASS >
100]"/>
</BODY>
</HTML>
</xsl:template>
XPath also supports these functions that operate on numbers:
Function
Description
ceiling() ‑Returns the smallest integer
larger than the number that you pass it
floor() ‑Returns the largest
integer smaller than the number that you pass it
round() ‑Rounds the number that you
pass it to the nearest integer
sum() Returns the sum of the numbers
that you pass it
For example, here's how you can find the average mass of the
planets in planets.xml:
<xsl:template match="PLANETS">
<HTML>
<BODY>
The average planetary mass is:
<xsl:value-of select="sum(child::MASS)
div count(descendant::MASS)"/>
</BODY>
</HTML>
</xsl:template>
XPath Strings
In XPath, strings are made up of Unicode characters. A number of
functions are specially designed to work on strings, as shown in this
table.
Function
Description
starts-with(string string1,
Returns true if the first string starts with the _string
string2)
second string
contains(string string1,
Returns true if the first string contains the _string
string2)
second one
substring(string
string1,
Returns length characters from the string, _number offset, number
length) starting at offset
substring-before(string string1,
Returns the part of string1 up to the first _string
string2)
occurrence of string2
substring-after(string string1,
Returns the part of string1 after the first _string
string2)
occurrence of string2
string-length(string
string1)
Returns the number of characters in string1
normalize-space(string
string1)
‑Returns string1 after leading and trailing whitespace is
stripped and multiple consecutive whitespace is replaced with a
single space
translate(string string1, string
Returns string1 with all occurrences of the _string2, string
string3)
‑characters in string2 replaced by the matching characters in
string3
concat(string string1,
Returns all strings concatenated (that is, _string string2,
...)
joined) together
format-number(number number1,
Returns a string holding the formatted string _string string2, string
string3) version of number1, using string2 as a
format_
‑ting string (create formatting strings as you would for Java's
java.text.DecimalFormat method), and string3 as the optional locale
string
XPath Result Tree Fragments
A result tree fragment is a part of an XML document that is not a
complete node or complete set of nodes. You can create result tree
fragments in various ways, such as with the document() function when
you point to somewhere inside another document.
You really can't do much with result tree fragments in XPath.
Actually, you can do only two things: use the string() or boolean()
functions to turn them into strings or Booleans.
XPath Examples
We've seen a lot of XPath in theory; how about some examples?
Here's a number of location path examples-note that XPath enables you
to use and or or in predicates to apply logical tests using multiple
patterns.
Example
Action
child::PLANET
‑Returns the <PLANET> element children of the context
node.
child::*
‑Returns all element children (* only matches elements) of the
context node.
child::text()
‑Returns all text node children of the context node.
child::node()
‑Returns all the children of the context node, no matter what
their node type is.
attribute::UNIT
‑Returns the UNIT attribute of the context node.
descendant::PLANET
‑Returns the <PLANET> element descendants of the context
node.
ancestor::PLANET
‑Returns all <PLANET> ancestors of the context node.
ancestor-or-self::PLANET
‑Returns the <PLANET> ancestors of the context node. If
the context node is a <PLANET> as well, also returns the
context node.
descendant-or-self::PLANET
‑Returns the <PLANET> element descendants of the context
node. If the context node is a <PLANET> as well, also returns
the context node.
self::PLANET
‑Returns the context node if it is a <PLANET>
element.
child::NAME/descendant::PLANET
‑Returns the <PLANET> element descendants of the child
<NAME> elements of the context node.
child::*/child::PLANET
‑Returns all <PLANET> grandchildren of the context
node.
/
‑Returns the document root (that is, the parent of the document
element).
/descendant::PLANET
‑Returns all the <PLANET> elements in the _document.
/descendant::PLANET/child::NAME
‑Returns all the <NAME> elements that have a
<PLANET> parent.
child::PLANET[position() =
3]
‑Returns the third <PLANET> child of the context
node.
child::PLANET[position() =
Returns the last <PLANET> child of the
context_last()]
node.
/descendant::PLANET[position()
Returns the third <PLANET> element in the _ =
3] document.
child::PLANETS/child::PLANET
Returns the third <NAME> element of the _[position() = 4
]/child::NAME fourth
<PLANET> element of the <PLANETS> _[position() =
3]
element.
child::PLANET[position() >
3]
‑Returns all the <PLANET> children of the context node
after the first three.
preceding-sibling::NAME
Returns the second previous <NAME> sibling _[position() =
2] element of the context node.
child::PLANET[attribute::
Returns all <PLANET> children of the context _COLOR =
"RED"]
‑node that have a COLOR attribute with value of RED.
child::PLANET[attribute::]
Returns the third <PLANET> child of the _COLOR =
"RED"][position() = 3
‑context node that has a COLOR attribute with value of RED.
child::PLANET[position() =
Returns the third <PLANET> child of the
_3][attribute::COLOR="RED"]
‑context node, only if that child has a COLOR attribute with
value of RED.
child::MASS[child::NAME =
Returns the <MASS> children of the context _"VENUS"
]
‑node that have <NAME> children whose text is VENUS.
child::PLANET[child::NAME]
‑Returns the <PLANET> children of the context node that
have <NAME> children.
child::*[self::NAME or
Returns both the <NAME> and <MASS> children_self::MASS
] of the context node.
child::*[self::NAME or
Returns the first <NAME> or <MASS> child of
_self::MASS][position() =
the context
node._first()]
As you can see, some of this syntax is pretty involved and a
little lengthy to type. However, there is an abbreviated form of
XPath syntax.