Intro to this
page (click to expand)
When I work as a contract programmer at different offices, I put
my copy of Joe Celko's SQL for Smarties
prominently on display on my desk. That's enough for some people to
think of me as a database expert. Fortunately, I have actually read
SQL for Smarties, so, I am a database
expert. Next to it is my copy of Jeffrey Friedl's
Mastering Regular Expressions. Few people
are impressed by it, but they should be. Whenever I was tempted to
think that Friedl's multi-page dissection of a given regex was more
detail than I really needed, some practical implication would pop
off the page and I was glad I had stuck to it.
SQL and Regular Expressions are declarative not procedural. As such, mastering them
requires excursions outside the realm of how-to books. If even
bare competence with these tools is your goal, examination of the
theory is needed.
XPath and XSLT are also declarative. And they have some other
characteristics that will be odd to the Visual Basic programmer.
In this page I am going to try to look at "programming"
XSLT, with the perspective of a theoretical practitioner or a
practical theoretician. In
practice, looking at theory means I will mention a lot of stuff
might be obvious, but maybe one of those obvious things will be a
revelation to you.
Disclaimer: If you haven't even taken a look
at XSLT, this page isn't for you. If you are an expert in
it, much of this you already know. If you have minimum standards in the area
of technical prose, this page is not for you, otherwise, welcome.
Infoworld on XML extensions
In the July 3, 2000 issue of InfoWorld there
is a "Spotlight" on XML Extensions.
We read in the TOC that "keeping up with XML extensions is a
time-consuming endeavor." Evidently so. There are several points
where the authors seem confused or give a confused summary. They
rate the different extensions (e.g. XLink, DTD, XPath, RDF, etc.) in
terms of Relevance, Acceptance and "Standards Prospects", but they
do not make clear which extensions have been implemented yet.
For example, both XPath and XPointer are listed as "High" in
terms of acceptance. But XPointer is only a candidate
recommendation, while XPath became an "official" recommendation last
November. MSXML3 almost completely implements XPath, but so far does
not implement the additional functionality of XPointer.
There is a somewhat confusing discussion about XSL and XSLT,
perhaps inevitably so, since they had such limited space to explain
the matter. When XSL is mentioned alone, it is usually denotes XSLT,
XPath and XSL-fo (formatting objects) together. When XSLT and XSL
are mentioned together, XSL in that context denotes XSL-fo (and not
XSLT). It is like a mainframer calling a Mac a "PC", whereas a Mac
person uses "PC" to specifically denote something other than a Mac.
XSLT is an approved recommendation (i.e. a standard) and
implemented in MSXML3 and a number of other parsers. XSL-fo (just
called XSL on the w3 site) is close to being a release candidate.
Here is a few sentences from Infoworld.
XSLT applies XSL formatting rules (style sheets) to produce
either a restructured XML document or an HTML page. Internet
Explorer has a built-in XSL style sheet that formats XML files for
display. XSLT is extremely powerful and can be used to translate
documents from one XML layout to another. Overall, XSL and XSLT are
two of the most successful XML spin-offs, so they should be
considered essential components of any XML toolkit.
Here
is some comments.
| XSLT applies XSL
formatting rules (style sheets) to produce either a
restructured XML document or an HTML page. |
XSLT applies XSLT rules to
.... |
| Internet Explorer has a built-in
XSL style sheet that formats XML files for display. |
IE has a built-in stylesheet that is
built around (IE 5) XSLT, CSS1 and DHTML |
| XSLT is extremely powerful and can be
used to translate documents from one XML layout to another. |
Yes, correct! |
| Overall, XSL and XSLT are two of the most
successful XML spin-offs, so they should be considered
essential components of any XML toolkit. |
XSL doesn't exist yet, so it "success"
will catch many industry experts by
surprise. |
Any article on this matter should also inform the
reader that Microsoft seems content with CSS 1 and to ignore XSL, while
going full bore on XSLT. Netscape is intent on CSS 2 and XSL-fo
(formatting objects). (It doesn't help things that Microsoft's
version of XSLT in IE 5 was usually referred to as XSL.) Confused?
Under XML Query we read "At present, the awkward XPath pattern-matching syntax is the only way
to search an XML document.". awkward ? XPath is
a model of simplicity. How could XPath be any simpler without being
vastly less powerful? Have the authors seen the XML Query proposals
such as YATL, Lorel, XML-QL or Quilt?
Lastly, there seems to be a lack of perspective.
For example-next to the entry on XLink is an entry on XInclude, both
are given the same rating (high) in the relevance category. Now
XInclude is going to be a really, really handy feature for people
who make web-sites, while XLink will (in time) fundamentally change
the way the world uses the world wide web.
The XSLT
Programming Language?
The first time I encountered people writing about
(what we now call) XSLT as a true
programming language, my mind had trouble accepting
the notion. How could a stylesheet syntax be a
language?
Cascading StyleSheets hardly fired my
imagination, yet these writers certainly seemed excited about XSLT.
In November of 1998, I attended a seminar given by Dr. Steve DeRose
(co-editor of XPath, XPointer and XLink), and the
light was turned on. XSLT was about more than displaying XML on the
web. It was about.... well if you are reading this you probably know
the many things it will do.
So from that time I realized that XSLT was a
true language, but even I was surprised by some of the examples
in Michael Kay's book "XSLT".
I have read a couple of books on Artificial Intelligence that
work through the Knight's Tourproblem.
This problem is how to move a Knight on a chess board so that every
square is visited and visited only once. I have seen this problem
worked out using the AI languages Prolog and Lisp. Trying to code
this in VB would definitely be "non-trivial".
So when I saw a section entitled Knight's Tour in Mr. Kay's book, I did a double take and thought "surely that can't
be the Knight's Tour" (as if there were others?). Indeed, it is, Mr. Kay has implemented the Knight's Tour in
a 17k stylesheet (with no JavaScript).
Upon reflection the ability to solve this in XSLT makes perfect sense, since XSLT is "powered" by recursion which is
what powers Lisp and (in a different way) Prolog.
So the bottom line is that XSLT is a true language and one that gives the programmer the opportunity to be clever and elegant.
VB and Real Programmers
I once worked to next to a C++ programmer who
could hardly hide his contempt for VB. One day
our boss told him to write a utility and since he
wanted it fast, he told this C++
programmer to write in VB. Oh the insult! I
helped him get a start showing him this and that
and then let him go at it. A couple times he
asked me some questions and then on the third day
he came over to my cube exclaiming "It's done!,
it's done! This program does real work and I did
it in three days." I told him that it is a fairly common thing for VB
programmers to complete several programs within
their lifetimes.
Books on the Declarative Tools available to VBers
Two Smart Guys Overheard
If you have been programming with
the MS XML parser for a while, you were no doubt surprised to
discover that transformNode() is not part of the w3's XSLT standard.
Being able to invoke transformNode() off of any XML node is an
extremely handy way of processing a small sub-branch. The trick was
to have a template that matched a period (.) instead of having a
template that matched on the root (\) as you can see in the
following code:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match='.'>
<xsl:apply-templates select="*"/>
</xsl:template>
<xsl:template match='fooKids'>
<h2><xsl:value-of/></h2>
</xsl:template>
</xsl:stylesheet>
Set xNode = xmlSession.selectSingleNode("fooDads[@id='Mark']")
sAnswer = xNode.transformNode(xmlSession.documentElement)
So with the code above, the instruction <xsl:apply-templates select="*"> in the first template
would have selected the children of <FooDads id="Mark">.
If you try the same thing with MSXML3 it complains:
. may not appear to the right of / or // or be used with |.
This was incomprehesible to me since no slash was near there. As
it turns out, the dot match is simply illegal in XSLT. The
explanation from Jonathan Marsh, MS's XML main man: "In IE5 XSL,
match="." literally means "true if the context node exists". Since
XPath and XSLT both require a context node, this is always true.
Several people complained about this and there was some interesting
posts from Michael Kay, creator of SAXON and Mr. Marsh.
From Mike Kay
XSLT 1.0 only allows a stylesheet to be applied to the root node of the
source document. The Microsoft transformNode() is a proprietary extension
that allows the transformation to be applied starting at any node (or
possibly just an element? I'm not sure). But there is no XSLT match pattern
that matches "the starting node of the transformation", because it isn't
needed as (in the standard) this is always the root.
If the stylesheet needs to know what the starting node was, I'd suggest
passing the information in as a parameter. In fact it's probably good advice
not to use transformNode() on any node other than the root, to keep your
stylesheet portable: I think you can always achieve the same effect using
parameters.
From Jonathan Marsh
Sorry, Michael, I don't agree with your analysis here at all.
XSLT is designed as a whole to enable incremental regeneration. To suggest
that an interface that exposes incremental execution to the programmer is
not allowed goes against both the letter and the spirit of the XSLT
recommendation. To suggest that an API to XSLT processing is proprietary
might lead one to surmise that there IS a standard API, which is also not
the case. The transformNode method offers significant benefits not
available in other APIs, and enables applications that would be impossible
with a more limited API. As such it provides a differentiating factor for
MSXML.
There is nothing wrong with a template that matches all nodes, and thus is
sensitive to the context it gets executed on. Such a stylesheet is
completely portable, although the limitations of other APIs may make it
impossible to get the desired results. I'm not sure how parameters can be
made to solve this without proprietary extensions like saxon:evaluate()
anyway. Even if it is possible, passing the parameter into the stylesheet
must be done through proprietary interfaces.
Back to Mark Bosley
If you have read Michael Kay's XLST, you know he is
brilliant and a visionary. But in this case I have to side with Mr.
Marsh. TransformNode() is simply great, for a client-server
programmer moving to XML transformNode allows using XML in all sorts
of small things, as well as displaying web pages.
|