|
Summary
This snippet shows how to construct a sophisticated XPath expression which allows sorting of dynamically created strings.
Categories: XPath, XSLT
This snippet shows:
- How to construct a string with an (if-like) XPath conditional expression.
- How to get rid of unwanted characters in a string.
At the beginning of February 2001 the following problem was raised in the xslt-list (http://sources.redhat.com/ml/xsl-list/2001-02/msg00178.html):
I have the following problem when sorting a list of words. My xml is: <root> <line lineID=1> <word wordID=1>ABC-</word> <word wordID=2>ABCD</word> </line> <line lineID=2> <word wordID=1 type=end>DEF</word> <word wordID=2>XYZ</word> </line> </root>
I have a stylesheet to create an alphabetical list of words. If a word contains a dash, it means it has to be joined with the following word that has an attribute type=end. The result is: ABCDEF ABCD XYZ
which is not what I want because the alphabetical order is wrong. I want the output to be: ABCD ABCDEF XYZ
My solution is contained in the XSL Code below.
The main difficulty here is how to specify what exactly is to be sorted. This must be any regular word (not an ending designated by @type='end') But there are two cases here -- either this word does not contain a hyphen, or it does contain a hyphen. In the first case we must sort on the value of the current node (.) In the second case, we have to concat the value of the current node with the value of the first following word node, which is an ending (@type='end').
How to have the combination of these two possible cases specified by one single XPath expression? The following expression is quite close to achieving this: concat(.,self::*[contains(.,'-')]/following::word[@type='end'])
In case the current node does not contain a hyphen, then the second argument of the concat() function will be the empty string -- therefore the result will be only the value of the current node.
In case the current node does contain a hyphen, then the second argument of the concat() function correctly will evaluate to the first of the following word nodes that are endings.
Just one little problem remains -- the hyphen in the current node remains as part of the concatenation. This will generally make the result of the sort incorrect.
How do we fix this? Now the translate() function comes to help -- always when it's necessary to get rid of a single character from a string, this can be achieved as follows: translate(aString, singleChar, '')
Therefore the complete solution is to have the select attribute of the xsl:sort as follows: select=translate(concat(.,self::*[contains(.,'-')]/following::word[@type='end']), '-', '')
|