|
Summary
Attribute normal or Element normal from an XSLT perspective?
There occassionally rages the debate over which is the 'best' out of Attribute Normal and Element Normal XML. Without getting drawn into that debate on an architectural front I'd like to examine the 'facts' on a performance front, especially with regard to XSLT performance.
Let's take two peices of XML that contain essentially the same content data:-
Example A XML (Attribute Normal):- <?xml version=1.0?> <root> <data status=1 value=abc/> <data status=1 value=abc/> <data status=1 value=abc/> <!-- repeated many times --> <data status=1 value=abc/> </root>
Example B XML (Element Normal):- <?xml version=1.0?> <root> <data><status>1</status><value>abc</value></data> <data><status>1</status><value>abc</value></data> <data><status>1</status><value>abc</value></data> <!-- repeated many times --> <data><status>1</status><value>abc</value></data> </root>
And produce two stylesheets that process this data, filtering by status and displaying the value (attribute or element):-
Example A XSLT (to handle Attribute Normal):- <?xml version=1.0?> <xsl:stylesheet version=1.0 xmlns:xsl=http://www.w3.org/1999/XSL/Transform> <xsl:template match=/> <xsl:for-each select=root/data[@status = '1']> <xsl:value-of select=@value/> </xsl:for-each> </xsl:template> </xsl:stylesheet>
Example B XSLT (to handle Element Normal):- <?xml version=1.0?> <xsl:stylesheet version=1.0 xmlns:xsl=http://www.w3.org/1999/XSL/Transform> <xsl:template match=/> <xsl:for-each select=root/data[status = '1']> <xsl:value-of select=value/> </xsl:for-each> </xsl:template> </xsl:stylesheet>
There are three processes invloved in any transformation (as a whole) - i) the time to parse the XML; ii) the time to parse the XSL and iii) the time to perform the actual transformation. Let's start by examining the relative performances of just the XML parsing portion of the process:-
MSXML3 - Attribute Normal (A) is 33% faster than Element Normal (B) MSXML4 - Attribute Normal (A) is 15% faster than Element Normal (B) Xalan-C - Attribute Normal (A) is 22% faster than Element Normal (B)
These results should come as no great surprise - as obviously the size of the Element Normal document will be larger than that of the Attribute Normal document. So just as an aside let's filter these comparative figures by factoring in the relative document size. Therefore, on a crude comparative parsing per character the figures come out as:-
MSXML3 - Element Normal (B) is 26% faster than Attribute Normal (A) MSXML4 - Element Normal (B) is 42% faster than Attribute Normal (A) Xalan-C - Element Normal (B) is 36% faster than Attribute Normal (A)
So it would appear that although strictly the parsing time for Attribute Normal is faster than Element Normal, the process of actually mapping nodes is faster for Element Normal. This would seem quite logical when you consider that, amongst other things, attributes takes greater processing to ensure the well-formedness of the XML (i.e. to check that attribute names are not repeated).
But this is an aside, let's get back to the issue of how this affects the XSLT performance in handling the two different structures of XML. On tests with extremely large input XML documents the comparative figures between processing Attribute Normal and Element Normal come out indentical. Even in repeated tests the differences were negligable and inconsistent - no conclusive or sizeable favour given to either XML document type (any differences were fractions of a percent and swayed in favour between the two).
The conclusion - parsing of Attribute Normal is faster than Element Normal - but this should never be taken as a consideration without also acknowledging the fundamental advantages and limitations of each architecture. But from an XSLT transformation performance stance there is no difference either way!
Cheers - Marrow http://www.MarrowSoft.com - home of Xselerator (XSLT IDE and debugger) http://www.TopXML.com/Xselerator
| Further additional information | |
Updating comments...
Updating comments...
|