Mark Wilson I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
First posted :
03/24/2008
Times viewed :
287
SQLXML - Raw mode
RAW mode queries result in a flat table-like
XML format. The format does not preserve any information about the origin of
the data or hierarchical relationships. SQL Server simply transforms each row
of the result set into an XML element with the name row, very similar to
the output format of the persist-to-XML option of classic ADO Recordsets. Every
column that is not NULL is mapped to an attribute of the column’s name. The
example SQL statement below illustrates using XML RAW:
The element does not reflect that CompanyName and ProductName came from
different tables. In fact it does not even indicate anything about either table
the results came from. It only lists the columns we selected.
AUTO mode returns the results in a more descriptive XML format. Each
selected row results in an element named after the table from which it was
selected. The selected columns result in attributes of the elements by default.
If the SELECT statement joins multiple tables the results from the joined table
are represented as child elements. AUTO mode also recognizes the ELEMENTS
option. If we append “, ELEMENTS” to the query statement columns in the
returned rowset are also mapped to child elements rather than attributes. The
nesting of elements and their children is determined by the order of the tables
in the SELECT clause. Thus the order of the columns in the SELECT clause is
significant. We can see the difference if we replace RAW with AUTO in the above
query:
from Products JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
Where CompanyName="Exotic Liquids"
FOR XML AUTO
The results do show each column as attributes on elements named after the
table from which they were selected. The result also reflects the parent-child
relationship of the joined tables “Suppliers” and “Products” by nesting the
Product children inside Suppliers parents:
Note: You can change element and attribute names by defining aliases for
tables and columns in the SQL query.
EXPLICIT Mode
The EXPLICIT mode is, well, the most explicit. This mode allows you to
control every aspect of the XML format returned by the query. In EXPLICIT mode,
the SQL query must be written in a very specific way. Within the query we need
to provide not only information about the data to query, but also the XML
format in which to return the results. We are fully responsible for ensuring
the XML is well formed and valid. The key is to setup the SQL statement as to
result in a very specific rowset format called a “universal table”. EXPLICIT
mode then transforms the universal table to an XML document. The easiest way to
understand a universal table is to build one. Once you have mastered the first
one, the next one is much easier, so let’s go ahead and look at the FOR XML
EXPLICIT query below that returns the same format as the previous FOR XML AUTO
query and examine how it works.
1 A FOR XML
EXPLICIT query returning XML from SQL Server.
SELECT 1 As Tag, NULL As Parent,
CompanyName As [Suppliers!1!CompanyName],
NULL As [Products!2!ProductName],
NULL As [Products!2!UnitsInStock],
NULL As [Products!2!UnitsOnOrder]
FROM Suppliers
UNION ALL
SELECT 2 As Tag, 1 As Parent,
CompanyName,
ProductName,
UnitsInStock,
UnitsOnOrder
from Products JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
ORDER BY [Suppliers!1!CompanyName], [Products!2!ProductName]
FOR XML EXPLICIT
Did you notice that the statement actually combines two individual SELECT
statements to produce the universal table. The first column in each of the
SELECT statements is a named column and its type is a number. This column is
represented as the Tag column. We will see in just a minute what it does. The
second column in the SELECT clause is also a named column and its type is also
a number, I’ll refer to this as the Parent column. These two columns together
specify the parent-child relationship between the results of the two SELECT
queries in the resulting XML document. The Tag column stores the tag number by
which children can identify their parents in the parent column. Thus, if the
Parent column is 0 or NULL, then the row is placed at the highest level of the
XML tree. In our example query the Supplier data is the highest level because
we selected NULL into the Parent column – not because its Tag is 1! According
to this rule, the results from the second SELECT in our example make up the
children of the first one because the parent column references the tag value
from the first statement. Which children appear under which parent is
determined by the JOIN clause of the second statement.
Even though you will never see or use a universal table directly, it is
easier to work out EXPLICIT queries if you think in terms of the universal
table concept. The table 8.13 below shows part of the universal table from our
query for one supplier.
1.3 An
example universal table
Tag
Parent
Supplier!1!CompanyName
Products!2!ProductName
Products!2!UnitsInStock
1
NULL
ExoticLiquids
NULL
NULL
2
1
ExoticLIquids
Aniseed Syrup
13
2
1
ExoticLiquids
Chai
39
2
1
ExoticLiquids
Chang
17
The output of the query above will produce results in the same XML format
returned by the FOR XML AUTO query in the previous section:
Despite the tedious work setting up the universal table, we have not gained
anything over querying with FOR XML AUTO yet. Now let’s see why FOR XML
EXPLICIT is much more powerful than XML AUTO. In the introduction to this
article promised that EXPLICIT gives us full control over the returned XML
format and here is how.
You probably noticed the strange-looking column names specified in the query
statement. These names actually encode all the information SQL Server needs to
format the query results. Each name contains up to four data fields separated
by a “!”:
ElementName!TagNumber!AttributeName!Directive
The meaning of the individual fields is:
The ElementName field specifies the name of the XML to create in the
XML document.
The TagNumber identifies the nesting level of this column in the
output document. The value must match a Tag value in the universal table.
The AttributeName is the name of an XML attribute to create in the
XML document for the specified column. If the AttributeName is missing the
format will be determined by the Directive portion of the format.
Directive is an optional directive that can be used to encode XML
ID, IDREF, and IDREFS or it can indicate how to map string data to XML byusing
the keywords hide, element, xml, xmltext, and cdata. If Directive is not
specified, the AttributeName must be specified. Table 15.4 lists the directives
recognize by Sql Server 2000.
1.4 The
Directive of an FOR XML EXPLICIT query controls the XML format returned for a
particular column in the rowset.
Directive
Description
ID, IDREF, and IDREFS
Identities XML ID columns and the IDREF columns referencing them. This
clause has no effect on the returned format if XMLDATA is not specified
anywhere in the FOR XML clause.
hide
Omits the field from the result XML document
element
Formats the field as an XML element rather than an attribute. If the
field’s data contains characters restricted in XML document, the characters
are automatically converted to the corresponding entity references.
xml
Same as “element” directive without replacing restricted characters.
xmltext
Embeds an XML fragment from a column directly in the returned XML without
a containing element. The column has to be of a text type.
cdata
Wraps the field data in a CDATA section. The field’s data type must be
character based. As of release 3.0 SP 1 FOR XML EXPLICIT cannot create CDATA
sections inside a child element. In those cases you have to manually create
the CDATA markers in the SQL.
I think by now you realize that EXPLICIT mode queries are very powerful. You
probably also realize that they also can be quite cumbersome to develop,
especially when your schema nests types more than two levels deep. Fortunately
the SQLXML web releases provide a much easier-to-use wrapper around EXPLICIT
mode queries. If you like the idea of returning arbitrarily formatted XML from
SQL Server, be sure to read the section on mapping schemas and the SQLXML
managed classes.
XMLDATA
The last option we need to discuss with FOR XML is the XMLDATA option.
XMLDATA’s purpose is to generate an XDR that describes the format of the XML
document returned for a query. The schema is generated and returned inline with
the rest of the XML document. Care should be taken when using the XMLDATA
option. Generating the schema takes server resources that could be used for
other more “SQL-like” things. Since the XML that is returned is character data,
you may just want to use the XML as it is without the XDR. This is especially true
if the query is to be run hundreds or thousands of times a second.