BizTalk Utilities CV ,   Jobs ,   Code library
 
Home Page


Add/Edit your code items
Search the code library
Browse for the code library


XQuery


 
 

<< XmlSerializerXSLT >>
 


By Mark Wilson
I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
First Posted 11/04/2002
Times viewed 3310

Learn XQuery 1.0 fundamentals


Summary XQuery 1.0 is new query language for XML. XQuery is more optimized than XPath. We look at XQuery syntax and learn it through real world tested examples.

                           Learn XQuery 1.0 Fundamentals

An Introduction to XQuery

As increasing amounts of information are stored, exchanged, and presented using XML, the ability to intelligently query XML data sources becomes increasingly important. One of the great strengths of XML is its flexibility in representing many different kinds of information from diverse sources. To exploit this flexibility, an XML query language must provide features for retrieving and interpreting information from these diverse sources.

What is XQuery?

XQuery version 1.0 is a public W3C Working Draft for review by W3C Members and other interested parties and is published on 16 Aug, 2002 at w3c.org. XQuery is available at <http://www.w3c.org/XML/Query>. W3C has also released XQueryX <http://www.w3.org/TR/xqueryx> version 1.0 which is an alternative XML representation of an XQuery. The discussion of XQueryX is beyond the scope of the article.

XQuery is derived from an XML query language called Quilt. It is said that: XQuery is for XML documents just like SQL is for tables. It covers great features being a query language. Its queries consist on expressions because it shares the data model of XPath version 2.0 <http://www.w3.org/TR/xpath20>. XQuery's type system is based on XML Schema <http://www.w3c.org/XML/Schema>. It contains almost more than 200 built in functions and operators. The library of functions and operators supported by XQuery is defined in XQuery 1.0 and XPath 2.0 Functions and Operators <http://www.w3.org/TR/xquery-operators>.

Using XQuery, we can select information from XML documents. We can also construct XML structure dynamically which can have results of expressions. We can implement business logic of our application completely because we can perform every kind of computations in its expressions. I think it is also not complete yet because we cannot update or delete information from an XML document using XQuery.

Let’s take a detail look on the syntax of XQuery.

Expressions

XQuery is an expression driven language. Its version 1.0 is an extension of XPath Version 2.0. Any expression that is syntactically valid and executes successfully in both XPath 2.0 and XQuery 1.0 will return the same result in both languages.

XQuery expression(s) is covered by two braces ( {} ) for example:

{ XQuery expression(s) }

XQuery processor evaluates only the expression which is covered by braces. It considers an expression as a string and does not evalute it if it exists out side of the braces.

All query examples of the article are based on following XML document named bib.xml. All queries are tested with X-Hive <http://www.x-hive.com/xquery> online demo XQuery engine.

<bib>

<book year=1994>

<title>TCP/IP Illustrated</title>

<author><last>Stevens</last><first>W.</first></author>

<publisher>Addison-Wesley</publisher>

<price> 65.95</price>

</book>

<book year=1992>

<title>Advanced Programming in the Unix environment</title>

<author><last>Stevens</last><first>W.</first></author>

<publisher>Addison-Wesley</publisher>

<price>65.95</price>

</book>

<book year=2000>

<title>Data on the Web</title>

<author><last>Abiteboul</last><first>Serge</first></author>

<author><last>Buneman</last><first>Peter</first></author>

<author><last>Suciu</last><first>Dan</first></author>

<publisher>Morgan Kaufmann Publishers</publisher>

<price>39.95</price>

</book>

<book year=1999>

<title>The Economics of Technology and Content for Digital TV</title>

<editor>

<last>Gerbarg</last><first>Darcy</first>

<affiliation>CITI</affiliation>

</editor>

<publisher>Kluwer Academic Publishers</publisher>

<price>129.95</price>

</book>

</bib>

Following section section introduces each of the basic kinds of XQuery expression.

Primary Expressions

Primary expressions are the basic primitives of the language. They include literals, variables, function calls, comments, and the use of parentheses to control precedence of operators. Following section take a detail look on each type:

1. Literals

A literal is a direct syntactic representation of an atomic value i.e. it doesn’t contain another value. XQuery supports two kinds of literals: string literals and numeric literals.

Here are some examples of literal expressions:

  • 12.5 denotes the string containing the characters '1', '2', '.', and '5'.
  • 12 denotes the integer value twelve.
  • 12.5 denotes the decimal value twelve and one half.
  • 125E2 denotes the double value twelve thousand, five hundred.

2. Variables

A variable evaluates to the value to which its name is bound in the evaluation context. Variables can be bound by clauses in for expressions and quantified expressions, and

by function calls. For example

for $var in document (bib.xml)/bib/book return

In the query, result containing book nodes tree is bound to $var variable. Dollar sign ($) is the part of XQuery syntax and is used to represent a varibale.

3. Parenthesized Expressions

Parentheses may be used to enforce a particular evaluation order in expressions that contain multiple operators. For example, the expression (2 + 4) * 5 evaluates to thirty, since the parenthesized expression (2 + 4) is evaluated first and its result is multiplied by five. Without parentheses, the expression 2 + 4 * 5 evaluates to twenty-two, because the multiplication operator has higher precedence than the addition operator.

4. Function Calls

Function call is same like other languages e.g. C++ and Java. A funcation call consists of a function’s signature (function name, and argument list). For example

three-argument-function(1, 2, 3)

Above listed function call denotes a function call with three arguments. XQuery supports user defined functions. We will discuss them later in the article.

5. Comments

XQuery comments can be used to provide informative explanation. For example

{-- This expression calculates prices of books --}

Comments may be used before and after major tokens within expressions and within element content. XQuery ignores the comments and doesn’t evaluate them.

Path Exprdessions

A path expression can be used to locate nodes within a XML tree. Path Expressions are used to extract information from XML documents in form of nodes. These are just like Select statement of SQL. We will discuss it brefily in the following secion. As I have mentioned above that XQuery shares Xpath 2.0 data model so if you want to learn more about the path expressions in detail then visit www.xpath2.0 <http://www.xpath2.0>.

For example:

/bib/book

Above listed expression will return all nodes containing book elements.

/bib/book[@year=2000]/title

Second expression will return titile element as a node of the book element which was released in year 2000.

Arithmetic Expressions

Arithmetic expressions can be used for calculations like addition, subtraction and multiplication in XQuery.

Here is an example of arithmetic expressions:

let $book := document(bib.xml)/bib/book

return

<result>

{ $book[@year=1992]/price - $book[@year=2000]/price}

</result>

The result is as follows:

<result>

26.0

</result>

Comparison Expressions

Comparison expressions allow two values to be compared. The result of a comparison is always true or false. XQuery contains following type of comparisons:

1. Value Comparison

Value comparisons can be made through following operators: eq, ne, lt, le, gt, ge. The names of the operators are abbreviated. Complete names are as follows:

  • eq = equals
  • ne = not equals
  • lt = less than
  • le = less then or equals
  • gt = greater
  • ge = greater than or equals

For example:

Result of the following query is false:

{ 1 eq 2 }

Result of the following query is true:

{ 1 ne 2}

Result is of the following query is true if book contains author named Stevens as his last name:

{ $book/author/last eq Stevens }

2. General Comparison

General comparison are made thorugh operators: =, !=, <, <=, >, >=. Here is an example of a general comparison:

{ $book/author/last = Stevens }

Result is true if book contains author named Stevens as his last name.

3. Nodes Comparison

To compaire nodes, XQuery provides two operators is and isnot. For example:

{ $book[@year=2000]/price is $book[@year=1999]/price}

The result of query is false because both are different nodes.

4. Order Comparison

To check order or occurance of an element node, We can use << and >> operators of XQuerry. For example:

{$book/author << $book/price}

The result of the query is true because author element accures before price element in book element.

Logical Expressions

A logical expression is either an and-expression or an or-expression. Logical expressions are used to validate more than one conditions. Result of logical expression is always true or false.

Here are some examples of logical expressions:

{ 1 eq 1 and 2 eq 2 }

Operator ‘and’ returns true if Its all operands (conditions) reutrn true. Result of above listed expression will be true because both operands (conditions) return true.

{ 1 eq 1 or 2 eq 3}

Operator ‘or’ returns true either one operand (condition) returns true. Result of above listed expression will be true because first operand returns true.

Constructors

XQuery provides constructors that can create XML structures at run time within a query. There are constructors for elements, attributes, CDATA sections, processing instructions, and comments. The constructed XML structures have to obey XML rules i.e. they should be well-formed XML. Now take a detail look on each of the constructure:

1. Element Constructure

Elements containing every kind of structure (like attributes, text, and further child elemtns etc) can be created in query. For Example:

<book year=2002>

<title> Web Service using Java </title>

<author> Qurban Ali </author>

<publisher> Gill publications </publisher>

<price> 40 </price>

</book>

You can also add expressions in elements to insert values at run time. For Example:

<book year=2002>

--------------------

<price> {20 + 20 } </price>

</book>

XQuery processor will only evalute the expression covered by braces and remaining content will be considered as a simple string. You can place another expression in place of attribute value which return a dynamic value at run time.

Namespce Handling

We can also use namespaces in element constructor as we declare in XML documents. For Example:

<book xmlns:book=http://www.abc.com year=2002>

<book:title> Web Service using Java </book:title>

<book:author> Qurban Ali </book:author>

<book:publisher> Gill publications </book:publisher>

<book:price> 40 </book:price>

</book>

2. Attribute Constructure

The result of an attribute constructor is new attribute node. For example:

<book year={2002}>

-----------------------

</book>

The value of year attribute is: 2002.

3. Other construcres

The syntax for a CDATA section constructor, a processing instruction constructor, or an XML comment constructor is the same as the syntax of the equivalent XML construct.

<?format role=output ?>

<!-- Tags are ignored in the following section -->

<![CDATA[ <Books qty=3/> ]]

A special form of constructor called a computed constructor can be used to create an element or attribute with a computed name or to create a document node.

Computed Constructors

An alternative way to create nodes is by using a computed constructor. A computed constructor begins with a keyword that identifies the type of node to be created: element, attribute, or document. The keyword element or attribute is followed by the name of the node to be created (document nodes have no name). We create book element throgh computed constrcuture which we have already been created throgh element constructure. For example:

element book

{

attribute year{2002},

element title {Web Service using Java},

element author {Qurban Ali},

element publisher {Gill publications},

element price {50}

}

The result will be same as we have seen in previous example of element constructure.

Creation of document node through computed constructure is usefull when we want to create a new document containing elements of another XML document. For example:

document {

<BookTitileList>

document (bib.xml)/book/title

</BookTitileList>

}

The resulting XML document will contain all titiles of books retrieved from bib.xml. BookTitileList element will be root node of resulting document.

FLWR Expressions

XQuery provides a FLWR expression for iteration and for binding variables to results returned by expressions at run time. The name FLWR, pronounced flower, stands for the keywords for, let, where, and return. It is equivlent to Select statement of SQL.

Although for and let both bind variables, the manner in which variables are bound is quite different. In a let clause, each variable is bound directly to the result of an expression, as a whole. Consider the following query:

let $author := distinct-values(document(bib.xml)/bib/book/author)

return

<result> { $author } </result>

All author nodes returned by expression are bound to $author variable. The method distinct-values is a built-in function which is used to remove repeating nodes from result. XQuery processor will insert whole result at once in result element. The result of this query is as follows:

<result>

<author>

<last>Stevens</last>

<first>W.</first>

</author>

<author>

<last>Abiteboul</last>

<first>Serge</first>

</author>

<author>

<last>Buneman</last>

<first>Peter</first>

</author>

<author>

<last>Suciu</last>

<first>Dan</first>

</author>

</result>

Now consider a similar query which contains a for clause instead of a let clause:

for $author in distinct-values(document(/XQuery/docs/XMP/bib.xml)/bib/book/author)

return

<result> { $author } </result>

The result of the query is as follows:

<result>

<author>

<last>Stevens</last><first>W.</first>

</author>

</result>

<result>

<author>

<last>Abiteboul</last><first>Serge</first>

</author>

</result>

<result>

<author>

<last>Buneman</last><first>Peter</first>

</author>

</result><result>

<author>

<last>Suciu</last><first>Dan</first>

</author>

</result>

Now we can see difference between both results generated by both keywords. For keyword iterates resulting nodes one by one. We can get one node at a time from the tree of resulting nodes to return. Number of iterations are equal to number of resulting nodes. A FLWR Expression may contain multiple for clauses.

XQuery provides ‘where’ keyword. It can be used with both keywords let and for. It is used to put the condition in query so that we can extract only our desired specific nodes as a result. For example:

<result>

{

for $books in document (bib.xml)/bib/book

where $books/author/last eq Abiteboul

return

$books/title

}

</result>

The result of the query is as follows:

<result>

<title>Data on the Web</title>

</result>

Conditional Expressions

XQuery supports a conditional expression based on the keywords if, then, and else. The expression following the if keyword is called the test expression, and the expressions following the then and else keywords are called the then-expression and else-expression, respectively.

If test expression returns true then then-expression will execute and will return value and if test expression returns false then else-expression will execute and will return value.

For example:

let $book := document(bib.xml)//bib/book

return

<result>

{ if ($book/author/last eq Stevens)

then $book[@year=1992 or @year=1994]/title

else ()

}

</result>

The result of the query is as follows:

<result>

<title>TCP/IP Illustrated</title>

<title>Advanced Programming in the Unix environment</title>

</result>

Quantified Expressions

Quantified expressions are used to check the existance of nodes in an XML document through some and every keywords. These expressions always return true or false as a result.

The keyword every returns true if all resulting nodes of an expression satisfies given requirement. Following expression is true if every book element has a year attribute (regardless of the values of these attributes):

In the following query, result will be true because every book element has year attribute:

let $bib := document(bib.xml)//bib

return

<result> { every $bib in $bib satisfies $bib/book/@year} </result>

But the some keyword returns ture if at least single node satisfies given requirement from whole resulting nodes tree. Following expression is true either signle book element has an editor element:

In the following query, restult will be true because one book element contains editor element:

let $bib := document(bib.xml)//bib

return

<result> { some $bib in $bib satisfies $bib/book/editor } </result>

The Query Prolog Expression

A query consists of a Query Prolog, followed by a Query Body.

The Query Prolog is a series of declarations and definitions that create the environment for query processing. The Query Prolog may contain namespace declarations, schema imports, an xmlspace declaration, a default collation, and some function definitions. Query body contains queries which are used to get results through expressions and/or to produce dynamic XML strucutures.

Namespace Declaration

Namespace is one of the key features of XML. It is used to identify elements uniquely so that ambiguirty can be resolved which is created by same elements of diverse sources. XQuery also supports namespces but syntax of namespace declaration is little different than XML. The following query illustrates a namespace declaration:

declare namespace book=www.abc.com

<book:price>200</book:price>

The namespace declaration is in scope throughout the query in which it is declared, unless it is overridden by a namespace declaration attribute in an element constructor.

Schema Imports

A Schema Import imports the element and attribute declarations and type definitions from a schema. A notation is provided to allow a schema to be imported, and a namespace prefix to be bound to the target namespace of the schema, in a single step.

This notation is illustrated by the following example:

import schema namespace book=www.abc.com

at http://www.abc.com/schemas/books.xsd

document (bib.xml)/book:bib

Xmlspace Declaration

The xmlspace declaration in a Query Prolog controls whether whitespace is preserved by element and attribute constructors during execution of the query. If xmlspace = preserve is specified, whitespace is preserved. If xmlspace = strip is specified or if no xmlspace declaration is present, whitespace is stripped (deleted).

The following example illustrates an xmlspace declaration:

declare xmlspace=preserve

Default collation

A Query Prolog may declare a default collation, which is the name of the collation to be used by all functions and operators that require a collation if no other collation is specified. For example, the gt operator on strings is defined by a call to the xf:compare function, which takes an optional collation parameter. Since the gt operator does not specify a collation, the xf:compare function implements gt by using the default collation specified in the Query Prolog. The default collation is identified by a literal string containing a URI.

The following example illustrates a declaration of a default collation:

default collation = http://example.org/languages/Icelandic

If a Query Prolog specifies no default collation, the Unicode codepoint collation (http://www.w3.org/2002/08/query-operators/collation/codepoint) is used. If a Query Prolog specifies more than one default collation, or if the value specified for a default collation is not a string literal, a static error is raised.

Function Definitions

XQuery contains built in functions but it allows users to define functions of their own. A function definition specifies the name of the function, the names and datatypes of the parameters, and the datatype of the result. These user defined functions can contain business logic of an application.

For example:

define function maxPrice (element $book)

returns string

{

string (max ($book/price))

}

let $book :=document (bib.xml)/bib/book

return

<price>

{

maxPrice ($book)

}

</price>

Methods string and max are built-in functions of XQuery. The result of the above listed query is as follows:

<price> 65.95 </price>

In XQuery 1.0, user-defined functions may not be overloaded. Only one function definition may have a given name. Although it does not allow overloading of user-defined functions, some of the built-in functions in the XQuery core library are overloaded.

Conclusion

XQuery version 1.0 is a working draft, currently published by W3C on 16 Aug 2002. It is a query language for XML documents like SQL is for database tables. It is also expressions driven language and shares the data model of XPath 2.0 but it is more comprehensive and optimized solution than XPath. It contains great features being a query language. It has variety of built-in funcations and operators to perform different tasks. XQuery supports data retrievel and dynamically XML structures creation mechanisms but it doesn’t support updation and deletion mechanisms yet.

Resources

  • XML Query Requirements <http://www.w3.org/TR/xmlquery-req>. The planning document for the working group: a list of XQuery desiderata.
  • XML Query Use Cases <http://www.w3.org/TR/xmlquery-use-cases>. A number of real-world scenarios and XQuery snippets solving specific problems.
  • XQuery 1.0: An XML Query Language <http://www.w3.org/TR/xquery/>. The central document, detailing the surface language and an overview of everything else.
  • XQuery 1.0 and XPath 2.0 Data Model <http://www.w3.org/TR/query-datamodel/>. An extension of the XML infoset. It describes the data items that a query implementation needs to deal with.
  • XQuery 1.0 Formal Semantics <http://www.w3.org/TR/query-semantics/>. The underlying algebra formally defining query engine behavior.
  • XML Syntax for XQuery 1.0 (XQueryX) <http://www.w3.org/TR/xqueryx>. An alternative XML syntax for queries, preferred by machines.
  • XML Path Language (XPath) 2.0 <http://www.w3.org/TR/xpath20/>. The XPath 2.0 specification, broken out separately from XQuery.
  • XQuery 1.0 and XPath 2.0 Functions and Operators <http://www.w3.org/TR/xquery-operators>. New as of August 16, 2002.
  • X-Hive <http://www.x-hive.com/xquery>. Another nice-looking online demo.
  • XML Query Language Demo <http://131.107.228.20/>. Microsoft's online XQuery demo.

Rate this article on a scale of 1 to 10 (2 votes, average 8)

Your vote :  

<< XmlSerializerXSLT >>
 





Leave a comment for this article
Your name
Your email (optional)
Your comment
Optional: Upload an attachment
Enter the code shown:

 
 

    Email TopXML