BizTalk Utilities CV ,   Jobs ,   Code library  
 
 
Page 2 of 2

 

Previous Page 
Table of Contents

• The Purpose of Schemas
 Understanding the XSD
• Datatypes and Simple Types
• Structures
• The Future of Schemas


XSD Schema Tricks and Tips

Understanding the XSD

W3C Schema Definition Language

As I write this, the W3C XML Schema Definition Language (XSD) has just recently gone into Candidate Recommendation Status. The XSD Standard is in fact two different documents, along with a "tutorial" document. These are listed below, and will all be covered in greater depth subsequently.
  • XML Schemas Part 1: Structures - This document provides the grammar for the language, defining the notion of a schema in an abstract fashion then discussing specific implementation details.
  • XML Schemas Part 2: Datatypes - This document gives the datatype primitives that are used within schemas, as well as covering some of the constraining rules and subtypes, such as regular expression.
  • XML Schemas Part 0: Primer - This 'primer' is not for the faint of heart. As a measure of the complexity of schemas, even the primer can be difficult to follow.

XSD Structures

XML Schema Structures form the meat of the schema specification. Contained within its pages are the keys to describing the structure of documents, the distinction between Simple and Complex element types, Abstraction and Inheritance, and considerably more. The document itself is divided into the following:
  • Conceptual Framework - This describes the principles used to abstract the process of defining schemas. It is useful from a terms definition standpoint, but it doesn't explicitly define the XML implementation.
  • Schema Components - Not there yet. Schema components describe each of the elements and attributes that are used in creating formal schemas from a strictly formalistic standpoint.
  • XML Implementation - This part contains the actual implementation details, covering the XML representation of the formal grammar.
  • Constraints - This section deals with ways that a given element or attribute can be constrained to work within a subset of its original domain.
  • Access and Composition - This section covers how schemas work across a distributed system, including the thorny issues of namespace design and schema location and retrieval.
  • Validation - This section looks at validation mechanisms and how they should be addressed by the parser vendors.
  • Location is http://www.w3.org/TR/xmlschema-1/

XSD Datatypes

The Datatypes document is simpler, dealing as it does with the specific implementation of type rather than conceptual frameworks.
  • Datatype Concepts - This section looks at data types in an abstract fashion, setting up the distinctions between differing archetypes of data, and introducing the notions of value and lexical spaces.
  • Built-in Datatypes - This section lists the various core datatypes, including those for string, numeric, and date representation.
  • Datatype Components - This section deals primarily with constraints that limit the scope of given datatypes, and includes both the use of patterns (i.e., regular expressions), enumerations, precision, and minima and maxima constraints.
  • Location is http://www.w3.org/TR/xmlschema-2/

XSD Primer

In order to understand the XSD specifications, read the Primer. It can be a little cryptic, but Document 0 is still probably the best place to see Schema code in action. Just a few notes.
  • Four Separate Examples - The primer looks at four different examples of schemas defined within the language, working its way up in complexity from specifying simple documents to creating abstract types and inheritance.
  • Location is http://www.w3.org/TR/xmlschema-0/

Datatypes and Simple Types

Built-In Primitive Datatypes

XSD contains a number of built-in datatypes. In some cases these types are primitives - they are not made of any other types - while others are derived from more primitive types (or from derived types that are themselves built from primitives). The following primitive types are far from inclusive, but represent most of the more common types.
NAME DEFINITION EXAMPLE DERIVED
string A sequence of Unicode characters. "This is a sample string. ????' no
boolean One of of either true (1), or false (0). true no
float A single precision 32-bit floating point type -1E4, 2442, 342.34, 0, INF, NaN no
double A double precision 64-bit floating point type -1E4, 2442, 342.34, 0, INF, NaN no
decimal A decimal number of arbitrary precision 3.141582653589793238462643383279 50288419716939937510... no
timeDuration A specific period of time, in the format P nY nM nD T nH nM nS. Only relevent duration need be shown. P1Y2M13DT4H represents one year, two months, thirteen days and four hours. no
URI A Universal Resource Locator http://www.topxml.com/cagle no

Built-in Derived Datatypes

Derived types were added by the W3C to cut down on developers rolling their own for many of the more common data formats such as integer or time.
NAME DEFINITION EXAMPLE DERIVED
integer A decimal value in which the scale (the number of digits after the decimal point) is 0. ..., -2, -1, 0, 1, 2, ... from decimal
nonPositiveInteger All integers less than or equal to 0 ..., -3, -2, -1, 0 from decimal
long Value derived from integer within ±9223372036854775808 2214433234, 12, -32551 from decimal
int Value derived from long within ±2147483648 32768, 12, -32551 from long
short Value derived from int within ±32786 32765, 12, -32551 from int
byte Value derived from short within ±128 78, 12, -114 from short
unsignedInt Value derived from unsignedLong within 0 to 4294867286 3248321,52,-215534 unsignedLong
time Time represents an instant of time that recurs every day, and is given in the format HH:MM:SS-ZZ:YY, where ZZ:YY represents the time zone offset relative to Greenwich Mean Time. 21:15:00-08:00, which is 9:15 at night in Seattle (8 hours from Greenwich Mean Time) from recurringDuration
timeInstant timeInstant combines date and time format, with the time separated from the date by a "T". 2000-08-15T 21:15:00-08:00 is August 15, 2000 at 9:15 PM in Seattle. from recurringDuration

DTD Specific Datatypes

In addition to string, numeric, and date types, the schema reference also covers more traditional XML types that pertain to atomic units - name tokens, entities, and so forth. These are largely included for backward compatibility with DTDs.
NAME DEFINITION EXAMPLE DERIVED
ID A unique name token that identifies a given element. id="a1924" no
IDREF A reference to an existing ID for a given element. idref="a1924" no
NMTOKEN A collection of alphanumeric characters and the underscore character, used within attributes. the value 'red' in the attribute color="red" no
NMTOKENS A list of NMTOKEN items, typically as options within a given attribute, separated by white space. the value 'red blue green' in the attribute colors="red blue green" from NMTOKEN
ENTITY A reference to a specific entity object defined within a DTD. &myDocument; no

Creating A Simple Type By Constraint

Simple types form the foundation of datatypes, and are created in one of two ways. Either they are constrained from existing data-types, or they are aggregated from simpler data types. As an example of the former, consider a phoneNumber type, which is essentially a string that follows a very specific order (including potential optional characters), a purchase order quantity amount (limited to 1000 units), and an enumeration of specific holidays.
<simpleType name="phoneNumber" base="string">
    <pattern value="(?\d{3})?-?\d{3}-?\d{4}"/>
</simpleType>     
<simpleType name="poQuantity" base="int">
    <minInclusive value="0"/>
    <maxInclusive value="1000"/>
</simpleType>        
<simpleType name="holiday" base="recurringDate">
    <annotation>
        <documentation>Holiday Values
    <enumeration value="--01-01"/>
    <enumeration value="--07-04"/>
    <enumeration value="--10-31"/>
    <enumeration value="--12-25"/>
</simpleType&l\gt;       

Creating A Simple Type By List

You can also create a simple type by aggregating a list of other simple types. The following, for instance, will create aggregates of the phoneNumber, poQuantity types.
<simpleType name="phoneNumbers" base="phoneNumber" derivedBy="list"/>
<simpleType name="poQuantities" base="poQuantity" derivedBy="list">
    <minLength value="0"/>
    <maxLength value="3">
        <annotation>
            <documentation>No more than three poQuantities can be given</documentation>
        </annotation>
    </maxLength>
</simpleType>     

Structures

A Basic Schema

A schema consists of a collection of <element> tags, along with a collection of <simpleType> and <complexType> elements that aggregate other elements into a cohesive blocks. In the simplest cases, the arrangement of elements are straightforward, mapping easily to a known schema instance. For example, the schema for the slideshow that you are currently viewing is quite simple (at least at first blush) and is illustrated below.
<schema xmlns="http://www.w3.org/1999/XMLSchema">
    <element name="slides" type="slideList"/>

    <complexType name="slideList">
        <element name="head" type="slideListHead"/>
        <element name="group" type="slideGroup" minOccurs="0" maxOccurs="unbounded"/>
    </complexType>

    <complexType name="slideListHead">
        <element name="title" type="string"/>
        <element name="author" type="string"/>
        <element name="date" type="date"/>
    </complexType>

    Continued ... 

A Basic Schema (Continued)

This in turn contains the group and slide information.Continued ...
    <complexType name="slideGroup">
        <element name="title" type="string"/>
        <element name="slide" type="Slide" minOccurs="0" maxOccurs="unbounded"/>
    </complexType>
    
    <complexType name="Slide">
        <element name="title" type="string"/>
        <element name="para" type="string" minOccurs="0"/>
        <element name="point" minOccurs="0">
            <complexType content="mixed">
                <element name="key" type="string" minOccurs="0" maxOccurs="1"/>
            </complexType>
        </element>
        <element name="code" type="string" minOccurs="0"/>
    </complexType>
</schema>

Complex Types

Simple Types define atomic characteristics, and constrain an existing type. Complex Types, on the other hand, aggregate collections of elements together into a single cohesive unit. For example, a Slide element pulls in a number of string based elements into the contents of a primary Slide object, which can in turn be referenced in a type attribute by some other schema element definition.
    <complexType name="Slide">
        <attribute name="id" type="ID"/>
        <element name="title" type="string"/>
        <element name="para" type="string" minOccurs="0"/>
        <element name="point" minOccurs="0">
            <complexType content="mixed">
                <element name="key" type="string" minOccurs="0" maxOccurs="1"/>
            </complexType>
        </element>
        <element name="code" type="string" minOccurs="0"/>
    </complexType>

Mixed Content

Mixed content, elements containing both nodes and text, can be especially difficult to specify. The content="mixed" attribute indicates that the enclosed elements will likely occur in the company of one or more unspecified text nodes.
<complexType content="mixed">
    <element name="key" type="string" minOccurs="0" maxOccurs="1"/>
</complexType>

Attributes

You can define an attribute within an XSD schema using the <attribute> tag, analogous to the way that elements are defined. For example, to create an id attribute associated with the Slide element, you would include the tag:
    <complexType name="Slide">
        <attribute name="id" type="ID"/>
Note that you cannot create an attribute using a complex type, although an attribute from a simple type is quite permissible.

Fixed Types and Default Values

  • You can assign a specific value to a given element or attribute. Based upon the XSD use attribute, the role that the value has can be determined:
  • fixed, in which the element or attribute will always have that value (and as such can typically be excluded)
  •         <attribute name="output" use="fixed" value="text/xml" type="string"/>
    
  • default, where the value of the attribute or element is automatically set to the default value if the element is not explicitly specified,
  •         <attribute name="country" use="default" value="United States" type="string"/>
    
  • optional, in which the attribute or element does not need to be explicitly specified within a calling element,
  •         <attribute name="street2" use="optional" type="string"/>
    
  • required, the element or attribute must be included within the parent element.
  •         <attribute name="name" use="required" type="string"/>
    
  • prohibited, the element or attribute cannot be included in the parent element.
  •         <attribute name="data" use="prohibited" type="string"/>
    

Boundaries

  • You can also specify the minimum and maximum number of occurences of a given element (by definition XML cannot have more than one attribute of the same name).
  • maxOccurs gives the maximum number of times a given element can appear, and ranges from 0 (only for use="prohibited") to any integer value, or the text value "unbounded" (no limitation on maximum number of elements).
  • minOccurs gives the minimum number of times a given element can appear, and can take on the values 0, 1, or any value less than or equal to maxOccurs in the same section.

Anonymous Types

In certain cases, a type is only specifically needed once within a complex data type definition, and as such there is no explicit need to create a formally named element type. In such cases the type element is said to be an anonymous type. Anonymous types are otherwise identical to their non-anonymous counterparts:
<xsd:complexType name="Items">
  <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
    <xsd:complexType>
      <xsd:element name="productName" type="string"/>
      <xsd:element name="quantity">
        <xsd:simpleType base="xsd:positiveInteger">
          <xsd:maxExclusive value="100"/>
        </xsd:simpleType>
      </xsd:element>
      <xsd:element name="price" type="decimal"/>
      <xsd:element ref="comment" minOccurs="0"/>
      <xsd:element name="shipDate" type="date" minOccurs='0'/>
      <xsd:attribute name="partNum" type="Sku"/>
    </xsd:complexType>
  </xsd:element>
</xsd:complexType>

Annotations

  • Schemas describe an object structure, but that description goes beyond simply providing encapsulation and data type information. Annotations let you provide additional information about the element or attribute in question:
  • Application Information (appInfo), can be used to pass specific information about the element. For example, information about how a schema to instance generator might render a given element would be contained here.
  • Documentation (documentation) provides a way of describing the element in human readable terms, and could in turn contain specific information in different languages.
  • Labels. Either <documentation> or <appInfo> can be used to provide labels for tables and other interface elements. This can pull the onus of generating output labels from XSLT or DOM into the schema itself.

Grouping and Ordering

  • XSD varies from earlier schema models in that the assumed model is one where given elements appear in the sequence specified. However, there are in fact three models for containing data:
  • sequence. This, the default model, requires that all elements are presented in the order given.
  • <group name="shipAndBill">
      <sequence>
        <element name="shipTo" type="Address" />
        <element name="billTo" type="Address" />
      </sequence>
    </group>
    
  • choice. The choice model presents the elements as potential options that are mutually exclusive - only one element among the choices given can be a child of the indicated parent element.
  • <group name="addressInfo">
      <choice>
        <element name="address" type="AddressUS" />
        <element name="address" type="AddressUK" />
        <element name="address" type="AddressDE" />
      </choice>
    </group>
    

Grouping and Ordering II

  • all. The all model waives the specific sequencing order of the sequence model and lets you choose elements in any order. The one caveat is that the minOccurs must be either 0 or 1 and the maxOccurs must be 1 (there should be no more than one such element of any given type in an all set. For any given group, the all element must contain all elements in that group.
  • <complexType name="PurchaseOrderType">
      <all>
        <element name="shipTo" type="Address"/>
        <element name="billTo" type="Address"/>
        <element ref="comment" minOccurs="0"/>
        <element name="items"  type="Items" />
      </all>
      <attribute name="orderDate" type="date" />
    </complexType>
    

Attribute Groups

XSD lets you define a group of related attributes as a single entity that can be referenced by an element. For example, suppose that in a catalog of books of different genres you still had a number of common attributes (cover image (href), list price (price), and ISBN number (isbn). You could define an attribute set that would pass recreate this information, as follows:
<attributeGroup name="bookInfo">
    <attribute name="href" type="URI"/>
    <attribute name="price" type="float" pattern="\d*(.\d{2})+"/>
    <attribute name="isbn" type="string" pattern="\d{10}"/>
</attributeGroup>

<complexType name="book">
    <attributeGroup ref="bookInfo"/>
</complexType>

Advanced Features

  • This is sufficient to create basic XSD schemas, though the specification itself is considerably more complex. The XSD specification also permits the following:
  • Include. You can break a schema into a set of smaller component schemas.
  • Class Inheritance. It is possible to create a type that derives from other types in a very straightforward manner. This makes inheritance possible.
  • Equivalence Classes makes it possible to create polymorphic classes, where the same general interface can have separate implementations based upon internal data.
  • Abstract Elements provide a jumping off point for the implementation of inheritance within XML terms.
  • Preventing Derivations - what this does is to guarantee that abstract types cannot be inherited.

The Future of Schemas

Object Oriented XML

  • One of the most immediate impacts that Schemas will have upon XML-based programming is that it mixes the paradigms of the declarative structures that are characteristic of XML with an object-orientation that makes it possible to deal with identity and conceivably uniqueness in the programming sphere. Specifically,
  • Distributed Objects. With the combination of XML listings of data code and either embedded or referenced schema objects, it becomes easier to maintain the integrity of a given XML "object" across a broad network.
  • Primitives Repositories. If you look upon XML as a means of creating a model that describes any object in the virtual sphere, then the schema enables the concept of primitives libraries that could reside anywhere on the net.
  • Instance Generation. An XSD schema, with its associated annotation resources, could very easily generate through XSLT instances of objects without explicit need for customized constructor code. This could also be used for generating forms and similar resources that depend upon the data types of the schema to determine the valid content of each field.

Will There Ever Be Universal Schemas?

  • One of the greatest conundrums with schemas is that it is so easy to create a schema, and so hard to get anyone else to use it. The babel of schemas that currently exist, while bewildering, speaks to some of the strengths of schema in general.
  • Schemas as Contracts. While a technical specification, the characteristics of schemas will make them likely to be part of future legal documents in contracts - defining the data characteristics of the common data interchange.
  • Schemas and XSLT. A comprehensive Schema definition language can be used to better clarify XSLT transformations, moving it from a text manipulation language to a much more complex data manipulation language.
  • Schema Variants. The XSD specification contains a number of provisions for building conditional Schemas that change in response to the internal data within the XML document the schema represents. By providing a rigorous mechanism for doing so, XSD makes it possible to have multiple "similar" transformations that provide some flexibility into the B2B sphere.
  • Universal IDL. Interface Description Languages, or IDLs, are ways of describing programmatic interfaces. It is likely that an object oriented XML schema would be a major component of a universal IDL, perhaps one mediated by SOAP.
Page 2 of 2

 

Previous Page 
 

Recent Jobs

Integration Specialist Needed - Wor
Virtualization Server Infrastructur
A great opportunity to Digital Vide
here is a greate opportunity as a S
A great opportunity as a Network En

View all Jobs (Add yours)
View all CV (Add yours)




Chicago Web Site Design
free conference call
monofilament wigs
Donna karan sunglasses
air freshener
odor remover


    Email TopXML  

Front Page Daily Stuff TopXML Forum XML blogs XML Newsgroups BizTalk Biztalk Utilities Biztalk Utilities Tutorial B2B SAP XML Microsoft .NET Dotnet System XML Soapformatter SQLXML XMLserializer XQuery PHP PHP SimpleXML PHP XML Dom PHP XML RPC PHP XSLT Java Java Java XML Xalan Microsoft ASP ASP Schemas XML SQL Server XML XMLDom XSL XSL Tutorial XSLT Stylesheets General Javascript CSS XHTML WAP