BizTalk Utilities CV ,   Jobs ,   Code library  
 
 

 

  

What Is XML and Why Should I Care?

By Tony Stewart
Senior Consultant, Flash Creative Management, Inc.

Introduction

A lot has been written about XML, the Extensible Markup Language, but much of it contains misinformation and hype, and relatively few developers actually understand the core concepts of this family of standards.

In this presentation I walk through the key components, explaining each one and showing how they are relevant to us as developers and information application designers. By the end of the session, participants will have a basic understanding of the key standards - XML, XSL, XML-Schema, DOM, SAX, Namespaces, XLink - and their related technologies, and will know how these standards are used in the different categories of XML applications that are being developed today.

Defining Our Terms

To begin, let's take a look at the meaning of the words that comprise XML: "Extensible", "Markup" and "Language", though not in that order.

Markup is a term for metadata, that is, information about information. It originated long before computers, in the field of publishing, where publishing markup referred to the tags inserted into an edited text to tell a processor (human or machine) what to do with the information.

In this sense HTML is a classic markup language. In the phrase "This word is <bold>bold</bold>", for example, the HTML <bold> tags tell the browser how to display the information between them.

Language refers to agreement within a community on the meaning and use of a set of words. For example: English or Spanish.

Putting these two definitions together, we find that a markup language refers to agreement within a community on the meaning and use of a set of tags. Again, HTML is a perfect example of this. The community of people writing and using HTML know pretty much what to expect when they insert an <emph> or <H1> element into their text.

An extensible language is a language that includes a mechanism to add words in such a way that others understand them. We don't really have a corollary to this in the real world - adding words to English is an evolutionary process, not something any one of us can do on our own.

Following the logic of these definitions, an extensible markup language should be a markup language that provides a mechanism for adding more words to it. Unfortunately, that's not what the phrase means, and this has confused a lot of people.

In practice, "Extensible Markup Language" actually means: A system for defining entire markup languages, including the ability to extend existing ones.

SGML, HTML and XML

SGML (Standard Generalized Markup Language) was the original extensible markup language. It was standardized in 1988.

HTML (HyperText Markup Language) is a language that was defined using SGML. HTML is a "markup language" as you would expect: a set of tags to control formatting and behavior in on-line documents.

So here we have two acronyms, both containing "Markup Language" in their names, which actually are different things. One is a language; the other is a system for defining languages. What does this imply about XML?

XML is not a markup language like HTML. Instead, it is a subset of SGML: a mechanism for defining markup languages that provides 80% of SGML's power for approximately 20% of the pain. So, like SGML it is an "extensible markup language", but unlike SGML it is optimized for use over the Web. This has turned out to be an incredibly powerful combination.

The Core Concept

To repeat the key point, XML is not a markup language. Instead, it's a mechanism for creating your own markup languages, and a set of standards to ensure their interoperability, stability and longevity.

But if XML is not a markup language, neither is it a programming language. We use XML to represent data, but having done so, we still have to write applications to process that data. As Jon Bosak of Sun Microsystems, one of the inventors of XML, said early on: "XML gives Java something to do."

The primary uses of XML are:

  • Exchanging information between heterogeneous applications, enterprises, databases, etc.
  • Enabling styling and presentation of the same information on multiple output devices and/or for different purposes and audiences
  • As a storage format for long-lived or structurally rigorous document-centric information, such as aircraft manuals or enterprise information models.

The XML Family of Standards

While the name "XML" apples to a specific international standard, it is also commonly used to refer to the entire family of XML-related standards.

Most of us don't deal directly with the XML standards. They are primarily used by developers who build software tools that generate or process XML documents, in order to ensure that the tools work properly and will be interoperable. But it's important to know which standards are available to us as resources, and where to turn when we have questions.

One way to organize the standards is according to the goals they help accomplish. Here are the primary XML standards that have already been finished (or nearly so) as of this writing:

Goal

Primary Applicable Standards

Define an XML language

XML

Namespaces

XML-Schemas

Format and display XML documents

CSS (Cascading Style Sheets)

XSL (Extensible Style Language)

XSLT (XSL Transformations)

Develop processing applications

DOM (Document Object Model)

SAX (Simple API for XML)

XSLT

Exchange information between systems

Purpose-specific standardized XML languages such as:

  • SOAP (Simple Object Access Protocol)
  • SVG (Scalable Vector Graphics)
  • WML (Wireless Markup Language)
  • XCBL (XML Common Business Library)

In the rest of this presentation I'll take a closer look at how these standards fit together.

Defining XML Languages

Document Types and Instances

(NOTE: In this section I use the term "document" in the XML sense: a series of characters conforming to the XML rules. This can be confusing, since later we will talk about "documents" of the usual sort: information formatted for presentation to human beings.)

XML lets you define the markup to be used in a set of documents that share similar characteristics: for example, for example, a set of software manuals or a set of e-commerce messages. These are called document types. (An XML document type is roughly analogous to a class in object-oriented design.)

A document instance is a particular document: a specific software manual or a specific invoice. (This is roughly analogous to an instantiated object in the OO world.)

In a typical XML project we first design document types for the information we want to work with, then we write software to create and process instances of those documents.

Markup Building Blocks

XML provides a powerful set of low-level building blocks that we can use when designing our document types. Let's take a look at a sample instance, in this case a fragment of data from a personnel database.

<personnel-data>

        <person ID="PE1">

                <name>

                         <first-name>Tony</first-name>

                         <last-name>Stewart</last-name>

                </name>

                <working-location office-id="OF1"/>

                <title>Senior Consultant</title>

        </person>

        <office ID="OF1">

                <name>Head Office</name>

                <address>433 Hackensack Avenue</address>

        </office>

</personnel-data>

This fragment illustrates several of the key building blocks of XML:

  • The document consists of elements (the <tags> in brackets), which are roughly analogous to objects in an OO system, or to fields in a relational database. An element begins with the opening bracket of its start tag, ends with the closing bracket of its end tag, and includes everything in between.
  • An element can have content, which is the text between the opening and closing tags. For example, "Tony" and "Stewart" are both element contents.
  • Some elements contain attributes, which are additional information stored inside the opening tag of the element in the form of name:value pairs. ID and office-ID in this example are both attributes, and their contents ("PE1" and "OF1") are referred to as attribute values.
  • An element can be empty. In this example, <working-location/> is an empty element. Usually, empty elements serve as placeholders for attributes.
  • Elements can contain other elements. This is referred to as nesting or containership. Containership can be used to represent serialized collections of objects or rows of data, or for any other appropriate information.
  • Attributes, however, cannot contain other attributes or elements.
  • Finally, element content or attribute values can serve as pointers to other items in the document. XML offers a number of ways to do this, one of which is shown in this example: the office-id attribute "OF1" inside <working-location/>matches (points to) the ID attribute of the <office/> element below it. This pair of pointers indicates that person "PE1"'s office location is office "OF1", the Head Office.

Select the Appropriate Modeling Style

These building blocks are simple, yet flexible and powerful enough to support many different information-modeling styles: object, networked, hierarchical, relational, etc. You can select the style that best suits the type of information you want to represent as XML.

For example, here is a fragment of an XML document containing similar personnel-related information. This time the data has been modeled using a different style, one that relies much more heavily on pointers than containership:

<document>

        <people>

                <person ID="PE1"/>

                <person ID="TS23"/>

                <person ID="SS9"/

        </people>

        <offices>

                <office ID="OF1"/>

                <office ID="OF2"/>

        </offices>

        <notes>You can mix <glref term="structured"> structured</glref> and <glref term="unstructured"> unstructured</glref> text in the same document.

        </notes>

</document>

The point here is that there is no "right" way to design an XML document: it all depends on the information you are dealing with and the processing you want to apply to it.

Well-Formed and Valid

XML allows you to work formally or informally. For small projects or when prototyping, you can quickly develop well-formed documents. On larger projects or projects involving multiple systems, you will usually go further and create valid documents.

Well-formed XML conforms to a set of built-in structural rules, including:

  • One unique "root" element
  • Every non-empty element has matching start and end tags
  • All elements neatly nested, with no overlaps
  • Various character and name restrictions

Valid XML is well-formed and:

  • References or includes a schema or DTD (Document Type Definition)
  • Conforms to the rules in that schema

Schemas Provide Validity

The word "schema" refers to the rules applying to a set of similarly structured documents. This is not an XML concept: we use the same word in many other disciplines.

In the case of XML, these rules include:

  • What elements and attributes may occur?
  • In what sequences and nestings?
  • What kind of data can they contain (for example, datatypes, ranges, character masks, etc.)?

XML provides two schema languages: DTD and XML-Schema.

DTD (or Document Type Definition) is the schema mechanism invented originally for SGML and inherited by XML. DTDs are relatively document-centric, so they don't include a lot of useful features such as datatyping, ranges and picture masks. Also, they are written in a syntax all their own, and there are relatively few tools that can process them

XML-Schema is a new schema standard that has been designed specifically for XML. It uses XML syntax, it addresses most of the shortcomings of the DTD format, and the major tools vendors are already shipping technology to support it. As a result, people just arriving in the XML world are advised to ignore the DTD syntax if possible and adopt the XML-Schema standard for their work.

When Is Validity Necessary?

Well-formed documents are quick to prototype and easy to use. I've seen entire applications built entirely around well-formed XML documents. So when is it worth the trouble to go further and set up a formal schema?

The short answer is that schemas empower data-driven processing. They provide the missing information that, when processed by a schema processor, can be converted into automated:

  • Content validation
  • Default values for elements
  • Assistance when editing an XML document
  • Translation from one XML format to another

Of course you can write code that will do all these things, but your code will usually be specific to a particular document type. The information in a schema allows you to purchase or write a generic schema processor (or a tool containing a schema processor) that will run against many different document types with no further programming required from you.

And, of course, schemas document your information design in a standardized format that is easily shared.

Namespaces Resolve Conflicts

As you work with XML documents and applications, you frequently find yourself combining portions of what were originally separate documents into one document. As soon as you do this you start running into name conflicts: elements with the same tag names that actually have different meanings.

For example, you might want to merge a document in which <title> to refer to a person's job title with another document where <title> refers to move titles. This doesn't matter when they're in separate documents, and it even doesn't matter if they're in the same document but you don't intend to process these tags automatically. The problem occurs when you want to write code that will automatically do some kind of processing on all the <title> elements, and it needs to know which ones are which. (In fact this is a more common requirement than this trivial example suggests.)

The solution is to use XML Namespaces. A namespace is a mechanism by which you can declare within a document that a set of elements are conceptually related, usually by identifying the "space" from which they originally came. You can declare a short namespace identifier for each namespace, and then you use that identifier as a prefix on the element names in your document.

For example, once you have declared a namespace such as xmlns:flash="www.flashcreative.com/widgets", then each time you reference <flash:widget> in the document, you (and your processing software) will know that it comes from the flashcreative/widget namespace, and therefore is not the same as <ibm:widget> or even plain-vanilla <widget>, should either of these appear in the document.

Summary: Defining XML languages

The three standards that you rely on when defining XML document types are XML itself, Namespaces and XML-Schema (or the DTD language in some cases).

The normal procedure is:

  • Model your information
  • Prototype using well-formed XML
  • Optionally define a formal schema
  • Use namespaces to combine information from different sources

Publishing XML

NOTE: In this section of the presentation I use the term "publish" to refer to the process of formatting XML for display to human beings, and "document" to refer to the output of the publishing process.

The Problem with HTML

I find it easiest to explore the concepts involved in XML publishing by looking at the problem it was originally intended to solve: implementing a better alternative to HTML. To illustrate the issues, here is some representative HTML, in this case the title page of the book Captain Corelli's Mandolin:

<H1 ALIGN="CENTER">Captain Corelli&#146;s Mandolin</H1>
<P ALIGN="CENTER"><I>Louis de Berni&egrave;res</I></P>
<P ALIGN="RIGHT"><FONT SIZE="-1">&copy; 1994 by <A HREF=mailto:ldb@lb.com>Louis de Berni&egrave;res</A> </FONT></P>

<HR><P ALIGN="CENTER">To my mother and father</P><HR>

<H2>Dr Iannis Commences His History and is Frustrated </H2>

<P>Dr Iannis had enjoyed a satisfactory day...

If we want to understand how a browser sees this information, we have to remove the text and look at the tags (which are the only parts of the document that the browser pays any attention to):

<H1 ALIGN="..."> </H1>

<P ALIGN="..."> <I> </I></P>

<P><Font>...<A>...</A></Font></P>

<HR><P>...</P></HR>

<H2>...</H2>

<P>...

As you can see, HTML documents like this contain tags such as <H1> and <P> to convey structure, <I> and <Font> to convey formatting, but there are no tags at all to indicate what information this document contains. A computer looking at these tags can't easily tell the difference (for example) between the tags containing the name of the author, the name of the book, or the dedication.

The result is that a computer can interpret these tags well enough to display the information, but it can't process it in any other, more interesting way.

Enter XML

Here is the same information converted into an XML document:

<book title="Captain Corelli's Mandolin" author="Louis de Berni貥s" crdate="1994" crby="author">

        <dedication ID="c0p1">To my mother and father</dedication>

        <chapter ID="c1" title="Dr Iannis Commences His History and is Frustrated">

                <para ID="c1p1">Dr Iannis had enjoyed a satisfactory day…</para>

                ...

        </chapter>

</book>

This time, if we look just at the tags, we get a much richer understanding of the document:

<book title="..." author="..." crDate"..." crBy="...">

  <dedication>...</dedication>

  <chapter ID="..." title="...">

     <para>...</para>

  </chapter>

</book>

These tags clearly convey both the structure and contents of the information. We can find the book's title, the author's name, and the dedication with no trouble. We can break it down into chapters and paragraphs. As a result, we can easily write software to manipulate this information any way we want.

There's just one problem: nothing in these tags tells the computer how to format the book so people can read it. We have lost exactly the information that HTML provided. We need something more.

XML Style Languages

The solution is to store the formatting in a separate document called a stylesheet, then to use a stylesheet processor to merge the XML information with the formatting in order to publish a new, human-readable document.

Stylesheets are written in style languages. You could develop your own style language and style processor, but XML comes with two standardized style languages, which are supported by a growing number of free or inexpensive tools.

CSS (Cascading Style Sheets)

CSS is a style language invented for HTML that works very well with XML. CSS is an excellent mechanism for displaying an XML document in a browser. Most of the major Web design tools allow you to draw the page and then generate a CSS script to achieve that design, so it's easy to use, too.

However, CSS has two significant drawbacks. First, it cannot produce good-quality printed output. Second, it can only "decorate" your document; it cannot change the sequence of the information in the document. Often the contents of an XML document are in a different order from the way you want them to appear. Before you can use CSS to publish such a document, you need to transform it into a sequence that matches the desired output. This adds another step to the process.

XSL (Extensible Style Language)

XSL is an XML-specific style language that overcomes the limitations of CSS and provides a lot more. In the long run this is going to be great. But as I write this (August 2000), it has a major drawback: there are no tools available that will generate the XSL stylesheets for you. So unlike CSS, you actually have to write your XSL stylesheets by hand. The result is a cost-benefit tradeoff: CSS is easier to use but less flexible; XSL is much more powerful, but for now, a lot harder to use.

XSL provides three main features, each of which is associated with a separate standard. Two of these are ready now, and one is in the standardization home stretch:

  • Transformation (XSLT). This feature lets us transform an XML document into another format, either another XML document or a DHTML or XHTML document (well-formed HTML that will work in current browsers, complete with embedded JavaScript or VBScript, if appropriate.).
  • Pointing (XPath) allows us to unambiguously identify any location or type of location in an XML document. This is a core requirement of style sheet processing, since it provides the mechanism by which style rules can be applied to the information in the XML document without having to actually embed style tags in the document.
  • Formatting (XSL) is the process of applying formatting to information without having to write instructions that are specific to a particular output device. For example, using XSL designers will be able to create rules like "all titles should be formatted in Arial 24 boldface and centered in a double-line box that is n units wide", and have the rule apply with appropriate adjustments whether printing on paper or displaying in a browser. This will be a great improvement over current mechanisms, which require that you write different rules (in different style languages, such as HTML) for each device you want to support. Unfortunately, the XSL standard is still being developed, and as a result vendors have not yet provided tools for it.

How Stylesheets Work

The main principle of all XML style languages is that the designer creates rules that bind formatting and processing instructions to types of information in the document. The rules are placed in a stylesheet in the form of templates that reference (or point to) elements or patterns of information that can be found in the document.

At runtime, a piece of software called a stylesheet processor receives the XML document and an associated stylesheet as input. (Stylesheet processors can be found inside any application that applies stylesheets to XML, such as browsers and Web page design tools.) The processor acts on the instructions contained in the stylesheet, applies the templates to the types of information they point to, and generates a new output document as a result.

This technique, which is called "declarative" rather than "procedural" programming (because you declare style rules rather than writing procedural instructions in source code), is powerful, efficient, and scales well across large information sets.

XSLT Example

Here is a simple example: an XSLT template to boldface <title> elements when generating HTML:

<xsl:template match="title">

        <H1>

                <xsl:apply-templates/>

        </H1>

</xsl:template>

In this template, the match attribute value "title" indicates that the rule should be applied to any <title> element that the stylesheet processor encounters in the source XML document. The template rule here indicates that the contents of the <title> element in the XML document should be included in the HTML file that is being created, but preceded by an <H1> tag and followed by an </H1> tag. Assuming that the source XML looks like this:

<title>Captain Corelli's Mandolin</title>

then the resulting HTML will look like this:

<H1>Captain Corelli's Mandolin</H1>

This is an extremely simple example, which could easily have been accomplished using a CSS stylesheet. In practice XSLT pattern matching can be much more complicated and powerful than this, and can you do more with XSLT templates than merely apply formatting.

Stylesheets Enable Flexibility

By separating your formatting instructions from the information contents and then using the stylesheet mechanism to merge them back together, you open up a world of possibilities:

  • You can create different stylesheets for different devices (browsers, PDAs, telephones, etc.), different media (on-line, print, CD, etc.) or different purposes (manager's view, technical view, etc.).
  • You can create one stylesheet for many documents.
  • You can edit the information and the stylesheet separately from each other.
  • You can republish the entire information set at the touch of a button.

Summary: Publishing XML

The core standards that you rely on when publishing XML documents are CSS and XSL/XSLT.

The key concepts are:

  • Separate content from formatting
  • Bind styles to structure
  • Use CSS to "decorate" your information
  • Use XSLT when transformations or more powerful pattern matching is required

Processing XML

The previous sections have described ways to use the XML family of standards and tools to perform data-driven activities such as validating XML (against a schema) or publishing it (using a stylesheet and off-the-shelf tools). This section describes the standards you would use you writing custom code or scripts to process XML information in other ways.

Developers tend to work with XML at one of two levels. Low-level processing involves reading the raw XML document, breaking it into pieces and doing interesting things with the pieces. In high-level processing you rely on off-the-shelf tools to do the low-level work for you, but you still have to write code or script to customize those low-level tasks and chain them together. For low-level development you need to focus on the XML parsing standards, while for high-level development you should focus on the XML transformation standard.

Parsers

Every XML processor (including browsers, schema processors, editors, stylsheet processors, etc.) contains a parser.

The parser reads the XML document and breaks it into pieces in memory, usually corresponding to the elements and attributes of the original document. Once this is done, the processor can manipulate the pieces directly, just like data in a database:

  • Transforming it to/from other formats
  • Reassembling it in a different sequence
  • Applying formatting for printing or display

Only a small subset of XML developers needs to worry about parsers and parsing standards. Most of us rely on the parsers that are built into our favorite development environments, and leave the low-level details to the tool vendors.

The XML parsing standards help all of us, however, because they ensure that the available parsers contain the same core features and language bindings, and are therefore relatively interchangeable. This means that a developer can switch parsers without breaking his application, or can move to another development environment or programming language without losing functionality. Since different parsers and development environments have different strengths and weaknesses, this allows us to select the best tools for the job at hand.

Two Parsing Standards

XML comes with two standardized APIs (Application Programming Interfaces) for writing parsers: DOM and SAX.

Despite its name, the DOM (Document Object Model) is not really a "model" at all. Rather, it is an API for writing code to manipulate with the pieces of a document after that document has been loaded into memory. The DOM standard treats the information as if it has been stored in a tree structure, with commands to walk the tree, retrieve collections of information, etc.  These commands certainly imply the existence of an underlying object model, but for practical purposes the benefit of DOM-compliant parsers is that they provide standardized API calls to manipulate the information.

SAX (Simple API for XML) does not pretend to be an object model. Instead, it's a set of events that can be fired while a document is being read by a SAX-compliant parser. SAX was created because DOM commands really can't be used until the entire document has been loaded into memory. But what if the document is very large, and all you care about are the contents of a <widget> element somewhere in the middle? You don't want to waste time and memory loading the whole document into memory just to access one element. Instead, you can quickly stream the document through a SAX-compliant parser to find the elements you care about and grab their contents.

In a sense, the SAX functionality is a prerequisite for a DOM-compliant parser, because a DOM parser must first read the document and use SAX-like events to identify the elements and build its in-memory representation of the document. As a result, many parsers are both DOM and SAX compliant, so that you can use the same parser in either processing mode.

Transformations (XSLT)

A transformation is the process of using XSLT to convert an XML document to another format, either XML or non-XML.

Transformations are the building blocks of XML applications, because they allow us to work with information in an optimized form. As Simon Phipps of IBM said at a conference last year:  "The lifeblood of all business to business in XML is going to be transformation."

This has certainly been true in my own work. In a project I recently worked on, the heart of the application was a series of XSLT transformations between three formats: an "external" format that was used for interchange between organizations, an "internal" format that was used to support processing, and an HTML format that was used to display the information. Once these formats and transformation rules had been worked out, the rest of the development process consisted of creating generic, reusable routines to route the information and chain the transformations.

XSLT Concepts

An XSLT stylesheet consists of named templates and patterns. Templates define the string/structure transformations that need to occur, while patterns bind the templates to structures in the document.

The XSLT processing model is declarative and recursive. The stylesheet processor walks through the parsed XML document, and at each node it applies the templates whose patterns match the current location. The instructions in the templates can rearrange or sort the information, wrap other character strings around it, skip it entirely, or do whatever else is necessary to transform it into the desired output.

Here is a simple example of a template to output an HTML document with "last names first". Assume that the source XML contains two <author> elements like this:

<author>

        <firstname>Alan</firstname><lastname>Griver</surname>

</author>

<author>

        <firstname>Tony</firstname><lastname>Stewart</surname>

</author>

And assume that the stylesheet contains a template with a pattern match for <author> elements, like this:

<xsl:template match="author">

        <xsl:value-of select="lastname"/>, <xsl:value-of select="firstname"/>

        <BR/>

</xsl:template>

As the stylesheet processor walks through the document, each time it encounters a new element it will look see whether there is a template whose pattern matches the point it has reached in the document. At each <author> element the processor will see that there is a match, so it will evaluate the appropriate style template and add the results of that evaluation to the output that it is building. In this case the combination of the expressions in the template (the two <xsl:value-of> elements) and the literals (the comma, the space and the <BR/>) will generate a text string that looks like this: Griver, Alan<BR/>Stewart, Tony<BR/>.

A Glimpse of the Future

The XSLT processing model is incredibly powerful and scales very well. Many XML B2B applications are based on multiple transformations from one XML format to another, and XSLT is seen as the obvious way to accomplish this.

The problem up until now is that we've had to design and write the XSLT instructions by hand. But a new generation of schema processing tools that automate the process of designing schema-to-schema transformations is about to be released.

Summary

The core standards for parsing and processing XML documents are DOM, SAX and XSLT.

The key concepts are:

  • Standardized parser APIs enable interchangeable XML processors
  • Transformations are the building blocks of XML applications
  • Schema mapping tools will generate the necessary XSLT

Why Should I Care?

Hopefully by now you understand what XML is and how the main XML standards fit together. In this section I want to suggest four reasons why XML is achieving such widespread adoption, and why I believe it will change the way we design and develop information architectures.

Loose Coupling

Loose coupling is the process of communicating between systems or components using predefined message formats, without knowing the details of each other's systems. The basic concept is: "If you send a message like this to my address, I'll send a message like that to your address."

Loose coupling addresses many of the difficulties we have experienced over the years when we tried to integrate systems using data exchange formats that were tightly bound to the systems they connected.

XML adoption enables loose coupling on a large scale. Because the XML message format is independent of the processing software, you can add more systems to a network, and change the nature of those systems, without changing the message mechanism. And because XML allows you to "extend" the message format without breaking existing processing routines, you can improve your format incrementally over time without disrupting existing participants' systems.

The result is that XML is being widely and rapidly adopted as the medium of choice for both business-to-business (B2B) and object-to-object messaging. Around the world, organizations that need to exchange data with each other are joining together to create business communities, defining XML languages that represent the data they need to exchange, and publishing schemas for those languages at central Web sites.

You can learn more about these concepts and the communities that are adopting them at two Web sites that are focal points for this activity: www.biztalk.org and www.xml.org.

Self-Describing Data

An XML document always contains its element/column/object names, and optionally includes or references its entire schema. This means it is "self-describing" in a way that no widely-used data format has been before.

This in turn enables client-side processing of the XML. When you send XML to a device, software on that device can both display and process the information.

This combination of self-describing data and client-side processing is already leading to incredible browser applications and a new generation of "smart" documents, while requiring significantly less programming effort than before. (Yes, that's hype, but it is also my direct experience.)

Standards-Based

The fact that XML is based on international standards makes it stable, interoperable and extensible. You can:

  • Easily exchange information between heterogeneous systems
  • Choose between competing tools
  • Build and reuse generic processing routines
  • Add features without creating legacy compatibility problems

And because the standards groups are protecting backwards-compatibility, we can expect these benefits to remain stable for years to come.

A Linga Franca for Information

The final and most compelling reason for the widespread adoption of XML is that it works for so many different kinds of information and purposes.

Originally invented to provide the benefits of SGML on the Web - and (its founders hoped) to replace HTML in the process) - XML has turned out to be equally useful for both Web and non-Web applications in a variety of domains including:

  • Information exchange
  • System integration
  • Distributed computing
  • Local metadata storage

All This, and Unicode Too!

XML uses the Unicode character set, allowing over 64,000 characters and accommodating most of the world's languages and character sets.

Conclusions

What is XML?

  • A set of standards and tools to define, exchange and publish information

Why should I care?

  • Loose coupling
  • Separation of format from content
  • Self-describing data
  • Client-side processing
  • Standards-based
  • Lingua Franca

XML Resources

For more information, check out these Web sites.

Articles and discussions about XML:

Standards and related information:

Schemas and business communities:

 

  
 

Recent Jobs

Integration Specialist Needed - Wor
Virtualization Server Infrastructur
A great opportunity to Digital Vide
here is a greate opportunity as a S
A great opportunity as a Network En

View all Jobs (Add yours)
View all CV (Add yours)



Information Online

swimming pool builder
chicago web site design
soccer drills
Cheap Web Hosting
conference call
Giorgio armani sunglasses
answering service


    Email TopXML  

Front Page Daily Stuff TopXML Forum XML blogs XML Newsgroups BizTalk Biztalk Utilities Biztalk Utilities Tutorial B2B SAP XML Microsoft .NET Dotnet System XML Soapformatter SQLXML XMLserializer XQuery PHP PHP SimpleXML PHP XML Dom PHP XML RPC PHP XSLT Java Java Java XML Xalan Microsoft ASP ASP Schemas XML SQL Server XML XMLDom XSL XSL Tutorial XSLT Stylesheets General Javascript CSS XHTML WAP