ebXML: Building XML Messages from Processes to Data
An excerpt from:
ebXML:
The New Global Standard for Doing Business on the Internet. Written by
by Alan Kotok, David R. Webber
and published by New
Riders

This section looks at the process for building business messages
with XML. As we discussed earlier in this chapter, XML lets trading
partners define their own elements and tags, taking advantage of
XML's extensible naturethe X in XML. But XML messages also
represent the structure of those elements, following their
prescribed relationships in the hierarchy. The message schema DTD
captures the names of the elements as represented by the tags and
their hierarchical structure. Messages exchanged among trading
partners therefore must represent the rules and practices of a
business or industry, as captured in the schema DTD.
For example, in Chapter 3, "ebXML at Work," the Marathoner
running store case study points out how retailer and manufacturers
can exchange product identifiers and precise inventory levels, so
that manufacturers can compare inventory levels to predefined
reorder points and decide whether they need to ship more product.
Before any of these exchanges can happen, however, the retailer and
the manufacturers-or, better yet, the entire industry-need to agree
on common terminology and structure of the messages. With this
common set of rules, shoe manufacturers and retailers can use the
same basic set of messages, which promotes the use of packaged
software and makes it possible for the parties to develop their
systems faster and for less money.
We call this common set of rules a data model because,
like a schematic drawing, it offers a skeleton view of the
messages, specifies the order of the elements in a message, and
shows how the various elements relate to one another in a
hierarchy. The term comes from the database world, where database
design needs to meet the users' business requirements as
efficiently as possible, yet still allow for future growth. The
logical model defines the information fields and their
relationships in a database (much like a schema DTD in an XML
message), while the physical model details field sizes and
datatypes, such as alphanumeric or date formats.30 In
fact, defining an XML schema of information is analogous to
creating traditional row-and-column layouts for a database design
system.
The XML syntax is not just about interpreting the content. The
business process is a vital component of the content and is helped
along by XML.
As shown in the case studies in Chapter 3, the parties identify
business processes or actions taken by the companies to achieve
their business goals. For example, the travel agency case proposes
a process to decide on a tour package. This process has
contingencies built in for continued bids and best-and-final offers
if the customers don't want to accept one of the first offers. By
working out these larger processes, the trading partners can agree
on the overall conduct of the business, before trying to determine
the individual messages.
A tool called use cases can help identify these
processes. Use cases describe scenarios in which users interact
with each other and the systems under development. Each scenario
describes the accomplishment of a specific task or achievement of a
goal. They also identify the players, steps in the process, and the
messages or even the data exchanged. By describing these situations
in a storytelling mode, use cases often uncover the processes
underlying business practices.31
One of the ebXML development activities involves identifying
similarities in business processes across industries. While each
industry has its own language and culture, using these common
processes helps speed the work and improves the chances for
interoperability among industries.32
Each process contains a set of individual messages exchanged
among the trading partners. In Chapter 3, the running store case
listed a series of messages in the process of reporting inventory
levels and replenishing the stock:
- Periodic
inventory report sent from the store to the manufacturer
- Ship notice
sent from the manufacturer to the store with the shipment
details
- Receiving
report sent from the store to the manufacturer once inventory is
accepted
Industries defining their processes can identify the individual
messages contained in those processes, as well as how and when the
companies send and receive the messages. These messages may
resemble EDI transaction sets (see Chapter 5, "The Road Toward
ebXML," for a discussion of EDI), as in the running store case, or
look nothing like EDI transactions, as in the travel agency
case.
Identify Data in the Messages
Once industries identify the messages, they next need to
identify the sets of business data that go into those messages.
Industry organizations that have previously developed EDI
transactions can use this work as the basis for identifying data
for XML messages. Newer business processes must rely on information
analysis between companies to determine the content required, often
replacing older, paper-based documents. But the objective is to
improve the way companies do business-not necessarily to follow the
current EDI transactions or old paper-process documents. Industries
sometimes use this exercise to test traditional assumptions and
practices, which can cut out captured or exchanged data that's no
longer needed. On the other hand, this process can generate more
pieces of data needed by trading partners to meet their business
requirements.
When applying this process to XML, industry groups develop XML
vocabularies that put these groups of data into definable messages,
also identifying the structure of the data in the messages. To aid
understanding and reuse, the XML structure should link related and
most-used pieces of information together as logical blocks. The
messages thus embody the rules and practices of doing business in a
particular industry, defined in terms of XML. In this way,
industries can design common groups of data with common structures
as industry-wide rules for processing XML messages.
XML vocabularies can represent more than vertical industries.
Vocabularies can also define business functions found in multiple
industries, or entire frameworks that provide interoperability
across industries and functions. One of these frameworks is ebXML
itself, which provides the underpinning for global business, not
just an industry sector.33
As discussed earlier, DTDs, as specified for XML, contain the
rules for both constructing and structurally validating XML
messages. We'll now describe schemas in more detail to give you an
understanding of how this key piece of the XML technology is used
to enable consistent electronic business.
DTDs assemble information into elements with connected
attributes. Elements are the basic building blocks of XML messages,
and therefore the basic components of DTDs. Elements can contain
other elements expressed in a hierarchy (compound elements),
or they can stand alone as simple containers for character data.
Compound elements for parent/child blocks can be referenced
together. When the modeling process identifies the data in proposed
XML messages, most of these data items will become elements,
identified as such in DTDs. In XML messages themselves, elements
are marked up as tags within the now familiar angle brackets
(<>). Element definitions can indicate the frequency with
which the elements occur-once or more than once-and whether they're
required or optional.34
Then attributes provide additional description or
qualification for elements. Using the language metaphor often
applied to XML, one can think of elements as nouns and attributes
as adjectives. The XML document example presented earlier and the
following DTD fragment identify the PostalCode as an
element, with the codetype and its use as an attribute of
that element:
<PostalCode codetype='ZIP'>96045</PostalCode>
<!- DTD definition for element and attribute ->
<!ELEMENT PostalCode (#PCDATA) >
<!ATTLIST PostalCode
codetype CDATA #IMPLIED >
With the schema DTD syntax, the attributes also provide a
limited form of data typing, which means that they describe the
kind of data allowed for that element. Attributes can contain
strings (character data), enumerated lists, or references to other
components in the document called tokens.
Enumerated lists restrict the attribute to only permitted
character strings. For example, an attribute to identify smoking
preferences for hotel reservations would have the following as its
enumerated listing: SMOKING or NONSMOKING. Attributes can
likewise indicate a default response, used routinely unless the
customer requests otherwise. Returning to the hotel example, the
NONSMOKING response could serve as the default, unless the
customer specifically requests SMOKING.35 While
schema DTD datatyping is deliberately simplistic but thereby more
easily understood, the new W3C extended schema datatyping is
extensive and sophisticated.36
Entities are rather misnamed. They're really aliases or
substitution strings, intended to identify the reusable objects in
a schema DTD, providing handy shortcuts and helping to ensure
consistency in the rules expressed by the DTD. These reusable
objects can consist of text strings, such as legal boilerplate, or
more complex data element and attribute combinations, defined in
advance and recalled when needed. Entities can be internal to the
DTD or stored as fragments externally.37
Entities also help when placing a character inside a character
data or CDATA section of an XML document that would cause confusion
with the processing of the XML, such as &, <,
>, and ".
Consider the telephone number in the following example. The
boldfaced element <Telephone> is a substitution string
declared as an entity in the schema DTD telephone-usa.xml,
and then included as needed in XML documents based on that DTD. The
OpenTravel Alliance uses this technique in its customer profile,
which specifies several telephone numbers (customer, emergency
contact, travel agency, and so on). The use of this technique
simplifies the schema DTD and guarantees that all telephone numbers
in the valid messages are defined consistently.38
<?xml version="1.0"?>
<!DOCTYPE Cust.Telephone SYSTEM
'http://xml.org/telephone-usa.xml' []>
<Cust.Telephone PhoneTech="Voice" PhoneUse="Home">
< Telephone CountryAccessCode="1">
<
Phone.AreaCityCode>703
</Phone.AreaCityCode>
<
Phone.Number>555-9999
</Phone.Number>
</ Telephone>
</ Cust.Telephone>
Using a traveler's customer profile, we can show an example of a
DTD and how it helps build and validate an XML message.
Table 4.1 shows the pieces of information in a scaled-down
traveler profile database, showing three levels in the data
hierarchy, as well as the content of each level-element, text, or
attribute-as well as single/multiple occurrences, requirement
indicator, and allowable options.
The control information identifies the creator of the profile (a
travel agency, for the purpose of this exercise), whether it's a
new record or an update, whether the customer has given permission
to share the data in the profile, and a date/time stamp that most
systems can generate routinely.
Table 4.1: Traveler Profile Database Structure
Traveler Profile Database Structure
|
Data level 1
|
Data level 2
|
Data level 3
|
Content
|
Occurs
|
Required?
|
Options
|
|
Control info
|
|
|
Element
|
Single
|
Yes
|
|
|
|
Share permission?
|
|
Attribute
|
|
|
Yes
|
|
|
|
|
|
|
|
No
|
|
|
Agency
|
|
Element
|
Single
|
Yes
|
|
|
|
|
Agency name
|
Text
|
Single
|
Yes
|
|
|
|
|
Agency ID
|
Text
|
Single
|
|
|
|
|
New/Update
|
|
Text
|
Single
|
Yes
|
New
|
|
|
|
|
|
|
|
Update
|
|
|
Date-time
|
|
Text
|
Single
|
Yes
|
|
|
Traveler ID
|
|
|
Element
|
Multiple
|
Yes
|
|
|
|
Traveler name
|
|
Element
|
Single
|
Yes
|
|
|
|
|
Title
|
Text
|
Multiple
|
|
|
|
|
|
Family name
|
Text
|
Single
|
Yes
|
|
|
|
|
Given names
|
Text
|
Multiple
|
|
|
|
|
Address
|
|
Element
|
Multiple
|
Yes
|
|
|
|
|
Address type
|
Attribute
|
|
|
Mailing
|
|
|
|
|
|
|
|
Delivery
|
|
|
|
Number/street
|
Text
|
Single
|
Yes
|
|
|
|
|
Room/floor
|
Text
|
Multiple
|
|
|
|
|
|
City name
|
Text
|
Single
|
Yes
|
|
|
|
|
Postal code
|
Text
|
Single
|
Yes
|
|
|
|
|
State/Province
|
Text
|
Multiple
|
|
|
|
|
|
Country
|
Text
|
Single
|
|
|
|
|
Telephone
|
|
Element
|
Multiple
|
Yes
|
|
|
|
|
Telephone use
|
Attribute
|
|
|
Work
|
|
|
|
|
|
|
|
Home
|
|
|
|
Country access
|
Text
|
Single
|
|
|
|
|
|
Area/city code
|
Text
|
Single
|
Yes
|
|
|
|
|
Tel. number
|
Text
|
Single
|
Yes
|
|
|
|
Email
|
|
Element
|
Multiple
|
|
|
|
|
|
Email type
|
Attribute
|
|
|
Work
|
|
|
|
|
|
|
|
Personal
|
|
|
|
Email address
|
Text
|
Single
|
|
|
|
Form of payment
|
|
|
Element
|
Multiple
|
Yes
|
|
|
|
Payment type
|
|
Attribute
|
|
|
Credit card
|
|
|
|
|
|
|
|
Debit card
|
|
|
Payment detail
|
|
Element
|
Multiple
|
Yes
|
|
|
|
|
Card number
|
Text
|
Single
|
Yes
|
|
|
|
|
Exp. date
|
Text
|
Single
|
Yes
|
|
|
|
|
Name on card
|
Text
|
Single
|
Yes
|
|
|
Travel preferences
|
|
|
Element
|
Multiple
|
|
|
|
|
General
|
|
Element
|
Multiple
|
|
|
|
|
|
Smoking section
|
Text
|
Single
|
|
Smoking
|
|
|
|
|
|
|
|
Non-smoking
|
|
|
|
Meal preferences
|
Text
|
Multiple
|
|
|
|
|
|
Special needs
|
|
Multiple
|
|
|
|
|
Loyalty programs
|
|
Element
|
Multiple
|
|
|
|
|
|
Program type
|
Attribute
|
|
|
General
|
|
|
|
|
|
|
|
Airline
|
|
|
|
|
|
|
|
Hotel
|
|
|
|
|
|
|
|
Rental car
|
|
|
|
Program name
|
Text
|
Single
|
|
|
|
|
|
Program ID
|
Text
|
Single
|
|
|
|
|
Airline
|
|
Element
|
Multiple
|
|
|
|
|
|
Departure airport
|
Text
|
|
|
|
|
|
|
Seat selection
|
Text
|
|
|
Aisle
|
|
|
|
|
|
|
|
Center
|
|
|
|
|
|
|
|
Window
|
|
|
Hotel
|
|
Element
|
Multiple
|
|
|
|
|
|
City section
|
Text
|
|
|
Downtown
|
|
|
|
|
|
|
|
Suburbs
|
|
|
|
|
|
|
|
Airport
|
|
|
|
Room type
|
Text
|
|
|
Single
|
|
|
|
|
|
|
|
Double
|
|
|
Car rental
|
|
Element
|
Multiple
|
|
|
|
|
|
Car type
|
Text
|
|
|
Compact
|
|
|
|
|
|
|
|
Midsize
|
|
|
|
|
|
|
|
Full
|
|
|
|
|
|
|
|
SUV
|
|
|
|
|
|
|
|
Truck
|
|
|
|
Child seat
|
Text
|
Single
|
|
Yes
|
|
|
|
|
|
|
|
No
|
The DTD for this database structure (Traveler.dtd) is
found on this book's web site (www.ebxmlbooks.com). Please
note that this DTD example is meant only to illustrate how a DTD
works, and should not be used for normal business messages.
From this database structure, a travel agency wants to create a
traveler profile record for a traveler, with the following specific
data and preferences:
Administrative control data
- Agency name:
GoGo Travel
- Agency ID
code: ZZY98234
- Purpose of
record: new
- Date/time: 21
June 2001, 3:55 pm
- Permission to
share data in profile? No
Traveler identification
- Traveler's
name: Ms. Phoebe P. Peabody-Beebe
- ‑Address
(delivery): 312 Sycamore St., Buffalo, NY 14204
- Telephone
(work): 716-555-9999
- Email:
Phoebe@PeabodyBeebe.com
Payment data
- Type of
payment: Credit card
- Card number:
0000111122223333
- Expiration
date: 12/2002
- Name on card:
Phoebe P Peabody-Beebe
Preferences
- Nonsmoking
- Meal type:
Vegetarian
- Loyalty
program-airlines: US Airways, no. 24680
- Loyalty
program-car rental: National Car Rental, no. 54321
- Loyalty
program-general: AmEx Membership Miles, no. 09876
- Departure
airport (IATA code): BUF
- Airline seat
preference: Aisle
- Hotel, city
section preference: downtown
- Hotel room
preference: single
- Car type
preference: Compact
Listing 4.4 gives a validated XML document for these entries
based on the rules presented in Traveler.dtd. Sample XML Document Based on
Traveler.dtd
<Traveler>
<Control>
<Agency>
<AgencyName>Go-Go
Travel
</AgencyName>
<AgencyID>ZZY98234</AgencyID>
</Agency>
<Purpose>New</Purpose>
<DateTime>20010621t15:55:00</DateTime>
</Control>
<TravelerID Share="No">
<TravelerName>
<Title>Ms</Title>
<Family>Peabody-Beebe</Family>
<Given>Phoebe</Given>
<Given>P.</Given>
</TravelerName>
<Address
AddressType="Deliver">
<NumberStreet>312 Sycamore St
</NumberStreet>
<City>Buffalo</City>
<PostalCode>14204</PostalCode>
<StateProv>NY</StateProv>
</Address>
<Telephone
PhoneUse="Work">
<AreaCity>716</AreaCity>
<PhoneNumber>555-9999
</PhoneNumber>
</Telephone>
<Email>
<EmailAddress>
Phoebe@PeabodyBeebe.Com
</EmailAddress>
</Email>
</TravelerID>
<Payment>
<PayDetail>
<CardNumber>
0000111122223333
</CardNumber>
<ExpDate>12/2002</ExpDate>
<NameOnCard>
Phoebe P Peabody Beebe
</NameOnCard>
</PayDetail>
</Payment>
<Preferences>
<General>
<Smoking>Non-smoking</Smoking>
<MealPref>Vegetarian</MealPref>
</General>
<Loyalty
LoyalType="Airline">
<LoyalName>US Airways
</LoyalName>
<LoyalID>24680</LoyalID>
</Loyalty>
<Loyalty
LoyalType="Car Rental">
<LoyalName>National
Car
Rental</LoyalName>
<LoyalID>54321</LoyalID>
</Loyalty>
<Loyalty
LoyalType="General">
<LoyalName>Amex Member
Miles</LoyalName>
<LoyalID>09876</LoyalID>
</Loyalty>
<Airline>
<DepartAirport>BUF
</DepartAirport>
<SeatSelect>Aisle</SeatSelect>
</Airline>
<Hotel>
<CitySection>Downtown
</CitySection>
<RoomType>Single</RoomType>
</Hotel>
<CarRent>
<CarType>Compact</CarType>
</CarRent>
</Preferences>
</Traveler>
This message referencing the Traveler.dtd contains all of
the required data, uses tags that match the element names in the
DTD, presents the elements and tags in the order prescribed by the
DTD, and therefore conforms as a valid structure to that DTD.
Notice that the example doesn't have any data for child seat
preferences listed under the XML car rentals section, but does have
three different loyalty programs listed. The rules expressed in the
DTD allow for such variations. However, if a message left out the
traveler's name, a validating parser would return an error message
accordingly.
The generic name for DTDs is schemas, a term borrowed
from the database world. DTDs represent data only in a hierarchy,
which works fine for documentation; remember that the W3C borrowed
DTDs from SGML, designed for electronic documentation and the
predecessor to XML.
However, many business databases use other kinds of
structures-such as relational databases or object-oriented classes
and properties-some of which don't always lend themselves to a
hierarchical model. In some cases, particularly when working with a
simple data structure, data architects have been able to adapt
object-oriented structures or relational data models to the kind of
hierarchies represented in DTDs. But business doesn't always deal a
simple hand, and technologists need more robust and flexible tools
than the DTD to be prepared for these more complex conditions.
The W3C has developed XML Schema, a major enhancement to
XML that offers extended tools for representing information
structures and objects, as well as providing extended datatypes
beyond those in DTDs. In May 2001, XML Schema reached full
recommendation status.39
XML Schema provides more power for defining the structure,
content, and semantics of XML documents. The W3C specifications
document has three parts:
- Methods for
describing the structure of data
- Definition of
datatypes
- A primer,
explaining its features40
The first part of the specification deals with structures,
documenting the meaning, use, and relationships of the components
of an XML document, such as elements, attributes, and entities. It
provides the rules for validating XML documents, based on the rules
described in the schemas. It also allows for referencing partial or
multiple schemas, thus providing a great deal more flexibility and
power than DTDs.41
The second part of XML Schema covers datatypes and addresses the
need for defining more kinds of data in the rules used to validate
XML documents. This part of the specification identifies a group of
basic (or primitive) datatypes such as strings, integers,
dates, and sequences. The specification describes features of a
datatype system, including acceptable ranges of values and valid
representations of the data (such as whole numbers or scientific
notation).
The specification identifies datatypes derived from those built
into the basic XML recommendations, such as character data (CDATA),
tokens, and entities. And it defines various components of
datatypes to allow for the development of unanticipated
datatypes.42
This greater flexibility comes with a price, however. While it's
tempting to use many of these new features, many business
applications require just a few of them at any time. For example,
being able to validate dates and times will be a significant
addition to XML's ability to support business. Few businesses,
however, will need the ability to create entirely new datatypes.
Software and systems supporting XML Schema will need to resist the
temptation to cover all of the bells and whistles, since they build
in more complexity and cost than is needed.43
As an alternative, work on RELAX NG is being developed by
an OASIS Technical Committee and eventually for submission to ISO.
RELAX NG is designed as a simpler and more accessible approach to
providing schema functionality for XML documents.
XML Schema incorporates one of the first enhancements to the XML
specification, called XML Namespaces. With XML Namespaces, schemas
can address multiple XML vocabularies in a single document.
Namespaces provide for uniqueness in element names by combining the
namespace prefix (mapped to a uniform resource identifier, like a
web address), and the local part or element or attribute
name.45
Put simply, XML Namespaces allow different companies or
industries to avoid name clashes where they both use the same word
with different meanings or contexts, but with the same tag name. An
example is the word stock, which has at least six possible
meanings. An obvious example is using formats such as
billing:address and supplier:address to clarify that
address is being used in two different contexts.
While the XML family of technologies goes a long way to build up
the features needed for exchanging business messages, XML markup
technology by itself can't do all that's needed. The major
inhibitor to the use of XML for business is its inability to
provide for interoperability among the various vocabularies written
for exchanging business messages. The number of these vocabularies
is expanding rapidly; a survey in 2000 showed these vocabularies
doubling between February and August 2000.46 Unless a
way is found to allow businesses using these vocabularies to
understand messages from other vocabularies, the promise of XML as
a data exchange technology will go unfulfilled.
To achieve this interoperability, companies using XML need to
have a common set of methods with translation among the different
industry syntaxes. Also needed is a way of relating the XML
messages to overall business processes that give context to the
messages and the data contained within them. XML by itself also has
no inherent provisions for security and privacy, although the W3C
has undertaken important initiatives in these areas, notably with
digital signatures and privacy preferences using extensions to the
base XML specifications.
At the same time, no solution can just pile on all of these
requirements without keeping an eye on the impact it will have on
achieving the desired objectives-keeping within the scale and price
range of the millions of smaller businesses that so far are left
out of the data-exchange experience. This is the challenge laid at
the feet of ebXML.