Mark Wilson I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
Versioning data is an ancient practise in databases. A
time-stamp field is added to the database tables to record
when the update happened. Many developers use the time stamp to
ensure that the update makes sense: Is this user submitting some
older data that wipes out newer data? (I use a record-version
"counter" field and compare that with the incoming data's record
version. It is perfect for the sub-millisecond, multi-time zone
services we want to support.)
We commonly version our software tools to ensure the proper
files are updated. Versioning our software lets anyone determine if
they need an update.
Versioning XML data without changing the tools that
process the documents is not difficult. All it takes is some extra
thought in the design stage. This means that older software will
even process XML structures that are invented after the associated
software is shipped.
The Expanding XML Universe
Many people were surprised when Microsoft announced that Excel
spreadsheets held more data than all other databases combined. In a
few years, XML documents could take over that position.
In its most basic incarnation, XML is only a container for data.
An XML document is meant for distribution and will probably be
stored too. You cannot be sure of who will pick it up or when.
Star Trek's Spock explained: "Entropy increases in an
expanding universe." (Entropy means randomness or chaos -
the common foe of IT departments everywhere.) Universal access to
data is expanding. Hence, chaos could increase. Here are some
examples of the expanding data access:
B2B
services,
BizTalk,
Application Service Providers,
Browsers
that work off-line with saved pages,
Cool new
web services with SOAP.
Those services were not common a few years ago. They are
becoming more common. That means:
Old XML documents will be floating around out there.
The potential problem of old data being used is
increasing.
Some Idle Tuesday at 3:45 pm
Like some old radio message from deep space, your software
receives some XML data that is formatted in an old version. It
could be minutes or months old. You could simply reject the message
because it is mal-formed (by current standards). One co-worker used
to say, "Apply swift justice and use the error message cannons!" He
was a likeable nut case. His point was that they should keep their
software up to date! Right?
Well, in a vastly distributed world, upgrading every computer is
not always possible. For example, some large corporations have
frozen software versions because it makes helpdesk support easier.
Heck, some web servers aren't even getting security patches
installed! These are not our computers and we cannot control
them. Of course, my old co-worker might start up a side occupation
of hacking into machines, upgrading their software and purging old
XML documents. As for me, I'm going past jail and sticking with my
day job.
You could try to freeze your DTD's (Document Type
Definition) and other database schemas to avoid version
problems. But we all know that won't last forever because we have
to accommodate the changes that happen outside of our sphere of
influence.
If you want to be open to all, for as long as possible, you need
to handle old XML data. Old data will be in-use, especially in the
short period after a new release. Nobody really wants to cause a
failure because of a version change in the XML document structure.
That failure could be a cascading failure once web services become
commonplace.
The key here is that a sharp company designs things so
users/customers are as successful as you can make them in accessing
your application. Designing a system that simply rejects service
requests is a dull idea in a cutting-edge world.
The goal is to make a versioning system that survives over a
long time. I cannot point to any long-term stable XML examples, so
let's turn to OLE 2 as an example of a good approach.
Please, ignore for the moment that OLE 2 is an interface, and not a
data document.
OLE 2 (now called COM, COM+, DCOM) has been a stable interface
for about 8 years. It's from the days when the DOSisaur ruled the
desktop; before browsers were invented; when Windows was new.
That's millions of CPU minutes ago, making it a good example of
stability. The OLE 2 interface survived the changes in the
environment because it supported an interface tools could rely on.
The OLE 2 interface is used to reveal information about things you
do not know about. That's really useful. Effectively,
you can ask it questions about some "thing" that will be used. OLE
2 would return useful answers. Naturally, if you already know the
"thing", you can directly access it without asking questions first.
(Note: OLE 2 only works with executables that support the OLE
interface. XML files do not support that interface. The answer is
not there.) Kudos go to the OLE team.
If we apply this concept to XML data, we could potentially talk
to any XML document more intelligently. Naturally, anyone can walk
through all the nodes of an XML document now. Walking through the
nodes will expose the raw data. We know there is data; we
just don't know things about the data. If there were so