In addition to tags and elements, XML
documents can also include attributes.
Attributes are simple name/value pairs associated with an
element.
They are attached to the start-tag, as shown below, but not to the
end-tag:
<name nickname='Shiny John'>
<first>John</first>
<middle>Fitzgerald Johansen</middle>
<last>Doe</last>
</name>
Attributes must have values - even if that value is just an empty
string (like "") - and those values must be in quotes. So the
following, which is part of a common HTML tag, is not legal in
XML:
and neither is this:
Either single quotes or double quotes are fine, but they
have to match. For example, to make this well-formed XML, you can use
one of these:
<input checked='true'>
<input checked="true">
but you can't use:
Because either single or double quotes are allowed, it's easy to
include quote characters in your attribute values, like "John's
nickname"or 'Isaid"hi"tohim'. You just have to be careful not to
accidentally close your attribute, like 'John'snickname'; if an XML
parser sees an attribute value like this, it will think you're
closing the value at the second single quote, and will raise an error
when it sees the "s" which comes right after it.
The same rules apply to naming attributes as apply to naming
elements: names are case sensitive, can't start with "xml", and so
on. Also, you can't have more than one attribute with the same name
on an element. So if we create an XML document like this:
<bad att="1" att="2"></bad>
we will get the following error in IE5:
Try It Out - Adding Attributes to Al's CD
With all of the information we recorded about our CD in our
earlier Try It Out, we forgot to include the CD's serial number, or
the length of the disc. Let's add some attributes, so that our
hypothetical CD Player application can easily find this information
out.
Open your cd.xml file created earlier, and resave it to your hard
drive as cd2.xml.
With our new-found attributes knowledge, add two attributes to the
<CD> element, like this:
<CD serial=B6B41B
disc-length='36:55'>
<artist>"Weird Al" Yankovic</artist>
<title>Dare to be Stupid</title>
<genre>parody</genre>
<date-released>1990</date-released>
<song>
<title>Like A Surgeon</title>
<length>
<minutes>3</minutes>
<seconds>33</seconds>
</length>
<parody>
<title>Like A
Virgin</title>
<artist>Madonna</artist>
</parody>
</song>
<song>
<title>Dare to be
Stupid</title>
<length>
<minutes>3</minutes>
<seconds>25</seconds>
</length>
<parody></parody>
</song>
</CD>
If you typed in exactly what's written above, when you display it
in IE5 it should look something like this:
Now edit the first attribute, like this:
<CD serial='B6B41B'
disc-length='36:55'>
Re-save the file, and view it in IE5. It will look something like
this:
How It Works
Using attributes, we added some information about the CD's serial
number and length to our document:
<CD serial=B6B41B
disc-length='36:55'>
When the XML parser got to the "=" character after the serial
attribute, it expected an opening quotation mark, but instead it
found a B. This is an error, and it caused the parser to stop and
raise the error to the user.
So we changed our serial attribute declaration:
and this time the browser displayed our XML correctly.
The information we added might be useful, for example, in the CD
Player application we considered earlier. We could write our CD
Player to use the serial number of a CD to load any previous settings
the user may have previously saved (such as a custom play list).
Why Use Attributes?
There have been many debates in the XML community about whether
attributes are really necessary, and if so, where they should be
used. Here are some of the main points in that debate:
Attributes Can Provide Metadata that May Not be Relevant to Most
Applications Dealing with Our XML
For example, if we know that some applications may care
about a CD's serial number, but most won't, it may make sense to make
it an attribute. This logically separates the data most applications
will need from the data that most applications won't need.
In reality, there is no such thing as "pure metadata" - all
information is "data" to some application. Think about HTML;
you could break the information in HTML into two types of data: the
data to be shown to a human, and the data to be used by the web
browser to format the human-readable data. From one standpoint, the
data used to format the data would be metadata, but to the browser or
the person writing the HTML, the metadata is the data.
Therefore, attributes can make sense when we're separating one type
of information from another.
What Do Attributes Buy Me that Elements Don't?
Can't elements do anything attributes can
do?
In other words, on the face of it there's really no difference
between:
<name nickname='Shiny John'></name>
and:
<name>
<nickname>Shiny John</nickname>
</name>
So why bother to pollute the language with two ways of doing the
same thing?
The main reason that XML was invented was that SGML could do
some great things, but it was too massively difficult to use without
a fully-fledged SGML expert on hand. So one concept behind XML is a
simpler, kinder, gentler SGML. For this reason, many people don't
like attributes, because they add a complexity to the language that
they feel isn't needed.
On the other hand, some people find attributes easier to use - for
example, they don't require nesting and you don't have to worry about
crossed tags.
Why Use Elements, if Attributes Take Up So Much Less Space?
Wouldn't it save bandwidth to use attributes instead?
For example, if we were to rewrite our <name> document to
use only attributes, it might look like this:
<name nickname='Shiny John' first='John' middle='Fitzgerald
Johansen' last='Doe'></name>
Which takes up much less space than our earlier code using
elements.
However, in systems where size is really an issue, it turns out
that simple compression techniques would work much better than trying
to optimize the XML. And because of the way compression works,
you end up with almost the same file sizes regardless of whether
attributes or elements are used.
Besides, when you try to optimize XML this way, you lose many of
the benefits XML offers, such as readability and descriptive tag
names. And there are cases where using elements allows more
flexibility and scope for extension. For example, if we decided that
first needed additional metadata in the future, it would be much
simpler to modify our code if we'd used elements rather than
attributes.
Why Use Attributes when Elements Look So Much Better? I Mean, Why
Use Elements when Attributes Look So Much Better?
Many people have different opinions as to whether attributes or
child elements "look better". In this case, it comes down to a matter
of personal preference and style.
In fact, much of the attributes versus elements debate
comes from personal preference. Many, but not all, of the arguments
boil down to "I like the one better than the other". But since XML
has both elements and attributes, and neither one is going to go
away, you're free to use both. Choose whichever works best for your
application, whichever looks better to you, or whichever you're most
comfortable with.
©1999 Wrox Press Limited,
US and UK.