|
Summary
This handbook focuses on basic concepts of xml and targets the novice xml audiences.
The XML
Basics Handbook
Authored by: Itech Consulting
©www.itechconsulting.co.in
Contents:
Introduction
XML Rules
Elements and Attribute values
Root Element
Empty Element
Nesting Elements
Comments in XML
XML Browsers
Escape Characters
XML Namespaces
XML Data-Binding
©www.itechconsulting.co.in
Introduction:
This handbook contains basics of xml. The examples given in this handbook
can be executed using Notepad, Altova's XML Spy or EditPlus software by
saving the source code as ".xml" and viewing it
on the browser e.g. IE6. This handbook focuses on basic concepts of xml
and targets the novice xml audiences. Also some portions tend to be for
intermediate xml programmers. In case, you find that the following document
requires any modifications, please do let us know and we will incorporate
the necessary changes.
Please send all feedback to training@itechconsulting.co.in
©www.itechconsulting.co.in
1)
An XML Document:
Sample xml document:
|
Introduction to RSS
RSS Syntax
RSS Channel
RSS Item
Comments in RSS
Learn how to put comments in an RSS document
Optional Elements in RSS Channel and Item
Validate your RSS
|
Example 1
The above xml document
is simple to understand & self-explanatory. It gives information about the
topics covered under RSS.
The first line
in the document refers to the xml declaration, defining xml version and the
character encoding being used. It means that this xml document obeys the 1.0
specification of xml and uses ISO-8859 (Latin 1/West European) character set.
The next line is the root element of this document. Here, “rss"
is the root element followed by the child elements (topic1-topic7) describing
the topics covered under rss.
©www.itechconsulting.co.in
2) Rules:
XML documents must
follow certain syntax rules. Taking example 1 as reference,
1. All XML documents
must have a root element. In the example 1, “rss" is the root element.
2. All XML elements must have a closing tag. In the example 1, the root (rss),
all the child elements (topic1-topic7) have the start tag and the close tag.
Omission of the close tag would result in an error.
3. XML tags are case-sensitive i.e. the start tag and the end tag must be written
in the same case. Best practice would be to write all the tags in an xml document
in the same case to make it look consistent.
4. XML elements must be properly nested. Refer to child element “topic5"
and its sub-child elements “name" & “description".
5. The attributes values must always be in quotation marks. Refer attributes
in root element and in the sub-child element “description" under
child element “topic5". Violating this rule will result to an error.
6. Like HTML, comments in XML document can be written within
©www.itechconsulting.co.in
3) Elements:
An xml document
consists of elements. Elements in an xml document describe the data it contains.
Elements can contain other elements and attributes. Root element being mandatory
followed by child elements and sub-child elements and further.
Elements in an
xml document cannot start with a number or underscore or letters like ‘XML’.
The element name cannot contain spaces. If you require an element name to be
combination of two words then you can either put an underscore sign in between
two words or write the first letter of the second word in capitals, as shown
in the example 2 – element ‘myForumId’
|
todd
USA
Optional
10
Dec
1970
|
Example 2
Elements in xml
can have different content types. The types of content in an xml element can
be – element, simple, mixed or empty.
In example 2,
· Element ‘myForumId’ has element content because it contains
other elements.
· Element ‘address’ has simple content because it contains
only plain text.
· Element ‘forumRecog’ has no contents but only attributes.
It can also be written as:
‘formRecog’
can also be called as empty tag since it contains no information. Empty tags
need not have closed tag as shown above. But the start tag can end with a space
and a forward slash.
· Element
‘birthday’ contains mixed content because it contains text as well
as other elements.
©www.itechconsulting.co.in
4) Attributes:
Attributes are
simple name/value pairs associated with an element. They are used to provide
additional information about the elements. Attributes are attached to the “start
tag" of the element.
Attributes must
have values and must be in either single or double quotes. If the value contains
a word in double quotes, then enclose the attribute value in single quote. Reverse
incase if the value contains a word in single quotes.
Example 3
Example 3A
The naming rules
for an attribute are similar to that of an element. Also, two attributes cannot
be of the same name in an element.
Take a look at example 2 again,
|
todd
USA
Optional
10
Dec
1970
|
Here the “name"
element contains two attributes – age, gender representing the users personal
details.
Guidelines on using
Attributes –
7. Use attributes if your data is simple, default or fixed.
8. In worst scenarios like if your xml file size increases, you can convert
some of the child elements to attributes.
9. Use child elements instead of using attributes. Child elements help you expand
your data and manage it easily.
©www.itechconsulting.co.in
5) Root
Element:
In all xml documents,
there exists a root element. In a well-formed xml document, there cannot be
more than one root element in an xml document, which results in an error. Likewise,
an xml document containing no root element would generate an error. These rules
are presented in the following examples for Root Element.
Example 4 –
Represents a well-formed xml document with element ‘hello’.
Example 4A –
Represents a well-formed xml document containing one root element – ‘employee’
and its child elements.
Example 4B –
Erroneous xml document containing no root element.
Example 4 –
Erroneous xml document containing more than one root element
Hi,
this is xml
I am a markup language
I am used for data exchange
|
©www.itechconsulting.co.in
6) Empty
Elements:
The element that
contains no element content is called empty element or empty tag. The empty
element can have attributes representing some data. Empty elements can contain
attributes to provide some kind of fixed value data. There can be zero or more
empty elements in an xml document.
Example 5
©www.itechconsulting.co.in
7) Nesting Elements:
Elements in an
xml document must be properly nested. If child’s start element is in the
content of parent’s element, then the child’s end element must also
be in its parent’s content. Elements not properly nested results in error.
In an xml document there can be many elements representing blocks of data; the
elements must be well nested to avoid errors and indenting them properly helps
the user understand the xml document structure easily.
Example 6 - A well-formed
document with elements nested properly
Example 6A –
Elements not properly nested, generates error
©www.itechconsulting.co.in
8) Comments
in XML:
Comments in an
xml can be put at any place inside the document. It can be before the start
tag of the root element, inside the root element, child elements, sub-child
elements or so on. Care must be taken to not to have “--“ (double-hyphen)
in between the comments content, which results in an error.
Example 7
©www.itechconsulting.co.in
9) XML
Browsers:
XML browsers besides
supporting XML must support style languages like CSS or XSL. Besides this the
browser must support a scripting language like VBScript or JavaScript. XML browsers
must support all these requirements. There are few Internet Browsers providing
full support of xml.
Few of them are
as listed below:
1. Internet Explorer
6: The most powerful Internet Browser support XML is Internet Explorer 6 (IE6).
IE6 can display XML documents, provides support for scripting languages like
JavaScript, VBScript, also provides full support for style sheet languages.
2. Netscape Navigator
6: Netscape also support XML to a good extent. Has a good style sheet support.
3. Mozilla: supports
XML + XSLT + CSS.
4. FireFox: Has
full XML support along with XSLT & CSS.
5. Opera: supports
XML & CSS on MS Windows and Linux.
10) Escape Characters:
The elements and the data in an xml file is parsed by the xml parsers and displayed
on the Internet browsers. However, there are certain illegal characters which
the parsers do not accept and generate error on parsing.
Characters such as “<" and “&" are treated as
illegal characters in xml. See the following example:
|
check for conditions one & two
age < 20
age < 30
|
In the above example,
there would be an error generated in for contents in the tag because
of and illegal/invalid character = “&" and for and
tag because of “<" illegal character. Replacing &
= & and < = < eliminates the error.
Documents containing
too many escape characters can be written within:
©www.itechconsulting.co.in
11) XML
Namespaces:
Namespaces in xml are used to group related information and to avoid element
name conflict. They define a mechanism to uniquely name elements and attributes
so that different terminologies can be mixed in an xml document without name
conflicts.
Syntax:
xmlns:namespace-prefix="namespaceURI"
Where,
xmlns is a special attribute placed in the start tag of the element.
Namespace-prefix refers to the namespace URI.
NamespaceURI = unique resource identifier.
Namespaces are required to be unique. Elements belonging to same namespace must
be prefixed with the same name. Namespaces are placed in the start tag of the
parent element and all the child elements with the same prefix are associated
with the same namespace.
In the above example,
element names with prefix are qualified names.
Default Namespace:
In the above example,
the element names are not prefixed; hence they come under the default namespace
mentioned. Here, element names without prefix are called local names.
Scenario:
Suppose, at a given point you may want to refer to a same named element defined
in two separate DTD’s having different terminologies in your xml document.
Use of namespace helps you identify which terminology you are referring to for
that element at any point of time.
For e.g.
Using namespaces
in your xml document will help you differentiate which state element you refer
to at all times.
|
three.dtd:
%onedtd;
%twodtd;
xml document:
Maharashtra
gas
|
In the above example,
the two ‘state’ elements are ambiguous. To resolve this conflict,
use namespaces as below:
|
one.dtd:
two.dtd:
three.dtd:
%onedtd;
%twodtd;
xml document:
Maharashtra
gas
|
©www.itechconsulting.co.in
12) XML Data-binding:
Xml data-binding
is used to draw xml data from an XML Data Island and present it on the browser.
It is a process of nailing a webpage to an xml data island (xml data). The data
island can be internal or an external one. Internal Data Island is a part of
the html webpage. Employing external data island helps in data separation from
its presentation thereby eases the maintenance. This process does not involve
any scripting and is unlike the transformation process.
To create data
islands in html document, html element ‘xml’ is used that embeds
xml in html. This is achieved as follows:
I) Implementing
External Data Island:
1. External xml
document: contacts.xml – sample xml document.
todd
27
male
<_address>texas
7777777777
|
2. Create external
data island to html page using ‘xml’ html tag:
The xml tag in
html helps in creation of XML Data Island in html. The ‘id’ attribute
creates Data Island (named “contacts" in our example. The ‘src’
attribute specifies the xml document file location.
3. Embedding xml into html. Html: contacts.html
The HTML page demonstrates
xml data binding. Assuming contacts.xml and contacts.html resides in the same
directory/folder, the ‘src’ attribute of xml tag does not mention
the path of the .xml file. The xml data is presented in an html table using
xml tag, datasrc (data source) pointing to the data island, and the dataFld
attribute specifying the element name in the xml document.
|
| Name |
Age |
Sex |
Address |
Contact No. |
|
|
|
|
|
|
II) Implementing
Internal Data Island:
|
todd
27
male
<_address>texas
7777777777
| Name |
Age |
Sex |
Address |
Contact No. |
|
|
|
|
|
|
©www.itechconsulting.co.in
References:
1. http://www.w3schools.com/xml/default.asp
- XML tutorial
2. http://www.zvon.org/xxl/XMLTutorial/General/book.html
- XML Tutorial with examples
3. http://www.xml.org - Community focusing
on various xml topics
4. http://www.topxml.com - Provides great
bunch of xml help
|