BizTalk Utilities CV ,   Jobs ,   Code library
 
Home Page


Add/Edit your code items
Search the code library
Browse for the code library


JavaScript


 
 

 LAMP >>
 


By itech
First Posted 01/23/2006
Times viewed 340

XML Basics Handbook


Summary This handbook focuses on basic concepts of xml and targets the novice xml audiences.

The XML Basics Handbook Authored by: Itech Consulting ©www.itechconsulting.co.in

Contents:

Introduction XML Rules Elements and Attribute values Root Element Empty Element Nesting Elements Comments in XML XML Browsers Escape Characters XML Namespaces XML Data-Binding


©www.itechconsulting.co.in

Introduction: This handbook contains basics of xml. The examples given in this handbook can be executed using Notepad, Altova's XML Spy or EditPlus software by saving the source code as ".xml" and viewing it on the browser e.g. IE6. This handbook focuses on basic concepts of xml and targets the novice xml audiences. Also some portions tend to be for intermediate xml programmers. In case, you find that the following document requires any modifications, please do let us know and we will incorporate the necessary changes. Please send all feedback to training@itechconsulting.co.in


©www.itechconsulting.co.in

 

1) An XML Document:

Sample xml document:

 Introduction to RSS  RSS Syntax  RSS Channel  RSS Item        Comments in RSS           Learn how to put comments in an RSS document        Optional Elements in RSS Channel and Item  Validate your RSS

Example 1

The above xml document is simple to understand & self-explanatory. It gives information about the topics covered under RSS.

The first line in the document refers to the xml declaration, defining xml version and the character encoding being used. It means that this xml document obeys the 1.0 specification of xml and uses ISO-8859 (Latin 1/West European) character set. The next line is the root element of this document. Here, “rss" is the root element followed by the child elements (topic1-topic7) describing the topics covered under rss.


©www.itechconsulting.co.in

2) Rules:

XML documents must follow certain syntax rules. Taking example 1 as reference,

1. All XML documents must have a root element. In the example 1, “rss" is the root element. 2. All XML elements must have a closing tag. In the example 1, the root (rss), all the child elements (topic1-topic7) have the start tag and the close tag. Omission of the close tag would result in an error. 3. XML tags are case-sensitive i.e. the start tag and the end tag must be written in the same case. Best practice would be to write all the tags in an xml document in the same case to make it look consistent. 4. XML elements must be properly nested. Refer to child element “topic5" and its sub-child elements “name" & “description". 5. The attributes values must always be in quotation marks. Refer attributes in root element and in the sub-child element “description" under child element “topic5". Violating this rule will result to an error. 6. Like HTML, comments in XML document can be written within


©www.itechconsulting.co.in

3) Elements:

An xml document consists of elements. Elements in an xml document describe the data it contains. Elements can contain other elements and attributes. Root element being mandatory followed by child elements and sub-child elements and further.

Elements in an xml document cannot start with a number or underscore or letters like ‘XML’. The element name cannot contain spaces. If you require an element name to be combination of two words then you can either put an underscore sign in between two words or write the first letter of the second word in capitals, as shown in the example 2 – element ‘myForumId’

 todd  

USA
 

   Optional    10    Dec    1970  

Example 2

Elements in xml can have different content types. The types of content in an xml element can be – element, simple, mixed or empty.

In example 2, · Element ‘myForumId’ has element content because it contains other elements. · Element ‘address’ has simple content because it contains only plain text. · Element ‘forumRecog’ has no contents but only attributes.

It can also be written as:

‘formRecog’ can also be called as empty tag since it contains no information. Empty tags need not have closed tag as shown above. But the start tag can end with a space and a forward slash.

· Element ‘birthday’ contains mixed content because it contains text as well as other elements.


©www.itechconsulting.co.in

4) Attributes:

Attributes are simple name/value pairs associated with an element. They are used to provide additional information about the elements. Attributes are attached to the “start tag" of the element.

Attributes must have values and must be in either single or double quotes. If the value contains a word in double quotes, then enclose the attribute value in single quote. Reverse incase if the value contains a word in single quotes.

Example 3

Example 3A

The naming rules for an attribute are similar to that of an element. Also, two attributes cannot be of the same name in an element. Take a look at example 2 again,

 todd  

USA
 

   Optional    10    Dec    1970  

Here the “name" element contains two attributes – age, gender representing the users personal details.

Guidelines on using Attributes – 7. Use attributes if your data is simple, default or fixed. 8. In worst scenarios like if your xml file size increases, you can convert some of the child elements to attributes. 9. Use child elements instead of using attributes. Child elements help you expand your data and manage it easily.


©www.itechconsulting.co.in

5) Root Element:

In all xml documents, there exists a root element. In a well-formed xml document, there cannot be more than one root element in an xml document, which results in an error. Likewise, an xml document containing no root element would generate an error. These rules are presented in the following examples for Root Element.

Example 4 – Represents a well-formed xml document with element ‘hello’.

Hi, this is xml

Example 4A – Represents a well-formed xml document containing one root element – ‘employee’ and its child elements.

 001  todd  55555

Example 4B – Erroneous xml document containing no root element.

Hi, this is xml

Example 4 – Erroneous xml document containing more than one root element

Hi, this is xml I am a markup language I am used for data exchange


©www.itechconsulting.co.in

6) Empty Elements:

The element that contains no element content is called empty element or empty tag. The empty element can have attributes representing some data. Empty elements can contain attributes to provide some kind of fixed value data. There can be zero or more empty elements in an xml document.

Example 5

           

©www.itechconsulting.co.in

7) Nesting Elements:

Elements in an xml document must be properly nested. If child’s start element is in the content of parent’s element, then the child’s end element must also be in its parent’s content. Elements not properly nested results in error. In an xml document there can be many elements representing blocks of data; the elements must be well nested to avoid errors and indenting them properly helps the user understand the xml document structure easily.

Example 6 - A well-formed document with elements nested properly

     xml        xsl    xsd        rss  

Example 6A – Elements not properly nested, generates error

       xsl        xsd  

 


©www.itechconsulting.co.in

8) Comments in XML:

Comments in an xml can be put at any place inside the document. It can be before the start tag of the root element, inside the root element, child elements, sub-child elements or so on. Care must be taken to not to have “--“ (double-hyphen) in between the comments content, which results in an error.

Example 7

          

   

=>not valid


©www.itechconsulting.co.in

9) XML Browsers:

XML browsers besides supporting XML must support style languages like CSS or XSL. Besides this the browser must support a scripting language like VBScript or JavaScript. XML browsers must support all these requirements. There are few Internet Browsers providing full support of xml.

Few of them are as listed below:

1. Internet Explorer 6: The most powerful Internet Browser support XML is Internet Explorer 6 (IE6). IE6 can display XML documents, provides support for scripting languages like JavaScript, VBScript, also provides full support for style sheet languages.

2. Netscape Navigator 6: Netscape also support XML to a good extent. Has a good style sheet support.

3. Mozilla: supports XML + XSLT + CSS.

4. FireFox: Has full XML support along with XSLT & CSS.

5. Opera: supports XML & CSS on MS Windows and Linux. 10) Escape Characters: The elements and the data in an xml file is parsed by the xml parsers and displayed on the Internet browsers. However, there are certain illegal characters which the parsers do not accept and generate error on parsing. Characters such as “<" and “&" are treated as illegal characters in xml. See the following example:

 check for conditions one & two  age < 20  age < 30

In the above example, there would be an error generated in for contents in the tag because of and illegal/invalid character = “&" and for and tag because of “<" illegal character. Replacing & = & and < = < eliminates the error.

Documents containing too many escape characters can be written within:


©www.itechconsulting.co.in

11) XML Namespaces: Namespaces in xml are used to group related information and to avoid element name conflict. They define a mechanism to uniquely name elements and attributes so that different terminologies can be mixed in an xml document without name conflicts.

Syntax: xmlns:namespace-prefix="namespaceURI"

Where, xmlns is a special attribute placed in the start tag of the element. Namespace-prefix refers to the namespace URI. NamespaceURI = unique resource identifier.

Namespaces are required to be unique. Elements belonging to same namespace must be prefixed with the same name. Namespaces are placed in the start tag of the parent element and all the child elements with the same prefix are associated with the same namespace.

 todd  27  texas  7777777777

In the above example, element names with prefix are qualified names.

Default Namespace:

 todd  
texas
 27

In the above example, the element names are not prefixed; hence they come under the default namespace mentioned. Here, element names without prefix are called local names.

Scenario: Suppose, at a given point you may want to refer to a same named element defined in two separate DTD’s having different terminologies in your xml document. Use of namespace helps you identify which terminology you are referring to for that element at any point of time.

For e.g.

one.dtd:

two.dtd:

Using namespaces in your xml document will help you differentiate which state element you refer to at all times.

three.dtd: %onedtd; %twodtd;

xml document:

 Maharashtra      gas  

In the above example, the two ‘state’ elements are ambiguous. To resolve this conflict, use namespaces as below:

one.dtd:

two.dtd:

three.dtd: %onedtd; %twodtd;

xml document:

     Maharashtra  

           gas    

 


©www.itechconsulting.co.in

12) XML Data-binding:

Xml data-binding is used to draw xml data from an XML Data Island and present it on the browser. It is a process of nailing a webpage to an xml data island (xml data). The data island can be internal or an external one. Internal Data Island is a part of the html webpage. Employing external data island helps in data separation from its presentation thereby eases the maintenance. This process does not involve any scripting and is unlike the transformation process.

To create data islands in html document, html element ‘xml’ is used that embeds xml in html. This is achieved as follows:

I) Implementing External Data Island:

1. External xml document: contacts.xml – sample xml document.

 todd  27  male  <_address>texas  7777777777

2. Create external data island to html page using ‘xml’ html tag:

The xml tag in html helps in creation of XML Data Island in html. The ‘id’ attribute creates Data Island (named “contacts" in our example. The ‘src’ attribute specifies the xml document file location.

3. Embedding xml into html. Html: contacts.html

The HTML page demonstrates xml data binding. Assuming contacts.xml and contacts.html resides in the same directory/folder, the ‘src’ attribute of xml tag does not mention the path of the .xml file. The xml data is presented in an html table using xml tag, datasrc (data source) pointing to the data island, and the dataFld attribute specifying the element name in the xml document.

   

                                                                        
NameAgeSexAddressContact No.
 

 

II) Implementing Internal Data Island:

 

               todd     27     male     <_address>texas     7777777777       

                                                             
NameAgeSexAddressContact No.
  

 


©www.itechconsulting.co.in

References:

1. http://www.w3schools.com/xml/default.asp - XML tutorial 2. http://www.zvon.org/xxl/XMLTutorial/General/book.html - XML Tutorial with examples 3. http://www.xml.org - Community focusing on various xml topics 4. http://www.topxml.com - Provides great bunch of xml help

Additional information


Rate this article on a scale of 1 to 10 (0 votes, average 0)

Your vote :  

 LAMP >>
 





Leave a comment for this article
Your name
Your email (optional)
Your comment
Optional: Upload an attachment
Enter the code shown:

 
 

    Email TopXML