BizTalk Utilities CV ,   Jobs ,   Code library  
 
 
Page 4 of 7

 

Previous Page Table Of Contents Next Page

Converting HTML to XHTML, cont.

Mechanical Translation from HTML to XHTML

Translation from HTML to XHTML is truly a breeze. Instead of thinking of it as a full-blown conversion process, think of it as cleaning up your HTML document. XHTML uses the same elements-the only difference is syntax and rules. We break them down for you in the following sections.

XML Syntax Rules

All XML documents have something in common: syntax. There are a few, easy rules to remember when you work with XML documents. Here they are, one by one:

      n  All elements must be contained by a root element (also called a document element). For XHTML, the html element is the root element: <html>...</html> contains all other elements

      n  If your document adheres to a Document Type Definition (DTD), the document must include a document type declaration. For XHTML, the document type declaration is formed as follows:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

        The DOCTYPE declaration is not an element; it's a declaration and has its own syntax. All DOCTYPE declarations begin with an exclamation point and uppercase keyword (for example, <!DOCTYPE>). This is not negotiable. If you're new to the term DTD, please read Chapter 2.

      n  All element and attribute names must be lowercase (for example, <head> not <HEAD>).

      n  All nonempty elements must have a closing tag (for example, <p>...</p>).

      n  All empty elements must use the following syntax: <img />. (For backward-compatibility reasons, although it's not required, be sure to include a space between the element name and />.)

      n  Elements must be nested correctly (for example, <p><b>This is correct</b></p> and <p><b>This is not correct</p></b>).

      n  All attributes must have values and those values must be contained by single or double quotation marks. This means that the standalone attributes used in HTML (such as <td nowrap>) are no longer valid). The correct form is <td nowrap="nowrap">.

The main syntax rules end there. However, there are a few XHTML-specific rules you have to follow as you convert your documents.

XHTML-Specific Rules

To become familiar with the XHTML-specific rules, let's go through an HTML document from top to bottom and identify the necessary changes and additions to required elements.

The first item on an HTML page is normally the html element. This is no longer the case. There are a few pieces of markup that must come at the beginning of any XHTML document.

First, you must include a DOCTYPE declaration (also known as a document type declaration). You may also include XML version and encoding information (which is optional) in the XML declaration right before the DOCTYPE declaration. We strongly recommend that you include the XML declaration. For more on the XML declaration, see Chapter 2. The correct markup takes the form:

<?xml version="1.0"?>

<!DOCTYPE html PUBLIC       "-//W3C//DTD XHTML 1.0 Strict//EN"       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

The next item on the list is the html element. In XHTML, this element is called the root element. The root element contains all other elements. To read more on including a root element, see Chapter 3, "Overview of Element Structures." Not only is the html element required, but it must also contain a predefined attribute-value pair. This specific required attribute is called an XMLnamespace. XML namespaces are commonly used in XML documents. The specific namespace used for XHTML documents is often referred to as the XHTML namespace and is a way to uniquely identify the set of elements as XHTML elements. The correct XHTML namespace is as follows:

<html>

Imagine that you're in the world of XML and you want to combine your XHTML document with some elements you made from scratch. In the list of elements you named, you decided to use a title element type to represent the title of a book. This means that your combined document now has two title element types with two completely different meanings.

How do you tell the XML processor (a browser is one type of processor) that the XHTML title element is the title of the document and your title element represents the title of a book? You use a namespace to uniquely separate the two.

Namespaces are like surnames. Defining a namespace in the root element means that all elements contained by the html element belong to the XHTML element set. Any elements that are not contained by the html element or that have their own namespaces attached will belong to a different set of elements. The namespace debate is not yet settled at the W3C-for a while, there was an argument about how and when to use namespaces. To learn more about namespaces, see Chapter 2 or check out the W3C site at www.w3.org/TR/REC-xml-names/.

Next, you must include the head, title, and body structural elements, which are required in XHTML. The one exception is if you're creating a frameset document. In this case, you have to replace the body element with the frameset element.

Now that you know the differences between XHTML and HTML, it's time to convert an HTML document to an XHTML document.

 

Page 4 of 7

 

Previous Page Table Of Contents Next Page
 

Recent Jobs

An immediate job opportunity as a B
Software Developers Needed in Charl
Sr. Software Engineer - Analytics
Immediate Mainframe openings for Ch
Immediate TANDEM-TAL openings for C

View all Jobs (Add yours)
View all CV (Add yours)



answering service
masks
swimming pool builder
free conference call
water softener
Teleconference
Host Department NOLIMIT Web Hosting
MSN
sunglasses


    Email TopXML  

Front Page Daily Stuff TopXML Forum XML blogs XML Newsgroups BizTalk Biztalk Utilities Biztalk Utilities Tutorial B2B SAP XML Microsoft .NET Dotnet System XML Soapformatter SQLXML XMLserializer XQuery PHP PHP SimpleXML PHP XML Dom PHP XML RPC PHP XSLT Java Java Java XML Xalan Microsoft ASP ASP Schemas XML SQL Server XML XMLDom XSL XSL Tutorial XSLT Stylesheets General Javascript CSS XHTML WAP