BizTalk Utilities CV ,   Jobs ,   Code library  
 
 
Page 7 of 13

 

Previous Page Table Of Contents Next Page

Java and the XML DOM, cont.

Displaying an Entire Document

In this next example, I'm going to write a program that will parse and display an entire document, indenting each element, processing instruction, and so on, as well as displaying attributes and their values. For example, if you pass customer.xml to this program, which I'll call IndentingParser.java, that program will display the whole document properly indented.

I start by letting the user specify what document to parse and then parsing that document as before. To actually parse the document, I'll call a new method, displayDocument, from the main method:

public static void main(String args[])

{

    displayDocument(args[0]);

    .

    .

    .

}

In the displayDocument method, I'll parse the document and get an object corresponding to that document:

import org.w3c.dom.*;

import org.apache.xerces.parsers.DOMParser;

 

public class IndentingParser

{

    public static void displayDocument(String uri)

    {

        try {

            DOMParser parser = new DOMParser();

            parser.parse(uri);

            Document document = parser.getDocument();

            .

            .

            .

        } catch (Exception e) {

            e.printStackTrace(System.err);

        }

    .

    .

    .

The actual method that will parse the document, display, will be recursive, as we saw when working with JavaScript. I'll pass the document to parse to that method, as well as the current indentation string (which will grow by four spaces for every successive level of recursion):

import org.w3c.dom.*;

import org.apache.xerces.parsers.DOMParser;

 

public class IndentingParser

{

    public static void displayDocument(String uri)

    {

        try {

            DOMParser parser = new DOMParser();

            parser.parse(uri);

            Document document = parser.getDocument();

 

            display(document, "");

 

        } catch (Exception e) {

            e.printStackTrace(System.err);

        }

    }

    .

    .

    .

In the display method, I'll check to see whether the node passed to us is really a node-if not, return from the method. The next job is to display the node, and how we do that depends on the type of node we're working with. To get the type of node, you can use the node's getNodeType method; I'll set up a long switch statement to handle the different types:

import org.w3c.dom.*;

import org.apache.xerces.parsers.DOMParser;

 

public class IndentingParser

{

    public static void displayDocument(String uri)

    {

    .

    .

    .

    }

 

    public static void display(Node node, String indent)

    {

        if (node == null) {

            return;

        }

 

        int type = node.getNodeType();

 

        switch (type) {

    .

    .

    .

To handle output from this program, I'll create an array of strings, displayStrings, placing each line of the output into one of those strings. I'll also store our current location in that array in an integer named numberDisplayLines:

public class IndentingParser

{

    static String displayStrings[] = new String[1000];

    static int numberDisplayLines = 0;

    .

    .

    .

I'll start handling various types of nodes in this switch statement now.

Handling Document Nodes

At the beginning of the document is the XML declaration, and the type of this node matches the constant Node.DOCUMENT_NODE defined in the Node interface (see Table 11.4). This declaration takes up one line of output, so I'll start the first line of output with the current indent string, followed by a default XML declaration.

The next step is to get the document element of the document we're parsing (the root element), and you do that with the getDocumentElement method. The root element contains all other elements, so I pass that element to the display method, which will display all those elements:

public static void display(Node node, String indent)

{

    if (node == null) {

        return;

    }

 

    int type = node.getNodeType();

 

    switch (type) {

        case Node.DOCUMENT_NODE: {

            displayStrings[numberDisplayLines] = indent;

            displayStrings[numberDisplayLines] +=

              "<?xml version=\"1.0\" encoding=\""+

              "UTF-8" + "\"?>";

            numberDisplayLines++;

            display(((Document)node).getDocumentElement(), "");

            break;

         }

.

.

.

Handling Element Nodes

To handle an element node, we should display the name of the element, as well as any attributes the element has. I start by checking whether the current node type is Node.ELEMENT_NODE; if so, I place the current indent string into a display string, followed by a < and the element's name, which I can get with the getNodeName method:

switch (type) {

    .

    .

    .

     case Node.ELEMENT_NODE: {

         displayStrings[numberDisplayLines] = indent;

         displayStrings[numberDisplayLines] += "<";

         displayStrings[numberDisplayLines] += node.getNodeName();

         .

         .

         .

Handling Attributes

Now we've got to handle the attributes of this element, if it has any. Because the current node is an element node, you can use the method getAttributes to get a NodeList object holding all its attributes, which are stored as Attr objects. I'll convert the node list to an array of Attr objects, attributes, like this-note that I first create the attributes array after finding the number of items in the NodeList object with the getLength method:

switch (type) {

    .

    .

    .

     case Node.ELEMENT_NODE: {

         displayStrings[numberDisplayLines] = indent;

         displayStrings[numberDisplayLines] += "<";

         displayStrings[numberDisplayLines] += node.getNodeName();

 

         int length = (node.getAttributes() != null) ?

             node.getAttributes().getLength() : 0;

         Attr attributes[] = new Attr[length];

         for (int loopIndex = 0; loopIndex < length; loopIndex++) {

             attributes[loopIndex] =

             (Attr)node.getAttributes().item(loopIndex);

         }

         .

         .

         .

You can find the methods of the Attr interface in Table 11.6.

Attr Interface Methods

Method                                                Description

java.lang.String getName()                               Gets the name of this attribute

Element getOwnerElement()                              ‑Gets the Element node to which this attribute is attached

boolean getSpecified()                                   ‑Is true if this attribute was explicitly given a value in the original document.

java.lang.String getValue()                               Gets the value of the attribute as a string

Because the Attr interface is built on the Node interface, you can use either the getNodeName and getNodeValue methods to get the attribute's name and value, or the Attr methods getName and getValue methods. I'll use getNodeName and getNodeValue here. In this case, I'm going to loop over all the attributes in the attributes array, adding them to the current display line: AttrName = "AttrValue". (Note that I escape the quotation marks around the attribute values as \" so that Java doesn't interpret them as the end of the string.)

switch (type) {

    .

    .

    .

     case Node.ELEMENT_NODE: {

         displayStrings[numberDisplayLines] = indent;

         displayStrings[numberDisplayLines] += "<";

         displayStrings[numberDisplayLines] += node.getNodeName();

 

         int length = (node.getAttributes() != null) ?

             node.getAttributes().getLength() : 0;

         Attr attributes[] = new Attr[length];

         for (int loopIndex = 0; loopIndex < length; loopIndex++) {

             attributes[loopIndex] =

             (Attr)node.getAttributes().item(loopIndex);

         }

 

         for (int loopIndex = 0; loopIndex < attributes.length; loopIndex++) {

             Attr attribute = attributes[loopIndex];

             displayStrings[numberDisplayLines] += " ";

             displayStrings[numberDisplayLines] += attribute.getNodeName();

             displayStrings[numberDisplayLines] += "=\"";

             displayStrings[numberDisplayLines] += attribute.getNodeValue();

             displayStrings[numberDisplayLines] += "\"";

         }

         displayStrings[numberDisplayLines] += ">";

 

         numberDisplayLines++;

         .

         .

         .

This element may have child elements, of course, and we have to handle them as well. I do that by storing all the child nodes in a NodeList object with the getChildNodes method. If there are any child nodes, I add four spaces to the indent string and loop over those child nodes, calling display to display each of them:

switch (type) {

    .

    .

    .

     case Node.ELEMENT_NODE: {

         displayStrings[numberDisplayLines] = indent;

         displayStrings[numberDisplayLines] += "<";

         displayStrings[numberDisplayLines] += node.getNodeName();

 

         int length = (node.getAttributes() != null) ?

             node.getAttributes().getLength() : 0;

         Attr attributes[] = new Attr[length];

         for (int loopIndex = 0; loopIndex < length; loopIndex++) {

             attributes[loopIndex] =

             (Attr)node.getAttributes().item(loopIndex);

         }

 

         for (int loopIndex = 0; loopIndex < attributes.length; loopIndex++) {

             Attr attribute = attributes[loopIndex];

             displayStrings[numberDisplayLines] += " ";

             displayStrings[numberDisplayLines] += attribute.getNodeName();

             displayStrings[numberDisplayLines] += "=\"";

             displayStrings[numberDisplayLines] += attribute.getNodeValue();

             displayStrings[numberDisplayLines] += "\"";

         }

         displayStrings[numberDisplayLines] += ">";

 

         numberDisplayLines++;

 

         NodeList childNodes = node.getChildNodes();

         if (childNodes != null) {

             length = childNodes.getLength();

             indent += "    ";

             for (int loopIndex = 0; loopIndex < length; loopIndex++ ) {

                display(childNodes.item(loopIndex), indent);

             }

         }

         break;

     }

     .

     .

     .

That's it for handling elements; I'll handle CDATA sections next.

Handling CDATA Section Nodes

Handling CDATA sections is particularly easy. All I have to do here is to enclose the value of the CDATA section's node inside "<![CDATA[" and "[[>":

case Node.CDATA_SECTION_NODE: {

    displayStrings[numberDisplayLines] = indent;

    displayStrings[numberDisplayLines] += "<![CDATA[";

    displayStrings[numberDisplayLines] += node.getNodeValue();

    displayStrings[numberDisplayLines] += "]]>";

    numberDisplayLines++;

    break;

}

.

.

.

Handling Text Nodes

The W3C DOM specifies that the text in elements must be stored in text nodes, and those nodes have the type Node.TEXT_NODE. For these nodes, I'll add the current indent string to the display string, and then I'll trim off leading and trailing whitespace from the node's value with the Java String object's trim method:

case Node.TEXT_NODE: {

    displayStrings[numberDisplayLines] = indent;

    String newText = node.getNodeValue().trim();

.

.

.

The XML for Java parser treats all text as text nodes, including the spaces used for indenting elements in customer.xml. I'll filter out the text nodes corresponding to indentation spacing; if a text node contains only displayable text, however, I'll add that text to the strings in the displayStrings array:

case Node.TEXT_NODE: {

    displayStrings[numberDisplayLines] = indent;

    String newText = node.getNodeValue().trim();

    if(newText.indexOf("\n") < 0 && newText.length() > 0) {

        displayStrings[numberDisplayLines] += newText;

        numberDisplayLines++;

    }

    break;

}

.

.

.

Handling Processing Instruction Nodes

The W3C DOM also lets you handle processing instructions. Here, the node type is Node.PROCESSING_INSTRUCTION_NODE, and the node value is simply the processing instruction itself. For example, let's say that this is the processing instruction:

<?xml-stylesheet type="text/css" href="style.css"?>

Then this is the value of the associated processing instruction node:

xml-stylesheet type="text/css" href="style.css"

That means all we have to do is to straddle the value of a processing instruction node with <? and ?>. Here's what the code looks like:

     case Node.PROCESSING_INSTRUCTION_NODE: {

         displayStrings[numberDisplayLines] = indent;

         displayStrings[numberDisplayLines] += "<?";

         String text = node.getNodeValue();

         if (text != null && text.length() > 0) {

             displayStrings[numberDisplayLines] += text;

         }

         displayStrings[numberDisplayLines] += "?>";

         numberDisplayLines++;

         break;

    }

}

.

.

.

And that finishes the switch statement that handles the various types of nodes. There's only one more point to cover.

Closing Element Tags

Displaying element nodes takes a little more thought than displaying other types of nodes. In addition to displaying <, the name of the element, and >, you also must display a closing tag, </, the name of the element, and >, at the end of the element.

For that reason, I'll place some code after the switch statement to add closing tags to elements after all their children have been displayed. (Note that I'm also subtracting four spaces from the indent string, using the Java String substr method so that the closing tag lines up vertically with the opening tag.)

    if (type == Node.ELEMENT_NODE) {

        displayStrings[numberDisplayLines] = indent.substring(0,

            indent.length() - 4);

        displayStrings[numberDisplayLines] += "</";

        displayStrings[numberDisplayLines] += node.getNodeName();

        displayStrings[numberDisplayLines] += ">";

        numberDisplayLines++;

        indent += "    ";

    }

}

And that's it. I parse and display customer.xml like this after compiling IndentingParser.java-in this case, I'll pipe the output through the more filter to stop it scrolling off the screen. (The more filter is available in MS-DOS and certain UNIX ports; it displays one screenful of information, and waits for you to type a key to display the next screenful.)

 

Page 7 of 13

 

Previous Page Table Of Contents Next Page
 

Recent Jobs

Sr. Software Engineer - Analytics
Immediate Mainframe openings for Ch
Immediate TANDEM-TAL openings for C
Immediate ASP.NET/C# Openings for C
Sr. Software Engineer

View all Jobs (Add yours)
View all CV (Add yours)



spfxmasks
water softener
Teleconference
Host Department NOLIMIT Web Hosting
MSN
sunglasses
conference calling


    Email TopXML  

Front Page Daily Stuff TopXML Forum XML blogs XML Newsgroups BizTalk Biztalk Utilities Biztalk Utilities Tutorial B2B SAP XML Microsoft .NET Dotnet System XML Soapformatter SQLXML XMLserializer XQuery PHP PHP SimpleXML PHP XML Dom PHP XML RPC PHP XSLT Java Java Java XML Xalan Microsoft ASP ASP Schemas XML SQL Server XML XMLDom XSL XSL Tutorial XSLT Stylesheets General Javascript CSS XHTML WAP