   
The 12
Node types
Table lists the 12 Node types.
0.0.1 The 12 Node types
| Name
|
Enumeration
number
|
| NODE_ELEMENT
|
(1)
|
| NODE_ATTRIBUTE
|
(2)
|
| NODE_TEXT
|
(3)
|
| NODE_CDATA_SECTION
|
(4)
|
| NODE_ENTITY_REFERENCE
|
(5)
|
| NODE_ENTITY
|
(6)
|
| NODE_PROCESSING_INSTRUCTION
|
(7)
|
| NODE_COMMENT
|
(8)
|
| NODE_DOCUMENT
|
(9)
|
| NODE_DOCUMENT_TYPE
|
(10)
|
| NODE_DOCUMENT_FRAGMENT
|
(11)
|
| NODE_NOTATION
|
(12)
|
Now let's look at each type in a bit more
detail.
NODE_ELEMENT
| Has an interface type
of:
|
IXMLDOMElement
|
| Can have the following children
types:
|
Element
Text
Comment
ProcessingInstruction
CDATASection
EntityReference
|
This nodeType specifies an element Node in the XML
file.
Here is an example of an element in the XML
code:
<PEOPLE> ... </PEOPLE>
This is also an example of an element in the XML
code:
<TEL>(++) 61 2 12345</TEL>
The following XML code includes all the elements you can
find in our DOMDocument:
<PEOPLE>
<PERSON id="p1">
<NAME>Mark Wilson</NAME>
<ADDRESS>911 Somewhere Circle, Canberra, Australia</ADDRESS>
<TEL>(++612) 12345</TEL>
<FAX>(++612) 12345</FAX>
<EMAIL>markwilson@somewhere.com</EMAIL>
</PERSON>
...
</PEOPLE>
¬
This element is returned by the documentElement property.
¬
This element is a childNode of PEOPLE.
¬
This element is the first childNode of PERSON.
¬
Second childNode of PERSON.
¬
Third childNode of PERSON.
¬
Fourth childNode of PERSON.
¬
Fifth childNode of PERSON.
NODE_ATTRIBUTE
Has an interface type
of:
|
XMLDOMAttribute
|
Can have the following children
types:
|
Text
EntityReference
|
This nodeType specifies an attribute Node in the XML
file. For more information on attributes, see the attributes property of this
section, as well as chapter 4, "Programming with XML."
Our XML will read as follows:
<PERSON PERSONID="1">
PERSONID="1" is the attribute.
In order to work with the XML file, these attributes
need to be declared in the DTD file or a section of the XML file, which looks
like:
<!ATTLIST PERSON PERSONID ID #REQUIRED>
In this example, an "id" type attribute declaration,
which must exist for each Person element in the XML file as an id type
attribute, is declared as "#REQUIRED."
NODE_TEXT
Has an interface type
of:
|
XMLDOMText
|
Can have the following children
types:
|
None
|
This nodeType specifies a text Node in the XML file. A
text Node can appear as the child Node of Attribute, DocumentFragment, Element,
and EntityReference Nodes. It never has any child Nodes.
In the following XML example, the highlighted text will
be the value in a text Node:
<ADDRESS>911 Somewhere Circle, Canberra, Australia</ADDRESS>
The highlighted text is the Node text.
If you are working with an ElementNode type, then you
actually don't need to iterate down to its child Node to get the text of the
element. You can just use the nodeTypeValue property of the element Node. This
will give you its text value, whether it contains a CDATA section, entities, or
whatever.
However, there are a few tricks. The next Node type we
will explain is the CDATA section. If your Node happens to be of this type, it
will be represented as a CDATA section Node and not a text Node, even though it
looks like a text Node. So, if you are looking for the text Node of an element
Node, don't forget to look for the CDATA section Node as well.
NODE_CDATA_SECTION
Has an interface type
of:
|
XMLDOMCDATASection
|
Can have the following children
types:
|
None
|
Sometimes you will want to insert unusual characters in your XML Nodes. To do this, you need to use a CDATA section (also known as a Marked section).
Yes, we know that you can also use built-in entities instead of a CDATA section, but there are currently only five of these built-in entities that the DOMDocument will parse. (See the XMLDOMEntity object section earlier in this
chapter.)
But built-in entities don't sort out all our problems. What do we do about a
hash character (#)? An XML parser does not allow you to use a # character in
your XML, but there is no DOMDocument built-in entity for a #
character.
Note
Built-in entities are like VB constants, but they are
built into XML. Because XML has certain characters that are used specifically
for XML (like a < character), you can use the entity to represent your
character.
Here we talk specifically about what the DOMDocument can
parse. Many browsers support numerous entities that they recognize, but the
DOMDocument seems to support only these five built-in entities.
To use a CDATA section this in our XML code, instead of
declaring an element in the DTD as CDATA element, we can use the following in
our XML code:
![CDATA[]]
Here is an example of an element that uses a CDATA
section with a # in it:
<ADDRESS><![CDATA[#911 Somewhere Circle, Canberra, Australia]]></ADDRESS>
Note
If you do a Select Case in VB to find the enumeration of
a Node, using the NODE_TEXT you will miss the text that is put into CDATA
sections to protect reserved characters. Therefore, don't forget to look for
NODE_CDATA_SECTION in your Select Case.
A CDATA section Node can be the child Node of Elements,
EntityReferences, and DocumentFragments.
If you try to use an Entity in a CDATA section, the
entity will not be parsed, and therefore not interpreted. For example the
following XML code is embedded in a CDATA section:
<ADDRESS><![CDATA[#911 O'Hara Circle, Canberra, Australia]]></ADDRESS>
Did you note the ' entity? The ' entity represents an '
(apostrophe) and we would expect the output to display O` Hara Circle, wouldn't we? However, this will not be interpreted as
O'Hara Circle,
because the text is contained in CDATA section and won't be parsed. When we
examine the above XML code from VB, using the nodeTypedValue property to get the
value of the element, we observe the following:
strValue = objXMLElement.nodeTypedValue
strValue returns: #911 O'Hara Circle, Canberra, Australia.
Note
When you allow users to add data to a DOMDocument, don't
forget to check for any unparsable characters. Embed these unparsable characters
in a CDATA section before saving the DOMDocument. If you do not this, when
someone has typed in an apostrophe in one of the fields (e.g., O'Mally), the
DOMDocument will save this and return no errors. However, when you try to load
this saved XML file into the DOMDocument, you will receive an error because of
this unparsable character.
NODE_ENTITY_REFERENCE
Has an interface type
of:
|
XMLDOMEntityReference
|
Can have the following children
types:
|
Element
ProcessingInstruction
Comment
Text
CDATASection
EntityReference
|
This Node specifies a reference to an entity (see the
next section on the Entity Node).
Entities are specified in the DTD.
In the DTD file, we specify the entity as:
<!ENTITY USA "United States of America">
In the XML file, the following code in an element Node will have two child Nodes, one a text Node and the other an EntityReference:
<ADDRESS>30 Animal Road, New York, &USA;</ADDRESS>
In the DOMDocument, the XML example above can be
extracted, for the element, as follows:
First child
Node
strValue = objXMLElement.childNodes(1).nodeTypedValue
enumType = objXMLElement.childNodes(1).nodeType
Returns: strValue = "30 Animal Road, New
York."
Returns: enumType = NODE_TEXT.
Second child
Node
strValue = objXMLElement.childNodes(2).Text
enumType = objXMLElement.childNodes(2).nodeType
Returns: strValue = "United States of
America."
Returns: enumType = NODE_ENTITY_REFERENCE.
However, why have we changed from using the
nodeTypedValue property in the first child to using the text value in the second
child? It's because we are dealing with two different Node types. Therefore,
depending on your Node type, you get the data you need to access different
properties. See the nodeTypedValue and nodeValue in this section for more
information on this.
Note
If you only want to get the value of the element Node,
you don't have to go through this long procedure to get the value. You only have
to do this if you specifically want to analyze the EntityReference Node, for
example.
The following XML code specifies an entity
"&USA;":
<ADDRESS>30 Animal Road, New York,
&USA;</ADDRESS>
In the DOMDocument, you can get straight to finding its
parsed value from the element Node:
strValue = objXMLElement.nodeTypedValue
Returns: strValue = "30 Animal Road, New York, United
States of America."
NODE_ENTITY
Has an interface type
of:
|
XMLDOMEntity
|
Can have the following children
types:
|
Text
EntityReference
|
As mentioned earlier, an entity can only be specified in
the DTD section or file, and not in the XML file. Therefore, entities will
always be child Nodes of the DocumentType Node (see the docType property in this
section).
An example of entities in the DTD is:
<!ENTITY USA "United States of America">
<!ENTITY UK "United Kingdom">
As you can see in figure , there are two child
Nodes (Item 1 and Item 2) for the docType property of this object.
From our DTD sample code above, the DOMDocument's docType child- Nodes property has two child Nodes. Our nodeName property for the first child Node of nodeType NODE_ENTITY is
USA.
This Node consists of one child Node, which is of the type NODE_TEXT. However,
as you might have noticed, you can reference the text property-this will give
you the same value as the child Node, without having to iterate to the child
Nodes.
Although they are referenced (NODE_ENTITY_REFERENCE Node types) later in the XML file, among the elements, the parser will automatically handle expanding these entities-unless it's in a CDATA section, which we have explained in the
NODE_CDATA_SECTION in this section.
NODE_PROCESSING_INSTRUCTION
Has an interface
of:
|
XMLDOMProcessingInstructor
|
Can have the following children
types:
|
None
|
This type of Node specifies a PI in the XML document. Processing Instructions are
declared in the XML file as:
<?xml version="1.0" ?>
PIs can be found as part of the childNodes property in
Document, DocumentFragment, Element, and EntityReference Nodes. Therefore, don't
be deceived-they are not only found as the PI at the header of a
document.
We could easily put the following XML code,
"<?realaudio version="4.0"?>", among our elements, which would cause no
harm to the DOMDocument. This will return six child Nodes for the PERSON element. The realaudio
PI is returned as a PI Node (and in case you were wondering, the parser will
ignore it completely).
<PERSON id="p1">
<?realaudio version="4.0"?>
<NAME>Mark Wilson</NAME>
<ADDRESS>911 Somewhere Circle, Canberra, Australia</ADDRESS>
<TEL>(++612) 12345</TEL>
<FAX>(++612) 12345</FAX>
<EMAIL>markwilson@somewhere.com</EMAIL>
</PERSON>
NODE_COMMENT
Has an interface type
of:
|
XMLDOMComment
|
Can have the following children
types:
|
None
|
This Node is a comment. Comments can be inserted
anywhere, so this is another Node to be aware of when iterating through a Node
collection. It's good practice to check the type of Node that you are working
with, in case PIs or comments pop up. The following XML example includes a
comment about this person's address that someone inserted.
<PERSON id="p1">
<!-- Person not available for another contract until September -->
<NAME>Mark Wilson</NAME>
<ADDRESS>911 Somewhere Circle, Canberra, Australia</ADDRESS>
<TEL>(++612) 12345</TEL>
<FAX>(++612) 12345</FAX>
<EMAIL>markwilson@somewhere.com</EMAIL>
</PERSON>
In the following VB example, we iterate through our
child Nodes for an element. For this example, we only want to check for Element,
Comments, or PI Nodes. The nodeType property enables us to do this
check.
For Each objNode In objElement.childNodes
Select Case objNode.nodeType
Case NODE_COMMENT
'Do something to handle comments
Case NODE_ELEMENT
'Do something to handle elements
Case NODE_PROCESSING_INSTRUCTION
'Do something to handle PI's
End Select
Next objNode
NODE_DOCUMENT
Has an interface type
of:
|
XMLDOMDocument
|
Can have the following children
types:
|
Element
ProcessingInstruction
Comment
DocumentType
|
This Node type specifies that this Node is the document
object, which is the root of the whole document. There can only be one document
Node per XML file. In the following example, we load a DOMDocument from an XML
file:
Dim objDOMDocument As DOMDocument
Set objDOMDocument = New DOMDocument
objDOMDocument.async = False
objDOMDocument.Load http://localhost/xmlcode/people2.dtd
In this example, if you look at the objDOMDocument's
nodeType property, you will see that it's a NODE_DOCUMENT type Node.
Of its childNodes, the Element Node type can only
consist of one element (who in itself has child Nodes), as there is only one
root element (see the documentElement property for a detailed explanation and
example of this Node).
NODE_DOCUMENT_TYPE
Has an interface type
of:
|
XMLDOMDocumentType
|
Can have the following children
types:
|
Notation
Entity
|
This Node type specifies that this is the DTD Node,
which is represented in the XML code as:
<!DOCTYPE PEOPLE SYSTEM "http://localhost/xmlcode/people.dtd">
For a more detailed explanation of the DocumentType
Node, see the docType property in this section.
NODE_NOTATION
Has an interface type
of:
|
XMLDOMNotation
|
Can have the following children
types:
|
None
|
This is a notation Node. These notations can only be
declared in the document type declaration, just like the Entity type Node.
Therefore, it will only be found as a child of the Document type
Node.
An example of a notation is when we want to tell the
browser how to include an image or realaudio or whichever type of file; we can
then specify the following:
<img src="http://localhost/xmlcode/vbdev.gif">    
This manuscript is an abridged version of a chapter
from the Manning
Publications book XML
Programming with VB and ASP. This chapter looks at the Microsoft DOM objects. NOTE: Most images have been removed to increase speed and many of the code comments have also been removed for presentation. Please purchase the book to enjoy the full experience of all the chapters with images and code comments!
|