This site has been taken over by the staff of www.ASPDeveloper.Net

Please report errors to suggest@aspdeveloper.net

BizTalk Utilities CV ,   Jobs ,   Code library  
 
Home Page
XML DOM
The Understanding XML DOM Game
Programming the SAX2 interfaces from Visual Basic
Integration with SAX
Responding to SAX Events in JDOM
Firing SAX events from JDOM
Introduction to MSXML and SAX2
Information on the MinML SAX parser
Python & XML
The XML HTTPRequest
The XML DOMParseError
The XML DOMProcessingInstruction
The XML DOMEntity
The XML DOMDocumentType
The XML DOMAttribute
The XML DOMCDATASection
The XML DOMNamedNodeMap
The XML DOMNodeList
The XML DOMNode
The XML DOMElement
The XML DOMDocument
<< XHTML
XmlSerializer >>

By :Mark Wilson
I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
First posted :01/19/2004
Times viewed :2640

 
          

Page 1 of 2

The Joy of SAX

Martin Naughton

Download the source code for this article.

One of the key features of the May 2000 MSXML Technology Preview is an implementation of SAX2 (Simple API for XML, version 2). Developers may also browse or download a SAX2 "Jumpstart" application, written in C++. In this article, I outline an approach to programming the SAX2 interfaces from Visual Basic.

Hello World?

The typical approach to experimentation with a COM component, such as MSXML3.DLL, from VB is to create a new Standard EXE project, then go to the Project/References menu to add a reference to the MSXML3.DLL Type Library ("Microsoft XML, v3.0"). At this point, the VB Object Browser can be used to examine the properties, methods and events for the interface of your choice.

I did this with MSXML3, then went looking (in vain) for the SAX2 interfaces. Nothing remotely SAX-y in this Type Library. I checked the registry, to verify that I had installed the correct version of MSXML3.DLL. Sure enough, there were two promising- looking ProgIds under HKEY_CLASSES_ROOT ("Msxml2.SAXXMLReader" and "Msxml2.SAXXMLReader.3.0"). I fired off one of those "Am I missing something?" -type questions to a couple of related newsgroups. Quick as a flash, a response came back (thanks to Jason Loader), indicating that, indeed, I was missing something.

Use the Source, Luke

The SAX2 interfaces are defined in a file called "xmlsax.idl". If you installed MSXML3 into the default folder, it can be found in the folder called "C:\Program Files\Microsoft XML Parser SDK\inc". A couple of minutes later, I had compiled "xmlsax.idl" into a Type Library file, called, appropriately enough, "xmlsax.tlb". (The MIDL IDL compiler was the tool used to do the compilation).

Back in the VB IDE, I returned to the "Project/References" menu, this time selecting "Browse…" from the References dialogue, to locate the new "xmlsax.tlb" file. As a side effect of selecting a Type Library by this method, the VB IDE will first register the Type Library file. This was much more encouraging. The SAX2 interfaces were now showing up in the Object Browser.

Character-Building Stuff

An inspection of the parameters to some of the methods confirmed what my newsgroup response had suggested: the interface was a little VB- unfriendly. (Remember, the documentation describes it as "Microsoft's COM/C++ implementation of SAX2".) To summarise, wherever a VB developer might expect to see a "String" parameter, there were, in fact, two parameters. The first of these parameters has a "pwch" Hungarian prefix ("pointer to an array of wide char"?) and the second has a "cch" prefix ("count of char"?).

Here's what the ISAXContentHandler IDL looks like for the "StartElement" method:

HRESULT StartElement(
  [in] const wchar_t * pwchNamespaceUri, 
  [in] int cchNamespaceUri, 
  [in] const wchar_t * pwchLocalName, 
  [in] int cchLocalName, 
  [in] const wchar_t * pwchQName, 
  [in] int cchQName, 
  [in] ISAXAttributes * pAttributes); 

Luckily the newsgroup response provided a snippet of VB code to process these parameters. A simplified version of this function is given here:

      
Public Function UnicodeArrayToString(
	pwchArray As Integer, _
         ByVal lCharCount As Long) As String

  Dim sText As String

  ' size the string to the correct no of chars
  sText = String$(lCharCount, 0)

  ' copy over the values. note the size is twice no of chars 
  'because the string is Unicode (double byte)
  Call CopyMemory(ByVal StrPtr(sText), _
		iUnicodeCharArrayFirstElt, _
		lCharCount * 2)

  UnicodeArrayToString = sText

End Function

What the XMLSAX interface is returning is an array of (VB) Integers, plus the number of items in the array. The "UnicodeArrayToString" function uses the undocumented VB "StrPtr" function, plus a call to the "CopyMemory" Windows API, to convert these two values to a VB String. ("CopyMemory" is an alias for the Windows API "RtlMoveMemory")

SAX2 "Jumpstart" for VB

We're now ready to write a version of the SAX2 "Jumpstart" in VB. Follow this recipe to get a similar result to the C++ "Jumpstart" application. Note that a reference to MSXML is not required.

  1. Create a VB Standard EXE project
  2. Add a reference to the XMLSAX.TLB (included in the download)
  3. Copy/Paste the following code into a VB Form code window
  4. Add a CommandButton (Command1) to the form
  5. Save the Project
  6. Place a valid XML file, called "test.xml" in the same folder as the one you saved the project
  7. Run the project in Debug mode
  8. Click the Command1 button
  9. Watch the Immediate window
Option Explicit

Private Declare Sub CopyMemory Lib "kernel32" _
  Alias "RtlMoveMemory" _
  (pDest As Any, _
  pSource As Any, _
  ByVal ByteLen As Long)

Implements XMLSAX.ISAXContentHandler

Private Sub Command1_Click()

  Call Parse

End Sub

Private Function Parse() As Boolean

  Dim i() As Integer
  Dim str As String

  Dim sax As SAXXMLReader30

  Set sax = New SAXXMLReader30

  Call sax.PutContentHandler(Me)

  str = App.Path & "\" & "test.xml"

  ' size the array to the number of characters
  ReDim i(Len(str))

  ' copy the memory, not that the size is twice the no. of chars
  ' as it is a unicode (double byte) string
  CopyMemory i(0), ByVal StrPtr(str), Len(str) * 2

  ' pass the first array entry in
  sax.ParseURL i(0), Len(str)

End Function

Private Sub ISAXContentHandler_Characters _
  (pwchChars As Integer, 
  ByVal cchChars As Long)

  'Do Nothing

End Sub

Private Sub ISAXContentHandler_EndDocument()

  'Do Nothing

End Sub

Private Sub ISAXContentHandler_EndElement _
  (pwchNamespaceUri As Integer, 
  ByVal cchNamespaceUri As Long, _
  pwchLocalName As Integer, _
  ByVal cchLocalName As Long, _
  pwchQName As Integer, _
  ByVal cchQName As Long)

  'Do Nothing

End Sub

Private Sub ISAXContentHandler_EndPrefixMapping _
  (pwchPrefix As Integer, _
  ByVal cchPrefix As Long)

  'Do Nothing

End Sub

Private Sub ISAXContentHandler_IgnorableWhitespace _
  (pwchChars As Integer,
  ByVal cchChars As Long)

  'Do Nothing

End Sub

Private Sub ISAXContentHandler_ProcessingInstruction _
  (pwchTarget As Integer,_ 
  ByVal cchTarget As Long, _
  pwchData As Integer, _
  ByVal cchData As Long)

  'Do Nothing

End Sub

Private Sub ISAXContentHandler_PutDocumentLocator _
  (ByVal pLocator As XMLSAX.ISAXLocator)

  'Do Nothing

End Sub

Private Sub ISAXContentHandler_SkippedEntity _
  (pwchName As Integer, _
  ByVal cchName As Long)

  'Do Nothing

End Sub

Private Sub ISAXContentHandler_StartDocument()

End Sub

Private Sub ISAXContentHandler_StartElement _
  (pwchNamespaceUri As Integer, _
  ByVal cchNamespaceUri As Long, _
  pwchLocalName As Integer, _
  ByVal cchLocalName As Long, _
  pwchQName As Integer, _
  ByVal cchQName As Long,  _
  ByVal pAttributes As XMLSAX.ISAXAttributes)

  Dim str As String

  ' size the string to the correct no of chars
  str = String$(cchLocalName, 0)

  ' copy over the values. note the size is twice no of chars 
  ' because the string is unicode (double byte)
  CopyMemory ByVal StrPtr(str), pwchLocalName, cchLocalName * 2

  Debug.Print str

End Sub

Private Sub ISAXContentHandler_StartPrefixMapping _
  (pwchPrefix As Integer, _
  ByVal cchPrefix As Long,  _
  pwchUri As Integer, ByVal cchUri As Long)

  'Do Nothing

End Sub

Wrap it up

Once I got over the initial hurdles of programming SAX2 from VB, I decided that the best thing to do long-term was to encapsulate all the tricky stuff inside VB wrapper classes for all the SAX2 interfaces.

The result of this work can be seen in the download available with this article. It's not complete but it demonstrates all the important points. The example code can be compiled into an ActiveX component (VBXMLSAX), then re-used (in VB, VBScript, JScript or even C++) without knowledge of the implementation details. For those of you who wish to understand some of those implementation issues, read on…

 

          
Stop the Parser, I want to get off!

Page 1 of 2


Rate this article on a scale of 1 to 10

Your vote :  


 

Recent Jobs

Software Specialist, Linux - Finlan
Linux Core Technical Project Manage
Graphics designer at Tanzania. Expe
Integration Specialist Needed - Wor
Virtualization Server Infrastructur

View all Jobs (Add yours)
View all CV (Add yours)






    Email TopXML  

Front Page Daily Stuff TopXML Forum XML blogs XML Newsgroups BizTalk Biztalk Utilities Biztalk Utilities Tutorial B2B SAP XML Microsoft .NET Dotnet System XML Soapformatter SQLXML XMLserializer XQuery PHP PHP SimpleXML PHP XML Dom PHP XML RPC PHP XSLT Java Java Java XML Xalan Microsoft ASP ASP Schemas XML SQL Server XML XMLDom XSL XSL Tutorial XSLT Stylesheets General Javascript CSS XHTML WAP