BizTalk Utilities CV ,   Jobs ,   Code library
 
Go to the front page to continue learning about XML or select below:

Contents

ReBlogger Contents

Previous posts in XML

 
 
Page 7954 of 21350

Querying a NIEM XML Document Using LINQ

Blogger : Geekswithblogs.net
All posts : All posts by Geekswithblogs.net
Category : XML
Blogged date : 2007 Dec 06

If you are not familiar with the new LINQ syntax in .NET 3.5 I recommend you read up on that first. Language Integrated Query (LINQ) is the new foundation for querying anything (I mean anything) in .NET and a lot of effort has been put into LINQ and XML integration.  I did some research a few months back on it but it was only recently that I realized how much Microsoft put into LINQ to XML. 

I am a huge fan of intellisense.  That is the best thing that MS ever perfected in Studio.  I spend a lot of time making sure things show up properly via intellisense and it really irks me if something isn't there.  Studio 2008 (I've used Professional and now Team Development edition) has the best intellisense support of any version yet and MS pulled out all the stops and you can get intellisense darn near everywhere in a VB, XML, or other types of files.

Before you go on I highly recommend spending the 15 minutes watching this video from Beth Massi of the VB Team.  She covers how to add intellisense support for XML Schemas and documents inside of LINQ queries.

Prerequisites:
1) A NIEM conformant XML schema loaded into your Visual Studio Solution.  Be sure to have it properly seated in its folders.  Here is a screenshot of mine:

If you generated your schemas from the Schema Subset Generator on the NIEM tools page just copy the folder structure from the zip file into your project.  Studio should immediately start parsing it.

2) Import your XML namespaces into the class file you are working in.  This is done via the new Import <xmlns> statement like so:

[The blacked out line is because I covered over one that I didn't want you to see :)]

3) Load a valid XML Document.  If you have one on disk you can load it by creating an XDocument and calling the load method and passing the file path like so:
Dim Doc as XDocument = XDocument.Load("C:\myfile.xml")

If you have a string of XML in a variable you can load it calling the parse method of the XDocument:

Dim MyXml as string = "<root><tag1>test</tag2></root>"

Dim Doc as XDocument = XDocument.Parse(MyXml)

Once you have those things set up you are ready to begin.  As with any type of query language you must know the structure.  Studio will give you intellisense during the queries but you have to closely look for a check mark indicator to know that your selection is valid for where you are at in your structure.  Example:

Note the small green circle with the check.  That indicates that my choice is valid for where I was in the path -nc:Person.nc:PersonName.nc:PersonGivenName.  I could have typed something else in there and it would have accepted it, however it would fail at runtime.

In the query above I am selecting a Person element from the document based on the Reference Id.  From there I am drilling down to their name.  When calling that method I am passing the string Id I want to find.

Example Document:
<j:CitationSubject>

<nc:RoleOfPersonReference s:ref="P1">

</j:CitationSubject>

<nc:Person s:id="P1">

<nc:PersonName><nc:PersonGivenName>Chris</..></..>

</..>

The Query would return "Chris" as it is the value of the PersonGivenName element.  Because FirstNameQuery is an IEnumerable of XElement I use the index to return the first result because that is the one I want (and when matching by Id like I am there should only ever be one). [NB: Because I was demoing the intellisense I had to cut off the .Value at the end of the select.  The full select should look like: Select person...<nc:PersonName>...<nc:PersonGiveName>.Value]

The only caveat is that LINQ's Intellisense does not support substitution groups.  In the normal XML document intellisense in Studio 2008 you will get a list of valid substitutions (nc:AddressRepresentation anyone?).  It is not a big deal, you just have to know what tag to put in.  For example, under nc:Location.nc:LocationAddress there is a substitution called "AddressRepresentation".  I substitute nc:LocationStreet at that point.  With LINQ, I have to manually type that in, but after I get to the nc:LocationStreet the intellisense picks back up and I get choices like nc:StreetName, etc.  I know, they can't spoil us but if they added that I would fly to Seattle to buy someone coffee!

So how does this fare versus serialization?  I ran some very rudimentary benchmarks.  For loading a document with 50 of my citations in it (for reference I inherited from j:CitationType and extended it...a bunch) it took LINQ 00:00:00.0624724 ms to load the document.

LINQ:
Start: 21:17:59.2930220
Document Loaded: 21:17:59.3554944
Load Time 00:00:00.0624724

The same document after invoking a serializer to deserialize the XML to the matching class structure:

Deserializer
Start: 21:20:38.8631497
Serialize Complete: 21:20:39.5815823
Load Time 00:00:00.7184326

That is quite a substantial difference to me considering the parser I'm developing may need to chew through thousands of citations at a time.

I see a lot of potential for using LINQ to query NIEM conformant XML.  On the XML generation side I will still use the serializer for the time being.  But when it comes to parsing, I'm going to push using LINQ because of its speed and flexibility.


Read comments or post a reply to : Querying a NIEM XML Document Using LINQ
Page 7954 of 21350

Newest posts
 

    Email TopXML