Mark Wilson I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
First posted :
05/18/2004
Times viewed :
1004
A VB, VBScript, JavaScript string library based on XPath Functions
At first glance, the string functions of XPath (starts-with, contains, substring-before, substring-after, substring, string-length, normalize-space)
might not seem to be that different from those of
VB or JavaScript. They are, however, a step up on the abstraction
ladder.
For example, substring-after(strURL,'://')means give
"me everything after of ://"
In VB you would use
Mid$(strURL, InStr(strURL, "://") + Len("://"))
which means "take the index of the start of ://, add the
length of :// and use that value with a Mid$ to get everything after
it."
The advantage with the previous example is clear. Let's take an example
that's less so.
<xsl:if test="contains(URL,'://')">
means "does the text in URL contain ://?"
In VB, we would
write
If Instr(strURL,'://') > 0 Then Is there any
advantage to the XPath here, No? OK, what about JavaScript? Simple, you
say, it is as follows
if (strURL.indexOf('://') > 0)
Right?
No, wrong. JavaScript strings are zero-based, so the correct code
is
if (strURL.indexOf('://')
> -1)
I find myself jumping back and forth from VB to JavaScript
quite a bit. It is irksome that, when I am want to know if a string
"contains" a given substring, I have to remember whether the string is zero or
one-based.
In order to take advantage of the higher level functions in XPath in
all my programming, I have created, with Fred Baptiste, XPath based modules for VB,
VBScript and JavaScript.
Part of the difficulty in creating general reuseable functions
is figuring out what to do with boundary conditions. For example,
should StartsWith return true if you pass
in a zero width match string? Well,
you could debate this for a
long, long time. If I were a sophomore in
college--a time when my parents subsidized such debates--I would happily join in. But now, since I have a job,....
So, does the string "VBXML" begin with a zero
length string? The XPath Spec says yes. The XPath expression:
starts-with('VBXML','')
returns true. Huh? Doesn't make sense to me. But if I followed my
preference and then distributed this code to people who know XPath, I
would received a lot of "bug" reports. So I have just tried to mimic
XPath, using Kay for my reference.
Notes: you can't include a hyphen in VB function names,
so following VB protocol, I changed starts-with to
StartsWith etc.
Also, XPath is case-sensitive. That is a
definitely a big step backward for the VB
programmer, so I added a TextCompare enumeration as the last argument to
each function, defaulting it to case-sensitive.
Following JavaScript's protocol, these methods
start with a lower case letter. So,
instead of StartsWith, we have startsWith.
BTW, you might be tempted to think that the JavaScript
code strTemp.normalize();
changes the variable strTemp. It doesn't. JavaScript strings are
immutable. strTemp.normalize() returns a new string.
The JavaScript module like the others, has
a boolean flag as the last argument. Without specifying it, the functions are case-sensitive.
AfterRev is the same as substring-after,
but searches backwards analogous to InstrRev
Before is like substring-before
BeforeRev see AfterRev
Contains - this just replaces instr()>0 Why do it?
It makes programs more understandable
EndsWith is not a part of XPath, but it is useful
Normalize - (normalize-space)
can be done much more quickly with a regex,
but the code was already done
Translate - thanks to Fred Baptiste for fixing this
Mark Bosley is a programmer who lives in Shorewood, Wisconsin. His laziness leads him
to investigate all sorts of declarative programming tools because he's confident the computer
is smarter at figuring out how to do things than he is. He works for
Northwoods Software Development, Inc. You can
reach him at mark@lightcc.com