It seems Jonathan Marsh has joined the blogosphere with his new blog Design
By Committee. If you don`t know Jonathan Marsh, he`s been one of Microsoft`s representatives
at the W3C for several years and has been an editor of a variety of W3C specifications
including XML:Base, XPointer
Framework, and XInclude.
In his post XML
Base and open content models Jonathan writes
There is a current controversy about XInclude adding
xml:base attributes whenever an inclusion is done. If your schema doesn`t
allow those attributes to appear, you`re document won`t validate. This surprises some
people, since the invalid attributes were added by a previous step in the processing
chain (in this case XInclude), rather than by hand. As if that makes a difference
to the validator!
Norm Walsh ,
after a false start, correctly points out this behavior was intentional. But he doesn`t
go the next step to say that this behavior is vital! The reason xml:base attributes
are inserted is to keep references and links from breaking. If the included content
has a relative URI, and the xml:base attribute is omitted, the link will no longer
resolve - or worse, it will resolve to the wrong thing. Can you say "security hole"
Sure it`s inconvenient to fail validation when xml:base attributes are added,
especially when there are no relative URIs in the included content (and thus the xml:base
attributes are unnecessary.) But hey, if you wanted people or processes to add attributes
to your content model, you should have allowed them in the schema!
I agree that the working group tried to address a valid concern. But this seems to
me to be a case of the solution being worse than the problem. To handle a situation
for which workarounds will exist in practice (i.e. document authors should use absolute
URIs instead of relative URIs in documents) the XInclude working group handicapped
using XInclude as part of the processing chain for documents that will be validated
by XML Schema.
Since the problem they were trying to solve exists in instance documents, even if
the document author don`t follow a general guideline of favoring absolute URIs over
relative URIs, these URIs can be expanded in a single pass using XSLT before being
processed up the chain by XInclude. On the other hand if a schema doesn`t allow xml:base
elements everywhere (basically every XML format in existence) then one cannot use
XInclude as part of the pipeline that creates the document if the final document will
undergo schema validation.
I think the working group optimized for an edge case but ended up breaking a major
scenario. Unfortunately this happens a lot more than it should in W3C specifications.