Michael Rys

Musings on XML, XQuery and more...

<October 2008>
SuMoTuWeThFrSa
2829301234
567891011
12131415161718
19202122232425
2627282930311
2345678


Navigation

Papers

SQL Server XML Whitepapers

Weblogging Links

MS Bloggers

Recommended Books

Other Blogs

Recommended Links

Presentations (Upcoming)

Presentations (Recent)

Subscriptions

News


Upcoming Presentations


TechEd 2007, Orlando, June 4 to June 8, 2007


Books I co-authored



www.flickr.com
This is a Flickr badge showing public photos from Michael Rys. Make your own badge here.
eXTReMe Tracker

Post Categories

Article Categories



Thursday, February 05, 2004 - Posts

The value of XML 1.1

Yesterday was a day of mourning: XML 1.1 became a recommendation....

Many of you may wonder why I say this. The short answer is the benefit/cost ratio of XML 1.1: the overall benefit of XML 1.1 over XML 1.0 in the interoperability space is by far outweigh by the cost of bifurcating the XML format and the loss of backwards-compatibility.

If XML would only be used as a document mark-up format, the changes would be minor. But since many interop specifications are based on XML as the transport format, I find that the cost of somebody producing XML 1.1 in this context (IBM's NEL as whitespace, or - I have to admit - our own desire to map binary range code points to their entitized representation) and expecting that every other interop component can consume it is too high for the benefits.

So let me quickly review some of the benefits of XML 1.1:

1. NEL as whitespace/EOL character. This is an EBCDIC (speak IBM mainframe) issue and could have been solved by the IBM mainframe by mapping NEL to a whitespace character.

2. Entitzation of binary range code points (except U+0000). This is useful in the context of generating XML based on less restricted string types such as the relational database string types. The only problem with the current standard is that for some arcane C API issues, U+0000 is not allowed, which unfortunately makes the solution less appealing, since one still cannot transport any string. Facit: This feature could help web services to transport data that may contain a rare occurrence of a binary value. But by making it incomplete, it is less useful and does not seem to be worth the cost of needing to support two XML standards.

3. Redefinition of allowed name characters and general decoupling from Unicode version 2.0. This indeed is useful for some languages and allows XML documents to evolve with later versions of Unicode. However, having different versions of Unicode now being acceptable, interop is further hampered. And however inconvenient the current name character limits are for some languages, I think the bifurcation of the XML standard is more inconvenient.

4. Undeclaration of namespace prefixes (as part of the Namespace 1.1 specification). Again, nothing wrong with it at the purely technical level, but the “namespace pollution of payload“ issue that it tries to address is much smaller in my opinion than some people make it out to be.

5. Normalization. Normalization is fortunately only a should and not a must. Otherwise we would not even be able to take an XML 1.0 document (without the version declaration) and parse it as a 1.1 document. With the current wording, it becomes less of a problem, but again: no reason for a new version.

Given the minimal benefits not justifying the cost of changing all XML consuming application to accept XML 1.1, I find the release of XML 1.1 somewhat disappointing. I hope that all parties will stay away from producing XML 1.1 with the expectation that one gets interoperability and follow at least the following W3C recommendation:

“Programs which generate XML SHOULD generate XML 1.0, unless one of the specific features of XML 1.1 is required.”

Let me add that even then, they should think again and generate XML 1.0...

posted Thursday, February 05, 2004 12:02 PM by mrys with 6 Comments




Powered by Dot Net Junkies, by Telligent Systems