The value of XML 1.1
Yesterday was a day of mourning: XML 1.1 became a recommendation....
Many of you may wonder why I say this. The short answer is the benefit/cost ratio of XML 1.1: the overall benefit of XML 1.1 over XML 1.0 in the interoperability space is by far outweigh by the cost of bifurcating the XML format and the loss of backwards-compatibility.
If XML would only be used as a document mark-up format, the changes would be minor. But since many interop specifications are based on XML as the transport format, I find that the cost of somebody producing XML 1.1 in this context (IBM's NEL as whitespace, or - I have to admit - our own desire to map binary range code points to their entitized representation) and expecting that every other interop component can consume it is too high for the benefits.
So let me quickly review some of the benefits of XML 1.1:
1. NEL as whitespace/EOL character. This is an EBCDIC (speak IBM mainframe) issue and could have been solved by the IBM mainframe by mapping NEL to a whitespace character.
2. Entitzation of binary range code points (except U+0000). This is useful in the context of generating XML based on less restricted string types such as the relational database string types. The only problem with the current standard is that for some arcane C API issues, U+0000 is not allowed, which unfortunately makes the solution less appealing, since one still cannot transport any string. Facit: This feature could help web services to transport data that may contain a rare occurrence of a binary value. But by making it incomplete, it is less useful and does not seem to be worth the cost of needing to support two XML standards.
3. Redefinition of allowed name characters and general decoupling from Unicode version 2.0. This indeed is useful for some languages and allows XML documents to evolve with later versions of Unicode. However, having different versions of Unicode now being acceptable, interop is further hampered. And however inconvenient the current name character limits are for some languages, I think the bifurcation of the XML standard is more inconvenient.
4. Undeclaration of namespace prefixes (as part of the Namespace 1.1 specification). Again, nothing wrong with it at the purely technical level, but the “namespace pollution of payload“ issue that it tries to address is much smaller in my opinion than some people make it out to be.
5. Normalization. Normalization is fortunately only a should and not a must. Otherwise we would not even be able to take an XML 1.0 document (without the version declaration) and parse it as a 1.1 document. With the current wording, it becomes less of a problem, but again: no reason for a new version.
Given the minimal benefits not justifying the cost of changing all XML consuming application to accept XML 1.1, I find the release of XML 1.1 somewhat disappointing. I hope that all parties will stay away from producing XML 1.1 with the expectation that one gets interoperability and follow at least the following W3C recommendation:
“Programs which generate XML SHOULD generate XML 1.0, unless one of the specific features of XML 1.1 is required.”
Let me add that even then, they should think again and generate XML 1.0...