I was recently involved in some discussions on XML-DEV about what XML is good for. Interestingly enough, I was sitting (as so often) between the chairs and being attacked from both sides.
On the one hand, we have the XML traditionalists (see Simon's mail for a prime example) that see XML as just syntax with the main-benefit of being able to define markup languages. Basically a simpler SGML that provides the ability to define more meaningful markup. For many in this camp, the advent of the Infoset, W3C XML Schema, Web services and XQuery 1.0 seem to be the work of the devil.
On the other hand, we have the data-centrists for which XML is just one serialization of the underlying data model. Many in this camp argue that XML is only one serialization and try to divide (and hope to conquer) the space by advocating “better” formats suited to this task such as ASN.1, their favorite “binary XML/Infoset” and to some extend RDF (see my earlier post for some references).
I think that XML transcends and combines these two areas (for the Hegel fans: “das Eine hebt das Andere auf und generiert etwas Neues”) and by doing so will enable new application scenarios and information integration. Yes, it is syntax and is a good markup definition language. But it is also a really decent serialization format as long as it is the primary interoperability format (again, see my binary XML: why not posting). By being able to represent structured and semi-structured data together with marked-up data, all in the same document using the same syntax enables some very interesting scenarios in the area of integrating information from different sources and to get closer to the holy grail of information management: the integration of data and document management using a single set of tools and an integrated query language that sees both structured data and markup as first-class citizens.
Does XML have warts? Yes it does. The inability to represent certain code-point ranges as element content makes the serialization format of non-XML data hard. XML also has only weak support for representing graph structures. The W3C XML Schema language could have been much simpler. Many people in the document community argue that it is too much designed for typing data and not enough as a validation language. Actually, many members of the schema working group I spoke to said that their main design goal was to define a validation language. And the resulting design certainly is also not a very good type language (look at some of the built-in types such as xs:duration, xsi:nil, derivation by extension, pattern restrictions on types whose value space does not provide a 1-to-1 mapping to the lexical space etc.). Or just look at the “namespace in unmarked string“ fiasco.
However, most of these warts do not really seem to outweigh the benefits of having a single, widely available, interoperable data representation and markup format. Many of the warts can be overcome by following best practices or only burden the implementation's cost and performance in a small area that can often be avoided. Non-hierarchical relationships can be represented using a variety of mechanisms (ID/IDREF, XLink, RDF encodings). XML Schemata are being written, often at a much lower complexity level, sometimes generated from other languages such as Relax NG or from tools that incorporate these best practices. Even cases where the generated markup is not as readable as one may hope, it still can be queried or transformed by XSLT, XPath and XQuery.
While XQuery 1.0 is currently lacking the full-text/information retrieval capabilities to provide the combined querying of structured data and markup, I can say that there are some interesting proposals being investigated for the full-text language part that will address this shortcoming. For a review of some of the design decision, I refer to my chapter in Intelligent Search on XML Data.
So yes, many of the detractors of either side of the XML debate raise valid points.
However, they nitpick on the warts, lament bad usage (a valid complaint, but not a problem of the technology per se) and often have agendas on their own.
Yes, if I would have the chance to design my own XML, XML schema language, and even XML query language with the current knowledge and hindsight, they would look different than what we currently have (and probably have different warts :-)).
However, XML would still be mainly syntax that can be used for markup and serialization of non-XML data, the schema language would still be usable for structural validation and typing and the query language would still be based on a type-augmented Infoset, provide a similar mix of strong and weak typing with the option to statically type the language, and be a functional, sequence-oriented language.
Because the benefits of having a unifying model of mapping markup to objects (see XAML, XUL etc), a powerful markup language, mixing markup and data for information integration, and having an interoperable format with wide-spread, cheap (and some expensive) tools has much higher benefits than the costs of the warts, XML and its family of technologies (whether being specified by the W3C or some individual) will survive and prosper (often hidden from most users' direct view).
Whoo-hoo! The local ski resort is going to open this week! Much better than last year :-). This will be the first season of over thirty that I will wear a helmet. Since it comes with a built-in headset, I am now looking for a good MP3 player to carve to some good music. I am currently looking at either the Rio Nitrus, the Creative Nomad Jukebox Zen Xtra 30GB or the Apple iPod. Or are there others? My wife wants the Rio S30 since it has FM radio and is more rugged, I find it having not enough capacity for a day of skiing...
Anyway, let me know if you have any recommendations.