Adam Bosworth has pointed out an interesting set of requirements for database systems in his post titled “Where have all the good databases gone”:
- Dynamic Schema support
- Dynamic partitioning
- Better Indexing of non relational data
I think all of them are indeed important and based on my insight into the database industry, the major database vendors are working on addressing them. I think the recent addition of the XML datatype to SQL (and the major database vendors) to store data with more dynamic schemas and XQuery to query that data in a (hopefully) efficient way is certainly one of the main technologies to address some of the dynamic schema scenarios. Others are dynamic, online schema changes (ALTER TABLE) and other features.
There are also efforts underway to improve dynamic partitioning and indexing. For example, look at the investments into improving the full-text search capabilities of the database engines (SQL Server 2005's full-text search is considerably more performant than the previous version and more work is being done for future releases) and adding full-text support to XQuery.
However, adding such support has to be done carefully, if one wants to continue to provide the performance, scalability and maintainability that database systems normally provide (or should provide). For example, it is much harder to compile a query into an efficient query plan if the compiler and optimizer has less (reliable) schema information available. Frequent dynamic recompiles of a query due to frequent schema changes may be too costly if done wrong etc.. And for better or worse, database engines have become complex systems that need many components to provide the necessary integration of a new feature.
So doing this right takes time, but the vendors have started doing it over the last five years.
He also then asked the open source community to build it which lead to an interesting post by Kris (and a reply post by Adam and a comment by Dare). Interesting to note that all three of them have worked with me at Microsoft at some time :-). And note the interesting parallelism to Mr. Egelund's recent remarks and this exchange about contributing to the open-source world :-). My take: I think Kris and Dare have a point in that “web services” businesses prefer software to be free, because it lowers their cost and increases the number of people getting access (although, they should ask/provide for free hardware too, like the cell phone providers :-)). All that will be paid for will be services and servicing. OTOH, I don't think this part was the important aspect of Adam's post. These payment models are part of a set of competing business models with different trade-offs, which will look differently based on what you are working on :-).
I recently posted a reply to an XML-DEV enquiry explaining why XQuery and XSLT are declarative languages and providing some advantages of declarative over procedural languages (and some of the costs). M. David Peterson liked it and posted it on his blog (so I don't have to do it myself :-)).
Note that indeed I am not Mike Champion, and I am sure he is glad that I am not :-). But I feel honored to have my postings be put into the same category as his.
Also, Edd Dumbill posted an article called XQuery's Niche on XML.com that summarized another (related) XML-DEV perma-thread this month. While I was too busy during that discussion to participate (I am building systems after all :-)), I would like to take this opportunity to point out the following:
- Both XQuery and XSLT are declarative, functional languages of equal capabilities for doing XML query and transforms (and both are providing recursion)
- XSLT 2.0 has some additional capabilities such as non-XML results, multiple outputs, provides some functionalities in easier syntax (grouping) at least when we compare these releases, uses an XML syntax, and provides an event-driven processing model.
- XQuery provides a more concise syntax targeted towards declarative, prescriptive formulations of queries. The XML syntax of XQueryX is not an end-user syntax.
- The XSLT event-driven model is well-suited for data-driven transformations (Norm's DocBook transformation example)
- XQuery is easier to statically analyse and optimize.
The distinction to some extend is similar to DataLog and SQL.
As to Dare's comment: XQuery got so complex that now I personally would rather stick to XSLT. For the average developer I'd suggest using something like X#/Xen/C-Omega or E4X before I'd suggest XQuery.
Based on that quote I wonder how many XQuery expressions Dare actually wrote. I don't think XQuery is more complex to use than XSLT if used for querying XML and reshape the structure into summaries and slightly changed structures. And in its current form it is not intended to take the place of XSLT in reshaping a DocBook document into an XHTML rendering; there I prefer XSLT. And as a language, it is hardly more complex than XSLT since XQuery is basically XPath 2.0 + order by + validate + construction - sibling/anchestor/namespace axis. And XPath 2.0 is part of XSLT 2.0 that provides all of the parts that XQuery provides...
I also would not recommend C-Omega to any average developer - at least not yet. For the simple reason that this kind of technology is still a research project and will have to deal with some of the same issues as XSLT and XQuery when it wants to support the XML data model in its fullness and has to bridge the procedural/declarative chasm. But I think it will be an interesting tool in a couple of years.
The take-away should be: Both XSLT and XQuery (and C-Omega and E4X) are tools in your toolbox. Use the right tool for the right job and understand when to use which one.