Mike Rorke, one of our testers and representative to the XQuery testing taskforce (the one that is busy building test cases for the working group to be able to leave the Candidate Recommendation phase of XQuery) has started to blog. He will focus - what else? :-) - XQuery in SQL Server 2005 and posted a first article on how to write some of the XQuery use case solutions in SQL Server 2005. Please give him a warm welcome and subscribe!
And yes, MichaelroSoft also hires people that are not called Michael :-).
My apologies for the week long delay. You can get the WebCast recording and copies of the slides from the MSDN website (registration required). I would consider the language of my presentation to be Swiss-American-English and not as the site indicates American-English :-).
Here are the code snippets I used during my WebCast demo (please use in order):
And as always, the general small print disclaimer:
THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE.
In the following I would like to point out some interesting postings about XQuery.
First, my co-PM Shankar has posted some good advice about how to avoid ineffiicent multiple executions of value() methods in SQL Server.
Second, Kent has been warning about temp tables and schema collections. The reason for the observed behaviour is that temp tables are in the TEMP database. Since XML Schema Collections in SQL Server 2005 are scoped to a database (and cannot be referenced across database boundaries), one needs to make sure that the XML Schema Collection for the temp tables are created in the TEMP database.
Third, Jon Udell has been doing some interesting stuff using XQuery (although not using SQL Server 2005). Check it out.
Finally, Roger Jennings has written a good intro to the XML support in SQL Server 2005 (free registration is required).
It also looks like more from our team will start blogging soon. Stay tuned. :-)
A week ago (on April 4th), the W3C announced that XQuery 1.0, XPath 2.0, and XSLT 2.0 went into Last Call. Since this is the second Last Call, let me assure you that the intend of the working group is to make this the final one. The goal is still to get to recommendation within a year. One way to keep this going is that we request that people submit their comments by May 13th (and another one is that we discourage members of the WG to submit too many substantial comments).
Oleg has posted about the comparison of XQuery with PERL: Note that this was not sanctioned by the working group and the person who coined that comparison has received a fair number of email complaints from within and outside the working group. XQuery is really not comparable to PERL or vice versa :-).
In the following, I will list the documents, explain how to make comments and identify a few issues that I personally would like to see people provide feedback on the drafts.
List of relevant documents
The following are the documents that are relevant for the final recommendation and are in Last Call (I left some less important documents such as the XML Query Use Cases or How to build a Tokenizer for XPath or XQuery out):
In addition, new Full-Text documents have been published as well. They are not going into Last Call for now and are expected to trail after the XQuery 1.0/XPath 2.0 Recommendations:
How to file comments
This time around, the working groups decided to utilize the W3C's Bugzilla bug tracking database. Here is the official comment invite from the XQuery document:
Public Last Call comments on this document are invited. Comments on this document are due by 13 May 2005. Comments should be entered into the last-call issue tracking system for this specification (instructions can be found at http://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery mailing list, public-qt-comments@w3.org (archived at http://lists.w3.org/Archives/Public/public-qt-comments/), with “[XQuery]” at the beginning of the subject field.
Some issues to look out for
There have been quite some simplification of XQuery since the last Last Call: We have simplified the construction/validation story (no implicit validation anymore), the sequence type syntax has been refactored and simplified along the way I discussed it in an earlier posting, the definition of effective Boolean value has been simplified and made easier to implement, etc.. However, I think we need more feedback on the following new or in my opinion not rightly balanced features:
- Specification of encoding on an XQuery document:
The latest XQuery draft allows to specify an encoding of the query expression. While this sounds like a good idea, I fear that this will wreck havoc with how you embed XQuery into your programming environment. Most of the time, XQuery expressions are being written in your favorite editing tool for your query language or programming language. These tools normally are providing some default code page for the program code. I don't think that there is a need in such environments to specify or provide a choice of encodings for different XQuery expressions. Au contraire, allowing to specify this will lead most likely to inconsistencies between what the programmer writes and what the programming environment uses. For example, if your programming environment allows you to enter your code in UFT-16 and you say that the XQuery expression uses UTF-8 encoding, I am pretty sure that the query parser will get confused. In the worst case, the query may not even error but produce garbage.
If you transport your XQuery over a network, having encoding information is useful. But that information should be provided by the transport level protocol and not inline in the query expression. What happens if a gateway switches the encodings? Does he have to go and change the XQuery expression to fix up the encoding indicator?
Hearing from users and implementers about this should be valuable feedback.
- UntypedAtomic handling of value comparisons (eq, le et al.):
XQuery provides two set of atomic value comparison operators: The so called general comparisons such as =, < etc. that provide existential semantics over sequences with Boolean logic and the value comparisons such as eq, lt etc. that provide singleton only compares with three-valued logic. The W3C working group (against my advice) has chosen to define different type casting rules if untyped atomic values are compared against typed values.
For example, if I compare the untyped value “12“ against the decimal value 42 using the general comparison operator <, the untyped value will be cast to xs:decimal and the comparison will occur and return true. However, if I use a value comparison lt, the untyped value will be cast to string regardless of its counterpart and the comparison will result in a type error!
The reason that the working group cites is that the general comparison type casting rules for untyped values lead to non-transitive semantics. While this is correct, the value comparisons are not transitive on numeric comparisons anyway (due to the impreciseness of the binary floating point comparisons). And after normalization of the type cast, you can still claim that the operation is transitive, as long as the untyped value is cast to the same type for every comparison. My opinion is, that forcing users to remember two set of rules will be more confusing and makes the language more difficult to use. Thus, so far, SQL Server 2005 is implementing the same rules for value comparisons as for general comparisons regarding type casting of untyped values.
I think feedback on this topic to the W3C (and myself regarding the SQL Server behaviour) will be useful.
- Copy namespace functionality:
XML namespace (and base URI) processing when copying subtrees and creating new trees is one of the more complex aspects of XQuery to understand and often, you need different options geared towards different expectations and performance requirements. The specification has added new functionality that I think needs to be carefully reviewed, so that we are sure that we have a usable solution.