November 2004 - Posts

Do NOT install SQL 2000 or MSDE on Windows XP via Remote Desktop

Not a real BI topic, but today I discovered that you CAN NOT install SQL 2000 or MSDE on Windows XP via Remote Desktop.

If you do that, setup hangs at a certain point, when it starts to execute some installation script on SQL Server. The MSDE setup display a timeout message after 2 minutes, while SQL Server setup hangs apparently forever.

If you find these two lines in ERRORLOG:

Warning: override, autoexec procedures skipped.
SQL Server is terminating due to 'stop' request from Service Control Manager.

and you are installing SQL Server 2000 or MSDE via Remote Desktop, probably you will be successful installing the software moving yourself in front of the incriminated machine using the real keyboard and a local session!

What have annoyed me is that there is a lack of diagnostics for that, expecially on MSDE2000A setup (who could be used to deploy MSDE with a db-based custom application). And, of course, no kb article on that. Fortunately, this thread on newsgroup helped me but I think it's an important scenario to address (at least in documentation!).

DISTINCT COUNT measures and NULL value

After several years of using Analysis Services and SQL Server 2000, today I stumbled in this problem.

I have a DISTINCT COUNT measure that have to measure how many different documents are present in a certain aggregation. Sometimes valid cells has no documents and the document field is NULL on the fact table (in reality it's a view that generate this degenerate dimension, anyway this doesn't really matter). The problem is that the NULL value is counted as a valid different value... so when you see 2 in the cube it could be 1 or 2 if you want to ignore NULL values.

Solution is not so light. Following this kb article (only available in korean language, I suppose - no english translation available but keywords are still in latin character so I guessed right...) I've found that a solution is to define two other measure in the cube with "regular" measure (as a best practice it's best to put distinct count measures each in a different cube, for performance reasons - then merge all measures in a virtual cube):

  • Rows_number - defined as SUM of expression "1"
  • NotNullRows_number - defined as COUNT of expression "field", where field is the same field you use as DISTINCT COUNT expression

In the cube with DISTINCT COUNT measure I defined only this measure:

  • Documents_null - defined as DISTINCT COUNT of expression "field", where field is the same as the COUNT measure

Then in the virtual cube you put all together creating a single calculated measure that evaluate the right number:

  • Documents - defined as calculated measure with expression "IIF( Rows_number > NotNullRows_number, Documents_null - 1, Documents_null )"

It works, but you are wasting three times the space originally needed for the DISTINCT COUNT measure.

Distinct count measures are a really bad beast in Analysis Services 2000. Unfortunately it seems that even the Yukon release will suffer from some actual limitations, like the performance penalty caused by an ORDER BY (with the field you used as DISTINCT COUNT measure) in the SELECT sent by Analysis Services to the source database when you have a DISTINCT COUNT measure. Try it on a (virtual) cube with several DISTINCT COUNT measures and a large fact table... a lot of RDBMS resource are needed while processing this cube.

UPDATE: thanks to a valuable feedback, it has to be noted that Analysis Services handle a NULL value like a 0 value in a DISTINCT COUNT measure. In my real-world scenario I had a DISTINCT COUNT measure on a dimension key that's an int identity, so it never has 0 as a valid value. Anyway you have to take care of this behaviour if you want to use this technique.

Snapshot Isolation Level

I just atteended a session where Ron Talmage showed the new Snapshot Isolation feature of SQL Server 2005. While it can be used to improve scalability of a traditional LOB application, it's really important to BI applications. Making a long story short, one of the available features is that you can take a "snapshot" of the database at a certain point in time and, without any lock (neither exclusive nor sharing), you can query the database viewing the same snapshot for a whole transaction (i.e. comprising several SELECT statement).

There are many examples of BI applications which can take advantage from this new feature:

  • DTS that feeds a DataMart while users are modifying the OLTP database
  • Several SQL queries for a single report that has to show consistent data

The only concern is that tempdb stores versioned rows, so it can become a bottleneck and grow in size if database update activity is very large while you are working on a snapshot. Anyway, there are a lot of scenario where snapshot isolation could be a great feature to improve scalability.

Reporting Services and Sharepoint: WebPart coming soon

Tom Rizzo yesterday at SQL Server Connections announced that Reporting Services SP2 is coming soom and it will bring us a SharePoint WebPart to display reports.

Great news.