Part I of IV: Two Examples Show that the New Drillthrough May Not Always Behave Like the Old Drillthrough
As noted in a previous post, there was usually no compelling reason for a fact table to have a primary key since there was no need for row uniqueness in the fact table. Since there may be readers who erroneously dispute this, an example is in order. Suppose a customer buys three identical hammers at a big orange home improvement retailer. The cashier swipes them one at a time and places them in the bag. The fact table that records these occurrences has a grain of day, store, cashier, and product, with a measure of quantity and price. Is it okay to write three rows into the fact table? Yes. Could ETL aggregate these identical facts into a single row? Also, yes. Does Analysis Services care? Not one whit, aside from some additional processing time to load three rows instead of one. But keeping three rows is a convenient way to count the number of scanner swipes, and these three rows will be perfectly identical.
Drillthrough on a cell might return all three identical rows to the client and this occurrence was easily factored into the data warehouse design. But this does not work for Analysis Services 2005 (“SSAS”) because drillthrough has been rearchitected. Two examples will show how fundamental these changes are; in fact, the changes are so pervasive that (a) MSAS cubes using DRILLTHROUGH will lose their DRILLTHROUGH upon migration to SSAS, and (b) the “new” DRILLTHROUGH returns a completely different answer in SSAS than it did in MSAS.
The first example provides evidence that drillthrough in SSAS can provide the same answer as the old drillthrough. It uses the AdventureWorksDW sample database for SQL Server 2005, and the Adventure Works DW sample database for SSAS. It focuses solely upon the Reseller Sales measure group, which has the following characteristics:
The first QA any good Analysis Services architect/developer performs is a rowcount check and top cell summary check. This first step in the check is to know what the “right” answer is. In this case, the right answer is 100% of the rows in FactResellerSales and the sum of the SalesAmount column. The following query provides the control totals:
SELECT
count(*) as ResellerSalesRowcount,
sum(SalesAmount) as ResellerSalesTotalSales
FROM
dbo.FactResellerSales
ResellerSalesRowcount = 60,855
ResellerSalesTotalSales = $80,450,596.9823
Any drillthrough operation should return 60,855 rows and total dollars of $80,450,596.98. The following MDX query executes a DRILLTHROUGH statement on the Adventure Works cube:
DRILLTHROUGH MAXROWS 100000
SELECT
[Measures].[Reseller Transaction Count] ON COLUMNS
FROM
[Adventure Works]
(Note: MAXROWS is set to 10,000 by default. Since this fact table has far more than 10,000 rows, MAXROWS is set to 100,000.)
This query also returns 60,855 rows, which is the correct answer. Do not be fooled into thinking that this one example proves that drillthrough in SSAS will always return the right answer.
The second example, which uses the aforementioned cash report, shows that drillthrough may not always return the correct answer. The cash report fact table has 20,473 rows in the fact table. But a drillthrough query, identical in form to the query used in the first example, sent back only 18,585 rows. The sum of the dollars returned by the drillthrough set was correct. So the dollars were right, but the rowcount was wrong.
After adding a rowcount measure to the measure group and running the query again, the results were plain. In SSAS, drillthrough does not return every row that went into the cell being drilled. In fact, it does not even query the data source. (http://msdn2.microsoft.com/en-us/library/ms345125.aspx) It returns one row for each detail cell in the cube that contributed to the cell upon which drillthrough occurs. So the 18,585 rows returned by the DRILLTHROUGH statement represent 18,585 distinct combinations of member values for the non-empty detail cells comprising the aggregate cell. Examination of the rowcount measure showed many fact rows with duplicate dimensionality, but not amounts (48 in one case, 41 in another, etc.) In all cases the dollar value returned for the drillthrough row was the sum of the fact rows that went into the detail cell represented by the drillthrough row. In other words, drillthrough reported that there were 48 fact rows that populated that detail cell, but it refused to return the individual rows. This was not acceptable since the cash report requires one row for each transaction, even if they are dimensionally identical.
The difference between MSAS and SSAS can be summarized with these two points:
- In MSAS, drillthrough queries the MSAS data source with an SQL statement and returns the fact rows underneath the cell upon which drillthrough acts.
- In SSAS, drillthrough does not query the SSAS data source. It queries the cube and returns one row for each detail cell underneath the cell upon which drillthrough acts. An obvious implication of this new behavior is that all information returned by drillthrough must reside solely with SSAS as UDM attributes.
When asked about this behavior, Microsoft’s response was, “This is by design. You can write all drillthrough queries as MDX now.” Darren Gosbel wrote a bug report about it. (http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=126175)
Is the latter part really true? Can you really write all drillthrough queries as MDX? Yes, but only if you modify your design to accommodate the “new” drillthrough, because Microsoft changed the rules of the game from MSAS to SSAS.
The next installment of this series will address the changes required in fact and dimension design to accommodate the new DRILLTHROUGH.