posted on Saturday, December 02, 2006 4:03 PM
by
drodabaugh
Integrating Analysis Services 2005 and Reporting Services 2005 with DRILLTHROUGH: Part II of IV
Part II of IV: Design Implications of the New Drillthrough
Part I of this series showed that using drillthrough in SSAS is hit-or-miss. Drilling through the top cell in the Adventure Works cube returned a rowcount identical to that of the FactResellerSales table underpinning the Reseller Sales measure group. Another example, from a real-world project, showed that this is not a reliable indicator of drillthrough’s behavior. In that example, drillthrough returned fewer rows than the fact table contained, even though the sum of dollars was correct. Microsoft has intentionally changed the behavior of drillthrough to return one row per nonempty detail cell that feeds the cell upon which drillthrough acts.
This is decidedly unsatisfactory if the goal is to show the actual fact rows that contributed to a cell, not a rowset imitated by detail cells. There are three possible ways to address this change.
-
Tell the users such detail is no longer available. This will surely go over like the proverbial lead zeppelin and does not show a servant's heart.
-
Force drillthrough behavior by using parameterized relational queries. This is not a satisfactory approach since it adds complexity to report authoring and assumes an available and capable relational data source.
-
Alter multidimenional design principles to accommodate the new reality. Use an MDX query with crossjoins or with the DRILLTHROUGH keyword to provide drillthrough.
The third option is the only viable option. Since drillthrough seems to return one row for each detail cell that comprises the target cell, there must be a way to ensure that each cell contains one and only one fact row. This suggestion will rankle many. As noted earlier, the point of using OLAP is to have a single cell for a value, regardless of how many fact rows correspond to that tuple. Stated another way, it was legitimate to have many fact rows per single detail cell. But for the “new” drillthrough to work properly, this relationship must be changed to one cube detail cell per one and only one fact row. If each detail cell has one and only one fact row, then drillthrough has no choice but to return one row per fact row. This can be codified with the following rules.
- For drillthrough in SSAS to return fact rows based only on UDM information, each detail cell must have one and only one fact row.
- For each detail cell to have one and only one fact row, each fact row must be dimensionally unique. No two rows may have the same combination of surrogate keys.
SSAS has two features to enable this design change.
- SSAS permits an optional primary key in the DSV fact table. This key can be a combination of existing dimensional columns, or it can be a value manufactured by ETL to ensure uniqueness.
- SSAS permits fact dimensions. Christian Wade has an excellent explanation of this nifty new feature at http://blogs.conchango.com/christianwade/archive/2005/04/07/1255.aspx.
This fact table primary key becomes a dimension attribute in the fact dimension, which allows a UDM attribute to enforce the one-to-one relationship between a detail cell and a fact row. It can be invisible, to be used only when called by a drillthrough query. Beware though, as it could exact a hefty toll on SSAS. Imagine having one attribute member for every single row in the fact table! Such an attribute would be prodigious. The multidimensional lattice could grow from “huge” to “unimaginably monstrous.” For those running on 32-bit Windows with the ongoing 3GB RAM limitation, this could be a huge problem even with the slick new dimension memory management in SSAS. Making the fact dimension a ROLAP dimension may be a good idea, and it may be possible to disable the attribute hierarchy for that attribute as well.
Part I of this series used Adventure Works DW as an example. This example is carried forward now, since it illustrates how these techniques are implemented in the Reseller Sales measure group. Inspection reveals the following:
- The FactResellerSales table in the AdventureWorksDW SQL database has a unique constraint on the SalesOrderNumber and SalesOrderLineNumber columns.
- The FactResellerSales table in the DSV reveals a compound primary key on the SalesOrderNumber and SalesOrderLineNumber columns.
- UDM contains a ROLAP dimension called Reseller Sales Order Details. The key attribute, called Reseller Sales Order, is a combination of SalesOrderNumber and SalesOrderLineNumber.
- The Reseller Sales measure group ties to the Reseller Sales Order Details dimension with a fact dimension relationship to the Reseller Sales Order attribute.
The upshot of all this is that each detail cell in the Reseller Sales measure group can have one and only one fact row.
Clearly, SSAS drillthrough creates design ripples all the way back through the DW/BI manufacturing pipeline. The following guidelines specify practices if drillthrough is desired.
-
Facts must have primary keys. ETL and the data pipeline must be able to manufacture meaningful unique identifiers for each fact row. The preferred unique identifier is a combination of meaningful attributes. Identity columns are less desirable since these keys need to be semi-permanent; that is, if the fact table is dropped and recreated with identical data, all of the values in the table primary keys should be as they were. Failure to enforce this will require complete reprocessing of the dimension holding the fact table’s primary key as an attribute, and every measure group using that dimension.
-
A PK must be designated on the fact table in the SSAS DSV.
-
A dimension and attribute must be created to pull this key into UDM. This dimension ideally has real business meaning (like an order number and detail line number that could never be duplicated). And regardless of its meaning, it could be huge. Suppose a fact table contains one billion rows. There will be literally one billion members populating the attribute. The new fact dimension in SSAS is tailor-made for this role. The dimension can be populated directly from the fact table without the need to build and populate separate dimension tables in the data source, or carry those objects forward in the DSV.
-
The storage mode must be carefully considered when building this dimension. As Christian Wade points out, SSAS now permits MOLAP partition storage with ROLAP dimensions. If this dimension is not used for anything other than drillthrough imitation, then ROLAP could be an ideal choice. Dimension processing times will be greatly reduced, and queries that do not directly use the dimension can ignore this dimension.
-
A decision must be made about how to author drillthrough queries. The DRILLTHROUGH keyword in MDX will still work. The alternative is to write a hefty crossjoin MDX query. The next part of this series will compare these two methods.
-
Limit drillthrough queries. This limitation takes two forms. First, account for drillthrough during design, by designating which measure groups will be use for drillthrough. Not all measure groups will be used for drillthrough, and in many cases, the data available to SSAS will not have a natural primary key for the fact data. If drillthrough is not required, then there is no mandate to create a unique fact key. Second, carefully control where and how drillthrough is executed. For small cubes, the query response time for any type of drillthrough query is immaterial. For cubes with huge data (e.g. – a billion fact rows), drillthrough on high-level cube cells would be disastrous, regardless of the method by which the query is answered. Ensure that MAXROWS is set properly.