August 2004 - Posts

Query Builder


One of the new things I noticed in Yukon’s Reporting Services is that the new data provider for Analysis Services. This data provider comes with its own query builder which is handy if you’re not comfortable writing your own MDX query or prefer using a UI to build a query. You can drag and drop objects (levels, members, measures, etc) into a results pane where you can automatically see the results of the change to the query. How fast you see the query results depends, of course, on the cube. If this auto-execute behavior slows you down too much, you can disable this feature and execute the query when you’ve finished building the query. For hard-core MDX developers, there is a query view which lets you enter an MDX query directly but you need to make sure that your query returns a flattened rowset. Reporting Services isn’t geared to work with multidimensional data in the reports.

When using the Analysis Services data source, the query gets added to the RDL both as a command text (like you saw in RS2000 for any query) and is parsed into an XML format called a Query Definition. Its purpose seems to be related to the query designer in BIDS. When I removed the Query Definition from the XML, I could still view the report in preview mode, but when I switched to the Data tab, my query was completely gone and I could no longer preview the report. This means that if you’re developing applications that generate RDL, you don’t need to worry about generating the Query Definition unless you plan to work with the generated report in BIDS.

One thing I really like – at least conceptually – is the ability to set up a report parameter from the MDX query builder, but beware this is still a work in progress. Essentially, you can specify a dimension as a parameter, which in turn sets up the report parameter and the dataset for you as well as modifies the MDX query to use parameter in the query. I’ve experience some odd behavior with this feature, but my understanding is that there will still be some changes to this in future builds. I’m looking forward to seeing how this winds up working.

--Stacia


Conditional branching in DTS


DTS 2005 adds a For Loop and a For Each Loop task. Very nice not to have to do spagetti hacks on precedence constraints. But I didn't see any kind of If task or Select Case task, and wasn't very happy about using Success as one branch and Failure as another branch of control flow. It took a little while (and some help from my art-school son) to track down how to do it, but it's pretty cool.

Now, in addition to a simple Success/Failure/Completion options, you can use AND or OR logic to add an expression . An easy expression is to test the value of a variable. So you can have one flow that is "Success AND @BranchVariable = 1" and another that is "Success AND @BranchVariable = 2" and so forth. You can still have a Failure track too. It's not really a conditional branch, because the expressions are independent, but that ultimately adds to the flexibility.

--Reed


Where’s the beef? (eh, permissions?), Part 2


This entry continues the discussion started in “Where’s the beef? (eh, permissions?), Part 1”.

Instead of removing the permissions from each object’s XML file, I decided I would just recreate the roles. I created a new Sales role and a new HR role. When I clicked on the Cube Access tab in the Role Designer, I found out that the Sales role had Read permission on the HR cube and no permission on the Sales cube. The HR role had Read permission on the Sales cube and no permission on the HR cube.

How did this happen? In my original database I first created the HR role and AS 2005 behind the scenes assigned it Object ID = Role. I then created the Sales role and it was assigned Object ID = Role 1. Then when permissions were granted to objects in the original database, the permissions referenced the roles by their Object IDs. You guessed it, in my new database I created the Sales role first and then the HR role, so when AS 2005 assigned the Object IDs they were exactly opposite to the IDs that were assigned in my original database.

Lesson learned? Objects are portable between databases, just be REALLY careful about the permissions granted/denied to each object.

- Scott


Where’s the beef? (eh, permissions?), Part 1


I created roles, added users and groups to the roles, and then using the Role Designer granted permissions to the roles. For curiosity’s sake, I then decided to look at the roles’ XML. When I viewed the .role XML files I found the users and groups, but no permissions!

Where did they go? In AS 2005 a permission is a property of the object to which the permission is granted. My role named HR was granted Read permission to the HRCube. The code granting this permission is contained in the cube’s XML file. My role named Sales was denied permission to certain attribute members in a dimension. The code denying access to these members is contained in the dimension’s .dim XML file. 

This makes objects less portable. I created a new database and then added all the objects from the database described above, except the roles. When I tried to build my new database I got the following error: “The Role with ID = ‘Role 1’ doesn’t exist in the collection.” 

Why did I get this error? Because the XML code for the objects I added grant permissions to roles that do not exist in the new database. No problem, I thought, I’ll just remove the permissions from the objects. Unfortunately, because I have no roles in the database, I have no access to the role designer. I have to directly edit each object’s XML file in order to remove the permissions. 

Lesson learned? Objects are portable between databases, just be careful about the permissions granted/denied to each object.

- Scott


Introducing New Blog Posters


This is a Hitachi Consulting blog, not a Reed Jacobson blog. Even though I’ve been posting all the blogs, but we have other people here who are doing Yukon work and will start posting cool things they encounter as well. Here are some of the people you may be hearing from:

- Dave DuVarney was one of the authors of the Wrox Press book on Reporting Services.

- Stacia Misner has just finished authoring the new Reporting Services Step By Step book for Microsoft Press.

- Scott Cameron is working on a Yukon implementation project at a large “real world” client.

- Mark Dreessen is doing the “in the trenches” C# work on the sample apps for the Ascend course.

Watch for great posts from them.

- Reed


Management and Design Environment Oddities


As they say, One man’s fish is another man’s poisson, so my oddity might be your delight, but here are a few things about the SQL Management Studio (SMS) and BI Development Studio (BIDS) environment separation that seem a bit odd to me.

Analysis Services. SMS prevents you from changing the design of a live Analysis Services database. (You could it Beta 1, but now SMS is a “management” environment.) But you can still make changes to a live database, you just have to connect to it through BIDS. So BIDS now has three ways to open an existing project: 1) open the project files (a true development environment), 2) import to a project from a deployed database (useful if you haven’t been using source control and screw up your working copy), or 3) attach to a live database (defeating the separation of production and development).

Reporting Services. Maybe I just haven’t yet discovered how to do it, but I can’t find a way in SMS to move a report to a new folder or to export a report to RDL. Also, even though Reporting Services tries to let you script actions, sometimes the script ends up empty, and sometimes it’s unusable—for example the script of a report upload includes a byte array of the report, rather than the file location.

DTS Server. In order to register the DTS Server, you must first start the DTS Service which, by default, is disabled. The philosophy is that potentially optional services should only be activated by people who can use them. By that logic why not disable the Analysis Service, or even the SQL services. Also, you can’t execute a package from SMS (although you can from BIDS); to execute a package, you have to run a separate utility, and DTS doesn’t even try to let you script actions from SMS.

General. And, of course, there’s the issue of having BIDS as a separate environment from Visual Studio—if you’ve got a development solution, why shouldn’t it include the .Net assemblies as well.

It will be interesting to watch the environments continue to evolve as we move toward product release.

- Reed


Combined but Separate Environments


One aspect of Yukon that has been interesting to watch is the evolution of the development and management tools. As interim builds and betas have progressed, the vision has become more fully materialized. In Beta 2, it’s possible to see all the pieces together, and it looks good.

For the BI components, there are two environments—SQL Management Studio for managing the production environment, and BI Development Studio for supporting the development environment. In a single development solution, you can include DTS projects, Analysis Services projects, and Reporting Services projects. The solution actually starts to look like a solution. In SQL Management Studio, the Report Server is now just another server along with the SQL Server and the Analysis Server. All the management options for Reporting Services that used to require Report Manager (or custom code) are now right there next to the management options for SQL Server and Analysis Server. There’s even a DTS Server that helps you monitor (and stop!) running jobs, as well as manage the storage of packages.

The bottom line is that there are now two separate and distinct environments for development and management, but all the BI components are available in a nicely parallel manner in each.

Lest I sound like I’m delivering a eulogy, there are still some quirks, rough edges, and confusing places. Tomorrow I’ll mention a few places I have found odd.


Multiple Grains in a Hierarchy


In AS2K, when you process a multi-level dimension, the query includes the columns needed to support all the levels. This means that even if the dimension comes from multiple, snowflaked dimension tables, the result will only include higher-level members where lower level members exist. For example, if you have a Time dimension with a Year table and a Month table, but have months for only 2004, then even though the Year table has values from 1994 to 2004, only the members that make it through the join appear in the dimension.

AS2005 is very different. Each attribute gets its own query. This means that you can easily have 10 years of Year members, but only 1 year of Month members. This doesn’t violate relational integrity, because you can always have extra members on the Primary Key side of a relationship.

When I first noticed this behavior, I thought it was very strange—if not an outright bug. But after more thought, it is a very nice capability. Suppose you want to show the current year orders with daily detail, but want to shrink your data warehouse by aggregating all prior years to the month level. Just create two fact tables, one at the daily grain for the current year, and one at the monthly grain for prior years. Create two measure groups, one for each grain. Then create a calculated member that adds the two together. (Because the data is partitioned over time, a simple sum doesn’t double-count anything.) Now you can seamlessly display the whole ten years of data at the year or month level, but the only the current year at the daily level.

In AS2K, you could always assign the first-day-of-the-month ID to represent months, but it’s very difficult to create a calculated member that properly nulls out the daily values when the first day of the month is not a valid day. With the new multi-grain approach, the values only get stored where they really belong.


Real Hierarchies Interact in Unusual Ways


In an earlier post, I got effusive about real hierarchies in AS2005. Multiple hierarchies in a dimension have a lot of very good consequences, but there are also some implications that are, at least in the current build—how shall we say it?—counter-intuitive.

Imagine a customer dimension, with a Geography multi-level hierarchy consisting of Country, State, and City levels, and also with separate hierarchies for Country, State, and City. Now imagine a report that has the Geography hierarchy on rows—expanded to show all levels—and the Country hierarchy as a slicer (page filter). What do you expect to have happen to the rows for the Mexico portion of the Geography hierarchy when you slice by Country = USA?

I half-expected all the Mexico rows to disappear (because of the new interaction between hierarchies in a dimension), but they don’t. OK. I can live with that. But I really expected all the Mexico rows to be empty, because the slicer limits the data to USA. They are not empty. In fact, the USA slicer is completely ignored. In fact, even if the rows axis shows only the City level of the Geography hierarchy (so that you can’t see the country on the rows, the USA slicer is still ignored.)

I do understand the logic behind this. When a row shows Guadalajara, that member implies that the country is Mexico. When MDX figures out the value for a cell, it needs one member from each hierarchy. By the time it gets to the slicer, the Country has already been filled in.

The MDX behavior hasn’t changed from previous versions—and it shouldn’t. In MDX, an “inner” member always overrides an “outer” member. It has to do this, or set functions wouldn’t work. The only thing that has changed is that you can now put members from the same dimension on two different axes, provided that they are in different hierarchies. (You could always put different hierarchies of the same dimension on different axes, but in AS2K different hierarchies were really different dimensions, so that doesn’t count.)

My vote is for AS2005 to disallow slicers with attributes that are implied by levels on another axis. That would be consistent with AS2K behavior and would avoid the counter-intuitive results. But maybe the brain trust in Redmond will come up with a better solution by the time the product ships.


Hierarchies Become Real


In Analysis Services 2000, there was no real distinction between a dimension and a hierarchy. The notion of "hierarchy" was simply a syntactic convention to group dimensions by a prefix. The two "hierarchies" Time.Calendar and Time.Fiscal, were really two independent dimensions.

is completely different. A dimension consists of a primary (granularity) key, plus as many other "attributes" as you want to add. Each attribute can (and, by default, does) form its own little mini-hierarchy. For example, a customer dimension with Age, Gender, and Nationality attributes would, by default, contain separate hierarchies for Age, Gender, and Nationality. In AS2K, you could manually create these three "hierarchies", but they would really be separate dimensions. The difference becomes obvious if you nest the hierarchies on a single axis.

In AS2K, if you nest Age, Gender, and Nationality, you would get a list with all nationalities repeated for all genders (uh, I mean, for both genders), and that combination repeated for all age brackets. Even if you didn't have any 83-year-old female customers in Lithuania, that combination would still appear; the row would just be empty. You could use NON EMPTY on the rows to get rid of the non-existent combination, but for large, sparse, combinations, assembling all the possibilities and then discarding the ones with no data is a very slow operation.

In AS2005, if you nest Age, Gender, and Nationality—which now really are hierarchies of a single dimension—you get a list of only those combinations for which a real customer exists. So if you still don't have any 83-year-old female Lithuanian customers, that combination will not even show up on the axis—whether you use NON EMPTY or not. Conversely, if you do have a 14-year-old Lithuanian boy as a customer who has never actually bought anything, that combination does show up, but the data cells are empty.

In AS2K, an empty row can appear for two different reasons: 1) the combination of attributes does not exist in the dimension, or 2) the combination of attributes exists in the dimension, but does not appear in the current slice of the facts. In AS2005, only the latter will produce an empty row; the former will never happen. The implication may seem subtle, but it can have a profound impact on query performance—or, more precisely, on the ability to create more flexible reporting structures that still give superb performance.


Reporting Services Self-Contained Data Source


I saw an interesting question/answer exchange on Reporting Services that I'd like to pass along

Question: Is there a way to have a self-contained data source in a report--in other words, to write the RDL in such a way that the data itself is serialized into the report definition?

Answer: Not out of the box, but because Reporting Services supports custom data processing extensions, you could write a data processing extension that reads the Data Source property as an XML string, parsing the string and passing it directly to the report.

Commentary: I could see using this approach as a way of distributing specialized reports to remote sites. The RDL would still need to be deployed to a server, but the report could be rendered by the recipient in a variety of formats, which would not be possible if the finished report were simply sent as a PDF document or even as an XML file.

One of the samples that ships with Reporting Services (2000, and presumably with 2005 edition as well) is a data processing extension that reads an XML file. You could adapt that sample to read the XML directly from the connection string.


“New” Descendants functionality is really Old


This is a bit embarrassing, but I just found a cool “new” feature in Yukon that turns out to have been in AS2K all along. (The fact that it was not documented is no excuse for not knowing it.)

The Descendants function can take a Set (as well as a Member) as its first argument. So, for example, if you want all the leaf level Road Bikes and Touring Bikes, you can just use Descendants({[Road],[Touring]},,LEAVES). It’s very handy.

And you don’t even have to wait for Yukon to get it. (You just have to wait for Yukon to get it in BOL.)


Subcubes and Scripts


The concept of a “subcube” exists in AS2K. It appears in the definition of a Calculated Cell. There are probably a couple dozen people who have created Calculated Cells.

In an effort to make the extremely useful functionality of Calculated Cells more widely used, AS2005 introduces the concept of a script. A script is essentially a series of Calculated Cell definitions that look like procedural instructions. So scripts use subcubes a lot.

A subcube is basically a way of specifying one or more cells in the cube. In a script, you assign a value or a formula to a subcube. If the subcube is a single cell, you assign the value or formula to that one cell--sort of like Excel. But a subcube can be a lot bigger than a single cell: one popular subcube might include one member of one dimension (for example, a single measure), and all members of all other dimensions. When you assign a formula to a large subcube, it simply means that whenever a query asks for a cell that is in that subcube, it uses that formula to calculate the value. Again, it's like Excel, except that instead of copying a formula into 4000 cells, you specify one big range (the subcube) and assign the formula once to the whole thing.