OLAP (RSS)

Analysis Services and all about multidimensional databases, client and server side

Wrong calculation for unary operator with many-to-many relationships

I just posted this bug on connect.microsoft.com - there is an issue when you use unary operator in a parent-child dimension combined with a many-to-many relationship.

The problem is that the '+' operator on non-leaf nodes seems to not consider the children value and bypass the intermediate measure group. Everything works well with the '-' operator. It sounds like an optimization that has a side-effect...

I found the bug thanks to a post on www.sqlbi.eu forum, that is starting to get feedback from people who are using the models described in the Many-to-Many Revolution paper.

SQL Server 2005 SP2 - CTP available

Here is available the SP2 CTP for SQL Server 2005.

The what's new document explain that there is an improvement in many-to-many dimension relationships calculations. I will try to evaluate the performance improvement, but unfortunately I'm in the middle of a busy timeframe (next week I will be in Seattle for SQL Pass - if someone want to exchange some opinion face-to-face here, let's contact me directly).

If someone get hands-on experience on many-to-many performance changes, please let me know.

SSAS suggestions for Katmai: perspectives and drillthrough

I just added two suggestion items to my feedback for SSAS2005.

The first suggestion is about the perspectives: it would be useful to select member filters and default members on each single perspective. Another important thing could be to be able to rename a dimension/attribute/measure into a perspectve (then you need to add perspective to translation system, so it would be good to translate only the changed descriptions).

The second suggestion is about drillthrough: today you can drillthrough only the real measures. It would be better to drillthrough also the calculated measures: to do that, it would be useful an MDX expression that defines a subcube (of "real" cells and not calculated ones) to be used for drillthrough in place of the original selection/cell.

If you agree, vote and give feedback to these suggestions!

Using MSOLAP as a linked server

I just faced the problem to connect an Analysis Services (2000 or 2005) to a SQL Server 2005 database using a OPENROWSET statement. If you face an "access denied" error (precisely, "The OLE DB provider "MSOLAP" for linked server "SERVERNAME" reported an error. Access denied.") the solution is to enable the "Allow inprocess" flag in MSOLAP Provider Options.

You can modify provider option through SQL Server Management Studio, Server Objects / Linked Servers / Providers / MSOLAP, right-click and edit Properties.

When you get a result for an MDX query, the resulting column names are composed by dimension+levels+property name, so if you would like a cleaner naming you can use the WITH statement to define column names in a fast way. E.G.

WITH A (Customer, Product, Year, Sales )
SELECT * FROM OPENROWSET( 'MSOLAP', 'DATASOURCE=localhost; Initial Catalog=FoodMart 2000;',
'...[place here your MDX query]...' )

DSV and Views: how to decouple the physical tables from UDM

I just answered to a newsgroup question about the use of views as fact tables to populate cubes.

I am a strong supporter of the need to build a star schema before to build a UDM, while I understand that in some circumstances (few data, fast prototypes) you can think to build a similar model through views on the source database, without an ETL in the middle.

That said, when you have a regular and well defined star schema, you still may have doubts about the use of views (in the database) and named queries (in the DSV) to decouple the UDM model from physical tables. The reason for doing that is to simplify the maintainance of the solution. After years of experience (AS2000 had similar scenario) and tests in many combinations of the factors, I have these guidelines.

  1. Unless you have only a few Mb of database, it is ALWAYS BETTER to create your data mart to feed dimensions and cubes
  2. When you present dimension and fact tables to SSAS, it is better to create a VIEW on the database that is easy to maintain if you have to implement some logic on the query to present data to SSAS (for example, a join to decode some fields - it should not be the standard case, but it happens in the lifetime of a project when you need to add some attribute and temporarily want to show data that are still not implemented well in the denormalized star schema) - the ideal case is that all these views are SELECT * FROM table, because you only decouple the physical fact table from the logical view for SSAS
  3. Use named query on DSV only to apply some cosmetic change to data just to correctly populate dimensions and cubes. For example, a lastname + ' ' + firstname expression is right to use here, while I can't see many reasons to use a WHERE or a JOIN, because it imply some transformation logic that is better to place into the VIEW (that could be used by  other SQL queries, for example for Reporting Services).

What is your opinion? Feedback and comments would be very appreciated.

 

The many-to-many revolution: paper finally released!

After a month of editing, I finally released my paper titled "The many-to-many revolution".

I copied in this post the introduction of the 84-page paper that is published (for free!) on a dedicated dimensional modeling page of SQLBI.EU web site.

I would like to receive feedbacks about the paper. You can use mail, comments or the support forum on SQLBI.EU.

The many-to-many revolution - Introduction

Analysis Services 2005 (SSAS 2005) introduced the capability to handle many-to-many relationships between dimensions. At a first glance, you may tend to underscore the importance of this feature: after all, Analysis Services 2000 and many other OLAP engines do not offer many-to-many relationships. Yet, its lack did not limit their adoption and, apparently, only a few businesses really require it. However, as this paper shows, the UDM (Unified Dimensional Model) can leverage many-to-many relationships helping you to present data from different perspectives that are not feasible with a traditional star schema. This opens a brand new world of opportunities that transcends the limits of traditional OLAP.

We will explore many different uses of many-to-many relationships that give us more choices to model effectively business needs, including:

  • Classical many-to-many
  • Cascading many-to-many
  • Survey
  • Distinct Count
  • Multiple Groups
  • Cross-Time
  • Transition Matrix
  • Multiple Hierarchies

Although you do not have to do so, I recommend you to read the models in the order presented above, because often each one builds upon the previous models.

Each model has a brief introduction, followed by a business scenario that may benefit of its use and an explanation of its implementation. Each model uses only the minimal set of dimensions that are necessary to explain the concept behind it and a small dataset that demonstrates the underlying behavior.

Only the Distinct Count scenario contains a section discussing the impact on performance. Since the considerations presented there may be applied to other many-to-many relationship uses, I recommend you read this scenario if you are interested in performance evaluations.

An important warning has to be made if you are going to use VisualTotals MDX function (directly or through an OLAP browser): visual totals apply only to one level at a time with many-to-many dimensions. In the Links section, you will find a link to a document written by Richard Tkachuk that explains this limitation.

Download the complete paper and demo here.

Refresh of a linked dimension in SSAS

I started to evaluate the use of linked dimension across different database on the same server.

I encountered the first issue after a few minutes: when you link a dimension, VS editor create a dimension file that copies the structure of the dimension, without details on binding to data source (which are useless in this case). If I add an attribute to the original dimension, I am unable to use the linked dimension: the worst part is that I need to remove and to recreate the linked dimension on the project, losing all references to the dimension in all measure groups. At this point I need to recreate dimensions in cubes and dimension relationships for all measure groups.

It would really useful a "refresh" function that do automatically what I do to workaround the problem (I use the Customer dimension as an example):

  1. Preparation. Create the dimension Customer on database A (solution A)
  2. Deploy and process database A
  3. Create the linked dimension Customer on database B (that points to A.Customer) (solution B)
  4. Create a cube on B that use Customer
  5. Deploy and process database B
  6. Create a copy of solution B and name it database C (and solution C) - we will use it as a tool without deploying it to SSAS
  7. Change the original dimension. Add an attribute (for example Country) to the dimension Customer on database A (solution A)
  8. Deploy and process database A
  9. Make the change on linked dimension. Delete dimension Customer from database C
  10. Recreate linked dimension Customer on solution C that points to database A
  11. Open Customer.dim on solution C with the View Code function, Select All and Copy to Clipboard
  12. Open Customer.dim on solution B with the View Code function, Select All and Paste from Clipboard
  13. (as an alternative to 11 and 12, you can copy the Customer.dim file from solution C to solution B)
  14. Deploy and process database B

This workaround works if you only changed attributes that are not directly referred in the cube file: it happens when you use an attribute as a reference to another dimension or to a measure group. Most of the time dimension modifications does not touch those attributes and this workaround allows you to avoid the tedious and error-prone work of recreating dimension references to measure groups in the cube.

I would like to get feedback from people that already user linked dimension in a working (and changing!) environment.

Many-to-many dimensional modeling paper: looking for reviews

I'm approaching the review phase for a paper I wrote about advanced dimensional modeling with Analysis Services 2005 using many-to-many dimension relationships.

Currently the paper has around 80 page (with many figures!) and covers 2 base techniques (base and cascading many-to-many relationship) and 6 design patterns to model different business scenarios:

  • Survey
  • Distinct Count
  • Multiple Groups
  • Cross-Time
  • Transition Matrix
  • Multiple Hierarchies

I hope to publish the paper on SQLBI.EU in a few weeks. In the meantime I am looking for peer reviewers that already have some knowledge and experience of the use of many-to-many relationships in a non-traditional way. If you are interested, please contact me using the contact link in this page.

UPDATE: PDF and demo are now available here.

MDX quality improvement in Excel 2007

I just had to profile MDX queries sent to SSAS from some client. Office Web Components (that are used in the cube browser of Management Studio) in many circumstances send a lot of MDX queries, increasing roundtrip between client and server, resulting in bad performance in tables with hundreds of rows (crossjoin between 3 attributes on rows and one measure on column).

I was very worried, until I saw the clean and beautiful MDX query that Excel 2007 send to Analysis Services 2005 to get the same results: great performance, a hand-written MDX query has the same performance (WARNING: this is not a benchmark but a small test on a very limited set of queries).

This week the Beta 2 of Excel 2007 could help me to make some customer happy...

SSAS Dimension process fails with unexpected error MDDiscretizer::GetBucketForValue

The use of DiscretizationMethod=Clusters can cause a process error that is not so clear:

Internal error: Unexpected error. File 'mddiscretizer.cpp', row 1532, function 'MDDiscretizer::GetBucketForValue'.

This happens because you have an attribute with few distinct values to do a discretization in required clusters. To workaround the problem you can disable discretization of the incriminated attribute (you could have only one in the dimension, otherwise some SELECT COUNT( DISTINCT fieldname ) FROM Dimension could help you in the search).

Even if it would be preferable to process the dimension with a smaller number of clusters without blocking errors, it would be necessary at least to have a better error message that indicate the problem in discretization.

The problem affects both SSAS RTM and SSAS SP1. I already filed a bug in MSDN Product Feedback Center, but blogging it could help other unlucky developers.

Memory configuration (no pitfalls!) in SSAS

I recently had some problem with memory configuration of SQL Server Analysis Services 2005 and I'd like to share lessons I got. This post is a replacement for my previous post, that I striked to keep history of some bad assumptions I initially made.

SSAS have some setting that controls the way memory is managed by SSAS2005. Default values probably doesn't take care of concurrent applications and you could want to change them on a server running other services. Settings I'm talking about are defined as parameters of a single instance of SSAS: please note that MS suggest to not touch these settings...

Memory / LowMemoryLimit: it defines the value of physical memory that defines the point at which server is low on memory. Values greater than 100 are absolute values (scientific notation is supported in the config file, msmdsrv.ini). Values under 100 are calculated as a percentage of memory: reference value of memory (that corresponds to 100) is calculated as the minimum between the physical memory and the virtual memory available for a process. We have several cases (for semplicity I show the memory limit calculation for an hypotetic 100 value of the parameter): a 8Gb RAM 32bit Windows has a 2Gb limit (or 3Gb if you have /3Gb option enabled); a 8Gb RAM 64bit Windows has a 8Gb limit; a 1Gb RAM 32bit Windows has a 1Gb limit.

Default is 75 (75%) that is good most of the time, but if you have other processes on the same machine, you could prefer to lower this setting. If you want to know what the actual and running absolute limit is, you can get the exact running value reading the MSAS 2005 Memory / Memory Limit High KB performance counter.

Memory / TotalMemoryLimit: it's a percentage of physical memory that the server will use at maximum. Behaviour and setting considerations are analogous to LowMemoryLimi..

I had notice that in some cases PSS suggested to lower these settings to avoid an Out Of Memory error of SSAS 2005. After my previous wrong assumptions, Eric J of Microsoft helped me to understand what BOL still doesn't explain well. Eric wrote some other hints that we can share and index for future references.

Eric: You should assume they AS2000 and SSAS2005 are completely different products.  There are some similarities in settings but many differences.

Eric: Out of memory error could have been caused by many things, but basically memory could not be allocated from the OS.  Changing config values also impacts the point when the cleaner thread removes "shrinkable" memory.  Watch the perfmon counter "MSAS 2005:Memory\Cleaner Memory Shrinkable KB", and nonshrinkable.

Eric: To observe effects while you change LowMemoryLimit settings you can:

1. Open msmdsrv.ini in notepad, change value, and save file.
2. Open perfmon, report setting, view "MSAS 2005:Memory\Memory Limit High KB"
3. Observe the above listed value.

I already suggested to Eric that BOL could be improved on this topic (existing information here are not so clear).

Memory configuration pitfalls in SSAS

I recently discovered a few pitfalls on memory configuration of SQL Server Analysis Services 2005: the main problem is that the default configuration (written by setup) doesn't consider the actual physical resources (mainly RAM) of your server and this can cause some trouble. Settings I'm talking about are defined as parameters of a single instance of SSAS: please note that MS suggest to not touch these settings...

UPDATE: I completely changed this post due to incorrect information I wrote. Please read correct information here.

Memory / LowMemoryLimit: it's a percentage of physical memory that defines the point at which server is low on memory. Default is 75 (75%) that's is not good in two cases. First, if you have other processes on the same machine, you could prefer to lower this setting. Second, really important: if you have 8Gb of RAM and a 32-bit operating system, than the limit is defined by default at 6Gb, a measure that a 32-bit process can't reach.. so SSAS will continue to allocate memory until an out of memory is raised by the operating system. A big problem! So if you have a similar configuration, consider to lower this setting to 18 (around 1.4Gb) or a bit higher if you enable the /3Gb option. To find the right number try to calculate the percentage of RAM memory necessary to have 1.4Gb as result.

Memory / TotalMemoryLimit: it's a percentage of physical memory that the server will use at maximum. Problems are analogous to LowMemoryLimit: in a 32bit server with more than 2Gb of RAM you would definitely lower this number. In a 8Gb server I put the value of 20 (1.6Gb). To find the right number try to calculate the percentage of RAM memory necessary to have 1.4Gb as result.

I've been conservative in my calculations. A /3Gb option should allow higher numbers than mine. A 64bit server is immune of this kind of miscalculations...

UPDATE: I received this comment directly from Eric J. of Microsoft. Your recent post regarding AS2005 memory usage is incorrect.  Memory values are calculated as percentage of min( virtual, physical ), or if greater than 100.0 an absolute value is used (scientific notation is supported in the config file).  The defaults have been found to work well in a variety of scenarios, but there are cases where lower or higher values would work better.  AS2005 administrators may want to monitor the perfmon counters in MSAS2005 Memory.

I need to further investigate on it. More info coming soon, I hope.

Excel 2003 OLAP customization with VSTO

Thanks to Romeo Pruno, I just discovered a sample of Excel 2003 customization to access OLAP data through task panes. While this article deserves to be read, I think that client customization is a sort of "last resort", to be applied only for particular customization. I prefer to use clients that are functional-complete by itself and that allows user to customize reports and queries without writing code. For this reason I think this is a good introduction for people who want to build a customized solution, but is not a way to solve the lack of functionality of Excel 2003 (like Search members), many of them will be present into Excel 12.

Post SP4 hotfixes for Analysis Services 2000

I just discovered that some weeks ago Microsoft released a cumulative upgrades of hotfixes for Analysis Services 2000 released after SP4. I have to say that I noted AS2000 is less stable after SP4 intallation (I can say that after months of use) and I randomly got event ID 132 that appears to be resolved by SP4 but I've never seen before SP4 (btw, it seems that a pretty invasive backup program lock a file that should be deleted when this happens, so it should be not a SP4 responsibility). I have no time to open a call to Microsoft Support (it's a very expensive process) - I hope Microsoft will improve the way a user can submit a bug (service shutdown seems to be a good candidate) in a production system...

An equivalent post-SP4 cumulative upgrades of hotfixes is also available for the relational part of SQL Server 2000.

SSAS and role deployment

A few days ago I have encountered a system error deploying a SSAS project that involves role member. The error message is too generic and it hides the real cause of the problem that is very hard to find.

The problem itself is really simple: if you try to deploy a SSAS project with a role object which contains a member name (in collection members) that doesn't exists (or can't be resolved) by SSAS server.

As you can imagine, the common case that really could bring you made is when you successfully deploy a project into a development server but you fail to deploy the same database into a production server. I'd like to share the repro case given by a real-world situation that produced this error.

  1. Create a SSAS project and name it BugRoles (we will deploy it into BugRoles database)
  2. Create a role TestRole into a SSAS cube
  3. Assign a user (DOMAIN\UserA) to role TestRole
  4. Deploy SSAS project to the SSAS server
  5. Rename DOMAIN\UserA into DOMAIN\UserB on domain controller
  6. If you modify dimensions or cubes deploy and then deploy the project overwriting the existing database you created at step 3, it still works
  7. Now if you delete BugRoles database from SSAS server and then deploy the project, you get a generic system error that doesn't help you to understand that the wrong name DOMAIN\UserA into role TestRole is the real problem.

It's interesting to note that furter deployments at step 6 works well because when the role is deployed, the member user name is converted into a SID (security ID) that is invariant from the login name: when you rename a user into Windows, his SID remains the same. A role (as any other SSAS object) is deployed to a server only if it has been changed from the previous deployment. For this reason only a change to the role object into the project or a deletion of the role (or of the whole database) from the server cause a new deployment of the role, that fails because the role member name can't be resolved by SSAS server.

I already posted this bug through official channels, but since I lost 2 hours to identify the reason of a deployment error in this situaton, I think that sharing this knowledge could help someone else until MS will change the error message for this condition.

Christmas gift: Excel AddIn 1.5 for Analysis Services 2005

Microsoft released a new release of Excel AddIn for SQL Server Analysis Services: it's the version 1.5 which is compatible with Analysis Services 2005, compatible in the sense that it fully navigates attributes and hierarchies in a meaningful way. Unfortunately, no KPI support (at least I can't find it).

KpiUtil

Microsoft release KpiUtil, a tool that should facilitate the move of KPIs between Business Scorecard Manager 2005 and Analysis Services 2005. It could be useful...

Report Builder Model with UDM

Today I lost a lot of time digging on this problem so a I hope post about this will save time of other developers!

I wanted to import the UDM into a model for Report Builder. You have to do these steps:

  • go into SQL Server Management Studio
  • connect to Reporting Services
  • define a data source to your Analysis Services database (remember, use "Data Source=SERVERNAME\INSTANCE;Catalog=DATABASENAME" connection string using "Microsoft SQL Server Analysis Services" as a Data Source Type)
  • right-click the data source and choose "Generate Model..." menu item

At this point if you live in an english-language country, probably you're done. But if you live in other countries and your regional settings is not an English one, you many be in trouble.

My italian setting generates a rsModelGeneration error. You can workaround this bug (I suppose it is...):

  • close SQL Server Management Studio process
  • go in Control Panel / Regional Settings and choose English (United States) in Regional Options tab (other settings as Location aren't significative for this workaround)
  • open SQL Server Management Studio
  • connect to Reporting Services
  • right-click the data source and choose "Generate Model..." menu item

Now you can revert to your ori