29,99 €
MDX is the BI industry standard for multidimensional calculations and queries. Proficiency with this language is essential for the realization of your Analysis Services' full potential. MDX is an elegant and powerful language, and also has a steep learning curve.SQL Server 2012 Analysis Services has introduced a new BISM tabular model and a new formula language, Data Analysis Expressions (DAX). However, for the multi-dimensional model, MDX is still the only query and expression language. For many product developers and report developers, MDX is the preferred language for both the tabular model and multi-dimensional model.
MDX with SSAS 2012 Cookbook is a must-have book for anyone who wants to be proficient in the MDX language and to enhance their business intelligence solutions.MDX with SSAS 2012 Cookbook is packed with immediately usable, practical solutions. It starts with elementary techniques that lay the foundation for designing advanced MDX calculations and queries. The discussions after each solution will provide you with a solid foundation and best practices. It covers a broad range of real-world topics and solutions and provides you with learning materials to become proficient in the language.This book will guide you through the hands-on and practical MDX solutions, best practices, and many intricacies that hide within the MDX calculations and queries.
We will start by working with sets, creating time-aware, context-aware calculations, and business analytics solutions, through to the techniques of enhancing the cube design when MDX is not enough. We will then move on to capturing MDX generated by SSAS front-ends and using SSAS stored procedures, and we will explore the whole range of MDX solutions for real-world BI projects.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 563
Veröffentlichungsjahr: 2013
Copyright © 2013 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: August 2011
Second edition: August 2013
Production Reference: 1200813
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-84968-960-1
www.packtpub.com
Cover Image by Žarko Piljak (<[email protected]>)
Authors
Sherry Li
Tomislav Piasevoli
Reviewers
Prakash Chatla
Jeremy Kashel
Acquisition Editor
Akram Hussain
Lead Technical Editor
Arun Nadar
Technical Editors
Shashank Desai
Iram Malik
Project Coordinator
Anugya Khurana
Proofreaders
Judith Bill
Lesley Harrison
Indexer
Monica Ajmera Mehta
Production Coordinator
Nilesh R. Mohite
Cover Work
Nilesh R. Mohite
Sherry Li has worked with Microsoft SQL Server Business Intelligence stacks for many years. She lives in Phoenix, Arizona where she works for a major financial organization with responsibilities in implementing data warehousing and BI solutions. She specializes in ETL, dimensional modeling, cube design, and reporting in MDX and T-SQL. Sherry Li maintains her blog at bisherryli.com.
I owe tremendous thanks to Packt Publishing for giving me this opportunity to write this book. Readers who want to become proficient in MDX and have been asking for more.
I would like to thank Tomislav Piasevoli, for starting this book in SQL Server Analysis Servies 2008, and for his deep knowledge in MDX and cube design. I have learned so much from him.
In addition, I would like to thank Prakash Chatla and Jeremy Kashel, for their insight, helpful questioning ("I don't think the point you're trying to make comes across."), enthusiastic responses ("I think every BI consultant should read this book!").
I would also like to thank my friends, my past co-workers, for their earnest encouragement and my current co-workers for putting up with my late morning starts at the office.
Also I thank my father, for keeping me company late at night on Skype from the other side of the globe.
My daughter Shasha, for showing me funny videos on YouTube at night.
My husband, Jim, for loving me unconditionally.
My dog Atari, for always sitting by my feet, while I write late at night.
Tomislav Piasevoli (<[email protected]>) is a Business Intelligence Specialist with years of experience in Microsoft SQL Server Analysis Services (SSAS). He lives in Croatia and works for SoftPro Tetral d.o.o., a company specializing in development of SSAS frontends and implementation of BI solutions.
His main interests are dimensional modeling, cube design, and MDX about which he blogs at http://tomislav.piasevoli.com. Tomislav also writes articles and speaks at regional conferences and user groups. For his contribution to the community, Microsoft awarded him with the Microsoft SQL Server MVP title.
I wish to thank everyone who contributed to this book, in one way or another. First and foremost, a big thank you goes to my wife Kristina and our three children—one of them was born while I was beginning to write this book—for having enormous patience throughout this period.
Secondly, I'd like to thank Chris Webb for vouching for me to the publisher and providing valuable information as one of the reviewers of this book. Greg Galloway, Marco Russo, Darren Gosbell, and Deepak Puri were the other precious reviewers of this book who corrected my mistakes and provided additional information relevant to the chapter topics.
Next, I'd like to thank all the people at Packt Publishing who worked on this book, to help make it better by assisting me during the book-writing process: Kerry George, Zainab Bagasrawala, Chris Rodrigues, and Merwine Machado.
There were also people who knowingly and unknowingly helped me with their ideas, articles, and helpful tips. My younger brother Hrvoje Piasevoli and Willfried Färber are representatives of the first group. The other group consists of bloggers and subject-matter experts whose articles I found useful for many recipes of this book. Here they are: Mosha Pasumansky, Teo Lachev, Jeffrey Wang, Jeremy Kashel, Vidas Matelis, Thomas Kejser (and other bloggers at SQLCAT site), Jeff Moden, Michael Coles, Itzik Ben-Gan, Irina Gorbach, Vincent Rainardi and in particular my reviewers again.
Additionally, I acknowledge countless contributors to MSDN SQL Server Analysis Services forum—whom I had the luck to meet—for the solutions they shared with the rest of us there.
Last but not least, a thank you goes to my current and former colleagues at SoftPro Tetral company who played a part by exchanging ideas with me during the time spent together all these years.
Prakash Chatla is a Business Intelligence Solution Architect and Software Developer who has over ten years of practical experience in building and implementing data warehousing and business intelligence solutions and products on the Microsoft platform in sectors, such as commercial distribution, financial services, online retailing, system integration services, software development, and the agricultural sector.
He used all his experience together with passion for programming in creating Microsoft BI client named PowerPanels available at http://www.powerpanels.eu. Currently, he and his team are building startup around PowerPanels.
My sincere thanks to Tomislav Piasevoli, Anugya Khurana and my PowerPanels team.
Jeremy Kashel is a principal consultant with Adatis, a UK-based Microsoft Gold Partner, specializing in the Microsoft Business Intelligence stack.
Jeremy has over 10 years' experience in delivering SQL Server data warehouse and business intelligence projects for a variety of UK and international clients. When time permits, he blogs at blogs.adatis.co.uk and has also spoken at a number of Microsoft events. Jeremy is the lead author of the book Microsoft SQL Server 2008 R2 Master Data Services, PacktPublishing.
You might want to visit www.PacktPub.com for support files and downloads related to your book.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
http://PacktLib.PacktPub.com
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books.
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.
Get notified! Find out when new books are published by following @PacktEnterprise on Twitter, or the Packt Enterprise Facebook page.
Microsoft SQL Server Analysis is one of the keystones of Microsoft's Business Intelligence (BI) product strategy. It is the most widely deployed OLAP server around the world. Many organizations, both large and small, have adopted it to provide secure and high-performance access to complex analytics.
MDX (for Multi-Dimensional expressions) is the BI industry standard for multidimensional calculations and queries, and is the most widely accepted software language in multidimensional data warehouse. Proficiency with MDX is essential for any professional who works with multidimensional cubes. MDX is an elegant and powerful language, and also has a steep learning curve.
SQL Server 2012 Analysis Services has introduced a new BISM tabular model and a new formula language, Data Analysis Expressions (DAX). However, for the multi-dimensional model, MDX is still the only query and expression language. For many product developers and report developers, MDX still is and always will be the preferred language for both the tabular model and multi-dimensional model.
Despite its popularity, very few books are dedicated to MDX. MDX-related books often limit their content to explaining the concepts of multidimensional cubes, the MDX language concept and its functions, and other specifics related to working with Analysis Services.
This book presents MDX solutions for business requirements that can be found in the real world of business. You will find best practices, explanations on advanced subjects in full detail, and deep knowledge in every topic. Organized around practical MDX solutions, this book provides a full, in-depth treatment of each topic, sequenced in a logical progression from elementary to advanced techniques.
This book is written in a cookbook format. You can browse through the contents and look for solutions to a particular problem. Each recipe is relatively short and grouped by relevancy, so you can find solutions to related issues in one place. Related recipes are sequenced in a logical progression; you will be able to build up your understanding of the topic incrementally.
This book is designed for both beginners and experts in the MDX language. If you are a beginner, this book is a good place to start. Each recipe provides you with best practices and their underlying rationale, detailed sample scripts, and options you need to know to make good choices. If you are an expert, you will be able to use this book as a reference. Whenever you face a particular challenge, you will be able to find a chapter that is dedicated to the topic.
We hope that you will become confident not only in using the sample MDX queries, but also in creating your own solutions. The moment you start creating your own solutions by combining techniques presented in this book, our goal of teaching through examples is accomplished. We want to hear from you about your journey to MDX proficiency. Feel free to contact me.
Chapter 1, Elementary MDX Techniques, uses some simple examples to demonstrate the fundamental MDX concepts, features, and techniques that are the foundations for our further explorations of the MDX language.
Chapter 2, Working with Sets, focuses on the challenges and solutions of performing the logic operations, NOT, OR, and AND, on Sets.
Chapter 3, Working with Time, presents various time-related functions in MDX language that are designed to work with a special type of dimension called the Time and its typed attributes.
Chapter 4, Concise Reporting, focuses on techniques that you can employ in your project to make analytical reports more compact and concise, and therefore, more efficient.
Chapter 5, Navigation, shows common tasks and techniques related to navigation and data retrieval relative to the current context.
Chapter 6, Business Analytics, focuses on how to perform some of the typical business analysis, such as forecasting, allocating values, and calculating the number of days from last sale date.
Chapter 7, When MDX is Not Enough, discovers that MDX calculations are not always the place to look for solutions. It illustrates several techniques to optimize the query response times with a relatively simple change in cube structure.
Chapter 8, Advanced MDX Topics, contains more advanced MDX topics, such as dealing with parent-child hierarchies and unbalanced hierarchies, getting random samples from a random hierarchy, and complex sorting and iterations.
Chapter 9, On the Edge, presents topics that will expand your horizons, such as clearing cache for performance tuning, executing MDX queries in T-SQL environment, using SSAS Dynamic Management Views (DMVs), drillthrough, and capturing MDX queries using SQL Server Profiler.
To run the examples in the book the following software will be required:
A full installation of Microsoft SQL Server 2012, or at least the following components are required:
We recommend the Developer, Enterprise, or Trial Edition of Microsoft SQL Server. A few examples might not work using the Standard Edition.
To download Microsoft SQL Server 2012 Trial Edition, go to Microsoft SQL Server site http://tinyurl.com/TrialSQLServer.
You can find Microsoft SQL Server 2012 Developer Edition on amazon.com, using the link http://tinyurl.com/DeveloperSQLServer2012.
Both the relational database file and the multidimensional Adventure Works project files are required:
We recommend the Enterprise Edition of the Adventure Works cube project. To download the installation files, use the following link to go to CodePlex: http://tinyurl.com/AdventureWorks2012
Microsoft Excel 2007 (or newer) with PivotTable is required.
Most of the examples will work with older versions of Microsoft SQL Server (2005 or 2008 or 2008 R2). However, some of them will need adjustments because the Date dimension in the older versions of the Adventure Works database has a different set of years. To solve that problem simply shift the date-specific parts of the queries few years back in time, for example, turn the year 2006 into the year 2002 and Q3 of the year 2007 to Q3 of 2003.
This book is for multidimensional cube developers or multidimensional database administrators. It is also for report developers who write MDX queries to access multidimensional cube. If you are a power cube user or an experienced business analyst, you will also find this book invaluable.
In other words, this book is for anyone who has been involved with multidimensional cube. This book is for you if you have found yourself in situations where it is difficult to deliver what your users want and you are interested in getting more information out of your multidimensional cubes.
In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Alternatively, we could have used the Aggregate() function instead."
A block of code is set as follows:
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "We can verify this by browsing the Geography user hierarchy in the Geography dimension in SQL Server Management Studio".
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.
To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title via the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the erratasubmissionform link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.
Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors, and our ability to bring you valuable content.
You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.
In this chapter, we will cover:
MDX is an elegant and powerful language, and also has a steep learning curve.
The goal of this chapter is to use some simple examples to demonstrate the fundamental MDX concepts, features and techniques that are the foundations for further explorations of the MDX language.
The chapter begins with several basic techniques: putting multi-dimensional data onto query axes, cube space restriction, empty cell removal, and the important concept of unique names for members, tuples, and sets. From there, we shall turn our attention to a few more advanced features, such as using the MDX functions, creating calculations in the cube space, manipulating strings, writing parameterized queries, and conditionally formatting cell properties. This will form the basis for the rest of the chapters in this book.
SSAS 2012 provides a sample Analysis Services database, the Multidimensional Adventure Works DW. All the MDX queries and scripts in this book have been updated for Analysis Services 2012, and verified against the 2012 Enterprise Edition of the Adventure Works DW Analysis Services database. Majority of the MDX queries and scripts should also run and have been tested in SSAS 2008 R2.
The Query Editor in SQL Server Management Studio (SSMS) is our choice of writing and testing MDX queries. The SQL Server 2012 comes with a free tool, SQL Server Data Tools (SSDT) for cube developers. Just as the Business Intelligence Development Studio (BIDS) was the tool that we used for cube design and MDX scripting in SSAS 2008, SSDT is the tool we will use in this cookbook for cube design and MDX scripting for SSAS 2012.
Cube space in SSAS is multi-dimensional. MDX allows you to display results on axes from 0, 1, and 2 up to 128. The first five axes have aliases: COLUMNS, ROWS, PAGES, SECTIONS, and CHAPTERS. However, the frontend tools such as SQL Server Management Studio (SSMS) or other application that you can use for writing and executing MDX queries only have two axes, x and y axis, or COLUMNS and ROWS.
As a result, we have two tasks to do when trying to fit the multi-dimensional data onto the limited axes in our frontend tool:
It's fair to say that your job of writing the MDX queries is mostly trying to figure out how to project multi-dimensional data onto only two axes, namely, x and y. We will start by putting only one hierarchy on COLUMNS, and one on ROWS. Then we will use the CROSSJOIN function to "combine" more than one hierarchy into COLUMNS and ROWS.
Making a two by eight table below in a spreadsheet is quite simple. Writing a MDX query to do that can also be very simple. Putting data on the x and y axes is a matter of finding the right expressions for each axis.
Internet Sales Amount
Australia
$9,061,000.58
Canada
$1,977,844.86
France
$2,644,017.71
Germany
$2,894,312.34
NA
(null)
United Kingdom
$3,391,712.21
United States
$9,389,789.51
All we need are three things from our cube:
Once, we have the preceding three things, we are ready to plug them into the following MDX query, and the cube will give us back the two by eight table:
The MDX engine will understand it perfectly, if we replace columns by 0 and rows by 1. Throughout this book, we will use the number 0 for columns that is the x axis, and 1 for rows that is the y axis.
We are going to use the Adventure Works 2012 Multidimensional Analysis Service database enterprise edition in our cookbook. If you open the Adventure Works cube, and hover your cursor over the measure Internet Sales Amount, you will see the fully qualified expression, [Measures].[Internet Sales Amount]. This is a long expression. Drag-and-drop in SQL Server Management Studio works perfectly for us in this situation.
Long expression is a fact of life in MDX. Although the case does not matter, correct spelling is required and fully qualified and unique expressions are recommended for MDX queries to work properly.
Follow these two steps to open the Query Editor in SSMS:
Follow these steps to save the time spent for typing the long expressions:
This should be your final query:
Downloading the example code
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
When you execute the query, you should get a two by eight table, same as the following screenshot:
We have chosen to put Internet Sales Amount on the Axis(0), and all members of Sales Territory Country on the Axis(1). We have fully qualified the measure with the special dimension [Measures], and the sales territory members with dimension [Sales Territory] and hierarchy [Sales Territory Country].
You might have expected an aggregate function such as SUM somewhere in the query. We do not need to have any aggregate function here because the cube understands that when we ask for the sales amount for Canada, we would expect the sales amount to come from all the provinces and territories in Canada.
SSAS cubes are perfectly capable of storing data in more than two dimensions. In MDX, we can use the technique called "cross join" to "combine" multiple hierarchies into one query axis.
In MDX query, we can specify how multi-dimensions from our SSAS cube layout onto only two x and y axes. Cross joining allows us in both SQL and MDX to get every possible combination of two lists.
We wish to write an MDX query to produce the following table. On the columns axis, we want to see both Internet Sales Amount and Internet Gross Profit. On the rows axis, we want to see all the sales territory countries, and all the products sold in each country.
Internet Sales Amount
Internet Gross Profit
Australia
Accessories
$138,690.63
$86,820.10
Australia
Bikes
$8,852,050.00
$3,572,267.29
Australia
Clothing
$70,259.95
$26,767.68
Australia
Components
(null)
(null)
Canada
Accessories
$103,377.85
$64,714.37
Canada
Bikes
$1,821,302.39
$741,451.22
Canada
Clothing
$53,164.62
$23,755.91
Canada
Components
(null)
(null)
This query lays two measures on columns (from the same dimension and hierarchy [Measures]), and two different hierarchies [Sales Territory Country] and [Product Categories] on rows.
To return the cross product of two sets, we can use either of the following two syntaxes:
We have chosen to use the alternate syntax for its convenience. The result from the previous query is shown as follows:
There are situations when we want to display just a list of members with no data associated with them. Naturally, we expect to get that list in rows, so that we can scroll through them vertically instead of horizontally. However, the rules of MDX say that we can't skip the axes. If we want something on rows (which is AXIS(1) by the way), we must use all previous axes as well (columns in this case, which is also known as AXIS(0)).
The reason why we want the list to appear on axis 1 and not axis 0 is because a horizontal list is not as easy to read as a vertical one.
Is there a way to display those members on rows and have nothing on columns? Sure! This recipe shows how.
Notation for empty set is this: {}. So for the axis 0, we would simply do this:
Follow these steps to open the Query Editor in SQL Server Management Studio (SSMS):
Follow these steps to get a one-dimensional query result with members on rows:
Although we can't skip axes, we are allowed to provide an empty set on them. This trick allows us to get what we need – nothing on columns and a set of members on rows.
Skipping the Axis(0) is a common technique to create a list for report parameters. If we want to create a list of customers whose name contains "John", we can modify the preceding base query to use two functions to get only those customers whose name contains the phrase John. These two functions are Filter() and InStr():
In the final result, you will notice the "John" phrase in various positions in member names:
Instead of skipping the Axis(0), if you put a cube measure or a calculated measure with a non-constant expression on axis 0, you'll slow down the query. The slower query time can be noticeable, if there are a large number of members from the specified hierarchy. For example, if you put the Sales Amount measure on axis 0, the Sales Amount will have to be evaluated for each member in the rows. Do we need the Sales Amount? No, we don't. The only thing we need is a list of members; hence we've used an empty set {} on axis 0. That way, the SSAS engine doesn't have to go into cube space to evaluate the sales amount for every customer. The SSAS engine will only reside in dimension space, which is much smaller, and the query is therefore more efficient.
Some client applications might have issues with the MDX statement skipping axes because they expect something on columns, and will not work with an empty set on axis 0. In this case, we can define a constant measure (a measure returning null, 0, 1 or any other constant) and place it on columns. In MDX's terms, this constant measure is a calculated measure. It will act as a dummy column. It might not be as efficient as an empty set, but it is a much better solution than the one with a regular (non-constant) cube measure like the Sales Amount measure.
This query creates a dummy value NULL on columns:
A WHERE clause in MDX works in a similar way as the other query languages. It acts as a filter and restricts the data returned in the result set.
Not surprisingly, however, the WHERE clause in MDX does more than just restricting the result set. It also establishes the "query context".
The MDX WHERE clause points to a specific intersection of cube space. We use tuple expressions to represent cells in cube space. Each tuple is made of one member, and only one member, from each hierarchy.
The following tuple points to one year, 2008 and one measure, the [Internet Sales Amount]:
Using a tuple in an MDX WHERE clause is called "slicing" the cube. This feature gives the WHERE clause another name, slicer. If we put the previous tuple in the WHERE clause, in MDX terms, we are saying, "show me some data from the cube sliced by sales and the year 2008".
That is what we are going to do next.
Open the Query Editor in SSMS, and then follow these steps to write a query with a slicer and test it:
We can slice the data by pointing to a specific intersection of cube space. We can achieve this by putting a tuple in the WHERE clause.
In the preceding example, the cube space is sliced by sales and year 2008. The cell values are the Internet Sales Amount for each country and each product category, sliced by year 2008.
Notice that the data returned on the query axes can be completely different from the tuple in the WHERE clause. The tuples in the slicer will only affect the cell values in the intersection of rows and columns, not what are on the columns or rows axes.
If you need to display sales and year 2008 on the query axes, you would need to move them to the query axes, and not in the WHERE clause.
This query has moved the sales to the columns axis, and the year 2008 to the rows axis. They both are "crossjoined" to the original hierarchies on the two query axes:
Run the query and you will get the following result. The call values are the same as before, but now we have the year 2008 on the rows axis, and the Internet Sales Amount on the columns axis.
TheNonEmpty() function is a very powerful MDX function. It is primarily used to improve query performance by reducing sets before the result is returned.
Both Customer and Date dimensions are relatively large in the Adventure Works DW 2012 database. Putting the cross product of these two dimensions on the query axis can take a long time. In this recipe, we'll show how the NonEmpty() function can be used on the Customer and Date dimensions to improve the query performance.
Start a new query in SSMS and make sure that you're working on the Adventure Works DW 2012 database. Then write the following query and execute it:
The query shows the sales per customer and dates of their purchases, and isolates only those combinations where the purchase was over 1000 USD.
On a typical server, it will take more than a minute before the query will return the results.
Now let's see how to improve the execution time by using the NonEmpty() function.
Follow these steps to improve the query performance by adding the NonEmpty() function:
Both the Customer and Date dimensions are medium-sized dimensions. The cross product of these two dimensions contains several million combinations. We know that typically, the cube space is sparse; therefore, many of these combinations are indeed empty. The Filter() operation is not optimized to work in block mode, which means a lot of calculations will have to be performed by the engine to evaluate the set on rows, whether the combinations are empty or not.
Fortunately, the NonEmpty() function exists. This function can be used to reduce any set, especially multidimensional sets that are the result of a cross join operation. It removes the empty combinations of the two sets before the engine starts to evaluate the sets on rows. A reduced set has fewer cells to be calculated, and therefore the query runs much faster.
Regardless of the benefits that were shown in this recipe, NonEmpty() should be used with caution. Here are some good practices regarding the NonEmpty() function:
Both the NonEmpty() function and the NON EMPTY keyword can reduce sets, but they do it in a different way.
TheNON EMPTY keyword removes empty rows, columns, or both, depending on the axis on which that keyword is used in the query. Therefore, the NON EMPTY operator tries to push the evaluation of cells to an early stage whenever possible. This way the set on axis becomes already reduced and the final result is faster.
Take a look at the initial query in this recipe, remove the Filter() function, run the query, and notice how quickly the results come, although the multidimensional set again counts millions of tuples. The trick is that the NON EMPTY operator uses the set on the opposite axis, the columns, to reduce the set on rows. Therefore, it can be said that NON EMPTY is highly dependent on members on axes and their values in columns and rows.
Contrary to the NON EMPTY operator found only on axes, the NonEmpty() function can be used anywhere in the query.
The NonEmpty() function removes all the members from its first set, where the value of one or more measures in the second set is empty. If no measure is specified, the function is evaluated in the context of the current member.
In other words, the NonEmpty() function is highly dependent on members in the second set, the slicer, or the current coordinate, in general.
If a second set in the NonEmpty() function is not provided, the expression is evaluated in the context of the current measure in the moment of evaluation, and current members of attribute hierarchies, also in the time of evaluation. In other words, if you're defining a calculated measure and you forget to include a measure in the second set, the expression is evaluated for that same measure which leads to null, a default initial value of every measure. If you're simply evaluating the set on the axis, it will be evaluated in the context of the current measure, the default measure in the cube or the one provided in the slicer. Again, this is perhaps not something you expected. In order to prevent these problems, always include a measure in the second set.
NonEmpty() reduces sets, just like a few other functions, namely Filter() and Existing() do. But what's special about NonEmpty() is that it reduces sets extremely efficiently and quickly. Because of that, there are some rules about where to position NonEmpty() in calculations made by the composition of MDX functions (one function wrapping the other). If we're trying to detect multi-select, that is, multiple members in the slicer, NonEmpty() should go inside with the EXISTING function/keyword outside. The reason is that although they both shrink sets efficiently, NonEmpty() works great if the set is intact. EXISTING is not affected by the order of members or compactness of the set. Therefore, NonEmpty() should be applied earlier.
You may get System.OutOfMemory errors if you use the CrossJoin() operation on many large hierarchies because the cross join generates a Cartesian product of those hierarchies. In that case, consider using NonEmpty() to reduce the space to a smaller subcube. Also, don't forget to group the hierarchies by their dimension inside the cross join.
Attribute relationships define hierarchical dependencies between attributes. A good example is the relationship between attribute City and attribute State. If we know the current city is Phoenix, we know the state must be Arizona. This knowledge of the relationship, City | State, can be used by the Analysis Services engine to optimize performance.
Analysis Services provides the Properties() function to allow us to retrieve data based on attribute relationships.
We will start from a classic Top 10 query that shows the Top 10 Customers. Then we will use the Properties() function to retrieve each top 10 customer's yearly income.
This table shows what our query result should be like:
Internet Sales Amount
Yearly Income
Nichole Nara
$13,295.38
100000 - 120000
Kaitlyn J. Henderson
$13,294.27
100000 - 120000
Margaret He
$13,269.27
100000 - 120000
Randall M. Dominguez
$13,265.99
80000 - 90000
Adriana L. Gonzalez
$13,242.70
80000 - 90000
Rosa K. Hu
$13,215.65
40000 - 70000
Brandi D. Gill
$13,195.64
100000 - 120000
Brad She
$13,173.19
80000 - 90000
Francisco A. Sara
$13,164.64
40000 - 70000
Maurice M. Shan
$12,909.67
80000 - 90000
Once we get only the top 10 customers, it's easy enough to place the customer on the rows, and the Internet sales amount on the columns. What about each customer's yearly income?
Customer geography is a user-defined hierarchy in the customer dimension. In the SSMS, if you start a new query against the Adventure Works DW 2012 database, and navigate to Customer | Customer Geography | Customer | Member Properties, you will see that the yearly income is one of the member properties for the attribute Customer. This is a good news, because now we can surely get the Yearly Income for each top 10 customer using the PROPERTIES() function:
In SSMS, let us write the following query in a new Query Editor against the Adventure Works DW 2012 database:
Attributes correspond to columns in the dimension tables in our data warehouse. Although we don't normally define the relationship between them, in the relationship database, we do so in the multidimensional space. This knowledge of attribute relationships can be used by the Analysis Services engine to optimize the performance. MDX has provided us the Properties() function to allow us to get from members of one attribute to members of another attribute.
In this recipe, we only focus on one type of member properties, that is, the user-defined member property. Member properties can also be the member properties that are defined by Analysis Services itself, such as NAME, ID, KEY, or CAPTION; they are the intrinsic member properties.
The Properties() function can take another optional parameter, that is the TYPED flag. When the TYPED flag is used, the return value has the original type of the member.
The preceding example does not use the TYPED flag. Without the TYPED flag, the return value is always a string.
In many business analysis, we perform arithmetical operations numerically. In the next example, we will include the TYPED flag in the Properties() function to make sure that the [Total Children] for the top 10 customers are numeric.
Attributes can be simply referenced as an attribute hierarchy, that is, when the attribute is enabled as an Attribute Hierarchy.
In SSAS, there is one situation where the attribute relationship can be explored only by using the PROPERTIES() function, that is when its property AttributeHierarchyEnabled is set to False.