MDX by Example

The most frequent request that I receive from readers is for more information about MDX. They particularly want more MDX examples. In this column, I offer 15 MDX queries that you can test in SQL Server 2000 Analysis Services' FoodMart 2000 sample cubes. The following real business questions focus on the problem of sales analysis.

1. What product brands does the company sell in each state in the United States? The query in Listing 1 creates a set called SoldInUSA, which determines the product brands sold in the United States by eliminating brands that have an empty Unit Sales value. The query defines a calculated member that determines whether the current product is sold in the current state. This member returns Yes if the product is sold in the state and No if the product isn't sold in the state. The query displays the states on the cube's columns and the products on the rows. The cell values are either Yes or No, depending on the corresponding product-state combination.

You might notice that this query's result returns the same brand name more than once. The Sales cube divides product categories into brands. Therefore, if one brand makes products in more than one category, that brand will appear multiple times in the hierarchy. At first glance, the repetition might appear to be a problem, but you probably want to treat multiple occurrences of the same brand as different brands. For example, suppose one company produces luggage and clothes. As an analyst, you'd treat these product categories as different brands because you wouldn't want the luggage brand's failure to affect your decision to carry the clothes brand.

2. What are the top product categories across all stores? Listing 2's straightforward query uses the TopCount() function to determine the top 10 product categories based on unit sales.

3. What are all the food and beverage brands that were sold in the United States during any of the past three quarters? The query in Listing 3 demonstrates how to define time-dynamic sets—a valuable technique. A time-dynamic set moves forward in time as the cube fills with data. The LastQuarter set determines the time dimension's most recent quarter that contains data. The Last3Quarters set builds on LastQuarter by using the Range() function—denoted with a colon (:)—to select the three consecutive quarters that end with LastQuarter. I didn't use the Tail() function in the LastQuarter definition to retrieve the past three quarters because doing so might return three nonconsecutive quarters. An empty quarter might occur between full quarters, and the Filter() function would eliminate the empty quarter. The Lag() function, combined with the Range() function, guarantees that the quarters are consecutive.

In this query, the item(0).item(0) function retrieves a set's first member. Because a set is technically a group of tuples (i.e., lists of members from different dimensions), you must use the first Item() function to select a tuple in the set and the second Item() function to select a member in the tuple.

4. What are the recent sales trends for the 10 best-selling product brands? The query in Listing 4 uses TopCount() to find the top-selling product brands, then uses Listing 3's time-dynamic set technique to determine unit sales for the past 6 months. The query then places the top brands in the rows, along with their unit sales for the past 6 months. You might use this query in a line chart for monitoring product-brand performance.

5. Which product brands make up the top 80 percent of the company's sales? TopPercent() is similar to the TopCount() function but returns the smallest number of items that make up 80 percent of the total unit sales. (In other words, these are the items that have the largest numbers of unit sales.) Listing 5's query displays the product brands on the query result's rows, along with their total Unit Sales value.

6. Which product brands make up the bottom 20 percent of volume? The query in Listing 6 uses BottomPercent() to find the largest number of items that make up only 20 percent of the total unit sales. (In other words, these are the items that have the smallest numbers of unit sales.) The query uses the Non Empty keyword to eliminate any brands that have no unit sales. Using Non Empty is slightly different from using the combined Filter() and IsEmpty() functions (which you see in Listing 3) because Non Empty works on all items on an axis (e.g., rows).

7. Which are the top five stores, and who are each store's top five customers? The query in Listing 7 demonstrates the useful but complex Generate() function. If you have programming experience, you'll find the Generate() function similar to the For Each statement in Visual Basic (VB) or C#. The Generate() function calculates the second parameter for each item in the first parameter.

In the following example

Generate( \{Miami, Atlanta\}, Customers.CurrentMember.Parent)

Generate() calculates the second parameter, in this case Customers.CurrentMember.Parent, for each item in the first parameter—\{Miami, Atlanta\}. In this example, the second parameter's MDX expression returns the parent of the current item, so the final result of this Generate() function—\{Florida, Georgia\}—would be a set of the parent of each item in the first parameter.

This query also uses the Generate() function to perform a nested rank. The function determines the top five customers for each of the top five stores. Then, Generate() unions the sets of customers together to create a list of the 25 store-customer combinations.

8. For each product brand, what are the two top-selling products and what percentage of total sales do they make up? To answer this question, Listing 8's relatively complicated query uses a combination of a calculated member and the Generate() function. The calculated member determines the percentage of the brand's total unit sales that a brand's top two products make up. The Generate() function searches the list of all brands and returns a set of each brand's top two products and the percent of total value that those products account for.

9. Show all the product brands for the past four quarters, highlighting brands that sold in the bottom 10 percent for that quarter. Cell properties are a convenient way to perform exception highlighting (i.e., changing the font or color to draw the user's attention to important information) in a query. In Listing 9, I added the cell property FONT_FLAGS to the calculated member HLUnit Sales to boldface the unit sales numbers in the bottom 10 percent of all product brands for that quarter. Because a cell property's value can be an MDX expression, you can perform conditional logic to determine whether the font will be roman or boldface. In this case, the condition logic determines whether the current brand is in the bottom 10 percent by doing a set intersect with the full list of brands in the bottom 10 percent. If the intersect yields a count of 0, the brand isn't among the bottom members and will appear in a roman font. If the count is 1, the brand is among the bottom 10 percent, and the value will appear in boldface.

10. Show the additional sales growth per month for a given promotion, and compare sales with those of parallel months in the previous quarter. This example shows how to find the range of monthly sales that a sales promotion affects. The PromoMonths set finds the precise months that a promotion affected. To ensure that the list of months is consecutive, use the Range() function to include mid-promotion months whose sales were unaffected.

For each month of the promotion, Listing 10's query displays the total unit sales, the total unit sales for the parallel month in the previous quarter, the growth (in absolute terms, rather than a percentage) from quarter to quarter, and the amount of the growth that the promotion caused. The ParallelPeriod() function finds the parallel month in the previous quarter. (Typically, you'd look for the same month in the previous year, but the Sales cube doesn't contain enough months to go back that far.) For example, if the current month is the second month in the current quarter, ParallelPeriod() will return the second month in the previous quarter.

11. Which product brands have store sales that exceed store cost by at least 160 percent? The query in Listing 11 demonstrates a technique to perform advanced filtering of the product-brand members. First, the query creates a SalesRatio measure to determine the percent by which store sales exceed store cost. Then, the query uses this ratio—combined with the Filter() function—to determine which brands exceed 160 percent. This example also demonstrates the use of the FORMAT_STRING cell property, which controls the format of a number for display. In this case, FORMAT_STRING displays the number as a percentage.

12. Which product brands' sales have grown from one quarter to the next by more than 50 percent? To make Listing 12's query more readable, I created two new measures, CurrQSales and PrevQSales, which return the current quarter's sales and the previous quarter's sales, respectively. The Growth measure uses these two new measures to determine the growth of the current product's sales from the previous quarter to the current quarter. The query then uses this growth value to filter all the product brands and isolate those with greater than 50 percent sales growth. The resulting display shows the previous quarter's sales, the current quarter's sales, and the percent growth from quarter to quarter for each brand that meets the growth criteria.

13. Show the top 10 product brands and the bottom 10 product brands, along with their unit sales and ranking numbers. Ranking is a common task in analysis because the largest and smallest items in a group are often the ones most relevant to decision making. The query in Listing 13 creates an ordered list of all product brands and displays the top 10 and bottom 10 from this list. The query also uses the Rank() function to display each brand's position in the ordered list. These ranking numbers are useful because the numbers convey the total number of product brands.

14. Compare a particular product trend with the average of all other products of the same brand and with all products of all brands. This example demonstrates a baseline analysis, in which you compare a trend in your business with some other known trend. An example in stock analysis is to compare a particular company's performance with the S&P 500, the Dow Jones industrial average, or the average of all companies in the same marketplace. The query in Listing 14 displays the trend of the past four quarters of the Ebony Plums product, along with the average of all Ebony products and all products that FoodMart carries.

15. List the top 10 middle-tier brands (i.e., the brands with between 500 and 3000 unit sales for the past 12 months). This example demonstrates a nested filter. The query in Listing 15 uses a filter to determine which brands make up the middle tier, then uses the TopCount() filter function to find the top 10 brands of that group.

These examples should equip you with more tools for using MDX to conquer complex business problems. Each month, I've provided at least one example in the form of the MDX Puzzle. Also, you can find more detailed information about most of the concepts I've presented this month. If you have a good example, send it my way, and I'll try to use it in a future column.

Learn more: MDX for the MDX Phobic

Comments

Plain text