SSAS Tabular-Part 4: Deep Dive DAX

This is the fourh article in the four part series about designing SSAS Tabular models.  In this article you will learn advanced DAX calculations.

This is the fourth article in the four part series about designing SSAS Tabular models.  This article is about advanced DAX calculations used to define measures and KPIs.

Part 1 - Getting Started with SSAS Tabular
Part 2 - Easy DAX: Getting Started with Data Analysis Expressions
Part 3 - Tabular Model Administration
Part 4 - Deep Dive DAX – Solving Complex Business Problems with Data Analysis Expressions

 

Starting Your Modeling Career – with Analysis Services Tabular Models

Paul Turley, SolidQ Mentor & SQL Server MVP

Part 4 - Deep Dive DAX

The heart and soul of Tabular models is the DAX language.  In part 2 of this series you learned to use some of the basic DAX functions like SUM and CALCULATE.  You also learned about row and filter context and how functions like ALL() and ALLEXCEPT() can be used to override natural row context to change the scope of a calculation.  In simple terms, when you use aggregate functions like SUM(), AVERAGE(), MIN() or MAX() with a numeric measure value, the calculation is applied to all the rows in the context of current filters and groups.  This functionality is much like the behavior of a T-SQL query using a SUM() function and GROUP BY clause but that’s about where the similarities end.

The power of DAX goes well beyond emulating simple grouping and filtering logic.  It can be used to apply advanced business logic to perform complex calculations that would be difficult to duplicate with the same performance in relational tools.  For this final exercise, I’m going to queue some important topics you are likely to encounter, I’ll describe a reporting scenario and the calculations needed to address requirements and then I’ll step you through the exercise using the project we completed in the previous article.

User Interaction and Data Browsing

Analytical reporting is pretty simple in concept.  Typically every measure is going to be grouped and/or filtered by something so when designing calculations, measures and KPIs; we must anticipate different ways users might interact with a model.  The easiest way to think about this is in the context of an Excel Pivot Table.  Connected to the model, a user could place any of the table attributes on rows or columns. They could place any measure or KPI in the intersecting data cells to perform calculations grouped by the rows or column attribute values; and they can apply filters and slicers.

Conditional Calculations

Some calculations only make sense when performed in a certain context.  For example, if financial budgets are calculated for the end of each fiscal year and actual expenses are recorded for each month period, the logic for reporting on a daily, monthly or annual basis will be different and in some cases, may just not make any sense. 

The IF() and SWITCH() functions apply conditional logic.  In its simplest form, the IF function accepts a Boolean expression that resolves to True or False.  If True, it returns a value or the result of an expression you provide in the second argument.  If False, a BLANK value is returned unless you provide something else in the third argument.  The SWITCH function is essentially an unlimited set of expression pairs that you could pass to multiple nested IF functions, just easier to read and debug.

One or More Values?

Some calculations only work in the context of one grouped value.  For example, in a monthly budget vs actual financial report, the data must be filtered or group by month.  If not, a calculation must include exception logic to deal with this possibility.  DAX functions that require only one value will throw an error if you send them nothing or more than one value.  I’ll show you an example of the HASONEVALUE() function that serves this purpose.

Date and Time Intelligence

Nearly every data model has a date dimension and there are several DAX functions that simplify common date and time calculations.  Most of these are wrapper functions that handle low-level range and comparison logic with built-in code that is aware of natural date and time characteristics.  These functions are useful only when you have designed a date or time dimension into your model with the appropriate metadata and data types.

Exercise Objectives

Let’s get started.  Our goal is to create three measures and a KPI to report accumulative and comparative sales totals across months and years.  We already have a measure named “Reseller Sales Amount” and we’ll take this calculation to the next level.

The first new measure I’ll add to the Reseller Sales table is “YTD Sales” which will calculate the sum of the Sales Amount from the beginning of the current year through the end of the currently selected period.  There are several different ways to write this measure depending on how hard you want to work.  One method looks like this:

YTD Sales:=CALCULATE(
        SUM([Sales Amount]),
        DATESYTD('Order Date'[Date])
     )

The DATESYTD() function is magical and handles a lot of logic that would otherwise require conditional branching and exception handling.  It simply figures out what attribute of the Order Date table is current (which could be the Quarter, Month, Week or Day) and returns a range of dates from the beginning of the year through the end of that period.  Another method to do the same thing is to use the TOTALYTD() function like this.  These are functionally the same.

YTD Sales 2:=TOTALYTD(
        SUM([Sales Amount]), 'Order Date'[Date]
   )

File this lesson under “as soon as I figure out one way to solve a problem in DAX, there is always an easier way”.

The next thing I want to do is compare the current accumulative YTD sales with the same total for the previous year.  This is can be performed by using the SAMEPERIODSLASTYEAR() function which takes a date column and returns a table of dates from the beginning of the prior year through a date that is one year earlier than the end of the provided calendar period.

Prior Year YTD Sales:=CALCULATE(
        SUM([Sales Amount]),
        DATESYTD( 
               SAMEPERIODLASTYEAR( 'Order Date'[Date] )
        )
    )

To add a KPI, create another measure for the “Actual” value and just reference the base measure like this:

YTD Sales to Prior Year KPI:=[YTD Sales]

…then format it and right-click and convert the measure to a KPI.  Set the target to use the Prior Year YTD measure and set the threshold percentages.  KPIs are very simple in tabular model design.

Testing, Debugging and Exploring

This is the part where I teach you to fish rather than just giving you the answers.  How did I know that the SAMEPERIODLASTYEAR() function would solve my problem?  Sure, I’ve done this before and I’ve gone to the online help and community posts for help a few times but I’m forgetful and I needed to explore this and other functions to make sure it does what I need.  To do this, I create a testing and debugging measure.  In a given project I may write a dozen of these, hide them until I validate all the results and then eventually remove the test measures from the model.  This test measure returns the actual prior year date value that I can drop into an Excel Pivot Table:

Test PreviousYear:=IF( HASONEVALUE( 'Order Date'[Date] ),
             SAMEPERIODLASTYEAR('Order Date'[Date])
        )

One of my learnings was that in the context I’m calling the SAMEPERIODLSTYEAR function here, if I don’t have a single date selected, the function will blow up so I used the HASONEVALUE() function wrapped in an IF() function to eliminate the error.  HASONEVALUE() returns True when the user has grouped or filtered data so the measure is being called for a single attribute value.  Keep this technique in your top ten list because you’ll use it often.

The ISFILTERED() function isn’t quite as common but it works in a similar way, which you’ll see in the following example.

Calculating Values at Different Attribute Levels

This is where things get interesting.  The next measure will only be called by another measure so I would normally hide it.  Creating utility measures like this breaks down calculations into logical steps for debugging and reuse. 

Measures naturally roll-up to whatever level they’re grouped by but if I need to explicitly control the group level, I need to be a little more creative.  I only want the following calculation to apply when grouping on the Month Name so I use the previous technique with IF() and ISFILTERED().  If the query is grouped by anything but Month Name, nothing gets returned.  The VALUES() function reduces Calendar Year to a distinct list of values and since this calculation runs in the context of a single Month Name, Calendar Year will only have one value.  The CALCULATE() and FILTER() functions, along with the ALL() function resets the row context and applies the calculation to all rows in the current year rather than just the current row or group value.

Current Year Sales for Month:=IF(
              ISFILTERED( 'Order Date'[Month Name] ),
              CALCULATE( SUM( [Sales Amount] ),
                   FILTER( ALL( 'Order Date' ), VALUES( 'Order Date'[Calendar Year] ) = [Calendar Year] )
              )
          )

Finally, here’s the last measure.  It’s fairly simple because the hard work was done in the previous step.  Since the calculation will only be performed if data is grouped by month, there is no need to do another level-specific roll-up for the month.

Monthly Sales Ratio of Year:=IF(
           ISFILTERED( 'Order Date'[Month Name] ),
                 DIVIDE( SUM( [Sales Amount] ),
                      [Current Year Sales for Month]
                 )
            )

I used the DIVIDE() function to avoid divide-by-zero errors if the denominator happens to be blank or zero.  Browsing the model in an Excel Pivot Table, I’ve placed the Calendar hierarchy on rows and each of the measures on Values. 

Figure 1 - Results in an Excel Pivot Table

That concludes this series of articles.  If you haven’t yet downloaded the sample projects for each article, I encourage you to do that.  You’ll find all of the working code.  Each project is finished and can be used as the starting point for the next article in the series.

Resources

Download sample project for this article

Check the resource links at the end of the three previous articles for recommended books, code samples and online resources that support all four articles in this series.  You can also find additional projects and samples, presentations and video tutorials on my blog.  You can contact me through my blog if you have questions or if you need help with a solution.

http://SqlServerBiBlog.com

Maybe we’ll meet at a community event!  One of the most valuable resources for learning and career development in the SQL Server community are SQL Saturdays, virtual chapters and events led by your local PASS chapter.  My peers and I speak regularly at these events.  SQL Saturdays are free training and community events held annually in major cities throughout the world.  There’s no catch… it’s just a great way to learn and connect.

http://SqlSaturday.com

http://SqlPASS.org