DAX_Mobi-12

7- The “Golden Rules” of DAX Measures

How Does the DAX Engine Arrive at Those Numbers?

In the previous chapter I showed you a bunch of examples of measures, displayed in various-shaped pivots. And of course, the numbers displayed in all of those cases are accurate.

Since we’re writing some pretty interesting formulas in pivots now, we need to take a quick step back and reflect, just a little bit, about how pivots work behind the scenes.

On an instinctive level, I’m pretty sure you already understand everything I’m going to explain in this chapter, but your understanding is informal and “loose.” What we need to do is take your informal understanding and make it crisper. We need to put it into words.

For instance, if I asked you what the highlighted cell in this pivot “means,” I’m pretty sure you will immediately have an answer.

Figure 74 Question: Can you explain what the $98,600 “means?”

Let’s make this multiple choice. Choose Answer A or Answer B:

• Answer A: “$98,600 worth of product 344 was sold in the year 2001.”

• Answer B: “When you filter the Sales table to just the rows where Year=2001 and ProductKey=344, then sum up the SalesAmt column over those remaining rows, you get $98,600.”

I bet you chose A. Am I right? Yeah, I’m right. Don’t lie to me. Unless you have actually merged with Excel over the years to form a cyborg calculator, you still think more like a person than a machine. And people think like Answer A.

But Answer B is exactly how the DAX engine arrived at the $98,600 number. So learning to think that way, just a little bit, is a goal of this chapter.

It’s important for you to get comfortable thinking about measures the way the DAX engine thinks about them - like Answer B. Thinking like a human (Answer A) is still important, too, and even when writing measures it’s going to be ok most of the time. That’s because most of the time, your measure formula just works the first time you write it. But when your measure formula doesn’t do what you expect, you usually have to think “the DAX way” (Answer B) in order to fix it.

Teaching you to “think like DAX” is essentially the point of this chapter. Don’t worry if you haven’t grasped this yet, I’m going to break it down a few ways for you.

Stepping Through That Example

Let’s step through that same “98,600” example from above, this time in the PowerPivot window so that we have a picture at each step.

Here’s the Sales table:

Figure 75 Sales table with all filters cleared

There are three elements of this window I’d like to call out.

1. The “Clear All Filters” Button on the Ribbon. Highlighted in the picture above. When this is greyed out like this, you know there are no filters applied on the current table.

2. The row count readout. Pictured here, it shows there are 60,398 rows in the Sales table when all filters are cleared.

Figure 76 Row Count Readout: 60,398 rows are currently being displayed in the Sale table.

3. The measure grid (the three cells at the bottom of the table). Let’s widen the first column so we can see what those were.

Figure 77 Our three measures from the pivot also appear here, in the Measure Grid.

This area at the bottom of the table is called the Measure Grid. This feature is new in v2, and I’m not sure I like it very much yet. But it’s superb for demonstrating “the DAX way,” so we’re gonna use it here to great effect.

All right, let’s filter Year to be 2001:

Figure 78 Filtering to Year=2001

After the filter is applied, let’s check out the measure grid and row readout:

Figure 79 Sales have dropped from $27M to $3.2M, row count dropped from 60k to 1k

OK, now let’s apply the ProductKey=344 filter and then check the same stuff:

Figure 80 With both filters applied, we get the $98,600 number (the sum of SalesAmt from 29 rows)

Hey hey! It matches the pivot!

Notice that [Profit] is displaying as $47,462 and [Profit Pct] as 48.1%? Those were the numbers in the pivot as well:

Figure 81 [Profit] and [Profit Pct] in the pivot also match up to what we see in the filtered Measure Grid.

Hey, where are our other measures? If we make the measure grid taller, we see that they are here too:

Figure 82 All of our measures are here. Note that [Transactions] = 29, which is also what the row readout tells us.

Do you think the [Days Selling] = 18 number is correct? Of course it is, but double checking it is a good excuse to show you another trick I use a lot. I dropdown the OrderDate filter:

Figure 83 Scroll through this list and count how many dates show up. (Hint: there are 18).

Dropping down the filters in the PowerPivot window is a very helpful trick. It will only show you the values that are “legal” in the context of the filters applied to all other columns at the moment, just like in normal Excel Autofilter. This trick is especially useful for seeing whether there are any Blank values in this column once the other column filters are respected. (Even when there are too many values in the column, and you see the “Not all items shown” warning, the Blanks checkbox will show up if there are blanks, and if it’s missing, you know there are none).

Enough examples. I promised you some Golden Rules, and Golden Rules I will deliver.

Translating the Examples Into Three Golden Rules

I’ve been teaching these, that I call the Golden Rules of DAX measures, for a few years now. They serve as the foundation – once you understand these, most everything that follows will be simple and incremental.

When you are reading these rules, I encourage you to reference back to the examples above to help clarify what the rules mean.

Rule A: DAX Measures Are Evaluated Against the Source Tables, NOT the Pivot

It is very tempting to think that the Grand Total cell at the bottom of a pivot is the sum of the cells above it, but that is NOT the way it is calculated. As far as DAX is concerned, the fact that the Grand Total matches the sum of the numbers above it borders on coincidence.

So when you are thinking about how to construct a measure formula, or are debugging one that isn’t quite working, visualize the underlying table in the PowerPivot window, because the DAX engine is doing its work in that context.

For an example of this, we need look no further than the age-old problem of “the average of averages is meaningless.”

Figure 84 The six selected cells’ Average is 45.5% but the pivot Grand Total is 44.0% - only a calculation against the individual rows in the Sales table will yield the right result.

Rule B: Each Measure Cell is Calculated Independently

When thinking about how your measure is calculated, it is best to think “one cell at a time.”

Figure 85 The DAX engine may not calculate in precisely this 1-4 order, but you should think that it does

So, pick a cell and visualize how it was calculated, as if it were an island.

The value in one measure cell NEVER impacts the value in another measure cell. The measures are calculated independently, and calculated against the source table(s). See Rule 1 :-)

Rule C: DAX Measures Are Evaluated in Three Phases

Phase One: Detect Filters

Before the DAX engine even looks at your formula, it detects the “coordinates” of the current measure cell in the pivot.

To illustrate this, let’s use a slightly “richer” pivot:

Figure 86 The selected measure cell has three “coordinates” – MonthNum=8, Year=2001, and ProductKey=313.

A measure cell’s set of filter coordinates is often referred to as its Filter Context

Phase Two: Apply Those Filter Coordinates to the Underlying Tables

Those coordinates (the filters in the filter context) are then applied to the underlying tables. (You never see this filtering of course- because it happens behind the scenes).

Phase Two: Evaluate the Arithmetic

Once the filter context of a measure cell (determined by its coordinates in the pivot) has been used to filter the underlying table(s), ONLY THEN is the arithmetic in your formula evaluated.

In other words, your SUM() or COUNTROWS() function doesn’t run until the filter context has been applied to the source table(s).

To summarize Rule C, each measure cell in the pivot is evaluated in three phases: Detect Filters, apply those filters, then run the arithmetic.

The Three Phases of Rule C "Visualized"

Figure 87 Measure evaluation process illustrated for a single measure cell in the pivot: Occurs in 3 steps, numbered 1-3.

A Few More Tips

No “Naked Columns” in Measure Formulas

When you reference a column in a measure formula, it always has to be “wrapped” in some sort of function. A “naked” reference to a column will yield an error in a measure. Let’s take a look at an example:

[My New Measure] =

Sales[Margin]

Figure 88 I enter a “naked” column reference into the measure editor, then click Check Formula…

Figure 89 …leading to a relatively cryptic error message.

Let’s look at that error message:

“Calculation error in measure ‘Sales’[My New Measure]: The value for column ‘Margin’ in table ‘Sales’ cannot be determined in the current context. Check that all columns referenced in the calculation expression exist, and that there are no circular dependencies.”

Not a great error message. It really should be more helpful. But when you see this error, in your head you should translate this to be “I have a naked column reference somewhere.”

“Cannot be determined in the current context” should become a trigger phrase for you to think “I have a naked column reference somewhere in my measure formula.”

But all of the following would be valid:

=SUM(Sales[Margin])

=AVERAGE(Sales[Margin])

=MIN(Sales[Margin])

=MAX(Sales[Margin])

…

Any aggregation function will do. Think of it this way: pivots are, by their nature, aggregation devices. They take sets of rows and turn them into more compact numerical results. Referencing “naked columns” is what calculated column formulas do. Measure are aggregations, and they don’t accept naked column references on their own.

Remember, naked column references are OK in calculated columns. This rule only applies to measures.

Best Practice: Reference Columns and Measures Differently

Whenever I am writing a measure formula,

• To reference a column, I include the table name: TableName[ColumnName]

• To reference a measure, I omit the table name: [MeasureName]

I do this so that my formulas are more readable. If I see a reference with a table name preceding it, I know immediately that it’s a column, and if I see a reference that lacks a table name, I know it’s a measure.

Additionally, there are many situations in which omitting the table name on a column reference will return an error. Following this best practice avoids that issue as well.

Best Practice: Assign Measures to the Right Tables

The “Table name” box in the measure editor controls which table the measure will be assigned to in the field list.

Figure 90 If you set this dropdown to the Sales table…

Figure 91 …the measure will be “parented” to the Sales table in the field list.

Simple Rule: I assign my measures to the tables that contain the numeric columns used in the formula.

This is merely good hygiene so that your model is easier to understand later (by you or by someone else). If a measure is returning numbers from a column in the Sales table, I assign that measure to the Sales table. Assigning it to the Customers table would confuse me later on – it would make me think this somehow evaluated number of customers rather than amount of sales. (I used to think that which table you assigned a measure to actually impacted the results of measures, but I have found no case where that is true.)