DAX_Mobi-23

18- Time Intelligence with Custom Calendars: Advanced Use of FILTER()

Perhaps My Favorite Thing in DAX

Working with custom calendars in DAX has become something that I’d almost do for free, it’s so much fun. Specifically, it just feels powerful, like you can do just about anything.

That said, it took me a little while to discover the magic formula. It took some experimentation. But you won’t have to do any of that – I will give you the secret, and explain how it works.

It also provides a platform to explain a few more things about FILTER() that I left unaddressed in that chapter.

The Periods Table

A “4/4/5” Example

OK, let’s say my company operates on a “4/4/5” calendar, which is very common in retail. “4/4/5” refers to the number of weeks in each period, where a period is roughly a month. These calendars rotate through four quarters in a year, each consisting 13 total weeks.

Here’s an example – a Periods table imported into PowerPivot:

Figure 275 Periods table – 39 rows spanning from 7/1/2001 to 9/25/2004 – note the repeating 28/28/35 pattern, which is 4/4/5 weeks

How This Changes Things: We Need to “Write” Our Own Time Intelligence Functions

The critical point is not merely the existence of this table. The “wrinkle” here is that all “sales for period X” reports, as well as all comparisons of growth – versus last year, versus prior periods – must be performed according to the periods defined in this table. Likewise, all “year to date” and similar calculations must respect this table.

The “smarter” time intelligence functions like DATESYTD(), DATEADD(), and SAMEPERIODLASTYEAR() – the ones with built-in knowledge of the standard calendar – will not work properly in this regard.

So we will need to essentially write our own versions of those functions from scratch, using other more primitive functions like FILTER(), ALL(), and DATESBETWEEN().

Simple “Sales in Period” Measure

Let’s start with the basics. We want a pivot that shows something like this:

Figure 276 Simplest pivot: just display sales data according to the custom Periods table (the 4/4/5 calendar)

This is pretty straightforward actually, it’s just another example of a disconnected table.

Let’s start with two measures on the Periods table that report the Start/End dates for each period:

[PeriodStartDate] =

FIRSTDATE(Periods[Start])

And:

[PeriodEndDate] =

LASTDATE(Periods[End])

With results:

Figure 277 Start and End date measures defined against the Periods table

Note that I added a couple of text columns via formula (QtrLabel and PeriodLabel), to make the pivot display a little nicer than just a jumble of numbers on Rows:

Figure 278 QtrLabel and PeriodLabel – better label fields for the pivot, defined using simple concatenation formulas

Then let’s define a [Sales in Period] measure on the Sales table, one that respects [PeriodStartDate] and [PeriodEndDate]:

[Sales in Period] =

CALCULATE([Total Sales],
DATESBETWEEN(Calendar[Date],
[PeriodStartDate],
[PeriodEndDate]
)
)

Displayed with conditional formatting, and filtered to just 2003 and 2004 to highlight differences:

Figure 279 [Sales in Period] – note the highlighted larger values for Year (2003) and Qtr (Q2) relative to Period. Note also the “bump” in the third period of each quarter, due to the 4/4/5 structure.

Visualizing the Relationships

So far this is absolutely the same as what we covered in the chapter on disconnected tables. But just to anchor our understanding, let’s take a look at the table diagram:

Figure 280 Table diagram updated. Periods table has “dotted line” relationship to Calendar, which has real relationship to Sales. Therefore with certain measures like [Sales in Period], Periods table acts like a Lookup table to Sales. (Several tables from prior chapters removed for clarity).

Another Familiar Concept: Sales per Day

Since our periods are of varying sizes, in order to fairly compare “apples to apples” we should have a measure that compares sales per day (or per week).

First let’s write a measure that calculates how many days are currently selected:

[Days in Period] =

SUM(Periods[Days])

Figure 281 [Days in Period] measure (Start and End date measures removed from pivot)

OK, now we can write a measure that calculates sales per day in each period:

[Sales per Day in Period] =

[Sales in Period]/[Days in Period]

Figure 282 [Sales per Day in Period] – note how the length of each period does not determine the size of its value. We can now compare “apples to apples” – for “4 vs. 4 vs. 5” but also Period vs. Quarter and Year.

First New Concept: Sales per Day in Prior Period

Getting Organized First

First let’s add a PeriodYear column to the Periods table, so that we have a unique label for each period regardless of what year it is in:

Figure 283 Unique label for periods, across years

And use that on rows instead:

Figure 284 PeriodYear on Rows, [Sales per Day in Period] on Values

Desired Results

In the context of that pivot, here is what we want to end up with:

Figure 285 This is what we want: a measure that returns the [Sales per Day in Period] value for the immediately prior period. (#NUM makes sense for now since that was our first-ever period).

Revealing the Formula First, Then Explaining

Let’s take the “work backwards” theme one step further and just reveal the formula for that measure first:

[Prior Period Sales per Day] =

IF(HASONEVALUE(Periods[PeriodID]),
CALCULATE([Sales per Day in Period],
ALL(Periods),
FILTER(ALL(Periods),
Periods[PeriodID]=
VALUES(Periods[PeriodID])-1
)
),
BLANK()
)

The Greatest Formula in the World

Ignore the HASONEVALUE() test for a moment and just focus on the “meat” of the formula:

CALCULATE([Sales per Day in Period],
ALL(Periods),
FILTER(ALL(Periods),
Periods[PeriodID]=
VALUES(Periods[PeriodID])-1
)
)

On my blog, I only half-jokingly refer to this as the Greatest Formula in the World, or GFITW. It is by far the #1 pattern you need to know when dealing with custom calendars.

As you get comfortable with the GFITW, some of you may prefer to treat it as just that: a pattern that you can adapt to your needs. It’s not strictly necessary that you understand in depth why it works, at least not immediately. (I certainly have copied a few normal Excel formulas and macros off the web in my day that I didn’t fully understand at the time, no shame in it right?)

So for a moment let’s just boil it down to the pattern itself:

The GFITW Pattern – adapt and reuse this for all your custom calendar needs!

CALCULATE(<base measure>,
ALL(<custom periods table>),
FILTER(ALL(<custom periods table>),
<row test with navigation arithmetic>
)
)

Got it? Now to explain how it works, starting from a high level and then getting progressively more detailed.

“Clear Filters Then Re-filter” – Another Name for GFITW

At a high level, here’s the way to understand GFITW: you clear all existing time-related filter context, then filter it back down to a new filter context. That new filter context is one that you control, typically using math that navigates backward in your custom calendar table. For this reason, you can also think of GFITW as “Clear filters, then re-filter.”

Another way to say it: first you clear all time-related filter context, yielding a completely “blank slate.” Once that is done, you can reconstruct a brand new filter context, from scratch, without worrying about interference from the original filter context.

With that understanding, it’s not hard to “parse” the GFITW into its component parts:

• ALL(<custom periods table>) – this first ALL(), the one outside of the FILTER(), is the first part of the “Clear” phase. (Not shocking given that ALL() is all about clearing filter context right?)

• FILTER(ALL(<custom periods table>)… - the second ALL(), the one insider the FILTER(), also contributes to the “Clear” phase. (Why we need to “clear” twice is explained in the next section)

• FILTER(… <row test with navigation arithmetic>) – this is the “re-filter” phase, the part where you build up a new filter context from scratch, using whatever logic is required.

To get to the next level of detail, we need to revisit the FILTER() function, and form a slightly deeper understanding of it than we did before.

FILTER() – The Ins and Outs

When I first introduced FILTER(), I explained it as “use this function when you need to use an expression on the right side of a CALCULATE() <filter> argument.” In a nutshell, that remains true – that’s a fair way to describe when to use it.

But that doesn’t capture every difference between the FILTER() function and the “raw” <filter> arguments to CALCULATE(). There are several key points to understand about the FILTER() function before we can truly understand the GFITW. Let’s go one by one.

First Key Characteristic of FILTER() – It Always Subtracts from Filter Context, Never Overrides

Let’s go way back to the original discussion of CALCULATE(). Remember this first example?

[2002 Sales] =

CALCULATE([Total Sales], Sales[Year]=2002)

Figure 286 Normal CALCULATE() <filter> behavior is an override: we get 2002’s Sales even for 2001

Let’s try that same thing with FILTER() instead:

[2002 Sales] =

CALCULATE([Total Sales], FILTER(Sales, Sales[Year]=2002))

Figure 287 FILTER() gives us blanks for years other than 2002, because FILTER() always subtracts from filter context rather than overriding it

Note that using FILTER() in this example violates two rules of using FILTER(): one is to never use FILTER() when a simple <filter> will do. The other is to only use FILTER() against Lookup tables. This example is only provided to drive home the particular difference, that FILTER() subtracts from filter context.

Now let’s go one step further, and add an additional raw <filter> to CALCULATE() that sets Year=2003:

[2002 FILTER with 2003 raw filter] =

CALCULATE([Total Sales],
FILTER(Sales, Sales[Year]=2002),
Sales[Year]=2003
)

And the results:

Figure 288 FILTER() ALWAYS subtracts from the rest of the filter context, and it does not matter where the rest of the filter context comes from – it can come from the original pivot OR other arguments to CALCULATE(), and FILTER() just keeps subtracting from it. It’s like a black hole.

In the following illustration, the left side represents the filter context applied to a source table as determined by the original filter context from the pivot, plus all the other <filter> arguments to CALCULATE(). (Where “other” means “all <filter> arguments other than the FILTER we’re currently looking at – technically the other arguments could be FILTER’s as well).

And then the right side represents the result of the FILTER() we’re focusing on. (Ignore the numbers, they are just random data – focus on the rows).

Figure 289 Same source table (such as Sales or Periods) – original filter context on the left, results of FILTER() on the right. Each “says” that a different set of rows should be active. When CALCULATE moves into the arithmetic phase, which rows will remain active?

Because FILTER() subtracts from filter context rather than overriding it, only rows that are left active by both the original context and the FILTER() will “survive” and remain active:

Figure 290 Only the four circled rows were active in the original filter context AND in the result set from FILTER(), so only those four will remain active “when the dust settles”

So this is the final filter context:

Figure 291 Clean view of the final filter context in the example above

This Explains Why We Needed the ALL() Outside the FILTER() in GFITW

OK, so FILTER() always subtracts from filter context. Another way to say that is “FILTER() can NEVER add back to filter context something that was removed by something else.”

So this explains why the GFITW starts out with a “Clear filters” phase – if we start out with a filter context of “Period 1 of 2003” and we go backward to “Period 12 of 2002” in our FILTER(), we will end up with nothing in our resulting filter context.

CALCULATE(...,ALL(<custom periods table>),…)

So that first ALL() in GFITW starts us with a clean slate, so that whatever our FILTER() says can become “the last word” on filter context.

Second Key Characteristic of FILTER(): Its <table> Argument Respects the Existing Filter Context of the Pivot

FILTER() does not start out with an unfiltered table! Instead, it starts with what the pivot says in terms of filter context, at least by default, and ignores all other arguments to CALCULATE().

Let’s return to the function definition:

FILTER(<table>, <single “rich” filter>)

If you provide just a table name for that first argument, FILTER() will start out with whatever filter context was specified by the pivot – nothing more or less!

Let’s step through an example to see what I mean. Pretend for a moment that we had used the following formula instead:

CALCULATE([Sales per Day in Period],
ALL(Periods),
FILTER(Periods,
Periods[PeriodID]=
VALUES(Periods[PeriodID])-1
)
)

The only difference from the original formula is that I removed the ALL() around Periods in that first argument to FILTER().

Now with that formula in mind, consider the following pivot, and the filter context of the highlighted measure cell:

Figure 292 Let’s focus on the highlighted measure cell and step through an example

In that cell, the pivot is saying PeriodYear=”2002-P1.” So here’s what the Periods table looks like according to the pivot:

Figure 293 This is the PIVOT’s “opinion” of the Periods table filter context

If we have an ALL(Periods) for the first argument to CALCULATE(), that means this is the “combined” filter context from the pivot plus the non-FILTER() arguments to CALCULATE():

Figure 294 Pivot filter context plus the ALL(Periods) argument yields ALL rows active

BUT if we used FILTER(Periods, …) as our FILTER(), FILTER() will start out with the original pivot filter context – it ignores all other <filter> arguments to CALCULATE()!

Figure 295 FILTER() ignores all other <filter> arguments to CALCULATE() and starts from the pivot’s original “opinion” of filter context!

Now, of those active rows (just one row), which ones meet the requirement of [PeriodID]=6, which is what the arithmetic test boils down to? None of them do! So this is the result of the FILTER():

Figure 296 Results of the FILTER() – No rows left active

OK, so that means we have the original pivot context plus the ALL() <filter> param resulting in all rows active, and the FILTER() resulting in no rows active:

Figure 297 When these two “opinions” of filter context get merged…

Figure 298 The resulting filter context is NO rows active, which is not what we intend in GFITW

That Explains Why We Need the ALL() Inside the FILTER()

Returning to the GFITW pattern, we apply an ALL() to the Periods table outside the FILTER() function and insider the FILTER() function:

CALCULATE(<base measure>,
ALL(<custom periods table>),
FILTER(ALL(<custom periods table>),
<row test with navigation arithmetic>
)
)

We already saw that the “outside” ALL(), the one provided as a raw <filter> argument, is there because of that first key characteristic of FILTER(), which is that it always subtracts from filter context.

But now we can also understand why there’s an ALL() inside FILTER() as well – we want FILTER() itself to start from a clean slate!

OK, let’s walk through it again, this time with both ALL()’s in place:

1. The original pivot filter context:

Figure 299 Original filter context from the pivot

2. The starting point for FILTER(), after the “inside” ALL is applied:

Figure 300 FILTER(ALL(Periods)…) starting point – after that inside ALL() is applied to original pivot filter context

3. The results of FILTER(), after the arithmetic boils down to PeriodID=6:

Figure 301 Results of the FILTER() function – one row active

4. Now let’s see what the rest of the CALCULATE() <filter> arguments result in:

Figure 302 After the “outside” ALL() is applied, all rows are active

5. Lastly, we see what rows are left active after FILTER()’s “opinion” is merged with the rest of the <filter> arguments:

Figure 303 The only row that “survived” pivot filter context plus the other <filter> arguments (the outside ALL), AND the FILTER(), was just this one. Which is what we want.

Note that the order of arguments to CALCULATE() does not matter at all. If we had the FILTER() as the first <filter>, and the “outside” ALL() as the second, the results would be the same.

Summing Up: FILTER()’s Characteristics Are Why We Need Two ALL()’s in GFITW

• FILTER() results in a set of “active” rows, and a row will only remain active “when the dust settles” if it was active according to FILTER() and active according to the pivot filter context (after the other <filter> arguments are applied to it)

• The “outside” ALL() is needed because FILTER() can therefore only subtract from filter context.

• FILTER()’s <table> argument starts out only paying attention to pivot filter context, and ignores all other <filter> arguments to CALCULATE(), so the “outside” ALL() has no impact on the FILTER() itself

• That last point is why the “inside” ALL() is also required, so that FILTER() can start from a truly clean slate.

The Navigation Arithmetic

With all of that understood, the navigation arithmetic really is the simple part.

Let’s look at the entire FILTER() that I used in my [Prior Period Sales per Day] GFITW measure:

FILTER(ALL(Periods),
Periods[PeriodID]=
VALUES(Periods[PeriodID])-1
)

The part of the GFITW pattern that I call “navigation arithmetic is the second argument:

Periods[PeriodID]=VALUES(Periods[PeriodID])-1

How does that work?

At first, it seems kinda strange: how can there be a row where PeriodID equals itself minus one? The answer lies in another important detail of the FILTER() function.

The Second Argument to FILTER() Respects Original Pivot Filter Context

That’s right – even though that “inner” ALL() forces FILTER() to look at all the rows in Periods, when it’s inspecting a row via the filter test in the second argument, the original filter context from the pivot still applies. It doesn’t care about other <filter> arguments to CALCULATE(), and it doesn’t care about the first argument to FILTER() either. So VALUES() works, but so would MIN() and MAX().

And since, in the full measure formula, the GFITW was “guarded” by a HASONEVALUE() test, VALUES(Periods[PeriodID]) in this formula always returns a single value for PeriodID. (If not, subtracting 1 from it would yield an error).

Pretty nifty.

Subtracting 1 from the PeriodID in this case yields the previous period, because my Periods have consecutive PeriodID values.

In Your Periods Table, You Always Need a Numeric PeriodID Column or Equivalent

Since our navigation always comes down to some sort of math, you absolutely need a PeriodID column – one that:

• Contains a unique number for each row

• Increases as time goes on

• Has consecutive numbers for periods that are consecutive in time

Pretty simple – if you don’t have a column on which you can perform sensible arithmetic, you aren’t going to be able to navigate.

More GFITW Measures – YOY, YTD, and “Multi-granularity”

Let’s do a few more custom calendar measures.

YOY (Year over Year) Custom Calendar Measure

This one looks very much the same as our first GFITW example, and only differs in terms of the navigation arithmetic.

[YOY Period Sales] =

IF(HASONEVALUE(Periods[PeriodID]),
CALCULATE([Sales in Period],
ALL(Periods),
FILTER(ALL(Periods),
Periods[PeriodID]=
VALUES(Periods[PeriodID])-12
)
),
BLANK()
)

Really the only thing changed here is that we are subtracting 12 rather than 1.

Let’s look at the results:

Figure 304 [YOY Period Sales] – the $426,286 is great, but I definitely did NOT expect the grand total to be returned for the first 11 periods – I expected those to be blank.

Fixing That Weird Grand Total Problem for the “Should Be Blank” Cells

Why is this happening? No seriously, why the heck is this happening, please tell me :-)

I’m kidding – I know why. But I don’t like it. Folks, I personally believe this is a design mistake in the DAX engine. Not a “red alert” kind of mistake, because it’s easily corrected with a formula change, but it’s one that I wish we didn’t have to work around. (Note that other DAX pros are divided on this controversial topic).

Here’s the crux of the problem: DATESBETWEEN(Calendar[Date], BLANK(), BLANK()) returns ALL dates in the Calendar table, rather than no dates!

In the filter context for the measure cells returning the $29M number, we end up with no rows active in our Periods table. Why? Check out what VALUES(Periods[PeriodID])-12 returns as a measure:

Figure 305 There are NO rows in the Periods table with PeriodID=0. Or -1, -2, etc. So the $29M cells have NO rows active in the Periods table.

And when there are no “active” Periods rows in our filter context, our [PeriodStartDate] and [PeriodEndDate] measures return blanks, because they are just FIRSTDATE()/LASTDATE() measures against the Periods table.

And that is why we’re seeing the grand total (the $29M number) for [YOY Sales in Period] – because the FILTER() in our GFITW returns no rows in the Periods table for those cells, that leads to blanks for the start/end date measures, and then DATESBETWEEN() interprets that as “select everything.”

The fix is to return to our original [Sales in Period] measure and substitute the FILTER() function for DATESBETWEEN().

Original measure:

[Sales in Period] =

CALCULATE([Total Sales],
DATESBETWEEN(Calendar[Date],
[PeriodStartDate],
[PeriodEndDate]
)
)

New version that uses FILTER():

[Sales in Period] =

CALCULATE([Total Sales],
FILTER(Calendar,
Calendar[Date]>=[PeriodStartDate] &&
Calendar[Date]<=[PeriodSEndDate]
)
)

And all is right with the world:

Figure 306 Ah, MUCH better.

Year to Date (YTD) Measure with Custom Calendar

Let’s get right to it. First, I add a new column to my Periods table:

Figure 307 I will explain this calculated column formula in the chapter on advanced calculated columns. For now, just focus on how it is used in the measure.

Again, if you have a database as your data source, and the skill (or assistance) to manipulate it, this is the sort of column that I highly recommend be implemented in the database rather than in a DAX column.

And now, the measure:

[YTD Period Sales] =

CALCULATE([Sales in Period],
ALL(Periods),
FILTER(ALL(Periods),
Periods[PeriodID]<=MAX(Periods[PeriodID]) &&
Periods[PeriodID]>=MAX(Periods[FirstPeriodInYear])
)
)

And the results:

Figure 308 YTD Period Sales measure with custom calendar – good stuff.

Fixing YOY Period to Work on Totals, Too

We have three measures on the pivot now. Notice that two of them work at the totals level, and one does not:

Figure 309 Can we make [YOY Period Sales] work at the totals level?

There are two reasons why [YOY Period Sales] does not work at the totals level.

1. First, of course, is that we have a IF(HASONEVALUE()) test in the measure, and return BLANK() for totals.

2. More importantly, though, we have a VALUES(Periods[PeriodID])-12 in the formula, which will outright fail in the context of a total cell.

So in order to remove the HASONEVALUE() “guard,” we need to fix that VALUES()-12 part.

The best way to do that is to subtract 12 from the beginning of the current time frame and 12 from the end of it. So we subtract 12 from the MIN() and 12 from the MAX() of the PeriodID, and select only Periods that fall between that range.

In the single-period case, the beginning and the end are the same PeriodID, so nothing actually changes there. Only when multiple periods are selected, as is the case in a total cell, does this change in formula make a difference.

[YOY Period Sales] =

CALCULATE([Sales in Period],
ALL(Periods),
FILTER(ALL(Periods),
Periods[PeriodID]>=MIN(Periods[PeriodID])-12 &&
Periods[PeriodID]<=MAX(Periods[PeriodID])-12
)
)

That also allows us to remove the
HASONEVALUE() guard.

Results:

Figure 310 YOY now works on totals, too – Year and Qtr!

I can now also copy/paste that formula above and change the base measure to create a YOY version of [Sales per Day in Period]:

Figure 311 Sales per day version of the YOY measure, more useful for “apples to apples” comparisons

Fixing Prior Period to work on totals, too

This is trickier than YOY – if the current total cell is a Quarter, we need to shift back 3 periods. But if the current total cell is a Year, we need to shift back 12.

[Prior Period Sales per Day] =

CALCULATE([Sales per Day in Period],
ALL(Periods),
FILTER(ALL(Periods),
Periods[PeriodID]>=MIN(Periods[PeriodID])
- COUNTROWS(Periods)
&&
Periods[PeriodID]<=MAX(Periods[PeriodID])
- COUNTROWS(Periods)
)
)

So rather than subtracting a fixed number like in YOY, we subtract COUNTROWS(Periods), which is the number of currently selected periods – the “size” of the current time selection, in other words.

Results:

Figure 312 Prior Period: Now matches at Period, Qtr, and Year levels. No more blank totals.

The Usual “Percent Growth” Formulas

Now you can do the usual “new minus old, divided by old” trick to get the pct growth.

For example,

[Sales per Day Growth vs Same Period Last Year] =

IF([YOY Period Sales per Day] = 0, BLANK(),
([Sales per Day in Period] -
[YOY Period Sales per Day])/
[YOY Period Sales per Day]
)

And results:

Figure 313 Percent growth is as simple a calc as it always was, even though its component measures are quite sophisticated