5- Intro to Calculated Columns

Two Kinds of PowerPivot Formulas

When we talk about DAX (the PowerPivot formula language, which you should think of as “Excel Formulas+”), there are two different places where you can write formulas: Calculated Columns and Measures.

Calculated Columns are the less “revolutionary” of the two, so let’s start there. In this chapter I will introduce the basics of calculated columns, and then return to the topic later for some more advanced coverage.

Adding Your First Calculated Column

You cannot add calculated columns until you have loaded some data. So let’s start with a few tables of data loaded into the PowerPivot window:

DAXCh05-1.png

Figure 20 Three tables loaded into PowerPivot, with the Sales table active

Starting a Formula

You see that blank column on the right with the header “Add Column?” Select any cell in that blank column and press the “=” key to start writing a formula:

DAXCh05-2.png

Figure 21 Select any cell in the “Add Column”, press the “=” key, and the formula bar goes active

Referencing a Column via the Mouse

Using the mouse, click any cell in the SalesAmt column:

DAXCh05-3.png

Figure 22 Clicking on a column while in formula edit mode adds a column reference into your formula

Referencing a Column by Typing and Autocomplete

I am going to subtract the ProductCost column from the SalesAmt column, so I type a “-“ sign.

Now, to reference the ProductCost column, I type “[“ (an open square bracket). See what happens:

DAXCh05-4.png

Figure 23 Typing “[“ in formula edit mode triggers column name autocomplete

I can now type a “P” to further limit the list of columns:

DAXCh05-5.png

Figure 24 Typing the first character of your desired column name filters the autocomplete list

Now I can use the up/down arrow keys to select the column name that I want:

DAXCh05-6.png

Figure 25 Pressing the down arrow on the keyboard selects the next column down

And then pressing the up arrow also does what you’d expect:

DAXCh05-5.png

Figure 26 The up arrow selects the next column up

Once the desired column is highlighted, the <TAB> key finishes entering the name of that column in my formula:

DAXCh05-7.png

Figure 27 <TAB> key enters the selected column name in the formula and dismisses autocomplete

Now press <ENTER> to finish the formula, just like in Excel, and the column calculates:

DAXCh05-8.png

Figure 28 Pressing <ENTER> commits the formula. Note the entire column fills down, and the column gets a generic name.

Notice the slightly darker color of the calculated column? This is a really nice feature that is new in v2, and helps you recognize columns that are calculated rather than imported.

Just like Excel Tables!

If that whole experience feels familiar, it is. The Tables feature in “normal” Excel has behaved just like that since Excel 2007. Here is an example:

DAXCh05-9.png

Figure 29 PowerPivot Autocomplete and column reference follows the precedent set by Excel Tables

OK, the Excel feature looks a bit snazzier – it can appear “in cell” and not just in the formula bar for instance – but otherwise it’s the same sort of thing.

Rename the New Column

Notice how the new column was given a placeholder name? It’s a good idea to immediately rename that to something more sensible, just like we do immediately after importing data. Right-click the column header of the new column, choose Rename:

DAXCh05-10.png

Figure 30 Right click column header to rename column

Reference the New Column in Another Calculation

Calculated columns are referenced precisely the same way as imported columns. Let’s add another calculated column with the following formula:

=[Margin] / [SalesAmt]

And here is the result:

DAXCh05-11.png

Figure 31 A second calculated column, again using a simple Excel-style formula and [ColumnName]-style references

Notice how we referenced the [Margin] column using its new (post-rename) name, as opposed to its original name of [CalculatedColumn1]? In PowerPivot, the column names are not just labels. They also serve the role of named ranges. There isn’t one name used for display and another for reference; they are one and the same. This is a good thing, because you don’t have to spend any additional time maintaining separate named ranges.

Properties of Calculated Columns

No Exceptions!

Every row in a calculated column shares the same formula. Unlike Excel Tables, you cannot create exceptions to a calculated column. One formula for the whole column. So if you want a single row’s formula to behave differently, you have to use an IF().

No “A1” Style Reference

PowerPivot always uses named references like [SalesAmt]. There is no A1-style reference in PowerPivot, ever. This is good news, as formulas are much more readable as a result.

Columns are referenced via [ColumnName]. And yes, that means column names can have spaces in them.

Columns can also be referenced via ‘TableName’[ColumnName]. This becomes important later, but for simple calculated columns within a single table, it is fine to omit the table name.

Tables are referenced via ‘TableName’. Single quotes are used around table names. But the single quotes can be omitted if there are no spaces in the table name (meaning that TableName[ColumnName] is also legal, without single quotes, in the event of a “spaceless” table name).

Slightly More Advanced Calculations

Let’s try a few more things before moving on to measures.

Function Names Also Autocomplete

Let’s write a third calc column, and this time start the formula off with “=SU”…

DAXCh05-12.png

Figure 32 The names of functions also autocomplete. Note the presence of two familiar functions – SUM() and SUBSTITUTE() – as well as two new ones – SUMMARIZE() and SUMX()

We’ll get to SUMMARIZE() and SUMX() later in the book. For now, let’s stick with functions we already know from Excel, and write a simple SUM:

=SUM([ProductCost])

DAXCh05-13.png

Figure 33 SUM formula summed the entire column

Aggregation Functions Implicitly Reference the Entire Column

Notice how SUM applied to the entire [ProductCost] column rather than just the current row? Get used to that – aggregation functions like SUM(), AVERAGE(), COUNT(), etc. will always “expand” and apply to the entire column.

Quite a Few “Traditional” Excel Functions are Available

Many familiar faces have made the jump from normal Excel into PowerPivot. Let’s try a couple more.

=MONTH([OrderDate])

and

=YEAR([OrderDate])

To receive the following results:

DAXCh05-14.png

Figure 34 MONTH() and YEAR() functions also work just like they do in Excel

If you’d like to take a quick tour through the function list in PowerPivot, you can do so by clicking the little “fx” button, just like in Excel:

DAXCh05-15.png

Figure 35 PowerPivot also has a function picker dialog. Note the presence of many familiar functions.

Excel functions are identical in PowerPivot

If you see a familiar function, one that you know from normal Excel, you already know how to use it. It will have the same parameters and behavior as the original function from Excel.

OK, before anyone calls me a liar, I’ll qualify the above and say that it’s true 99.9% of the time. The keen eye of Bill Jelen has found one or two places where things diverge in small ways, but PowerPivot has done a frankly amazing job of duplicating Excel’s behavior, in no small part due to the Excel team helping them out. In most cases, PowerPivot uses exactly the same programming “under the hood” as Excel.

Enough Calculated Columns for Now

There is nothing inherently novel or game changing about calculated columns really. If that were the only calculation type offered by PowerPivot, it would definitely not be analogous to a “Biplane to jetplane” upgrade for Excel Pros.

We will come back to calculated columns a few more times during the course of the book, but first I want to introduce measures, the real game changer.