5- Intro to Calculated Columns
Two Kinds of PowerPivot Formulas
When we talk about DAX (the PowerPivot formula language, which you should think of as “Excel Formulas+”), there are two different places where you can write formulas: Calculated Columns and Measures.
Calculated Columns are the less “revolutionary” of the two, so let’s start there. In this chapter I will introduce the basics of calculated columns, and then return to the topic later for some more advanced coverage.
Adding Your First Calculated Column
You cannot add calculated columns until you have loaded some data. So let’s start with a few tables of data loaded into the PowerPivot window:
Figure 20 Three tables loaded into PowerPivot, with the Sales table active
Starting a Formula
You see that blank column on the right with the header “Add Column?” Select any cell in that blank column and press the “=” key to start writing a formula:
Figure 21 Select any cell in the “Add Column”, press the “=” key, and the formula bar goes active
Referencing a Column via the Mouse
Using the mouse, click any cell in the SalesAmt column:
Figure 22 Clicking on a column while in formula edit mode adds a column reference into your formula
Referencing a Column by Typing and Autocomplete
I am going to subtract the ProductCost column from the SalesAmt column, so I type a “-“ sign.
Now, to reference the ProductCost column, I type “[“ (an open square bracket). See what happens:
Figure 23 Typing “[“ in formula edit mode triggers column name autocomplete
I can now type a “P” to further limit the list of columns:
Figure 24 Typing the first character of your desired column name filters the autocomplete list
Now I can use the up/down arrow keys to select the column name that I want:
Figure 25 Pressing the down arrow on the keyboard selects the next column down
And then pressing the up arrow also does what you’d expect:
Figure 26 The up arrow selects the next column up
Once the desired column is highlighted, the <TAB> key finishes entering the name of that column in my formula:
Figure 27 <TAB> key enters the selected column name in the formula and dismisses autocomplete
Now press <ENTER> to finish the formula, just like in Excel, and the column calculates:
Figure 28 Pressing <ENTER> commits the formula. Note the entire column fills down, and the column gets a generic name.
Notice the slightly darker color of the calculated column? This is a really nice feature that is new in v2, and helps you recognize columns that are calculated rather than imported.
Just like Excel Tables!
If that whole experience feels familiar, it is. The Tables feature in “normal” Excel has behaved just like that since Excel 2007. Here is an example:
Figure 29 PowerPivot Autocomplete and column reference follows the precedent set by Excel Tables
OK, the Excel feature looks a bit snazzier – it can appear “in cell” and not just in the formula bar for instance – but otherwise it’s the same sort of thing.
Rename the New Column
Notice how the new column was given a placeholder name? It’s a good idea to immediately rename that to something more sensible, just like we do immediately after importing data. Right-click the column header of the new column, choose Rename:
Figure 30 Right click column header to rename column
Reference the New Column in Another Calculation
Calculated columns are referenced precisely the same way as imported columns. Let’s add another calculated column with the following formula:
=[Margin] / [SalesAmt]
And here is the result:
Figure 31 A second calculated column, again using a simple Excel-style formula and [ColumnName]-style references
Notice how we referenced the [Margin] column using its new (post-rename) name, as opposed to its original name of [CalculatedColumn1]? In PowerPivot, the column names are not just labels. They also serve the role of named ranges. There isn’t one name used for display and another for reference; they are one and the same. This is a good thing, because you don’t have to spend any additional time maintaining separate named ranges.
Properties of Calculated Columns
No Exceptions!
Every row in a calculated column shares the same formula. Unlike Excel Tables, you cannot create exceptions to a calculated column. One formula for the whole column. So if you want a single row’s formula to behave differently, you have to use an IF().
No “A1” Style Reference
PowerPivot always uses named references like [SalesAmt]. There is no A1-style reference in PowerPivot, ever. This is good news, as formulas are much more readable as a result.
Columns are referenced via [ColumnName]. And yes, that means column names can have spaces in them.
Columns can also be referenced via ‘TableName’[ColumnName]. This becomes important later, but for simple calculated columns within a single table, it is fine to omit the table name.
Tables are referenced via ‘TableName’. Single quotes are used around table names. But the single quotes can be omitted if there are no spaces in the table name (meaning that TableName[ColumnName] is also legal, without single quotes, in the event of a “spaceless” table name).
Slightly More Advanced Calculations
Let’s try a few more things before moving on to measures.
Function Names Also Autocomplete
Let’s write a third calc column, and this time start the formula off with “=SU”…
Figure 32 The names of functions also autocomplete. Note the presence of two familiar functions – SUM() and SUBSTITUTE() – as well as two new ones – SUMMARIZE() and SUMX()
We’ll get to SUMMARIZE() and SUMX() later in the book. For now, let’s stick with functions we already know from Excel, and write a simple SUM:
=SUM([ProductCost])
Figure 33 SUM formula summed the entire column
Aggregation Functions Implicitly Reference the Entire Column
Notice how SUM applied to the entire [ProductCost] column rather than just the current row? Get used to that – aggregation functions like SUM(), AVERAGE(), COUNT(), etc. will always “expand” and apply to the entire column.
Quite a Few “Traditional” Excel Functions are Available
Many familiar faces have made the jump from normal Excel into PowerPivot. Let’s try a couple more.
=MONTH([OrderDate])
and
=YEAR([OrderDate])
To receive the following results:
Figure 34 MONTH() and YEAR() functions also work just like they do in Excel
If you’d like to take a quick tour through the function list in PowerPivot, you can do so by clicking the little “fx” button, just like in Excel:
Figure 35 PowerPivot also has a function picker dialog. Note the presence of many familiar functions.
Excel functions are identical in PowerPivot
If you see a familiar function, one that you know from normal Excel, you already know how to use it. It will have the same parameters and behavior as the original function from Excel.
OK, before anyone calls me a liar, I’ll qualify the above and say that it’s true 99.9% of the time. The keen eye of Bill Jelen has found one or two places where things diverge in small ways, but PowerPivot has done a frankly amazing job of duplicating Excel’s behavior, in no small part due to the Excel team helping them out. In most cases, PowerPivot uses exactly the same programming “under the hood” as Excel.
Enough Calculated Columns for Now
There is nothing inherently novel or game changing about calculated columns really. If that were the only calculation type offered by PowerPivot, it would definitely not be analogous to a “Biplane to jetplane” upgrade for Excel Pros.
We will come back to calculated columns a few more times during the course of the book, but first I want to introduce measures, the real game changer.