Chapter 6 Unpivoting Data

One of the big issues we Excel pros face is that no matter where we get our data, it doesn’t always arrive in a useful state. So not only do we waste time getting the data into a workbook to begin with, we then have to spend even more time cleaning it up and changing its layout in order to work with it.

The Curse of Pivoted Data

Say that a user has started tracking his sales on a daily basis, and he sends his data to you in the format shown below:

Figure 76 The dreaded pivoted data set.

Naturally, after tracking his sales in this way for days or weeks, he brings it to you and asks you to build a variety of different reports from it. The answer to this dilemma is, of course, to build PivotTables against the data source. But the issue is that this data set is already pivoted.

This problem comes up for Excel pros all the time. PivotTables were built to quickly turn tables of data into reports that users can more easily consume. The challenge is that users think in this kind of output format, not in tabular format, so they tend to build their data in the format that a PivotTable produces, not in a format that a PivotTable consumes.

Many users think that a simple transposing of the data set will work, but you know that this only changes the look of the data—it doesn’t truly convert it into a format that PivotTables are ready to consume.

Figure 77 Transposed data (on the left) vs properly unpivoted data (on the right).

The worst part about this issue is that in the past, there was no tool to easily convert the data back from pivoted to unpivoted, which meant a huge amount of labor was needed to pull this off. But here’s another place where Power Query comes to the rescue.

Unpivoting with Ease

This example shows how your life truly changes with Power Query. Open the Ch06 Examples\UnPivot.xlsx file and get ready to unpivot the data set within:

Figure 78 Raw data build in a pivoted format.

Preparing the Data

As you can see, the data is contained in a worksheet, but it has no table associated with it. You already know that you can fix that, so click somewhere in the data and create a new query:

The data lands in Power Query, and you now have the ability to make any transformations or manipulations you need. The overall goal here is to unpivot the data, but there is a column you really don’t need. The Total column can be removed because you can simply rebuild it with a PivotTable. So right-click the Total column → Remove. You’re now left with just the key data: the Sales Category column and a column for each day.

Unpivoting Columns

To unpivot the data set in UnPivot.xlsx, follow these steps:

The results are simply astounding: You’re already done.

Figure 79 Unpivoting magic in action.

You have only a couple more changes to make here before you finalize your data set:

Can you believe how easy this is?

Note: Notice that there is no need to use Change Type Using Locale in this instance. Since the data already resides inside Excel, Power Query will recognize this data correctly no matter what your regional settings are.

Repivoting

Next you can build a couple of PivotTables from this data. First, you rebuild the user’s existing data:

Now you can build an alternate view from the same data set:

Now you have two completely different sets created very easily from a set of unpivoted data:

Figure 80 Two PivotTables built from an unpivoted data set.

Surviving an Update

At this point, you’d probably be fairly comfortable saving the file, returning it to the user, and letting him continue to update it. After all, Power Query solutions can be refreshed at any time.

But say that you do so and the user makes updates and sends it back to you. Upon opening the file, you see that the user has done things that only an end user could think of as acceptable:

Figure 81 The table, as returned by the end user.

Looking through the changes, you’re astounded to see the following issues:

How will the refresh fare, given these changes? To find out, go to the Data tab and click Refresh All two times (once for the Power Query and once for the PivotTables).

The results are nothing short of amazing, as you can see below:

Figure 82 Your PivotTables still work, despite your end user’s actions.

Every issue that your user threw at you was handled. The totals are there, the data is in the right order, and the historical values have been updated.

Understanding the Unpivot Functions

There are actually two unpivot functions on the right-click menu in Power Query: Unpivot Columns and Unpivot Other Columns.

How Unpivot Other Columns Works

To be fair, we had you make some unnecessary steps in the example above. Instead of selecting the 1/1/2014 through 1/7/2014 columns, you could have just right-clicked the Sales Category column and chosen the Unpivot Other Columns command. What this command does is unpivot every column other than the one(s) you selected. From a logical point of view, it would have made complete sense when the solution was updated that it just worked as well.

Having said this, the demo was set up this way for a reason—to explain what really happens when you use the Unpivot Columns command.

How Unpivot Columns Actually Works

Based on the terminology of the user interface, you would have expected that when you recorded the step above to unpivot the 1/1/2014 through 1/7/2014 columns, Power Query would hard code those specific columns. It should be plainly obvious now that this is not the case.

Here’s what Power Query actually did: It looked at all the columns in the data set and determined that there was one column you did not select. Rather than build you a specific “unpivot these columns” command, it actually recorded code that says “unpivot everything except the column(s) that weren’t selected.” In this case, that was “unpivot everything except Sales Category.”

While this change seems subtle, it actually has some fairly large impacts, both good and bad.

The good news is that it is very difficult to make a mistake and build a scenario that blows up when new daily data columns are added. Essentially, it future-proofs your solution for new dates, as they will always be pulled in.

The bad news is that there is no way through the user interface to lock in a specific “unpivot this column only” command. This means that if you have an Unpivot Columns command in your query, and you add a new column to the data table, it will be unpivoted as well.

Locking Columns to Unpivot

While this need should be unusual, there may come a time when you need the ability to lock in a specific column or columns that you want to unpivot and ensure that new columns added to your source data are not unpivoted by default. Doing so takes some manipulation of the formula in the formula bar.

Assume that you only want to unpivot the Total column, for example. You would take the following steps:

Figure 83 The Total column has been unpivoted.

Here’s how this code breaks down:

If you wanted to lock your code to make sure that no other columns were ever unpivoted, you would need to change this from the Table.UnpivotOtherColumns function to use the Table.Unpivot function.

Like Table.UnpivotOtherColumns, the Table.Unpivot function takes four parameters:

So the overall formula would read as follows:

=Table.Unpivot(#"Changed Type",{"Total"},"Attribute","Value")

And the effect would look exactly the same as the version that was produced above with the original code. The difference is that any other column added to the source data table after the fact would not also get unpivoted upon refresh.

Note: In order to lock the unpivot of multiple columns, you just provide the full list of the columns to unpivot between the {} characters, with each column header surrounded in quotes and separated by commas.