Chapter 10 Query Loading Destinations

While the majority of the examples in this book to this point have focused on landing the Power Query output into Excel tables, that is certainly not the only option for where to place Power Query data.

In fact, you have three different loading destination options, some of which can be used in combination:

Note: The ability to load directly to the Power Pivot Data Model is not supported in Excel 2010. Having said that, just because it isn’t supported doesn’t mean that it can’t be done. You’ll find out how to do it in this chapter.

Query Loading Methods

Each of the different loading options has different benefits, drawbacks, and caveats associated with it, which we will explore here.

Excel Tables

Excel tables are the default loading experience for Power Query queries. When you simply click Close & Load in the Power Query interface, Excel creates a new table on a new worksheet to hold the query’s output. The newly created table inherits the name of the query, with a few modifications:

Warning: One caveat to be aware of here is that you should never give a query the same name as an existing Excel function. Excel tries to interpret named ranges before functions, so having a table named something like ROUND or LEFT will cause all cells using these functions to return #N/A errors.

If you are simply reading data from a table and making no manipulations to it, using the default option does give you a connection to the data in Power Query, but it also duplicates the data on another worksheet, increasing the file size and adding memory overhead to your file.

Connection-Only Queries

As you saw in Chapter 9, connection-only queries are set up to avoid landing data into a worksheet, but they still allow you to connect to the data via other Power Query queries.

This option is a fantastic one that gets used frequently in practice, as the queries set up in this fashion are loaded on demand only when called by a subsequent query. Since they don’t load any data directly to the worksheet, they also don’t increase the file size or memory overhead required to store the query output.

Note: Loading to a table and creating a connection-only query are mutually exclusive options. You can choose only one or the other.

For example, when you open Ch10 Examples\Load Destinations.xlsx, you’ll find that the workbook contains two tables of sales items for a pet store. The analyst wants to merge these tables together for use in a business intelligence solution. How does she do it?

She could load each table into a query, load the queries into worksheets, and then create an append query that would also be loaded into a worksheet. She would essentially create three copies of the data, which seems a bit ridiculous.

A better alternative is to use connection-only queries to create pointers to the original data tables and then create an append query against those connections and land the output in a table. The final table would act as the source for the analyst’s PivotTables, and there would not be an extra duplication step in the middle.

To set up this solution, you would follow these steps:

This process gives you two connection-only queries, without duplicating any data in the workbook:

Figure 128 Two connection-only queries, ready for use.

You can now create an append query to merge the two connection-only queries together:

Power Query creates a new table that holds all the data and is ready to be pivoted:

Figure 129 An append query created from two connection-only queries.

Note: When creating a new query against a connection-only query, it is not necessary to load the output to an Excel table. You could just as easily load it to another connection-only query if needed.

Loading to Power Pivot (Excel 2010)

There are actually two ways to load Power Query data to Power Pivot in Excel 2010. But only one is officially supported: loading via linked tables.

To link this data into the Power Pivot Data Model, you select anywhere inside the Sales table, go to the Power Pivot tab, and click Create Linked Table. The end result is a data flow that follows this process:

Excel table → connection-only query → append query → Excel table → Power Pivot

Note: You might think it would be easier to just go directly from your monthly sales tables, link both tables to Power Pivot, and create a relationship—avoiding the necessity of creating the append query. The issue here is that Power Pivot is great at creating relationships horizontally (replicating VLOOKUP() functionality) but it’s lousy at stacking two tables on top of each other (appending data).

The challenges with this approach are two-fold:

So while this works and is the supported method, using Power Query’s supported loading methods definitely has some drawbacks for serious modelers.

Loading to the Data Model (Excel 2013+)

In Excel 2013 and higher, the process of linking to the Data Model is actually much easier: Excel gives you a little box that you can check to have Excel do the job for you. To see this in action, you can modify the existing Sales table connection:

Figure 130 Changing the query loading options.

Making these changes triggers a warning about possible data loss. This happens because changing from Table to Only Create Connection tells Excel that you would like to remove the table that Power Query landed in the worksheet. Because that is to be expected, you can simply acknowledge the warning by clicking Continue.

Note: Should you need to, you are able to load the data to both a table in a worksheet as well as the Data Model.

After Power Query takes a little bit of time to remove the table and set things up with the Data Model, you can head in to Power Pivot and go to the Power Pivot tab → Manage.

Figure 131 The data is comfortably loaded into the Power Pivot Data Model.

You’ll also find that updates to the Power Query tables flow seamlessly into Power Pivot. Add a new column to the Sales table in Power Query to see how Power Pivot handles it:

The data is automatically updated for you, and you don’t even need to trigger a refresh.

Figure 132 A query update pushed directly into the Data Model.

Changing the Default Query Loading Settings

If you find that you’re doing a lot of modifications to the default loading experience, you can make changes to your default experience. Where you make these changes depends on your version of Excel:

Following either of these sets of instructions opens the Query Options dialog, where you can change your default loading behavior:

Figure 133 Changing the default loading settings.

To overrule the default settings, select Specify Custom Default Load Settings and then configure these settings as you wish. The trick here is that by unchecking Load to Worksheet, you create a connection-only query. You can then optionally select the option Load to Data Model (in Excel 2013 or higher.)

Loading Directly to Power Pivot in Excel 2010

Let’s be fair: Having to load data into the Excel 2010 Data Model via a table seems crazy. And, in fact, there is a way to go directly to the Data Model without going through a linked table first. A full caveat on this, however, is that this is not supported by Microsoft, which means you do this at your own risk.

Connecting to the Data Model

Connecting directly to the Data Model in Excel 2013 or higher is very easy, but in Excel 2010 you need to do a bit more work to pull it off. To see how it works, you’ll use the Ch10 Examples\Load Destinations – Pre DataModel.xlsx workbook, which contains the Sales query you created earlier, loaded to a worksheet but not yet linked to Power Pivot.

As with working with Excel 2013 or higher, you want this to be a connection-only query, as the entire point is to avoid loading to a worksheet first. So you need to convert it:

Now you need to link the Sales query directly to Power Pivot’s Data Model. This is done inside the Power Pivot window:

This is perfect: The queries are all listed in Power Pivot already!

Figure 134 The Power Query queries are existing connections!

You now see the following screen, which allows you to specify the SQL query to read from your query. Strangely, however, you can’t edit the SQL statement.

Figure 135 The query with un-editable SQL.

Warning: This is your only chance to set the Power Pivot table name . . . ever. Update the name of the query to something more logical here and make a mistake at your peril as it is un-editable after this point.

The query is now loaded to the Data Model:

Figure 136 A query loaded directly to the Excel Data Model.

Adding Columns to the Query

Now, what about adding that Month column? Go back to the Sales query and add it:

Now, take a look at how it updates:

Things do not go well:

You get an error immediately. And clicking on Error Details yields a message that says this (and more):

The query ‘Sales’ or one of its inputs was modified in Power Query after this connection was added. Please disable and re-enable loading to the Data Model for this query.

This is obviously not good. So how do you fix it?

The issue is that Power Query queries are compiled into a complicated connection string. And every time a source query is modified, the connection string is changed. Because you linked to the connection string when setting up this query, you need to update that string, and this is how you do it:

But what’s this? While the connection now refreshes, the new Month column isn’t present?

As if the initial process weren’t long enough, you need to re-confirm the table query as well by going to Design → Table Properties → Save.

Now you’re done. A simple open and save, and you finally see your new column:

Figure 137 The new month column finally arrives in Power Pivot.

Removing Columns from a Query

How do you remove columns from a query? A similar update process is required:

This process may seem incredibly painful, but if you follow these steps, you can easily update an Excel 2010 Power Pivot table when changes are made to your Power Query queries. The process is tedious, but at least it’s always the same.

Note: The problem with loading directly to Power Pivot in Excel 2010 is that both Power Query and Power Pivot were separate add-ins for Excel 2010. Power Pivot was integrated into Excel 2013 and Power Query in 2016. So from Excel 2013 on, Power Query and Power Pivot were able to start talking to each other properly, and Excel can handle this process without manual intervention.

Data Model Dangers in Excel 2010 and 2013

Because Power Query is a separate add-in from Excel 2010 and 2013, there are some very dangerous idiosyncrasies that you need to be aware of when working with Power Query. Fortunately, these have been solved with Excel 2016 and for some people using Excel 2013, but if you are developing in earlier versions, you either need to go “all in” with Power Query or avoid using it.

How to Corrupt Your Model

Corrupting your model is deadly easy, and the worst part is that you may not realize for months that you’ve done it. Here is a very easy way to do it (and it’s safe to try it here, so go for it!):

Your model is broken, but it certainly doesn’t look like it, does it? In fact, it continues to refresh whether you go to Home → Refresh in Power Pivot or Data → Refresh All in Excel. So what’s the big deal?

Figure 138 The model still refreshes, so what’s the problem?

The model is damaged, but your workbook will continue to refresh, possibly for months, before you figure it out. What will trigger the issue to actually rear its ugly head? Any change to the underlying query.

To see what we mean, go back and add that month column again, using the steps outlined earlier in this chapter. Whether you use Excel 2010 or 2013, once you’ve followed the normal steps to update your table, the data will fail to load to the Power Pivot Data Model:

Figure 139 An error? But it refreshed yesterday!

What’s even worse is that the error message provided only indicates that there is something wrong with the connection—but not what or how to fix it.

Can you even check a model to see if it’s damaged? Yes, you can:

If the connection string is grayed out and you get a message in the bottom-left corner that reads “Some properties cannot be changed because this connection was modified using the PowerPivot Add-in,” you’re too late. The damage has been done, and it’s irreversible.

Figure 140 Power Query’s game over screen.

The most awful news about this is that there is only one fix. You must:

1. Delete the table, including any measures and calculated fields that live on it, and then

2. Rebuild the table, measures, and calculated fields from scratch.

It’s horrendous, it’s ugly, and it’s feared. It’s also totally avoidable if you are disciplined.

Critical Do’s and Don’ts to Avoid Corrupting Your Model

Using Power Query to source data for your Power Pivot Data Model is completely stable, provided that you follow certain rules.

If your data has been loaded to the Power Pivot Data Model from Power Query, then you should never do any of the following:

Any of these actions will immediately corrupt the model, setting the table into a non-refreshable state the next time the Power Query query is modified.

Keep in mind that it is safe to perform the following:

The basic rule of thumb is this: If you loaded your data from Power Query, make all your modifications to that table in Power Query. If you follow this rule, you won’t ever experience the problem of corrupting your model.