Chapter 19 Power Query Objects
Before we take a closer look at programming with Power Query’s M language, we first need to take a detailed look at the different objects and functions that are available. Having a sound understanding of how to create these objects, sometimes from each other, will be useful when working with more advanced programming concepts.
Be aware that the focus on this chapter is not how to use these new-found powers in anything extremely practical. This chapter focuses on how to actually talk to these objects, providing a reference for you to return to when you need to talk to these objects (or portions thereof) when working in more complex scenarios.
If you’re following along on your own computer, you’ll notice that all of the key objects in this chapter (tables, lists, records, values, binaries, and errors) show up in a green font when included in a column. In addition, each can be previewed by clicking the whitespace beside that key word.
Table objects can show up in numerous places in Power Query, and we always appreciate it when they do, as they are very easy to work with. Follow these steps to see for yourself:
=Excel.CurrentWorkbook()
You can see that there is one table in the workbook:
Figure 226 The sole table in the workbook being previewed.
The benefits of tables are numerous in Power Query:
Tables do not, of course, only mean Excel workbook tables—far from it, in fact. You can find tables in many data sources, including those extracted using formulas like Excel.CurrentWorkbook(), Csv.Document(), database tables, and more, as you’ve seen in earlier chapters.
You should finalize this query before you move on:
Power Query lists are incredibly robust and useful, and in many cases they are required in order to use some of Power Query’s most powerful formulas.
The main way that lists differ from tables is that, when viewed, a list only ever has a single column of data. If you picture a list as being equivalent to your grocery list, you would simply list the names of the items you want to purchase. As soon as you start adding price comparisons between different stores, you have moved from a list to a table.
Syntax
When working with Power Query, a list can be identified by the presence of curly braces, with the list items separated by commas. In addition, textual items must be surrounded by quotes, just as they would need to be in an Excel formula. Here are two examples of lists:
={1,2,3,4,5,6,7,8,9,10}
={"A","B","C","D","E","F","G","H","I","J"}
Lists are not restricted to containing only numeric values, however. They can mix any data types at all, including other lists:
={1,465,"M","Data Monkey",{999,234}}
The key items to remember here are that the individual list items are separated by commas, and each individual list is surrounded by curly braces.
Creating Lists from Scratch
You will now create some lists from scratch to see how they work, starting with a list of individual numbers:
={1,2,3}
You now have a nice list of numbers from 1 through 3:
Figure 227 Creating a list from scratch.
In addition to the fact that you’ve created a list of values, you’ll also notice that you’re now working with the List Tools → Transform contextual tab active. Virtually all of the commands on the other tabs will be inactive at this point, making this feel like a very limited experience. Despite this, you’ll see that you still have access to keeping/removing rows, sorting, de-duplicating, and even performing some basic statistical operations on your data.
Now, as great as it is to be able to create a list from scratch like this, creating a list of numbers from 1 to 365 would be a bit of a pain. For this reason, you also have the ability to use a coding shortcut to create a consecutive list from one number through another. Change the formula in the formula bar to read as follows:
={1..365}
You get a nice consecutive list from 1 to 365:
Figure 228 Using the .. characters to create a consecutive list.
Note: You can create consecutive alphabetical lists in this manner as well, provided that you wrap the characters in quotes and use only a single character. For example, ={“A”..”J”} will work, but ={“AA”..”ZZ”} will not.
You can also use commas inside lists, provided that they are inside the quotes. Replace the formula in the formula bar with the following:
= {"Puls,Ken","Escobar,Miguel"}
Upon committing it, you get two list items, showing the names of this book’s authors:
Figure 229 You can still use commas in list output.
Converting Lists into Tables
Say that you really want to split your data into two columns, but that is not possible with a list as lists are restricted to a single column of data. In this case, you really need the richer transformational toolset that tables offer.
Not to worry: It is super easy to transform a list into a table. Just click the big To Table button in the top left of the List Tools contextual tab. This interesting dialog appears:
Figure 230 What’s this about delimiters?
You can set the delimiter setting to Comma and click OK, and your data will load nicely into a two-column table:
Figure 231 Data loaded from a comma-separated list.
Note: This dialog shows up whether there are delimiters in your data or not. If you don’t have delimiters, just click OK, and the dialog will go away.
Finalize this query:
Creating Lists from Table Columns
Sometimes you’ll want to extract the data from a single column of a query into a list. To see how this works, connect to the Sales table:
You now have the full table showing:
Figure 232 The raw table of data.
Now, what would you do if you wanted to get a unique list of the inventory items? If you were comfortable leaving it in the form of a table, you could simply remove all the other columns and then go to the Transform tab and remove duplicates. The challenge here is that it would still be in table form, and you wouldn’t be able to feed it into a function if you needed to. Instead, you’d like to get those unique items, but as a list, which gives you that flexibility. Follow these steps:
You now see a list of all the inventory items:
Figure 233 The column extracted to a list.
Before you move on, look at the formula bar. In there, you see this line of code:
=Source[Inventory Item]
This line of code refers to the Inventory Item column, as it was calculated during the Source step of this query. This gives you the M code shortcut to extract all column values into a list without using the user interface commands—something you’ll find very useful later.
With the column contents extracted to a list, you are able to perform further list operations on it, such as de-duplicating the data, like this:
You now have a list of unique items that you could feed into a different function.
It’s time to finalize this query:
We mentioned earlier that it is possible to create lists of lists. This might seem like an odd concept, so let’s explore this scenario.
Say that you have four employee IDs (from 1 through 4) involved in your sales table. These values represent Fred, John, Jane, and Mary, in that order. Wouldn’t it be nice to be able to convert those values to their names, without having to create a separate table? Follow these steps to see if you can use a list to do this:
= {{1,"Fred"},{2,"John"},{3,"Jane"},{4,"Mary"}}
Notice that here you have four separate lists, each surrounded by curly braces and separated by commas. These four lists are in turn surrounded by a single set of master curly braces, defining a master list made up of four sublists. When you commit this formula to the formula bar, Power Query gives you a list containing four lists:
Figure 234 A list of lists.
As you can see, previewing the first list shows that it is a list that holds the values 1 and Fred. This is interesting, but can you use this?
Converting this list to a table returns a single column that still contains lists, but it has an expansion arrow in the top left. Click that and look at the results:
Figure 235 The lists have expanded vertically, not horizontally!
Plainly, aggregating the lists combines them by stacking the rows, not treating each as an individual row. While you could transform this by using the index/modulo/pivot approach from Chapter 15, it is a bunch of extra work that you should be able to avoid.
Note: In order to make this work, you would have needed to define your list as is done in the author example above—not as a list of lists but rather as a list of items with the commas inside the quotes.
Finish this example with the following steps:
There are two key things to recognize here when working with lists:
Let’s look at an alternative.
While a list can be described as a single vertical column of data, a record is a list’s horizontal, multi-column counterpart. A record can be visualized as one row of a database, containing all the pertinent information related to a particular customer or transaction.
In Power Query, records can appear in table columns or in lists as you retrieve data. They can also be created on-the-fly if needed.
Syntax
Records are slightly more complex than lists in that they need to have a value of some kind, and you must also define the column names, like this:
=[Name="Ken Puls", Country="Canada", Languages Spoken=2]
Observe these key syntax points:
Note: Field (column) names do not need any punctuation surrounding them, whether they include spaces or not.
What happens when you need to create multiple records at once? The answer is that you reach to a list:
={[Name="Ken Puls", Country="Canada", Languages Spoken=2],
[Name="Miguel Escobar", Country="Panama", Languages Spoken=2]}
Creating a Record from Scratch
It’s time to circle back to your prior attempt at building the records for an employee ID table:
Here you need to create a single record:
=[EmployeeID=1,EmployeeName="Fred"]
Power Query returns your record:
Figure 236 Your first record.
As you can see, the record’s field names are listed down the left, with the corresponding data down the right. Interestingly, the data is arranged vertically, not horizontally, as you’d expect. This isn’t an issue, just something to get used to.
Also notice that you have a new Record Tools → Convert contextual tab and, if you explore the rest of the ribbon tabs, you’ll find out that they are all grayed out.
Converting a Record into a Table
Since there’s obviously not a lot that you can do with records, you can just go ahead and turn this one into a table to see what happens:
The result is probably not exactly what you expected:
Figure 237 A single record converted to a table.
You may have expected that this would show up with the field names across the top and the values in the first row. Although it doesn’t work this way, it’s easy to fix since it’s now a table:
The result looks more like what you would have originally expected:
Figure 238 The record now looks like a proper table.
In this case, this is fine, but what is going to happen if you have a bunch of records that you need to convert to a table? Finalize the query so you can move on to the next section and find out:
Creating Multiple Records from Scratch
Now say that you want to build your table out so that it encompasses all the employees. You need to build a list of records to do that:
= {[EmployeeID=1,EmployeeName="Fred"], [EmployeeID=2,EmployeeName="John"],
[EmployeeID=3,EmployeeName="Jane"],
[EmployeeID=4,EmployeeName="Mary"]}
Notice that you are still using the same format you use for a single record, but you’ve separated the records with commas and surrounded them in the curly braces needed to indicate that they are part of a list.
When you commit the formula above, it returns a list of records:
Figure 239 A list of records, with the preview showing that you got it right.
Note: The arrow in the top right of the formula bar allows you to expand it to show multiple lines at once.
Converting Multiple Records into a Table
Now you can convert this list of records to a table and see what kind of a disaster you have on your hands:
The result is a column of records that can be expanded. Interestingly, clicking that expand icon indicates that there are columns to be expanded:
Figure 240 This is looking better!
Clicking OK returns a nice set of columns and rows, exactly as you were looking for with the single record!
Figure 241 You just built a table from scratch!
It goes against conventional wisdom, but creating multiple records actually feels like it unwinds into a table more logically than a single record. The true difference is that you convert a list of records into a table in the second instance, not a single record into a table. With the records in a table column, Power Query then reads the record information correctly in order to expand it into the requisite column and rows.
You’re now at a stage where this table can be saved—and even merged into other queries if you like. Save it like this:
Creating a Record from a Table Row
When you were working with lists, you saw how you can convert a column into a list if you ever need to do this for any reason. You can also convert a row into a record. To do this, you can start with a new query:
As with earlier examples, you now have the full table showing. To extract the first record, you need to create a blank query step. To do this:
Figure 242 Creating a blank query step.
You now get a new step with the following formula in the formula bar:
=Source
=Source{0}
The result is your first record:
Figure 243 {0} = Record 1?
What just happened here?
When treated in this way, the Source step returns a list of records. Because Power Query is base 0, record 0 returns the first value in the list. (If you’d made this =Source{1}, you would have retrieved the record for the Talkative Parrot.)
Even more interestingly, you can drill in even further by appending the field name in square brackets. Try modifying the query to the following:
=Source{0}[Price]
As you can see, you just drilled right in and extracted the price for the first record in the table:
Figure 244 Drilling into record 0’s price.
To understand the relevance of this, consider a situation where you need to drill into a specific record in order to control filters. In Chapter 20 you’ll see where this technique allows you to do exactly that.
You can now finalize the query:
Creating Records from Each Table Row
To convert each row in a table into records, you need to use a little trick. Start like this:
Next, you want to convert each row in the table to a record. The challenge is that you need the index number of each row to do that. So you can reach to the Index column:
Now you are going to rename this step (not the column) in the Applied Steps box:
The query now looks as follows:
Figure 245 The Index column has been added, and the step has been renamed without the space.
Now you can reach to a custom column to convert your rows to records. The trick in this is creating an Index column, as you now have the value you need to extract your records. Why do you need this trick? You’re not going to operate on the current row but rather on the AddedIndex step’s output. This way, rather than getting a specific value (such as the first row), you can dynamically feed it into the query to get each row. Follow these steps:
=AddedIndex{[Index]}
The result is that a new column is created, and it contains the rows as records:
Figure 246 A column of records.
Note: Strictly speaking, you didn’t need to rename the Added Index column to remove the space. Doing that just makes things a lot easier in the user interface.
At this point, you could remove all other columns, and you’d simply have a column of records:
If you are working with databases, you will occasionally see columns containing values:
Figure 247 The elusive value object.
This particular object shows up only in certain cases. In order for it to appear, you must be working with a database that has a primary key and foreign key relationship set up between the tables. What’s really strange is that a value is just the way a database returns a record.
Note: This particular table can be located in the AdvertureWorks2012 database that you connected to in Chapter 8. Use the same method to connect as described in Chapter 8, connect to the Sales.SalesTerritory table, and remove the columns other than those shown.
Once you know what values you have, working with them is the same as working with the other data types that they are related to.
Note: The direction of the relationship determines what is returned to a related column when you’re working with a database. If you are in a fact table, and the link is going to the dimension table, you’ll receive a value (record). If you are in a dimension table and the link is going to a fact table, you’ll receive a table.
Binaries are essentially files. Some can be combined and read by using functions like Csv.Document(), whereas others, like Excel workbooks, cannot be combined and must be read using the Excel.Workbook() function.
The process of extracting data from these workbooks is covered extensively in Chapters 3 and 4, so we don’t explore them further here, other than to mention that they are a Power Query object that you will encounter.
There are two types of error messages that you can encounter in Power Query: step-level errors and row-level errors.
Row-level errors typically occur when you’re trying to convert data to the wrong data type or trying to operate on data before it has been converted to the correct type. You’ve seen several examples of these types of errors throughout this book. Here’s one:
Figure 248 A row-level error triggered by trying to convert countries to a Date data type.
These errors generally aren’t show-stoppers, and they can even be very useful when you’re cleaning data, as they can be treated in two ways:
Despite the fact that there is no debugging engine in Power Query, these errors are usually identifiable and often (although not always) are related to incorrect data types.
Step-level errors are a bit more serious to deal with than row-level errors. These messages block Power Query from showing anything in the output window except the error, like these examples:
Figure 249 An expression syntax error, triggered by a missing ) at the end of the line.
Figure 250 A general expression error, triggered by referring to SQL rather than Sql.
Unfortunately, Power Query’s debugging tools are particularly weak, as you can see with the following issue:
Figure 251 A syntax error caused by a missing } character but with the error message asking for a comma.
The error message debugging offering is presented on a single row (though it had to be cut and wrapped for this image). At the very end of the string is a helpful ^ character which indicates where Power Query thinks you need to place your comma. The issue, however, is that a curly brace was not provided to close the YTD Sales list, as indicated by the upward-pointing red arrow.
These issues today are challenges. It’s a shame that we have no built-in formula indenter, coloring, or IntelliSense today to help with such issues. Power Query is frequently being updated, though, and we hope to see changes in this area in the future. Until that happens, however, debugging must be done the painful way: reviewing the line and watching for key opening and closing syntax marks, commas, and the like.
The last type of object that you will encounter in Power Query is a function. Functions can occur in two places:
1. They can be inside a database, where they indicate a stored procedure.
2. They can be returned in a list from Power Query.
You’ll learn more about using and invoking functions later in the book, but here you’ll learn a trick you can use to see how functions manifest and also to discover the Power Query function list. Try this:
=#shared
Power Query generates a table of all Power Query tables in the current workbook, but more importantly, it follows the list of tables with a list of all functions included in the product. You can access the documentation for any function from this list:
Figure 252 A table of functions.
How do you use this? To see, filter the first column for Max (with an uppercase M):
The search turns up four offers:
Figure 253 All functions containing Max.
If you click on Function beside Table.Max, you see two things happen—documentation pops up behind and an invocation dialog pops up in front:
Figure 254 An invocation dialog appears in front of the documentation.
This box allows you to test the function out, but clicking Cancel makes it go away. Behind it you’ll find the full documentation for the function, so you can determine whether it does what you need.
Note that you’re not restricted to using the #shared command from a blank query. Any time you want to check Power Query’s formula documentation, you can do the following:
1. Click the fx on the formula bar to add a new step
2. Replace the code in the new step with =#Shared
3. Convert the records to a table
4. Drill into the function you want to explore
You can then step back into the earlier steps in the Applied Steps box to implement the function, and you can delete all the #Shared steps when you are finished.