Chapter 21 Creating Custom Functions

If you think way back to Chapters 3 and 4, you might remember a couple of issues with the way that Excel pros have historically imported and appended data sets. This was the classic method to consolidate these files:

1. Import and transform file #1.

2. Import and transform file #2.

3. Copy file #2’s data and paste it at the end of file #1.

4. Save file #1 as a consolidated file.

And then, as each new month came along, a similar import, clean, copy and paste workflow would follow.

This, of course, led to several dangers in both the transformation steps and the copy and paste. The more complicated the transformations required, the longer the process ran and the more the job turned over into new hands, the more likely something would go wrong. Steps would be missed or performed incorrectly, resulting in a compromised data set.

While you can solve many of these issues by creating append queries, and with enough foresight you can import all files in a folder, what happens in the following situations?

Either of these scenarios creates problems, but each can be solved by creating a custom function. These functions can be used to take an existing set of Power Query code, package it up, and allow it to be executed multiple times. In the case of the first scenario, you can reuse what you’ve already built and apply it consistently to each new file (reducing the human error factor). In the case of the latter, you can apply the process to each file, landing the result of each preprocessed set of data into a column of tables that can then be combined.

And the best part? It’s actually not that difficult!

Building a Custom Function

In order to build a custom function, you basically follow a three-step process:

1. Build a single-use scenario first.

2. Convert the single-use scenario into a function.

3. Call the function from another query.

This sounds easy enough in practice. Let’s take a look how to put it all together.

Rather than build a new use scenario, you will fall back on the scenario from Chapter 18 on conditional logic, where you imported the 2015-03-14.txt timesheet file. In that chapter you already built the code to perform the import you needed, targeted at a specific timesheet.

Not surprisingly, you’d now like to apply the same logic to a different file. Rather than just change the source, however, you’d also like to consolidate those files and any others that are later added to the directory. To do this, you can use a custom function.

Building a Single-Use Scenario

You already built the single-use scenario previously, and you can just load it to begin your work here. Open the Ch21 Examples\Custom Functions.xlsx file to get a copy of the routine that you built previously. This file contains code identical to what was built in Chapter 18, with one exception: The file path is pointed to a subfolder called Source Files, which now contains three separate files (including the original 2015-03-14.txt file).

When you go to Edit Query, you get an error message because the Source step refers to a location that is different from where you have copied the sample files. To resolve this:

1. Click the Go to Error button at the top right of the yellow error message bar. Power Query shows the path that is currently being used.

2. Click the Edit Settings button, and in the dialog box that opens in the middle of your screen, click Browse.

3. Navigate to the folder where you have copied the source files, select 2015-03-14.txt and click OK, and then click OK again.

Remember that the query built loads the records from the text file, extracts the employees into a new column, and fills them down before removing helper column and extraneous rows:

Figure 277 The query to reformat the timesheet.

Converting the Query into a Function

Next, you need to convert the query into a function. This involves three steps:

1. Come up with a name for the variable that will hold the data you wish to replace.

2. Edit the query and place the following text at the beginning:

(variable_name)=>

3. Scan your query for the data you wish to replace and overwrite it with the variable name.

It’s a good idea to come up with a variable name that is somewhat descriptive of the data it will hold, as this helps self-document your M code. The goal here is to convert the single-use scenario into a function where you can dynamically update the file path, so you should use something similar to filepath to describe what it holds. (Ultimately, you get to choose the name. Just make sure it doesn’t contain any spaces or special characters.)

Warning: Windows File Explorer has a set of rules about which characters are allowed for file names and paths. Power Query uses slightly different rules, however. For example, if you set up a folder such as D:\!!Data Monkey, it can be read by Windows, but Power Query will choke on it, as Power Query will not allow a file path to start with the ! character.

Now that you’ve determined your variable name, edit the query to turn it into a function:

Your code should now start like this:

Figure 278 The filepath variable is now in place.

Note: At this point, you’ve already converted your query into a function. Because you haven’t subbed the variable name into the code, however, the function won’t actually change anything.

The next step is to scan the code, find the existing file path, and replace it (and any quotes that surround it) with the variable name:

Your code should now look like this:

Figure 279 The filepath variable subbed into the code in place of the full file path.

Warning: When you go to select the file path using the mouse, Power Query selects extra characters as well, including the parentheses. It’s a better idea to click in front of the first quote, hold down the Shift key and arrow to the right to select the code you want to replace. Selecting text in this way doesn’t automatically grab extra characters.

At this point you can click OK, and your query changes drastically:

Figure 280 What happened to my query??

These changes are, unfortunately, entirely to be expected. While it’s great that it is obviously now a function and not a query, and it plainly states the variable that needs to be input, you’ll notice that you’ve lost all items in the Applied Steps box. Don’t worry: They’re still all there, but you just can’t see them right now.

Testing the Function

It’s a good idea to test that everything still works at this point. To do that:

Note: If you included the quotes when you copied the file path earlier, you need to remove them from both ends of the pasted value, as Power Query can’t read the path properly when it’s surrounded by quotes.

When you click OK, the query loads the information from the file and runs it through all the steps that you originally wrote:

Figure 281 The data loaded from a dynamic file path.

Notice also that you now have a new step in the Applied Steps box. This step shows that you invoked the function. While this is great for testing the function, it essentially breaks the function’s ability to be called from anywhere else. Now that you’ve tested your function, you need to remove this step:

Notice that the Timesheet table that resided in the Timesheet worksheet disappears. This is because, despite just clicking Close & Load, the query has been changed from a query to a function. Functions, by their very makeup, can only be created in a Connection Only format:

Figure 282 The query is now a function.

Calling the Function

Now that you’ve done the hard work, you need to call this query from another function. Since your end goal is to consolidate all the timesheets in a folder, you can kick off a new query to import all the timesheets at once:

Even though you’ve restricted the data to just the folder path and filename, you have everything you need to add a custom column and call the function. So do exactly that:

The trick here is to remember the name of your function and get the case right. Once you’ve done that, it’s as easy as filling it with the appropriate data. In this case, you simply concatenated the file path and file name together using the & character, just as you can do in an Excel formula. That gave you the full file path, which was then passed into the function. The function then executed all the steps contained in the function and returned a table to the column representing the processed data set for each file, as you can see here:

Figure 283 Previewing the new data import.

You can now remove the Folder Path and Name columns and import your data, thereby consolidating the files together:

There’s one last thing to check before you commit this query: the data types. This is a bit of a shame, but even though the original query you built defined the data types, you’ll find that none of those settings persisted when you turned it into a function and combined the files in this way:

Figure 284 Beware the evil Any datatype!

To safeguard against poor interpretations by Excel or Power Pivot, it’s a good idea to set those data types now. Set them as follows:

With that done, you can finally load the query:

Solution Impact

The ramifications of the process described here are rather extensive. Even if you build a single-use scenario, you can now convert it to be consistently applied across multiple files. That is a huge benefit that was previously unavailable to Excel pros, at least in easy-to-deploy fashion.

You can deal with situations where your data footprint expands across multiple files and where transformations would be too complex if the files were all combined into one before being processed. By building a single-use case, you can perfect the transformation on a smaller scale and then append the end results.

Debugging Custom Functions

One of the painful pieces of working with custom functions is that you lose the ability to step through them easily. That makes debugging custom functions a bit of a challenge.

While it’s not ideal, there is a way to convert a function back into a query so that you can test it. The unfortunate part of this process is that it is a temporary state because converting the function into a debuggable state converts it out of function mode, breaking any subsequent queries during the debugging process. However, it is the only way to accomplish the goal, so you’ll learn about it next.

Restoring Query Steps to the Applied Steps Box

In order to restore the Applied Steps box so that you can debug a function, you actually need to turn it back into a query. To do this you need to do two things:

1. Comment out the line that turns the query into a function

2. Duplicate the variable used in the initial line and assign it a value

Failure to do either of these steps will cause you to end up with a query or function that returns the wrong results at best or errors at worst.

To comment out a line in M code, you insert the characters // at the beginning of the line. This tells the Power Query engine that the remaining characters on the line should not be executed.

To duplicate the variable, you need to set up a new step after the initial let line to create and assign a value to the variable. That line must be built using the following syntax:

Variable_name = assign_value_here ,

The variable name must be the variable that is currently enclosed in the opening parentheses for the function, and the line must end with a comma. Take a look:

Wait . . . something is different than the last time you looked at this code:

Figure 285 New lines of code injected into the solution.

Where did those new lines come from?

They were injected when you clicked the Invoke button. They’re entirely unnecessary for the function to operate and can just be removed so that the function starts with (filepath) again.

Once you’ve removed those portions, modify the first three lines to read:

//(filepath)=>

let

filepath = "C:\yourfilepath\Source Files\2015-03-14.txt",

Warning: Don’t forget the final comma at the end of the line, or your code won’t work!

When you’re done, the code should look similar to this:

Figure 286 The modified code to convert this back into a query.

When you click OK, you can step through and verify what is happening in your query:

Figure 287 The steps are back.

The nice thing here is that you can even click the filepath step to see the path you set, and then you can review each step to see how it is reacting. If you find errors in the way your data is handled, you can correct them and then turn your query back into a function again.

Warning: While your function is in debug mode, any subsequent queries that refer to it will not function!

Restoring “Function”ality

To turn the query back into a function, you again need to edit the M code to do two things:

Once these things are done, your function will resume its normal operation method, and all queries using this function will be able to use it again.

Warning: Forgetting to comment out the temporary variable line will result in that line overwriting any variable passed into the function. You don’t want to forget to comment that line!