Chapter 17 Power Query Formulas

Power Query’s user interface is amazing, and many of the things you need to do are already built in. Having said that, there are sure to be times when you need to do things that aren’t built in to the user interface. Even without easy-to-click buttons, there are ways to get such jobs done. You just have to reach in to Power Query’s programming language: M.

Getting Started with M

While aspects of M can get quite technical, the easiest way to get some exposure to the language is to start with custom columns and add formulas to them.

Because Power Query was built for Excel pros, you might expect that its formula language would be just like Excel’s—much like Power Pivot’s DAX formulas are identical in name to those from Excel. Unfortunately, this is not the case, as you’ll see, and you’ll have to temper your initial instincts to use those formulas as you learn a new formula syntax.

Creating Custom Columns

It is relatively straightforward to create a custom column. While inside your query:

You see this dialog, which allows you to create a new formula:

Figure 200 The Add Custom Column dialog.

Three portions to this dialog that are important:

You can easily do simple aggregations in the formula area, using syntax that you’re used to in Excel. For example, to join the two fields listed above as text, you would build a formula in the following manner:

From this, Power Query would build the following formula:

=[Column1]&[#"Column #2"]

The great thing about using the double-click interface is that, at this point, it doesn’t matter to you that the syntax of Column1 and Column #2 must be handled differently. The interface will get it right, and you can get on with completing the job.

Note: This specific syntax is explained in Chapters 19 and 20.

As you might expect, you can perform regular mathematical equations as well, including the following:

Operation

Power Query Formula Equivalent

Addition

=[Column1]+[#"Column #2"]

Subtraction

=[Column1]-[#"Column #2"]

Multiplication

=[Column1]*[#"Column #2"]

Division

=[Column1]/[#"Column #2"]

Interestingly, however, the exponent operator ^ that you are used to in Excel will not work. Exponentiation requires a custom formula:

=Number.Power([Column1],[#"Column #2"])

Discovering New Formulas

Currently, Power Query doesn’t have any IntelliSense, so it can be difficult to determine which formulas exist. While this is something that we hope will change in future, it leaves us hunting when we need to work out which formulas exist.

For this reason, in the Add Custom Column dialog, there is a hyperlink just underneath the Custom Column Formula box.

Figure 201 How to find more Power Query formulas.

Click this link to go to a web page that promises to help you learn about Power Query formulas. If you scroll down that page, you’ll find a link to Power Query Formula Categories, which then links you through to the MSDN documentation site. This site lists all formula categories and allows you to browse and find the ones you need.

Note: We highly recommend bookmarking the landing page for the MSDN site. This will prevent you from having to open the Add Custom Column dialog in order to access this documentation.

Formula Gotchas

Power Query and Excel have some significant differences in terms of how they handle inputs. The following table summarizes some of the issues that will certainly frustrate you in the beginning:

Excel

Power Query

Formulas are not case sensitive

Formulas are case sensitive

Counts using base 1

Counts using base 0

Data type conversions are implicit

Observes strict data typing

Case Sensitivity

The case sensitivity aspect is a headache but something you get used to. The trick to remember here is that in 99% of cases, the first letter of each word in a Power Query formula is capitalized, and the rest are lowercase. Whereas Excel doesn’t care which case you use and converts formulas to uppercase by default, Power Query just returns an error.

Base 0 versus Base 1

The difference between base 0 and base 1 is where the number line starts. Consider the following word: Excel. If you were to ask yourself the position of the x character, you’d probably say 2. Character E is 1, and x is 2. Does that sound correct?

Counting in this fashion follows a base 1 rule, where the first character has an index position of 1, the second character 2, and so on. It’s the way you count, and it’s the way Excel counts.

If you were to ask Power Query the same question, it would return an answer of 1. That’s a bit of a head-scratcher at first, especially since Excel counts in a base 1 fashion. You’d kind of expect Power Query to follow the same rules, but it doesn’t. Power Query starts counting at 0, not 1. So in the word Excel, the E is character 0, and the x is character 1.

Ultimately, this doesn’t generate issues that can’t be dealt with, but it does mean that when you’re building formulas in Power Query, you can end up off by one position very easily.

Data Type Conversions

Excel formulas are very forgiving with data types, using implicit conversion, unlike Power Query explicit Conversion. To understand what that means, consider the following scenarios.

In Excel you can add a value to a date and increment it to the next day. Because all dates are really just serial numbers anyway, adding 1 to the date will work perfectly.

In Power Query, if the date is formatted as a date type, you must use a specific formula to add days to it. And if you try to use the same formula to add days to a number, Power Query will give you an error as it’s not a date. This means you need to explicitly convert your fields to the data type before using them in formulas.

In Excel you can join two cells together by using the & function. Whether the cells contain text or values is completely irrelevant. Excel will implicitly convert them both to text and then join them together:

Figure 202 Implicit conversion in action: Number and text converted to text.

Observe what happens when you pull this data into Power Query and create a new column using the following formula:

=[Column1]&[Column2]

As you can see, you get a completely different result:

Figure 203 Power Query can’t join numbers and text together.

In order to fix this issue, you have to tell Power Query explicitly to convert the number to text before it tries to join the two text strings together. You do that by wrapping Column1’s input in a conversion function:

=Text.From([Column1])&[Column2]

When you explicitly convert the data in Column1 to a text value, the concatenation will work as you originally intended:

Figure 204 Explicit conversion of data types ensures that the formula works.

There are actually two ways to deal with data types in Power Query:

Useful Data Type Conversion Functions

There are several data type conversion functions in Power Query. Some of the most important of them are described in the following sections.

Converting to Text

For the most part, if you need to convert the values in a column to text, you just use the Text.From() function. There are also some additional functions, however, you can use to keep your data typing even more explicit. The following table shows how to convert different data types into a Text data type:

To Convert

Formula

Example

Anything

Text.From()

Text.From([Column1])

A date

Date.ToText()

Date.ToText([Column1])

A time

Time.ToText()

Time.ToText([Column1])

A number

Number.ToText()

Number.ToText([Column1])

Keep in mind that Text.From() will do the job of all the others, whereas Time.ToText() will not convert a number to a text value.

Dates

There are two types of dates that you need to be concerned with: those that are based on numbers, and those that come as textual dates. Different conversion functions are provided for each data type in order to turn them into a Date data type:

To Convert

Formula

Example

Numeric dates

Date.From()

Date.From([Column1])

Date.From(42986)

Text dates

Date.FromText()

Date.FromText([Column1])

Date.FromText(“Jan 31, 2015”)

Again, Date.From() actually performs the function of Date.FromText, although the reverse case is not true.

Times

Like dates, time values can arrive as either numeric time values or textual ones. Again, there are two functions for these:

To Convert

Formula

Example

Numeric times

Time.From()

Time.From([Column1])

Time.From(0.586)

Text times

Time.FromText()

Time.FromText([Column1])

Time.FromText(“2:03 PM”)

And, just as with dates, Time.From()performs the function of Time.FromText(), but again, the reverse case is not true.

Durations

A duration is the difference between two date/time values, and it allows you to work out the days, hours, minutes, and seconds to complete a task. Two functions exist, where the .From() variant can perform the job of the other:

To Convert

Formula

Example

Numeric durations

Duration.From()

Duration.From([Column1])

Duration.From(2.525)

Text durations

Duration.FromText()

Duration.FromText([Column1])

Duration.FromText(“15:35”)

Values

There are actually a large variety of numeric conversion functions. Again, while Number.From() will perform the job of all of them, there are others included as well, in case you want to force your numbers to a specific numeric type:

Convert Value To

Formula

Example

Decimal number

Number.From()

Decimal.From([Column1])

Number.From(2.525)

Decimal number (from text)

Number.FromText()

Number.FromText([Column1])

Number.FromText(“15.35”)

Decimal number

Decimal.From()

Number.From([Column1])

Decimal.From(15)

Whole number

Int64.From()

Int64.From([Column1])

Int64.From(2)

Currency

Currency.From()

Currency.From([Column1])

Currency.From(2.52)

Comparing Excel and Power Query Text Functions

When we Excel pros are trying to land and clean data, we often need to clean and split text.

If you’ve worked with Excel’s text functions for a long time, you’ll find that you are naturally inclined to try to use them to extract components from your data. And you’ll find that they just don’t work. For that reason, in this section you’ll explore how to replicate Excel’s five most commonly used text functions.

Each of the examples in this section can be found in Ch17 Examples\5 Useful Text Functions.xlsx. Each of the examples in this section begins with a set of data like the one below:

Figure 205 A sample of the data you’ll work with.

There is a set of words down the left column, and there are some extractions in the right column. Each of these extractions was performed with Excel formulas, using the function listed in the header of the second column.

The following sections compare how to get Power Query to accomplish the same thing using different functions.

Note: In August 2015, the Power Query team added the ability to extract the first character, the last character, and a range of characters to the Transform tab. Despite this, the following sections walk through the process of replicating and comparing these formulas to those of Excel as this not only provides an understanding of how to work with the M code formulas but also allows you to build more robust solutions than you can with the user interface commands alone.

Replicating Excel’s LEFT() Function

Follow these steps to pull the data into Power Query:

This should work, shouldn’t it? Power Query certainly isn’t indicating that it won’t:

Figure 206 Power Query seems comfortable with the formula.

Click OK, and despite the green checkmark, things are not as good as you’d hoped:

Figure 207 You’re pretty sure you spelled left correctly.

So despite the tool being built for Excel pros, the design team elected to use a completely different formula term to refer to the x left characters of a cell. Here is a direct comparison of the syntax between the two programs:

Syntax

Example

Result

Excel

=LEFT(text,num_chars)

=LEFT(“Excel”,2)

Ex

Power Query

=Text.Start(text,num_chars)

=Text.Start(“Excel”,2)

Ex

This means you need to edit the formula:

Warning: Don’t forget that the formula is case sensitive. Text.start, TEXT.START, and other variants will return errors!

This works, and better yet, it delivers results consistent with the Excel formula set:

Figure 208 Replicating Excel’s LEFT() function with Text.Start().

You can now finalize the query:

Warning: Never use the name of an existing function as your Power Query table name. If you had called this table LEFT, you would experience #N/A errors in the original table’s Excel formulas, as table names are evaluated before functions.

Replicating Excel’s RIGHT() Function

Based on what you know about LEFT(), you’ve probably guessed that using =RIGHT() won’t work in Power Query, and you’re correct. Again, the Power Query team chose a different function name, as shown below:

Syntax

Example

Result

Excel

=RIGHT(text,num_chars)

=RIGHT(“Excel”,2)

el

Power Query

=Text.End(text,num_chars)

=Text.End(“Excel”,2)

el

Follow these steps to see how the results stack up:

The results are shown below:

Figure 209 Replicating Excel’s RIGHT() function with Text.End().

Once again, the results are consistent with Excel’s, which is a good thing. You can now finalize this query as well:

Replicating Excel’s LEN() Function

This isn’t looking difficult so far, is it? Now try the LEN() function, whose syntax is listed below:

Syntax

Example

Result

Excel

=LEN(text)

=LEN(“Excel”)

5

Power Query

=Text.Length(text)

=Text.Length(“Excel”)

5

Based on this, the results should be exactly the same as Excel’s. Follow these steps to see what happens:

The results are shown below:

Figure 210 Replicating Excel’s LEN() function with Text.Length().

The results line up perfectly again. You’re ready to load them to a connection as well:

Replicating Excel’s FIND() Function

By now you’re wondering what the big deal is. It’s obvious that the Power Query team just changed the way we reference formulas. In fact, the team has even injected some logic to group all the text formulas together, as they all start with Text.FunctionName.

But the changes are actually a bit bigger. Let’s take a look by replicating Excel’s FIND() function.

The following base table is using the FIND() function to find a lowercase o in the words. Naturally, that works in some cases, and it returns errors in the case of words that don’t contain the specified character.

Figure 211 The FIND() function you want to replicate.

Follow these steps to see how Power Query handles this:

Before you add the custom column, you should know that the new name for FIND in Power Query is Text.PositionOf(). Based on the syntax of the FIND() function, that means that you should be able to go with this:

=Text.PositionOf("o",[Word])

Try it:

The results are far from what you’d predict:

Figure 212 What the heck is going on here?

In Power Query, the -1 result for the function informs you that Text.PositionOf() found no match. But how is that possible? The letter o plainly appears in a bunch of those words.

To find the answer, examine the full syntax for this function, as compared to Excel’s:

Syntax

Example

Result

Excel

=FIND(find_text,within_text)

=FIND(“xc”,”Excel”)

2

Power Query

=Text.PositionOf(text, find_text)

=Text.PositionOf(“Excel”,”xc”)

1

Do you see the big difference? Not only is the FIND() function masquerading under a new name, but the order of the parameters has flipped! That will certainly cause an issue. Now you need to follow these steps:

The results are better but still not quite consistent with Excel:

Figure 213 Every result is off by one.

It looks like the values returned here follow the base 0 rule. (Counting F as 0, the first o in Football would be character 1, not character 2.)

This is not a huge issue; it just means that you need to modify the formula and add 1 to the result in order to make it consistent with Excel. Follow these steps to do so:

And now things are looking . . . almost the same:

Figure 214 The numeric results line up nicely, but the errors don’t.

This case is actually an interesting one, as Power Query actually gives a nicer result than Excel when a value isn’t found. Wouldn’t it be great if you didn’t have to wrap your FIND() function in an error handler in case the character wasn’t located? You can now finalize this query:

Note: One thing you cannot do is replicate the #VALUE! error. Trying to replace 0 with #VALUE! will fail unless you convert the column to text first. Even if you did so, you would then land a column of textual numbers into Excel, which would obviously impact your ability to use the values in formulas.

Replicating Excel’s MID() Function

The last function you will attempt to replicate is Excel’s MID() function. This is a tricky one, as it actually throws an additional wrinkle into the mix as well.

In order to replicate the MID() function, you need to use Power Query’s Text.Range() function, which has the following syntax:

Syntax

Example

Result

Excel

=MID(text,start,num_chars)

=MID(“Excel”,2,2)

ex

Power Query

=Text.Range(text, start,num_chars)

=Text.Range(“Excel”,2,2)

ce

As you can see, the function has a different name, but the parameters are listed in the same order as the Excel equivalent. That is good news, at least. And yet, as you can see, the end result seems to be a character off. This is due to the fact that Power Query counts letters using base 0 instead of base 1, as Excel does. But armed with this information, you should be able to make this work. To start the process:

The results are, once again, underwhelming:

Figure 215 A couple of these worked, but why not all?

This is a bit of a shock. You were expecting the word to be off by a character but not to have errors everywhere. What happened?

Before dealing with the errors, you need to correct the position of the starting character. You can see that Bookkeeper should be returning “keep,” and it’s coming in late by one letter, returning “eepe.” That’s doesn’t require a huge fix; you just need to adjust the formula to subtract 1 from the starting parameter:

Interestingly, the results look much better:

Figure 216 Why are these formulas now returning consistent results?

One of the great features of Excel’s MID() function is that you never have to worry about how many characters are remaining in the text string. If the final parameter you provide is higher than the number of characters remaining, it will simply return all remaining characters without triggering an error. Not so in Power Query.

In order to fix this, you obviously have to make a modification to how many characters are being returned, as you don’t want to provide a number higher than the number of characters remaining. In other words, you want to return the minimum of four characters, or the number of characters still remaining in the text string.

You’re going to learn a lot more about required functions in Chapter 20. Right now, however, just know that you need to use a List function to do this, specifically List.Min().

Rather than just try to build this into your function, follow these steps to see if you can build this in a separate column:

You can see that you have a table that clearly indicates that Truck has only one extra character:

Figure 217 You have determined the number of characters remaining.

You’ll get a lot more exposure to lists in Chapter 19, but here’s a quick breakdown of the formula:

With this working, you can simply edit your original column to use this formula in place of the final 4:

When you preview the Added Custom step, your MID function equivalent should now be working:

Figure 218 The query is working for all but the last line.

Once you are comfortable that the query is working, you can take the following steps:

Despite the error, you can call this done and load the query to an Excel table. Why? Because errors just show up as empty cells anyway:

Figure 219 Errors disappear when loaded to a table, giving the output a consistent feel.

Observations

It is a real shame that not all the functions in Power Query have full equivalency with those in Excel, especially since Power Query is a tool for Excel users. Hopefully this will change in future versions; we’d like to see a new library of functions added that allow you to port your existing Excel formula knowledge directly into Power Query without needing to learn new formula names and syntaxes. Until that time, however, it is important that you test the results from the Power Query functions that seem to be Excel’s equivalents and make sure they match what you expect, as you will occasionally have to tinker with the output to get it to line up correctly.