Chapter 22 Advanced Conditional Logic
In Chapter 18, you learned how to replicate the functionality of Excel’s IF() and IFERROR() functions. Of course, the IF() and IFERROR() functions are not the only functions in Excel that you use to implement conditional outputs or matches.
Unlike the formulas examined previously, replicating other formulas in Excel’s conditional logic library involves a bit more complicated techniques, including using lists and custom functions.
Multicolumn Logic
Occasionally, as you are trying to filter down raw data, you may need to filter based on multiple criteria. Consider the following data set, which is included in the Ch22 Examples\Multi-Column Logic.xlsx file:
Figure 288 A raw data set.
Replicating Excel’s OR() Function
What if your boss asked you to list only data where you sold a Talkative Parrot or the item was sold by Fred? If this were an and scenario, it would be easy: Just filter the Inventory Item column to Talkative Parrot and then filter the Sold By column to Fred. But that won’t work in this case because you’d lose any Talkative Parrots sold by John or Jane, and you’d also lose any other items sold by Fred.
If the data is stored in an Excel table, you could solve this by using Advanced Filter or by adding a column using the following formula and filtering to only results that are listed as true:
=OR([@[Inventory Item]]="Talkative Parrot",[@[Sold By]]="Fred")
But what if the data doesn’t originate in an Excel table? What if it is sourced from a database, a web page, or a text file? Plainly, that won’t work.
Pull the data into Power Query to see how you should approach this:
Since you can’t filter your data without losing required records, you’re going to need to add a custom column and apply a formula to examine each row:
You know that you’re going to need to test whether a certain criterion is true or false, using a basic formula framework along the lines of the following:
= if logic_test then "Meets Criteria!" else "No Match"
The challenge here is coming up with that logic test. Power Query doesn’t have an OR() function, so what is the equivalent?
When you’re looking for text comparisons, it is helpful to quickly scan the list of list functions contained in the Power Query formula categories.
Note: Remember that you can access the list of Power Query formulas by clicking the Learn About Power Query formulas link at the bottom of the Custom Column dialog and then clicking the Power Query formula categories link partway down the page. You end up at the https://msdn.microsoft.com/en-us/library/mt296612.aspx web page.
Within the list functions, you’ll find that there is a List.AnyTrue function, which sounds somewhat promising. Selecting that item in the documentation reveals the following example:
List.AnyTrue({2=0, false, 1 < 0 }) equals false
Based on this, you can see that the function contains a list of values, as indicated by the curly braces within the parentheses. It also shows that the function will return false if none of the items are true.
Try to nest this formula in place of the logic test:
=if List.AnyTrue(
{[Inventory Item]="Talkative Parrot",[Sold By]="Fred"}
)
then "Meets Criteria!"
else "No Match"
Note: Remember that you need to separate the criteria with commas and then surround the entire list of criteria with curly braces because the List.AnyTrue() function requires a list as a parameter.
Upon clicking OK, you can see that the formula returns your message in any case where the sales item equals Talkative Parrot or the Sold By field holds Fred:
Figure 289 Replicating Excel’s OR() function using List.AnyTrue().
Since the function returns true if any of the criteria are true, any instances where Fred sold a Talkative Parrot would also display the result Meets Criteria! in the Match? column.
You can now finalize this query by taking the following steps:
Power Query returns a table containing a total of 88 rows out of the original 332.
Replicating Excel’s AND() Function
While it’s true that you can reduce records by filtering a table column-by-column, what if you only want to tag records where a condition exists in multiple columns? For this you need to replicate Excel’s AND() function.
The trick to this is essentially the same as the trick with replicating the OR() function with List.AnyTrue(), except that for AND() you need to use the List.AllTrue() function. This function returns a true value only if every logical test provided returns a true value—just like Excel’s AND() function.
Take a look at how it differs from the previous function:
As you can see, the results are quite different than the results you get by using the original List.AnyTrue() function:
Figure 290 Using the List.AllTrue() function to tag records based on multiple-column criteria.
While this example obviously filtered the data based on the results from the formula output, the great thing about this function is that you can easily tag records without filtering first. This allows you more flexibility in building more complicated logic while still preserving your original data—something that can’t be done if you filter columns to drill into the end values.
Replicating Excel’s VLOOKUP() Function
As much as some people try to avoid VLOOKUP(), it is an incredibly useful function for Excel pros. Those who love it will certainly want to replicate its functionality in Power Query at some point. However, depending on which version of VLOOKUP() you need, it can be quite tricky to implement.
VLOOKUP() Using an Exact Match
In truth, you don’t need to do anything special to emulate VLOOKUP’s exact match, as this functionality can be replicated by simply merging two tables together, as described in Chapter 9.
VLOOKUP() Using an Approximate Match
Replicating VLOOKUP()’s approximate match is a totally different case than the exact match scenario. It requires some logic to emulate those steps because you’re not trying to match records against each other but actually trying to find the closest record to our request without going over. While you won’t create the function from scratch here, you will see the function and how it works.
To get started, open Ch22 Examples\Emulating VLOOKUP.xlsx. In this file you’ll find two tables: BandingLevels and DataTable.
Figure 291 The BandingLevels table.
Figure 292 The DataTable table.
If you review the functions in columns B:D of the DataTable table, you’ll see that they contain VLOOKUP() functions, as shown in the column headers. Each column is looking up the value shown in column A for that row against the BandingLevels table. Columns B and D are returning the value from column 2 of the BandingLevels table, and column C is returning the value from the Alt Band column of the BandingLevels table.
In addition, notice that columns B and C are returning approximate matches because the fourth parameter has either been set to True or omitted. Column D, however, is asking for an exact match (as the fourth parameter has been set to False), resulting in all records returning #N/A except for the very last one.
You should set up the Power Query function we need now, and then you’ll see how it replicates Excel’s version of VLOOKUP():
With the function created, you need a pointer to the BandingLevels table:
We are now ready to see how it works. Pull in the DataTable, and remove all of the Excel versions of the functions.
You are now down to a single column of data:
Figure 293 Ready to try the pqVLOOKUP function.
To see if the pqVLOOKUP function works for you, you can try to replicate the following formula:
=VLOOKUP([Values],BandingLevels,2,true)
To do that, you can take the following steps:
The results are identical to what Excel would show:
Figure 294 Replicating VLOOKUP() with an explicitly defined approximate match.
This looks good. Now try leaving the ,true off the end and returning the third column from the lookup table instead of the second:
=pqVLOOKUP([Values],BandingLevels,3)
The results are again identical to what Excel would show:
Figure 295 Replicating VLOOKUP() with an implicit approximate match.
Try one more. What if you wanted to define an exact match against the second column of the lookup table? To do this:
=pqVLOOKUP([Values],BandingLevels,2,false)
Once again, the results are spot on with what Excel delivers:
Figure 296 Replicating VLOOKUP() with an exact match.
Warning: Even though you can use this function to emulate VLOOKUP()’s exact match, you shouldn’t. The reason is that you can accomplish an exact match effect by merging tables together—a method that will be much faster. If you need the approximate match functionality, however, this is a viable method.
At this point, you should be aware of one minor difference between Excel’s VLOOKUP() and the pqVLOOKUP function: the #N/A value returned by pqVLOOKUP is actually text, not a true error, as you can see below.
Figure 297 pqVLOOKUP’s #N/A “errors” are actually text.
Returning text is as close as you could get when returning error, as there is no way to output a true #N/A error in Power Query.
Understanding the pqVLOOKUP Function
So how does the pqVLOOKUP function work? Take a look at the code:
(lookup_value as any, table_array as table, col_index_number as number, optional approximate_match as logical ) as any =>
let
/*Provide optional match if user didn’t */
matchtype =
if approximate_match = null
then true
else approximate_match,
/*Get name of return column */
Cols = Table.ColumnNames(table_array),
ColTable = Table.FromList(Cols, Splitter.SplitByNothing(), null,
null, ExtraValues.Error),
ColName_match = Record.Field(ColTable{0},"Column1"),
ColName_return = Record.Field(ColTable{col_index_number - 1},
"Column1"),
/*Find closest match */
SortData = Table.Sort(table_array,
{{ColName_match, Order.Descending}}),
RenameLookupCol =
Table.RenameColumns(SortData,{{ColName_match, "Lookup"}}),
RemoveExcess = Table.SelectRows(
RenameLookupCol, each [Lookup] <= lookup_value),
ClosestMatch=
if Table.IsEmpty(RemoveExcess)=true
then "#N/A"
else Record.Field(RemoveExcess{0},"Lookup"),
/*What should be returned in case of approximate match? */
ClosestReturn=
if Table.IsEmpty(RemoveExcess)=true
then "#N/A"
else Record.Field(RemoveExcess{0},ColName_return),
/*Modify result if we need an exact match */
Return =
if matchtype=true
then ClosestReturn
else
if lookup_value = ClosestMatch
then ClosestReturn
else "#N/A"
in Return
The code is fairly long and complex, and it uses a variety of tricks, but the basic methodology is this:
1. Pull in the data table.
2. Sort it descending by the first column.
3. Remove all records greater than the value being searched for.
4. Return the value in the requested column for the first remaining record unless an exact match was specified.
5. If an exact match was specified, test to see if the return is a match. If it is, return the value. If it is not, return #N/A.
Note the following in the code:
It should go without saying that this is not a function you’ll knock out in a few minutes. It is long and complicated, and it took several hours of development and debugging in order to get it correct. It is, however, a fairly robust function in the way it works, and it showcases how to build complex functions using Power Query.
Replicating Power Pivot’s SWITCH() Function
Power Pivot has a function called SWITCH() that allows you to perform multi-condition logic by declaring a table of index values and results and then passing in a variable index value. The function then looks up the provided index value and returns the matching result. This is easier to maintain than several levels of nested IF() statements, so it sometimes makes sense to replicate the SWITCH() function in Power Query.
The syntax for this function in Power Pivot is as follows:
=SWITCH(expression,value_1,result_1,[value_2,result_2],…,[Else])
One example of where this can be really useful is when breaking down encoded patterns like customer billing codes, where each character represents something specific. Take a code like the MP010450SP, where the ninth character could be one of the following:
E = Employee, S = Yacht Club, N = Non-Taxable, R = Restricted,
I = Inactive, L = Social, M = Medical, U = Regular
To break this apart in Excel, you could build a function with many nested IF statements and build a VLOOKUP() based on the letter options. In Power Pivot, though, it’s much easier with the SWITCH() function, as follows:
=SWITCH([Column],"E","Employee","S","Yacht Club",
"N","Non-Taxable","R","Restricted","I","Inactive",
"L","Social","M","Medical","U","Regular","Undefined")
Note: There are several ways to accomplish this goal. You could, for example, extract just the ninth letter and merge the results against a table. The purpose of this section is to give you yet another alternative.
Building a Power Query SWITCH() Function
Building the function isn’t overly difficult once you know the basic structure. Here’s how you get started:
(input) =>
let
values = {
{result_1, return_value_1},
{input, "Undefined"}
},
Result = List.First(List.Select(values, each _{0}=input)){1}
in
Result
This code is the basic framework for any SWITCH() function. These are the key parts to recognize here:
Using this structure, you can modify the fnSWITCH function for your scenario as follows:
(input) =>
let
values = {
{"E", "Employee"},
{"S", "SCYC"},
{"N", "Non-Taxable"},
{"R", "Restricted"},
{"I", "Inactive"},
{"L", "Social"},
{"M", "Medical"},
{"U", "Regular"},
{input, "Undefined"}
},
Result = List.First(List.Select(values, each _{0}=input)){1}
in
Result
The changes you made here were simply to replace value_1 with “E” and return_value_1 with “Employee” and then to add more list pairs of potential inputs and desired values to return. Note that you’re not restricted to looking up single characters. You can look up values or longer text strings just as easily; just make sure that your options are always entered in pairs between curly braces and have a comma at the end of the line.
When you’re done making the modifications:
Now that the fnSWITCH() function has been created, you can use it to extract the billing type from each customer record in this file.
The data loads into Power Query as follows:
Figure 298 The raw data table.
The fnSWITCH() function is designed to convert the ninth character from the BillingCode into the corresponding customer type. In order to use it, you need to extract that character:
=fnSWITCH(Text.Range([BillingCode],8,1))
Note: Remember that you need to start at character 8 in order to get the ninth character because Power Query uses base 0 indexing for this parameter of the Text.Range function, as shown in Chapter 17.
The results are perfect:
Figure 299 Results of the fnSWITCH function.
You’ll find that all the sample codes in the data work just fine and that nothing comes back as undefined. Follow these steps to see how the query reacts to different items:
When you refresh the table, it evaluates the new character and returns the appropriate result. Here you can see what happens when the first record’s billing code is updated to MP010450XP:
Figure 300 X is not a valid character for the billing code.
Note that the function as written above is case sensitive, meaning that the code MP010450uP would also return Undefined, even though MP010450UP is a valid code. In the case of the original source of the data, this was entirely expected because valid codes are made up of uppercase letters.
If you wanted to accept either uppercase or lowercase, you would need to modify the Billing query (not the function) and force the results of the Text.Range function to uppercase:
=fnSWITCH(Text.Upper(Text.Range([BillingCode],8,1)))
As you can see below, this adjustment allows you to pass lowercase values into the function and still get a positive result:
Figure 301 The customer type is calculated consistently for uppercase and lowercase letters.