Understanding Lookup and Reference Functions

image with no caption

Lookup and reference functions help you use your own worksheet tables as sources of information to be used elsewhere in formulas. You can use three primary functions to look up information stored in a list or a table or to manipulate references: LOOKUP, VLOOKUP, and HLOOKUP. Some powerful lookup and reference functions in addition to these three are available; we describe many of them in the following sections. You can find a list of all these functions by clicking the Lookup & Reference button on the Formulas tab on the ribbon.

VLOOKUP and HLOOKUP are nearly identical functions that look up information stored in tables you have constructed. VLOOKUP and HLOOKUP operate in either vertical or horizontal orientation (respectively), but LOOKUP works either way.

When you look up information in a table, you usually use a row index and a column index to locate a particular cell. Excel derives the first index by finding the largest value in the first column or row that is less than or equal to a lookup value you supply and then uses a row number or column number argument as the other index. It is important to make sure the table is sorted by the row or column containing the lookup values first, or you will get unexpected results.

These functions take the following forms:

=VLOOKUP(lookup_value, table_array, col_index_num, range_lookup)
=HLOOKUP(lookup_value, table_array, row_index_num, range_lookup)

Table 14-1 lists LOOKUP function arguments and their descriptions. The LOOKUP function takes two forms; the first is called the vector form, and the second is called the array form:

=LOOKUP(lookup_value, lookup_vector, result_vector)
=LOOKUP(lookup_value, array)

The difference between the lookup functions is the type of table each function uses: VLOOKUP works only with vertical tables (tables arranged in columns); HLOOKUP works only with horizontal tables (tables arranged in rows). You can use the array form of LOOKUP with either horizontal tables or vertical tables, and you can use the vector form with single rows or columns of data.

The array form of LOOKUP determines whether to search horizontally or vertically based on the shape of the table defined in the array argument. If the table has more columns than rows, LOOKUP searches the first row for lookup_value; if the table has more rows than columns, LOOKUP searches the first column for lookup_value. LOOKUP always returns the last value in the row or column containing the lookup_value argument, or you can specify a row or column number using VLOOKUP or HLOOKUP.

For the VLOOKUP and HLOOKUP functions, whether Excel considers a lookup table to be vertical or horizontal depends on where the comparison values (the first index) are located. If the values are in the leftmost column of the table, the table is vertical; if they are in the first row of the table, the table is horizontal. (In contrast, LOOKUP uses the shape of the table to determine whether to use the first row or column as the comparison values.) The comparison values can be numbers or text, but it is essential that they be sorted in ascending order. No comparison value should be used more than once in a table.

The index_num argument (sometimes called the offset) provides the second index and tells the lookup function which column or row of the table to look in for the function’s result. The first column or row in the table has an index number of 1; therefore, the index_num argument must be greater than or equal to 1 and must never be greater than the number of rows or columns in the table. For example, if a vertical table is three columns wide, the index number can’t be greater than 3. If any value does not meet these rules, the function returns an error value.

You can use the VLOOKUP function to retrieve information from the table in Figure14-2.

Remember that these lookup functions usually search for the greatest comparison value that is less than or equal to the lookup value, not for an exact match between the comparison values and the lookup value. If all the comparison values in the first row or column of the table range are greater than the lookup value, the function returns the #N/A error value. If all the comparison values are less than the lookup value, however, the function returns the value that corresponds to the last (largest) comparison value in the table, which might not be what you want. If you require an exact match, type FALSE as the range_lookup argument.

The worksheet in Figure 14-3 shows an example of a horizontal lookup table using the HLOOKUP function.

The LOOKUP function is similar to VLOOKUP and HLOOKUP, follows the same rules, and is available in the same two forms, vector and array, whose arguments are described in Table 14-1.

Like HLOOKUP and VLOOKUP, the vector form of LOOKUP searches for the largest comparison value that isn’t greater than the lookup value. It then selects the result from the corresponding position in the specified result range. The lookup_vector and result_vector arguments are often adjacent ranges, but they don’t have to be when you use LOOKUP. They can be in separate areas of the worksheet, and one range can be horizontal and the other vertical. The only requirement is that they must have the same number of elements.

For example, consider the worksheet in Figure 14-4, where the ranges are not parallel. Both the lookup_vector argument, A1:A5, and the result_vector argument, D6:H6, have five elements. The lookup_value argument, 3, matches the entry in the third cell of the lookup_vector argument, making the result of the formula the entry in the third cell of the result range: 300.

The array form of LOOKUP is similar to VLOOKUP and HLOOKUP, but it works with either a horizontal table or a vertical table, using the dimensions of the table to figure out the location of the comparison values. If the table is taller than it is wide or the table is square, the function treats it as a vertical table and assumes that the comparison values are in the leftmost column. If the table is wider than it is tall, the function views the table as horizontal and assumes that the comparison values are in the first row of the table. The result is always in the last row or column of the specified table; you can’t specify column or row numbers.

Because HLOOKUP and VLOOKUP are more predictable and controllable, you’ll generally find using them preferable to using LOOKUP.

The INDEX function has two forms: an array form, which returns a value, and a reference form, which returns a cell reference. The forms of these functions are as follows:

=INDEX(array, row_num, column_num)
=INDEX(reference, row_num, column_num, area_num)

The array form works only with an array argument; it returns the value of the result, not the cell reference. The result is the value at the position in array indicated by row_num and column_num. For example, the formula

INDEX({10,20,30;40,50,60} , 1, 2)

returns the value 20, because 20 is the value in the cell in the second column and first row of the array.

The reference form returns a cell address instead of a value and is useful when you want to perform operations on a cell (such as changing the cell’s width) rather than on its value. This function can be confusing, however, because if an INDEX function is nested in another function, that function can use the value in the cell whose address is returned by INDEX. Furthermore, the reference form of INDEX doesn’t display its result as an address; it displays the value(s) at that cell address. Remember that the result is an address, even if it doesn’t look like one.

Here are a few guidelines to keep in mind when using the INDEX function:

Let’s consider some examples to see how all this works. Figure 14-5 shows an example of an INDEX function. The formula in cell A1 uses the row coordinate in cell A2 and the column coordinate in cell A3 to return the contents of the cell in the third row and second column of the specified range.

The following example is a bit trickier: Using the same worksheet as in Figure 14-5, the formula =INDEX(C3:E6, 0, 2) displays the #VALUE! error value because the row_num argument of 0 returns a reference to the entire column specified by the column_num argument of 2, or the range D3:D6. Excel can’t display a range as the result. However, try nesting this formula in another function as follows: =SUM(INDEX(C3:E6, 0, 2)). The result is 2600, the sum of the values in D3:D6. This illustrates the utility of obtaining a reference as a result.

Now we’ll show how the INDEX function works with multiple ranges in the reference argument. (When you’re using more than one range, you must enclose the argument in parentheses.) For example, in the formula =INDEX((A1:C5,D6:F10), 1, 1, 2), the reference range comprises two areas: A1:C5 and D6:F10. The area_num argument (2) tells INDEX to work on the second of these areas. This formula returns the address D6, which is the cell in the first column and first row of the range D6:F10. The displayed result is the value in that cell.