4 Table Formulas

This chapter covers referencing Tables in formulas, both internal (within the Table) and external (outside the Table). It also covers some methods for dynamically referencing Table ranges.

Referencing Tables

Structured referencing is only as good as the naming conventions used to name a Table and the columns in the Table. It makes range references more readable—as long as both the Table names and Table column names are meaningful. If a Table is named "Table1" and the columns are named "Column 1", "Column 2", etc., then a structured reference using those names will not be much better than traditional A1 or R1C1 referencing. The best names are meaningful and of a reasonable length—not too short and not too long. Name length is a matter of preference, and your preference will develop as you learn more about the syntax of structured referencing.

Understanding how structured references work starts with understanding their syntax, which is shown in the next figure.

Structured reference syntax.

The following descriptions correspond to the numbers in the figure:

1. This is a single structured reference, representing a range of cells.

2. This is the Table name. If the structured reference is in a cell inside the Table, then the Table name is not required. If you don't enter the Table name when specifying multiple columns, Excel adds it automatically when you enter the structured reference. The Table name references the entire data body range, excluding header and total rows.

3. The Table reference specifies the range in the Table by specifying the column name(s) and, optionally, a special identifier that modifies the column name reference.

4. The special identifier modifies the column name(s) reference to refer to specific predefined areas of the Table:

[#All] refers to the entire Table, including the header and total rows, if they're enabled.

[#Headers] refers to the header row, if it's enabled.

[#Data] refers to the data body range if any data rows exist.

[#Totals] refers to the total row, if it's enabled.

[#This Row] refers to the same row in which the structured reference is entered. In Excel 2010, this special identifier was deprecated and the @ symbol used instead. (This is discussed in more detail later in this chapter.)

5. The column name refers to a specific Table column or a range of columns.

Implicit Intersections

When working with Tables and structured references, understanding implicit intersections is an important part of understanding how structured references function. Sometimes referred to as relative intersections, implicit intersections have been a feature of Excel for many years.

An implicit intersection occurs when a reference in a cell refers to a single column of cells in another column, with the cell containing the reference in a row that intersects the referenced column of cells. With an implicit intersection, Excel resolves the reference to the single cell in the referenced column of cells that is in the same row as the cell containing the reference.

Excel does not resolve the reference down to a single cell if the cell containing the reference to the column of cells is not in a row that intersects the referenced column of cells. It also does not resolve the reference if the reference is being passed to a function that expects an array of cells. In the former case, Excel returns a #VALUE! error.

The next figure illustrates an implicit intersection with a reference to a column in a Table that uses structured referencing.

An example of an implicit intersection.

The advantage with implicit intersections is that the same formula is used in multiple cells, but the formula takes the value from a single cell in the same row where it intersects the referenced column of cells.

Advantage of Structured References

Besides encouraging the use of implicit intersections, which eases formula maintenance, structured references have the distinct advantage of being easier to read. Before structured references, Excel users relied on named ranges to make formulas easier to read and understand. With Tables, Excel is more diligent about encouraging the use of names versus column letters and row numbers. Consider this formula using regular referencing:

=A2-B2

The strings A2 and B2 don't tell you what the formula really refers to, and you need to look further to understand the purpose of the formula. When structured references are used with reasonably well-named Tables and Table columns, the purpose of the formula is clearer, without additional discovery. For example, these more meaningful names make the formula easier to understand and maintain:

=[@Income]-[@Expenses]

NOTE

When you’re using a structured reference, if a Table column name includes manual line breaks (ALT+ENTER) and other special characters, you must include those special characters in the structured reference. (Use of special characters in column names is covered later in this chapter.) When you change column headers, Excel automatically ensures that all structured references that refer to those column names are also changed.

Creating Formulas

Clicking on cells when entering formulas is an easy way to get Excel to enter cell references instead of typing them. You can also do this with Tables, and Excel creates the structured reference for you. If Excel cannot represent the selected cells with a structured reference, it enters a standard reference. For example, in the next figure, the referenced cell is one row below the cell into which the formula is being entered.

Referencing a Table cell that cannot be represented as a structured reference.

In contrast, if the referenced cell is in the same row as the cell in which the formula is being entered, Excel creates a structured reference, as shown in the next figure.

Referencing a Table cell that can be represented as a structured reference.

CAUTION

When you enter a formula with multiple cells selected, Excel inserts standard references instead of structured references. The workaround is to enter a formula with only one cell selected.

CAUTION

If the SUBTOTAL function is used in a calculated column, each time the Table is filtered, additional rows are excluded from the filter for no reason, even though they are included in the filter criteria. This is a confirmed bug in Excel 2013. The workaround is to not use the SUBTOTAL function in a calculated column.

Referencing Header Rows

You reference a header row by using the [#Headers] special identifier. For example, this reference resolves to the header cell in the column "Category":

=tblRegister[[#Headers],[Category]]

When you reference the header row in formulas, the header row must be enabled, or the formula will return #REF!. This might not seem obvious at first because a Table always has headers, even when the header row is disabled. But when the header row is disabled, there is no range to reference, and thus Excel resolves any reference to it as a reference error.

A formula referencing a visible Table header in the Table named "Web_Table".

A formula referencing a disabled Table header in the Table named "Web_Table".

Referencing Total Rows

You reference the total row in a Table by using the [#Totals] special identifier. For example, this reference resolves to the total cell in the column "Category":

=tblRegister[[#Totals],[Category]]

When you reference the total row in a formula, the total row must be enabled, or the formula will return #REF!.

Unlike a header row, a total row can contain functions. Excel provides some built-in functions that work well with Tables and take into consideration rows hidden with the Table filter function. In addition to using these built-in functions, you can enter any formula or constant into a total row cell, and it can reference any Table part or, for that matter, any cell outside the Table.

NOTE

You cannot use data validation in a total row.

Excel shows a list of built-in functions if you select a total row cell and click the drop-down menu button. These built-in functions utilize the SUBTOTAL function. When you select one of these aggregation functions, Excel places the SUBTOTAL function in the total row cell with the appropriate parameters to produce the desired result. The table below lists the available aggregation functions and their respective SUBTOTAL function numbers.

Total Row Function

SUBTOTAL Function Number

None

Clears the cell

Average

101

Count

103

Count Numbers

102

Max

104

Min

105

Total Row Function

SUBTOTAL Function Number

None

Clears the Cell

Sum

109

StdDev

107

Var

110

More Functions…

Brings up the Function Wizard dialog

The SUBTOTAL function is not specific to Tables, but because you will see it used in the total row by default to calculate all the aggregation functions, we include a description below. The following is the syntax for the SUBTOTAL function:

SUBTOTAL(function_num,ref1,[ref2]…)

These are the various parts of the syntax:

In addition:

Referencing Columns

Each column reference is enclosed in square brackets:

tblRegister[Category]

The default behavior of this reference is to use implicit intersections to resolve the reference. When you pass it to a function that expects an array, Excel resolves it to the Table's column.

When you reference multiple columns, you use a second set of square brackets, with a colon separating the two column references, just as you do with regular references:

tblRegister[[Description]:[Category]]

This reference is always resolved to entire data area columns. To constrain the reference to just a row, you use an @ symbol in front of the first reference:

tblRegister[@[Description]:[Category]]

When you AutoFill formulas across columns, Excel treats structured references as relative references and shifts them across columns. However, it does not treat multiple-column references as relative references; it treats them as absolute references. Therefore, to prevent Excel from shifting single-column references, you can construct a single-column structured reference as if it were a multiple-column reference:

tblRegister[[Category]:[Category]]

We cover absolute referencing in structured formulas in more detail later in this chapter.

The [#This Row] or @ Special Identifier

In Excel 2007, to constrain a column reference to the row in which the reference resides, you precede that column reference with [#This Row]:

tblRegister[[#This Row][Description]]

Excel 2010 replaced [#This Row] with @:

tblRegister[@[Description]]

In the context of implicit intersections, the @ symbol is taking away some of the implicit nature of the syntax and stating explicitly "use the value from the column in this row."

Using IntelliSense

IntelliSense ("intelligent sense") is a helpful feature that makes formula entry easier. When you enter text in the formula bar, Excel presents a pop-up list of possible objects or functions you might want to use. You can select an object or function from this pop-up list to insert it into the formula. If you're using the keyboard rather than the mouse to enter a formula, you can use arrow keys to select the desired object or function and then press the TAB key to enter the selection into the formula.

When you enter a formula into a Table's cell, IntelliSense presents all the Table parts that are legal, given the syntax entered to that point.

Structured reference IntelliSense.

Using Absolute Structured Referencing

Absolute referencing and relative referencing are concepts that describe how Excel modifies references to cells as a reference is moved to other locations on a worksheet. With an absolute reference, the target does not move as the reference is moved. With a relative reference, the target moves in the same way the reference moves.

For example, with traditional A1 referencing, placing a dollar sign ($) in front of both of the column letter and the row number establishes that reference as an absolute reference (for example, $A$1). No matter where the reference is moved, the reference will always refer to A1. Without the $ signs, the target shifts with the reference.

A reference can be a mixed reference, with a $ sign in front of the column but not the row, in which case the column is not changed, and the row shifts. If the $ sign is in front of the row but not the column, the row is not changed, but the column shifts.

Structured references are naturally relative references, and there is no obvious way to make them absolute references. With regard to rows, structured references—and Tables in general—do not support the notion of specific rows, nor are they of much concern because, most of the time, formulas are concerned only about other cells in the same row. How Excel manipulates these references when you move or copy formulas depends on the method used. You can move or copy formulas from one column to another by using four different methods:

Not only does Excel shift a structured reference's target in some cases, when it shifts a target, it does so in a way that is not always useful. Excel doesn't always shift the target in the same direction as the reference; when it reaches the edge of the Table, it moves to the other edge and then continues across the Table again, like a loop. Remembering the circumstances under which references' targets are shifted and how they are shifted can be confusing.

Excel does not provide an obvious way to make a structured reference an absolute reference in order to avoid shifting reference targets. However, you can use a couple tricks to force a structured reference to be an absolute reference so that Excel does not move the target when the reference is moved. One trick is to use OFFSET or INDIRECT, but there are some drawbacks to using these functions. There is a way to create absolute structured references without additional formulas.

To create an absolute structured reference, you can use a multiple-column structured reference that targets a single column:

Table1[[Column1]:[Column1]]

This reference's target ("Column1") will not move, regardless of how the formula is copied. It is equivalent to the traditional A1 reference with absolute column referencing:

$A1

Unlike with standard cell referencing, you cannot toggle the absolute/relative syntax by using the F4 key.

The following figure shows a Table named "tblData" that is used in the following examples.

Sample data and method provided courtesy of Jon Acampora, of www.excelcampus.com.

This example shows how to accomplish a simple summary of a Table by using formulas. It demonstrates why absolute referencing is advantageous when you're developing and copying formulas using structured referencing to reference data in a Table.

In this case, you want to summarize all units by quarter and color, using the layout shown in the next figure, for the Table named "tblSum".

Formula destination range: B2:E4.

You want to build a formula that can be copied from column to column without some of the targets changing so that you won't have to build multiple formulas. Using the SUMIF function, the formula is:

=SUMIF(tblData[[Color]:[Color]],tblSum[[Color]:[Color]],tblData[Q1 Units])

NOTE

The SUMIF formula’s syntax is:

SUMIF(range, criteria, [sum_range])

where range is the range of cells to compare against the criteria, criteria is the criteria to use to select which cells to sum, and sum_range is the range of cells (optional) to sum if the criteria are satisfied in that row.

You use these references in the formula:

Placing the formula in cell B2 in the "tblSum" Table and using the fill handle to copy it across to the other three columns yields these formulas:

=SUMIF(tblData[[Color]:[Color]],tblSum[[Color]:[Color]],tblData[Q1 Units])

=SUMIF(tblData[[Color]:[Color]],tblSum[[Color]:[Color]],tblData[Q2 Units])

=SUMIF(tblData[[Color]:[Color]],tblSum[[Color]:[Color]],tblData[Q3 Units])

=SUMIF(tblData[[Color]:[Color]],tblSum[[Color]:[Color]],tblData[Q4 Units])

Note that the first two references did not shift but the last reference, sum_range, did shift one column to the right for each column to the right the formula was copied.

NOTE

The technique described here requires the use of the fill handle to copy the formulas across columns. Using copy and paste will not shift the last parameter’s target.

Disabling Structured Referencing

To avoid structured references altogether, you can select FILE | Options | Formulas | Use Table names in formulas and turn off the Use table names in formulas option.

Disabling structured references by turning off the Use table names in formulas option.

Structured Reference Operators

As with standard cell referencing, several operators work identically with structured references:

The following table provides some examples using the three different reference operators and the equivalent standard reference. The Table name is "Table1", the Table header is in row 1, and the data body range starts in row 2 and ends in row 20. Columns are in this order, beginning in column A: "OrderDate", "Region", "Rep", "Item", "Cost".

Reference

Refers to

Type

=Table1[[OrderDate]:[Rep]]

A2:C20

Range

=Table1[Region],Table1[Item]

B2:B20,D2:D20

Union

=Table1[[Rep]:[Units]] Table1[[Item]:[Cost]]

D2:E20

Intersection

Special Characters Used in Structured References

When a column name contains any of the following special characters, the name must be enclosed in an extra set of square brackets:

The characters marked in this list with a † have special meaning and so require an escape character in order to be properly recognized. The escape character is an apostrophe ( ' ):

[Order '#]

When a column name contains any of the special characters listed above and the @ symbol is used to keep the target of the reference on the same row as the reference, an added layer of square brackets surrounding the text is required, as shown below with the column name "$ Amount":

[@[$ Amount]]

Dynamic Referencing

You use dynamic referencing to create named ranges that are dynamic (that is, change depending on conditions) and that can be used in conditional formatting and validation lists. Here are some examples:

Named ranges, which are also called named formulas, are useful even when they're not dynamic because Excel does not allow structured referencing in conditional formatting rules or validation lists. The following example demonstrates the steps for creating a simple named dynamic range that is used in a validation list, based on a column in a Table:

1. Create a new Table and name it "tblValues". Name one of the columns "Value" and enter two or more unique values in the column.

2. Select the data area range portion of the "Value" column. If the header or total rows are showing, do not select them; select just the data body range of the column.

TIP

You can move the mouse over the top edge of the Table’s column header (not the worksheet header) to turn the cursor into a bold down arrow. You can click there to select the entire Table data area range column (excluding the header and total rows). You can click again to select the Table’s column, including the header row and total row. Subsequent clicks toggle between these two selections.

3. Select FORMULAS | Defined Names, Name Manager (or use the keyboard shortcut CTRL+F3) to open the Name Manager dialog box.

Name Manager dialog box.

4. Click the New button to open the New Name dialog box.

New Name dialog box.

5. Enter "ValidValues" in the Name text box. Note that the structured reference to the Table's "Value" column is already entered into the Refers to text box. Click OK.

6. Select the cell to which the validation list is to be applied and select DATA | Data Tools | Data Validation | Data Validation. Select "List" from the Allow combo box, enter "=ValidValues" in the Source text box, and click OK.

Creating a validation list using a named dynamic reference.

A validation list that contains the values from the Table's "Value" column is now in place in the selected cell.

Using the Current Row Number

The following formula displays the row number in a Table where the row number is relative to the first row in the Table and not the worksheet's row number:

=ROW()-ROW(Table1[[#Headers],[Column1]])

The row number can also be useful when you're referencing the same row in another Table. The formula works by subtracting the worksheet's row number in which the Table header resides from the worksheet's row number in which the formula resides. The ROW formula returns the worksheet row number of the first row of the range reference passed to it, or the worksheet row number of the row in which the formula resides if no range reference is passed to it.

If the first data row is always in a specific worksheet row, a more direct approach is to use a constant instead of passing the Table's header row to the ROW function:

=ROW()-1

Calculating a Running Total

The following formula displays a simple running total of another column in the same Table:

=SUM(tblRegister[[#Headers],[Amount]]:[@Amount])

This formula assumes that the header row is enabled; the SUM function ignores the text value in the header. If the header row of the referenced column is not enabled, the above example returns #REF!. To avoid this, reference the first cell of the Table column by using the INDEX function:

=SUM(INDEX([Amount],1):[@Amount])

CAUTION

You can implement this example by using standard referencing as =SUM(D$2:D2), assuming that Amount is in column D. You use absolute row referencing to keep the first part of the range absolute, while the second part shifts as the formula is copied down rows. However, there is a known issue when using absolute standard row referencing (A$1) in a Table: When new rows are added, Excel mangles the reference and the column formatting, rendering the formula in the new row and preceding rows in error. The workaround is to find an alternative approach that avoids absolute standard row referencing, as demonstrated in the preceding example.

If the amounts used to calculate the running total are in two columns, one for debits and one for credits, then the formula is:

=SUM(INDEX([Credit],1):[@Credit])-SUM(INDEX([Debit],1):[@Debit])

Using structured references can have a negative impact on calculation performance on large data sets, as in the example above. In the context of these examples, a large data set is anything over around 100,000 rows of Table data on most modern computers. The problem lies in how structured references are used: Each row sums all the rows above, including the same row, to get the running total for that row.

The next formula example utilizes standard cell referencing to improve calculation performance by working with only three cells in the row above and in the same row; this avoids repeated summing of thousands of rows. From the previous formula, it is assumed that Credit is in column C, Debit is in column D, the running total calculation is in column E, and the headers are in row 1. The formula in cell E2 (the running total in the first data row) is:

=N(E1)+C2-D2

When copied down the length of the column, the formula takes the running total from the previous row, adds the credit, and subtracts the debit on the same row to get a new running total for that row. This assumes that debits are entered as positive values, as they are on an accounting ledger.

Why use the N function? In the first row, the formula uses the header value in a numeric calculation; if the header is a text value, the formula produces a #VALUE! error. The N function solves this problem by taking the parameter passed to it and returning the same value if the parameter is numeric or zero if the parameter is text.

Another way to solve this problem is to use the SUM function, which ignores text values (effectively treating them as zero values):

=SUM(E1,C2,-D2)