FUN WITH AVERAGES

An average can be a helpful summary statistic, even easier to digest than a pie chart, allowing us to characterize a very large amount of information with a single number. We might want to know the average wealth of the people in a room to know whether our fund-raisers or sales managers will benefit from meeting with them. Or we might want to know the average price of gas to estimate how much it will cost to drive from Vancouver to Banff. But averages can be deceptively complex.

There are three ways of calculating an average, and they often yield different numbers, so people with statistical acumen usually avoid the word average in favor of the more precise terms mean, median, and mode. We don’t say “mean average” or “median average” or simply just “average”—we say mean, median, or mode. In some cases, these will be identical, but in many they are not. If you see the word average all by itself, it’s usually indicating the mean, but you can’t be certain.

The mean is the most commonly used of the three and is calculated by adding up all the observations or reports you have and dividing by the number of observations or reports. For example, the average wealth of the people in a room is simply the total wealth divided by the number of people. If the room has ten people whose net worth is $100,000 each, the room has a total net worth of $1 million, and you can figure the mean without having to pull out a calculator: It is $100,000. If a different room has ten people whose net worth varies from $50,000 to $150,000 each, but totals $1 million, the mean is still $100,000 (because we simply take the total $1 million and divide by the ten people, regardless of what any individual makes).

The median is the middle number in a set of numbers (statisticians call this set a “distribution”): Half the observations are above it and half are below. Remember, the point of an average is to be able to represent a whole lot of data with a single number. The median does a better job of this when some of your observations are very, very different from the majority of them, what statisticians call outliers.

If we visit a room with nine people, suppose eight of them have a net worth of near $100,000 and one person is on the verge of bankruptcy with a net worth of negative $500,000, owing to his debts. Here’s the makeup of the room:

Person 1: −$500,000

Person 2:  $96,000

Person 3:  $97,000

Person 4:  $99,000

Person 5:  $100,000

Person 6:  $101,000

Person 7:  $101,000

Person 8:  $101,000

Person 9:  $104,000

Now we take the sum and obtain a total of $299,000. Divide by the total number of observations, nine, and the mean is $33,222 per person. But the mean doesn’t seem to do a very good job of characterizing the room. It suggests that your fund-raiser might not want to visit these people, when it’s really only one odd person, one outlier, bringing down the average. This is the problem with the mean: It is sensitive to outliers.

The median here would be $100,000: Four people make less than that amount, and four people make more. The mode is $101,000, the number that appears more often than the others. Both the median and the mode are more helpful in this particular example.

There are many ways that averages can be used to manipulate what you want others to see in your data.

Let’s suppose that you and two friends founded a small start-up company with five employees. It’s the end of the year and you want to report your finances to your employees, so that they can feel good about all the long hours and cold pizzas they’ve eaten, and so that you can attract investors. Let’s say that four employees—programmers—each earned $70,000 per year, and one employee—a receptionist/office manager—earned $50,000 per year. That’s an average (mean) employee salary of $66,000 per year (4 × $70,000) + (1 × $50,000), divided by 5. You and your two friends each took home $100,000 per year in salary. Your payroll costs were therefore (4 × $70,000) + (1 × $50,000) + (3 × $100,000)=$630,000. Now, let’s say your company brought in $210,000 in profits and you divided it equally among you and your co-founders as bonuses, giving you $100,000 + $70,000 each. How are you going to report this?

You could say:

Average salary of employees: $66,000

Average salary + profits of owners: $170,000

This is true but probably doesn’t look good to anyone except you and your mom. If your employees get wind of this, they may feel undercompensated. Potential investors may feel that the founders are overcompensated. So instead, you could report this:

Average salary of employees: $66,000

Average salary of owners: $100,000

Profits: $210,000

That looks better to potential investors. And you can just leave out the fact that you divided the profits among the owners, and leave out that last line—that part about the profits—when reporting things to your employees. The four programmers are each going to think they’re very highly valued, because they’re making more than the average. Your poor receptionist won’t be so happy, but she no doubt knew already that the programmers make more than she does.

Now suppose you are feeling overworked and want to persuade your two partners, who don’t know much about critical thinking, that you need to hire more employees. You could do what many companies do, and report the “profits per employee” by dividing the $210,000 profit among the five employees:

Average salary of employees: $66,000

Average salary of owners: $100,000

Annual profits per employee: $42,000

Now you can claim that 64 percent of the salaries you pay to employees (42,000/66,000) comes back to you in profits, meaning you end up only having to pay 36 percent of their salaries after all those profits roll in. Of course, there is nothing in these figures to suggest that adding an employee will increase the profits—your profits may not be at all a function of how many employees there are—but for someone who is not thinking critically, this sounds like a compelling reason to hire more employees.

Finally, what if you want to claim that you are an unusually just and fair employer and that the difference between what you take in profits and what your employees earn is actually quite reasonable? Take the $210,000 in profits and distribute $150,000 of it as salary bonuses to you and your partners, saving the other $60,000 to report as “profits.” This time, compute the average salary but include you and your partners in it with the salary bonuses.

Average salary: $97,500

Average profit of owners: $20,000

Now for some real fun:

Total salary costs plus bonuses: $840,000

Salaries: $780,000

Profits: $60,000

That looks quite reasonable now, doesn’t it? Of the $840,000 available for salaries and profits, only $60,000 or 7 percent went into owners’ profits. Your employees will think you above reproach—who would begrudge a company owner from taking 7 percent? And it’s actually not even that high—the 7 percent is divided among the three company owners to 2.3 percent each. Hardly worth complaining about!

You can do even better than this. Suppose in your first year of operation, you had only part-time employees, earning $40,000 per year. By year two, you had only full-time employees, earning the $66,000 mentioned above. You can honestly claim that average employee earnings went up 65 percent. What a great employer you are! But here you are glossing over the fact that you are comparing part-time with full-time. You would not be the first: U.S. Steel did it back in the 1940s.

•   •   •

In criminal trials, the way the information is presented—the framing—profoundly affects jurors’ conclusions about guilt. Although they are mathematically equivalent, testifying that “the probability the suspect would match the blood drops if he were not their source is only 0.1 percent” (one in a thousand) turns out to be far more persuasive than saying “one in a thousand people in Houston would also match the blood drops.”

Averages are often used to express outcomes, such as “one in X marriages ends in divorce.” But that doesn’t mean that statistic will apply on your street, in your bridge club, or to anyone you know. It might or might not—it’s a nationwide average, and there might be certain vulnerability factors that help to predict who will and who will not divorce.

Similarly, you may read that one out of every five children born is Chinese. You note that the Swedish family down the street already has four children and the mother is expecting another child. This does not mean she’s about to give birth to a Chinese baby—the one out of five children is on average, across all births in the world, not the births restricted to a particular house or particular neighborhood or even particular country.

Be careful of averages and how they’re applied. One way that they can fool you is if the average combines samples from disparate populations. This can lead to absurd observations such as:

On average, humans have one testicle.

This example illustrates the difference between mean, median, and mode. Because there are slightly more women than men in the world, the median and mode are both zero, while the mean is close to one (perhaps 0.98 or so).

Also be careful to remember that the average doesn’t tell you anything about the range. The average annual temperature in Death Valley, California, is a comfortable 77 degrees F (25 degrees C). But the range can kill you, with temperatures ranging from 15 degrees to 134 degrees on record.

Or . . . I could tell you that the average wealth of a hundred people in a room is a whopping $350 million. You might think this is the place to unleash a hundred of your best salespeople. But the room could have Mark Zuckerberg (net worth $35 billion) and ninety-nine people who are indigent. The average can smear across differences that are important.

Another thing to watch out for in averages is the bimodal distribution. Remember, the mode is the value that occurs most often. In many biological, physical, and social datasets, the distribution has two or more peaks—that is, two or more values that appear more than the others.

For example, a graph like this might show the amount of money spent on lunches in a week (x-axis) and how many people spent that amount (y-axis). Imagine that you’ve got two different groups of people in your survey, children (left hump—they’re buying school lunches) and business executives (right hump—they’re going to fancy restaurants). The mean and median here could be a number somewhere right between the two, and would not tell us very much about what’s really going on—in fact, the mean and median in many cases are amounts that nobody spends. A graph like this is often a clue that there is heterogeneity in your sample, or that you are comparing apples and oranges. Better here is to report that it’s a bimodal distribution and report the two modes. Better yet, subdivide the group into two groups and provide statistics for each.

But be careful drawing conclusions about individuals and groups based on averages. The pitfalls here are so common that they have names: the ecological fallacy and the exception fallacy. The ecological fallacy occurs when we make inferences about an individual based on aggregate data (such as a group mean), and the exception fallacy occurs when we make inferences about a group based on knowledge of a few exceptional individuals.

For example, imagine two small towns, each with only one hundred people. Town A has ninety-nine people earning $80,000 a year, and one super-wealthy person who struck oil on her property, earning $5,000,000 a year. Town B has fifty people earning $100,000 a year and fifty people earning $140,000. The mean income of Town A is $129,200 and the mean income of Town B is $120,000. Although Town A has a higher mean income, in ninety-nine out of one hundred cases, any individual you select randomly from Town B will have a higher income than an individual selected randomly from Town A. The ecological fallacy is thinking that if you select someone at random from the group with the higher mean, that individual is likely to have a higher income. The neat thing is, in the examples above, that it’s not just the mean that is higher in Town B but also the median and the mode. (It doesn’t always work out that way.)

As another example, it has been suggested that wealthy individuals are more likely to vote Republican, but evidence shows that the wealthier states tend to vote Democratic. The wealth of those wealthier states may be skewed by a small percentage of super-wealthy individuals. During the 2004 U.S. presidential election, the Republican candidate, George W. Bush, won the fifteen poorest states, and the Democratic candidate, John Kerry, won nine of the eleven wealthiest states. However, 62 percent of those with annual incomes over $200,000 voted for Bush, whereas only 36 percent of voters with annual incomes of $15,000 or less voted for Bush.

As an example of the exception fallacy, you may have read that Volvos are among the most reliable automobiles and so you decide to buy one. On your way to the dealership, you pass a Volvo mechanic and find a parking lot full of Volvos in need of repair. If you change your mind about buying a Volvo based on seeing this, you’re using a relatively small number of exceptional cases to form an inference about the entire group. No one was claiming that Volvos never need repair, only that they’re less likely to in the aggregate. (Hence the ubiquitous cautionary note in advertising that “individual performance may vary.”) Note also that you’re being unduly influenced by this in another way: The one place that Volvos needing repair will be is at a Volvo mechanic. Your “base rate” has shifted, and you cannot consider this a random sample.

Now that you’re an expert on averages, you shouldn’t fall for the famous misunderstanding that people tended not to live as long a hundred years ago as they do today. You’ve probably read that life expectancy has steadily increased in modern times. For those born in 1850, the average life expectancy for males and females was thirty-eight and forty years respectively, and for those born in 1990 it is seventy-two and seventy-nine. There’s a tendency to think, then, that in the 1800s there just weren’t that many fifty- and sixty-year-olds walking around because people didn’t live that long. But in fact, people did live that long—it’s just that infant and childhood mortality was so high that it skewed the average. If you could make it past twenty, you could live a long life back then. Indeed, in 1850 a fifty-year-old white female could expect to live to be 73.5, and a sixty-year-old could expect to live to be seventy-seven. Life expectancy has certainly increased for fifty- and sixty-year-olds today, by about ten years compared to 1850, largely due to better health care. But as with the examples above of a room full of people with wildly different incomes, the changing averages for life expectancy at birth over the last 175 years reflect significant differences in the two samples: There were many more infant deaths back then pulling down the average.

Here is a brain-twister: The average child usually doesn’t come from the average family. Why? Because of shifting baselines. (I’m using “average” in this discussion instead of “mean” out of respect for a wonderful paper on this topic by James Jenkins and Terrell Tuten, who used it in their title.)

Now, suppose you read that the average number of children per family in a suburban community is three. You might conclude then that the average child must have two siblings. But this would be wrong. This same logical problem applies if we ask whether the average college student attends the average-sized college, if the average employee earns the average salary, or if the average tree comes from the average forest. What?

All these cases involve a shift of the baseline, or sample group we’re studying. When we calculate the average number of children per family, we’re sampling families. A very large family and a small family each count as one family, of course. When we calculate the average (mean) number of siblings, we’re sampling children. Each child in the large family gets counted once, so that the number of siblings each of them has weighs heavily on the average for sibling number. In other words, a family with ten children counts only one time in the average family statistic, but counts ten times in the average number of siblings statistic.

Suppose in one neighborhood of this hypothetical community there are thirty families. Four families have no children, six families have one child, nine families have two children, and eleven families have six children. The average number of children per family is three, because ninety (the total number of children) gets divided by thirty (the total number of families).

But let’s look at the average number of siblings. The mistake people make is thinking that if the average family has three children, then each child must have two siblings on average. But in the one-child families, each of the six children has zero siblings. In the two-child families, each of the eighteen children has one sibling. In the six-child families each of the sixty-six children has five siblings. Among the 90 children, there are 348 siblings. So although the average child comes from a family with three children, there are 348 siblings divided among 90 children, or an average of nearly four siblings per child.

Families

# Children/
Family

Total #
Children

Siblings

4

0

0

0

6

1

6

0

9

2

18

18

11

6

66

330

Totals

30

90

348

Average children per family: 3.0

Average siblings per child: 3.9

Consider now college size. There are many very large colleges in the United States (such as Ohio State and Arizona State) with student enrollment of more than 50,000. There are also many small colleges, with student enrollment under 3,000 (such as Kenyon College and Williams College). If we count up schools, we might find that the average-sized college has 10,000 students. But if we count up students, we’ll find that the average student goes to a college with greater than 30,000 students. This is because, when counting students, we’ll get many more data points from the large schools. Similarly, the average person doesn’t live in the average city, and the average golfer doesn’t shoot the average round (the total strokes over eighteen holes).

These examples involve a shift of baseline, or denominator. Consider another involving the kind of skewed distribution we looked at earlier with child mortality: The average investor does not earn the average return. In one study, the average return on a $100 investment held for thirty years was $760, or 7 percent per year. But 9 percent of the investors lost money, and a whopping 69 percent failed to reach the average return. This is because the average was skewed by a few people who made much greater than the average—in the figure below, the mean is pulled to the right by those lucky investors who made a fortune.

Payoff outcomes for return on a $100 investment over thirty years. Note that most people make less than the mean return, and a lucky few make more than five times the mean return.