You’re trying to decide whether to buy stock in a new soft drink and you come across this graph of the company’s sales figures in their annual report:
This looks promising—Peachy Cola is steadily increasing its sales. So far, so good. But a little bit of world knowledge can be applied here to good effect. The soft-drink market is very competitive. Peachy Cola’s sales are increasing, but maybe not as quickly as a competitor’s. As a potential investor, what you really want to see is how Peachy’s sales compare to those of other companies, or to see their sales as a function of market share—Peachy’s sales could go up only slightly while the market is growing enormously, and competitors are benefiting more than Peachy is. And, as this example of a useful double y-axis graph demonstrates, this may not bode well for their future:
Although unscrupulous graph makers can monkey with the scaling of the right-hand axis to make the graph appear to show anything they want, this kind of double-y-axis graph isn’t scandalous because the two y-axes are representing different things, quantities that couldn’t share an axis. This was not the case with the Planned Parenthood graph here, which was reporting the same quantity on the two different axes, the number of performed procedures. That graph was distorted by ensuring that the two axes, although they measure the same thing, were scaled differently in order to manipulate perception.
It would also be useful to see Peachy’s profits: Through manufacturing and distribution efficiencies, it may well be that they’re making more money on a lower sales volume. Just because someone quotes you a statistic or shows you a graph, it doesn’t mean it’s relevant to the point they’re trying to make. It’s the job of all of us to make sure we get the information that matters, and to ignore the information that doesn’t.
Let’s say that you work in the public-affairs office for a company that manufactures some kind of device—frabezoids. For the last several years, the public’s appetite for frabezoids has been high, and sales have increased. The company expanded by building new facilities, hiring new employees, and giving everyone a raise. Your boss comes into your cubicle with a somber-looking expression and explains that the newest sales results are in, and frabezoid sales have dropped 12 percent from the previous quarter. Your company’s president is about to hold a big press conference to talk about the future of the company. As is his custom, he’ll display a large graph on the stage behind him showing how frabezoids are doing. If word gets out about the lower sales figures, the public may think that frabezoids are no longer desirable things to have, which could then lead to an even further decline in sales.
What do you do? If you graph the sales figures honestly for the past four years, your graph would look like this:
That downward trend in the curve is the problem. If only there were a way to make that curve go up.
Well, there is! The cumulative sales graph. Instead of graphing sales per quarter, graph the cumulative sales per quarter—that is, the total sales to date.
As long as you sold only one frabezoid, your cumulative graph will increase, like this one here:
If you look carefully, you can still see a vestige of the poor sales for last quarter: Although the line is still going up for the most recent quarter, it’s going up less steeply. That’s your clue that sales have dropped. But our brains aren’t very good at detecting rates of change such as these (what’s called the first derivative in calculus, a fancy name for the slope of the line). So on casual examination, it seems the company continues to do fabulously well, and you’ve made a whole lot of consumers believe that frabezoids are still the hottest thing to have.
This is exactly what Tim Cook, CEO of Apple, did recently in a presentation on iPhone sales.
© 2013 The Verge, Vox Media Inc. (live.theverge.com/apple-iphone-5s-liveblog/)
There are so many things going on in the world that some coincidences are bound to happen. The number of green trucks on the road may be increasing at the same time as your salary; when you were a kid, the number of shows on television may have increased with your height. But that doesn’t mean that one is causing the other. When two things are related, whether or not one causes the other, statisticians call it a correlation.
The famous adage is that “correlation does not imply causation.” In formal logic there are two formulations of this rule:
1) Post hoc, ergo propter hoc (after this, therefore because of this). This is a logical fallacy that arises from thinking that just because one thing (Y) occurs after another (X), that X caused Y. People typically brush their teeth before going off to work in the morning. But brushing their teeth doesn’t cause them to go to work. In this case, it is even possibly the reverse.
2) Cum hoc, ergo propter hoc (with this, therefore because of this). This is a logical fallacy that arises from thinking that just because two things co-occur, one must have caused the other. To drive home the point, Harvard Law student Tyler Vigen has written a book and a website that feature spurious co-occurrences—correlations—such as this one:
There are four ways to interpret this: (1) drownings cause the release of new Nicolas Cage films; (2) the release of Nicolas Cage films causes drownings; (3) a third (as yet unidentified) factor causes both; or (4) they are simply unrelated and the correlation is a coincidence. If we don’t separate correlation from causation, we can claim that Vigen’s graph “proves” that Nic Cage was helping to prevent pool drownings, and furthermore, our best bet is to encourage him to make fewer movies so that he can ply his lifesaving skills as he apparently did so effectively in 2003 and 2008.
In some cases, there is no actual connection between items that are correlated—their correlation is simply coincidence. In other cases, one can find a causal link between correlated items, or at least spin a reasonable story that can spur the acquisition of new data.
We can rule out explanation one, because it takes time to produce and release a movie, so a spike in drownings cannot cause a spike in Nic Cage movies in the same year. What about number two? Perhaps people become so wrapped up in the drama of Cage’s films that they lose focus and drown as a consequence. It may be that the same cinematic absorption also increases rates of automobile accidents and injuries from heavy machinery. We don’t know until we analyze more data, because those are not reported here.
What about a third factor that caused both? We might guess that economic trends are driving both: A better economy leads to more investment in leisure activities—more films being made, more people going on vacation and swimming. If this is true, then neither of the two things depicted on the graph—Nic Cage films and drownings—caused the other. Instead, a third factor, the economy, led to changes in both. Statisticians call this the third factor x explanation of correlations, and there are many cases of these.
More likely, these two are simply unrelated. If we look long enough, and hard enough, we’re sure to find that two unrelated things vary with each other.
Ice-cream sales increase as the number of people who wear short pants increases. Neither causes the other; the third factor x that causes both is the warmer temperatures of summer. The number of television shows aired in a year while you were a child may have correlated with increases in your height, but what was no doubt driving both was the passage of time during an era when (a) TV was expanding its market and (b) you were growing.
How do you know when a correlation indicates causation? One way is to conduct a controlled experiment. Another is to apply logic. But be careful—it’s easy to get bogged down in semantics. Did the rain outside cause people to wear raincoats or was it their desire to avoid getting wet, a consequence of the rain, that caused it?
This idea was cleverly rendered by Randall Munroe in his Internet cartoon xkcd. Two stick figures, apparently college students, are talking. One says that he used to think correlation implied causation. Then he took a statistics class, and now he doesn’t think that anymore. The other student says, “Sounds like the class helped.” The first student replies, “Well, maybe.”
Infographics are often used by lying weasels to shape public opinion, and they rely on the fact that most people won’t study what they’ve done too carefully. Consider this graphic that might be used to scare you into thinking that runaway inflation is eating up your hard-earned money:
That’s a frightening image. But look closely. The scissors are cutting the bill not at 4.2 percent of its size, but at about 42 percent. When your visual system is pitted against your logical system, the visual system usually wins, unless you work extra diligently to overcome this visual bias. The accurate infographic would look like this but would have much less emotional impact:
Often a statistic will be properly created and reported, but someone—a journalist, an advocate, any non-statistician—will misreport it, either because they’ve misunderstood it or because they didn’t realize that a small change in wording can change the meaning.
Often those who want to use statistics do not have statisticians on their staffs, and so they seek the answers to their questions from people who lack proper training. Corporations, government offices, nonprofits, and mom-and-pop grocery stores all benefit from statistics about such items as sales, customers, trends, and supply chain. Incompetence can enter at any stage, in experimental design, data collection, analysis, or interpretation.
Sometimes the statistic being reported isn’t the relevant one. If you’re trying to convince stockholders that your company is doing well, you might publish statistics on your annual sales, and show steadily rising numbers. But if the market for your product is expanding, sales increases would be expected. What your investors and analysts probably want to know is whether your market share has changed. If your market share is decreasing because competitors are swooping in and taking away your customers, how can you make your report look attractive? Simply fail to report the relevant statistic of market share, and instead report the sales figures. Sales are going up! Everything is fine!
The financial profiles shown on people’s mortgage applications twenty-five years ago would probably not be much help in building a model for risk today. Any model of consumer behavior on a website may become out of date very quickly. Statistics on the integrity of concrete used for overpasses may not be relevant for concrete on bridges (where humidity and other factors may have caused divergence, even if both civic projects used the same concrete to begin with).
You’ve probably heard some variant of the claim that “four out of five dentists recommend Colgate toothpaste.” That’s true. What the ad agency behind these decades-old ads wants you to think is that the dentists prefer Colgate above and beyond other brands. But that’s not true. The Advertising Standards Authority in the United Kingdom investigated this claim and ruled it an unfair practice because the survey that was conducted allowed dentists to recommend more than one toothpaste. In fact, Colgate’s biggest competitor was named nearly as often as Colgate (a detail you won’t see in Colgate’s ads).
Framing came up in the section on averages and implicitly in the discussion of graphs. Manipulating the framing of any message furnishes an endless number of ways people can make you believe something that isn’t so if you don’t stop to think about what they’re saying. The cable network C-SPAN advertises that it is “available” in 100 million homes. That doesn’t mean that 100 million people are watching C-SPAN. It doesn’t mean that even one person is watching it.
Framing manipulations can influence public policy. A survey of recycling yield on various streets in metropolitan Los Angeles shows that one street in particular recycles 2.2 times as much as any other street. Before the city council gives the residents of this street an award for their green city efforts, let’s ask what might give rise to such a number. One possibility is that this street has more than twice as many residents as other streets—perhaps because it is longer, perhaps because there are a lot of apartment buildings on it. Measuring recycling at the level of the street is not the relevant statistic unless all streets are otherwise identical. A better statistic would be either the living unit (where you measure the recycling output of each family) or even better, because larger families probably consume more than smaller families, the individual. That is, we want to adjust the amount of recycling materials collected to take into account the number of people on the street. That is the true frame for the statistic.
The Los Angeles Times reported in 2014 about water use in the city of Rancho Santa Fe in drought-plagued California. “On a daily per capita basis, households in this area lapped up an average of nearly five times the water used by coastal Southern California homes in September, earning them the dubious distinction of being the state’s biggest residential water hogs.” “Households” is not the relevant frame for this statistic, and the LA Times was correct to report per capita—individuals; perhaps the residents of Rancho Santa Fe have larger families, meaning more showers, dishes, and flushing commodes. Another frame would look at water use per acre. Rancho Santa Fe homes tend to have larger lots. Perhaps it is desirable for fire prevention and other reasons to keep land planted with verdant vegetation, and the large lots in Rancho Santa Fe don’t use more water on a per acre basis than land anywhere else.
In fact, there’s a hint of this in a New York Times article on the issue: “State water officials warned against comparing per capita water use between districts; they said they expected use to be highest in wealthy communities with large properties.”
The problem with the newspaper articles is that they frame the data to make it look as though Rancho Santa Fe residents are using more than their share of water, but the data they provide—as in the case of the Los Angeles recycling example above—don’t actually show that.
Calculating proportions rather than actual numbers often helps to provide the true frame. Suppose you are northwest regional sales manager for a company that sells flux capacitors. Your sales have improved greatly, but are still no match for your nemesis in the company, Jack from the southwest. It’s hardly fair—his territory is not only geographically larger but covers a much larger population. Bonuses in your company depend on you showing the higher-ups that you have the mettle to go out and get sales.
There is a legitimate way to present your case: Report your sales as a function of the area or population of the territory you serve. In other words, instead of graphing total number of flux capacitors sold, look at total number per person in the region, or per square mile. In both, you may well come out ahead.
News reports showed that 2014 was one of the deadliest years for plane crashes: 22 accidents resulted in 992 fatalities. But flying is actually safer now than it has ever been. Because there are so many more flights today than ever before, the 992 fatalities represent a dramatic decline in the number of deaths per million passengers (or per million miles flown). On any single flight on a major airline, the chances are about 1 in 5 million that you’ll be killed, making it more likely that you’ll be killed doing just about anything else—walking across the street, eating food (death by choking or unintentional poisoning is about 1,000 times more likely). The baseline for comparison is very important here. These statistics are spread out over a year—a year of airline travel, a year of eating and then either choking or being poisoned. We could change the baseline and look at each hour of the activities, and this would change the statistic.
Statistics are often used when we seek to understand whether there is a difference between two treatments: two different fertilizers in a field, two different pain medications, two different styles of teaching, two different groups of salaries (e.g., men versus women doing the same jobs). There are many ways that two treatments can differ. There can be actual differences between them; there can be confounding factors in your sample that have nothing to do with the actual treatments; there can be errors in your measurement; or there can be random variation—little chance differences that turn up, sometimes on one side of the equation, sometimes on the other, depending on when you’re looking. The researcher’s goal is to find stable, replicable differences, and we try to distinguish those from experimental error.
Be wary, though, of the way news media use the word “significant,” because to statisticians it doesn’t mean “noteworthy.” In statistics, the word “significant” means that the results passed mathematical tests such as t-tests, chi-square tests, regression, and principal components analysis (there are hundreds). Statistical significance tests quantify how easily pure chance can explain the results. With a very large number of observations, even small differences that are trivial in magnitude can be beyond what our models of change and randomness can explain. These tests don’t know what’s noteworthy and what’s not—that’s a human judgment.
The more observations you have in the two groups, the more likely that you will find a difference between them. Suppose I test the annual maintenance costs of two automobiles, a Ford and a Toyota, by looking at the repair records for ten of each car. Let’s say, hypothetically, the mean cost of operating the Ford is eight cents more per year. This will probably fail to meet statistical significance, and clearly a cost difference of eight cents a year is not going to be the deciding factor in which car to buy—it’s just too small an amount to be concerned about. But if I look at the repair records for 500,000 vehicles, that eight-cent difference will be statistically significant. But it’s a difference that doesn’t matter in any real-world, practical sense. Similarly, a new headache medication may be statistically faster at curing your headache, but if it’s only 2.5 seconds faster, who cares?
You go out in your garden and see a dandelion that’s four inches high on Tuesday. You look again on Thursday and it’s six inches high. How high was it on Wednesday? We don’t know for sure because we didn’t measure it Wednesday (Wednesday’s the day you got stuck in traffic on the way home from the nursery, where you bought some weed killer). But you can guess: The dandelion was probably five inches high on Wednesday. This is interpolation. Interpolation takes two data points and estimates the value that would have occurred between them if you had taken a measurement there.
How high will the dandelion be after six months? If it’s growing 1 inch per day, you might say that it will grow 180 inches more in six months (roughly 180 days), for a total of 186 inches, or fifteen and a half feet high. You’re using extrapolation. But have you ever seen a dandelion that tall? Probably not. They collapse under their own weight, or die of other natural causes, or get trampled, or the weed killer might get them. Interpolation isn’t a perfect technique, but if the two observations you’re considering are very close together, interpolation usually provides a good estimate. Extrapolation, however, is riskier, because you’re making estimates outside the range of your observations.
The amount of time it takes a cup of coffee to cool to room temperature is governed by Newton’s law of cooling (and is affected by other factors such as the barometric pressure and the composition of the cup). If your coffee started out at 145 degrees Fahrenheit (F), you’d observe the temperature decreasing over time like this:
Elapsed Time (mins) |
Temp °F |
0 |
145 |
1 |
140 |
2 |
135 |
3 |
130 |
Your coffee loses five degrees every minute. If you interpolated between two observations—say you want to know what the temperature would have been at the halfway point between measurements—your interpolation is going to be quite accurate. But if you extrapolate from the pattern, you are likely to come up with an absurd answer, such as that the coffee will reach freezing after thirty minutes.
The extrapolation fails to take into account a physical limit: The coffee can’t get cooler than room temperature. It also fails to take into account that the rate at which the coffee cools slows down the closer it gets to room temperature. The rest of the cooling function looks like this:
Note that the steepness of the curve in the first ten minutes doesn’t continue—it flattens out. This underscores the importance of two things when you’re extrapolating: having a large number of observations that span a wide range, and having some knowledge of the underlying process.
When faced with the precision of numbers, we tend to believe that they are also accurate, but this is not the same thing. If I say “a lot of people are buying electric cars these days,” you assume that I’m making a guess. If I say that “16.39 percent of new car sales are electric vehicles,” you assume that I know what I’m talking about. But you’d be confusing precision for accuracy. I may have made it up. I may have sampled only a small number of people near an electric-car dealership.
Recall the Time magazine headline I mentioned earlier, which said that more people have cell phones than have toilets. This isn’t implausible, but it is a distortion because that’s not what the U.N. study found at all. The U.N. reported that more people had access to cell phones than to toilets, which is, as we know, a different thing. One cell phone might be shared among dozens of people. The lack of sanitation is still distressing, but the headline makes it sound like if you were to count, you’d find there are more cell phones in the world than there are toilets, and that is not supported by the data.
Access is one of those words that should raise red flags when you encounter them in statistics. People having access to health care might simply mean they live near a medical facility, not that the facility would admit them or that they could pay for it. As you learned above, C-SPAN is available in 100 million homes, but that doesn’t mean that 100 million people are watching it. I could claim that 90 percent of the world’s population has “access” to A Field Guide to Lies by showing that 90 percent of the population is within twenty-five miles of an Internet connection, rail line, road, landing strip, port, or dogsled route.
One way to lie with statistics is to compare things—datasets, populations, types of products—that are different from one another, and pretend that they’re not. As the old idiom says, you can’t compare apples with oranges.
Using dubious methods, you could claim that it is safer to be in the military during an active conflict (such as the present war in Afghanistan) than to be stateside in the comfort of your own home. Start with the 3,482 active-duty U.S. military personnel who died in 2010. Out of a total of 1,431,000 people in the military, this gives a rate of 2.4 deaths per 1,000. Across the United States, the death rate in 2010 was 8.2 deaths per 1,000. In other words, it is more than three times safer to be in the military, in a war zone, than to live in the United States.
What’s going on here? The two samples are not similar, and so shouldn’t be compared directly. Active military personnel tend to be young and in good health; they are served a nutritious diet and have good health care. The general population of the United States includes the elderly, people who are sick, gang members, crackheads, motorcycle daredevils, players of mumblety-peg, and many people who have neither a nutritious diet nor good health care; their mortality rate would be high wherever they are. And active military personnel are not all stationed in a war zone—some are stationed in very safe bases in the United States, are sitting behind desks in the Pentagon, or are stationed in recruiting stations in suburban strip malls.
U.S. News & World Report published an article comparing the proportion of Democrats and Republicans in the country going back to the 1930s. The problem is that sampling methods have changed over the years. In the 1930s and ’40s, sampling was typically done by in-person interviews and mail lists generated by telephone directories; by the 1970s sampling was predominantly just by telephone. Sampling in the early part of the twentieth century skewed toward those who tended to have landlines: wealthier people, who, at least at that time, tended to vote Republican. By the 2000s, cell phones were being sampled, which skewed toward the young, who tended to vote Democratic. We can’t really know if the proportion of Democrats to Republicans has changed since the 1930s because the samples are incompatible. We think we’re studying one thing but we’re studying another.
A similar problem occurs when reporting a decline in the death rate due to motorcycle accidents now versus three decades ago. The more recent figures might include more three-wheel motorcycles compared to predominantly two-wheeled ones last century; it might compare an era when helmets were not required by law to now, where they are in most states.
Be on the lookout for changing samples before drawing conclusions! U.S. News & World Report (yes, them again) wrote of an increase in the number of doctors over a twelve-year period, accompanied by a significant drop in average salary. What is the takeaway message? You might conclude that now is not a good time to enter the medical profession because there is a glut of doctors, and that supply exceeding demand has lowered every doctor’s salary. This might be true, but there is no evidence in the claim to support this.
An equally plausible argument is that over the twelve-year period, increased specialization and technology growth created more opportunities for doctors and so there were more available positions, accounting for the increase in the total number of doctors. What about the salary decline? Perhaps many older doctors retired, and were replaced by younger ones, who earn a smaller salary just out of medical school. There is no evidence presented either way. An important part of statistical literacy is recognizing that some statistics, as presented, simply cannot be interpreted.
Sometimes, this apples-and-oranges comparison results from inconsistent subsamples—ignoring a detail that you didn’t realize was important. For example, when sampling corn from a field that received a new fertilizer, you might not notice that some ears of corn get more sun and some get more water. Or when studying how traffic patterns affect street repaving, you might not realize that certain streets have more water runoff than others, influencing the need for asphalt repairs.
Amalgamating is putting things that are different (heterogeneous) into the same bin or category—a form of apples and oranges. If you’re looking at the number of defective sprockets produced by a factory, you might combine two completely different kinds in order to make the numbers come out more favorably for your particular interests.
Take an example from public policy. You might want to survey the sexual behavior of preteens and teens. How you amalgamate (or bin) the data can have a large effect on how people perceive your data. If your agenda is to raise money for educational and counseling centers, what better way to do so than to release a statistic such as “70 percent of schoolchildren ages ten to eighteen are sexually active.” We’re not surprised that seventeen- and eighteen-year-olds are, but ten-year-olds! That will surely cause grandparents to reach for the smelling salts and start writing checks. But obviously, a single category of ten-year-olds to eighteen-year-olds lumps together individuals who are likely to be sexually active with those who are not. More helpful would be separate bins that put together individuals of similar age and likely similar experiences: ten to eleven, twelve to thirteen, fourteen to fifteen, sixteen to eighteen, for example.
But that’s not the only problem. What do they mean by “sexually active”? What question was actually asked of the schoolchildren? Or were the schoolchildren even asked? Perhaps it was their parents who were asked. All kinds of biases can enter into such a number. “Sexually active” is open to interpretation. Responses will vary widely depending on how it is defined. And of course respondants may not tell the truth (reporting bias).
As another example, you might want to talk about unemployment as a general problem, but this risks combining people of very different backgrounds and contributing factors. Some are disabled and can’t work; some are fired with good cause because they were caught stealing or drunk on the job; some want to work but lack the training; some are in jail; some no longer want to work because they’ve gone back to school, joined a monastery, or are living off family money. When statistics are used to influence public policy, or to raise donations for a cause, or to make headlines, often the nuances are left out. And they can make all the difference.
These nuances often tell a story themselves about patterns in the data. People don’t become unemployed for the same reasons. The likelihood that an alcoholic or a thief will become unemployed may be four times that of someone who is not. These patterns carry information that is lost in amalgamation. Allowing these factors to become part of the data can help you to see who is unemployed and why—it could lead to better training programs for people who need it, or more Alcoholics Anonymous centers in a town that is underserved by them.
If the people and agencies who track behavior use different definitions for things, or different procedures for measuring them, the data that go into the statistic can be very dissimilar, or heterogeneous. If you’re trying to pin down the number of couples who live together but are not married, you might rely on data that have already been collected by various county and state agencies. But varying definitions can yield a categorization problem: What constitutes living together? Is it determined by how many nights a week they are together? By where their possessions are, where they get mail? Some jurisdictions recognize same-sex couples and some don’t. If you take the data from different places using different schemes, the final statistic carries very little meaning. If the recording, collection, and measurement practices vary widely across collection points, the statistic that results may not mean what you think it means.
A recent report found that the youth unemployment rate in Spain was an astonishing 60 percent. The report amalgamated into the same category people who normally would appear in separate categories: Students who were not seeking work were counted as unemployed, alongside workers who had just been laid off and workers who were seeking jobs.
In the United States, there are six different indexes (numbered U1 through U6) to track unemployment (as measured by the Bureau of Labor Statistics), and they reflect different interpretations of what “unemployed” actually means. It can include people looking for a job, people who are in school but not looking, people who are seeking full-time assignments in a company where they work only part-time, and so on.
USA Today reported in July 2015 that the unemployment rate dropped to 5.3 percent, “its lowest level since April 2008.” More comprehensive sources, including the AP, Forbes, and the New York Times, reported the reason for the apparent drop: Many people who were out of work gave up looking and so technically had left the workforce.
Amalgamating isn’t always wrong. You might choose to combine the test scores of boys and girls in a school, especially if there is no evidence that their scores differ—in fact, it’s a good idea to, in order to increase your sample size (which provides you with a more stable estimate of what you’re studying). Overly broad definitions of a category (as with the sexual-activity survey mentioned earlier) or inconsistent definitions (as with the couples-living-together statistic) present problems for interpretation. When performed properly, amalgamating helps us come up with a valid analysis of data.
Suppose that you work for the state of Utah and a large national manufacturer of baby clothes is thinking about moving to your state. You’re thinking that if you can show that Utah has a lot of births, you’re in a better position to attract the company, so you go to the Census.gov website, and graph the results for number of births by state:
Utah looks better than Alaska, D.C., Montana, Wyoming, the Dakotas, and the small states of the Northeast. But it is hardly a booming baby state compared to California, Texas, Florida, and New York. But wait, this map you’ve made shows the raw number of births and so will be weighted heavily toward states with larger populations. Instead, you could graph the birth rate per thousand people in the population:
That doesn’t help. Utah looks just like most of the rest of the country. What to do? Change the bins! You can play around with which range of values go into each category, those five gray-to-black bars at the bottom. By making sure that Utah’s rate is in a category all by itself, you can make it stand out from the rest of the country.
Of course, this only works because Utah does in fact have the highest birth rate in the country—not by much, but it is still the highest. By choosing a bin that puts it all by itself in a color category, you’ve made it stand out. If you were trying to make a case for one of the other states, you’d have to resort to other kinds of flimflam, such as graphing the number of births per square mile, or per Walmart store, as a function of disposable income. Play around long enough and you might find a metric to make a case for any of the fifty states.
What is the right way, the non-lying way to present such a graph? This is a matter of judgment, but one relatively neutral way would be to bin the data so that 20 percent of the states are contained in each of the five bins, that is, an equal number of states per color category:
Another would be to make the bins equal in size:
This kind of statistical chicanery—using unequal bin widths in all but the last of these maps—often shows up in histograms, where the bins are typically identified by their midpoint and you have to infer the range yourself. Here are the batting averages for the 2015 season for the Top 50 qualifying Major League Baseball players (National and American Leagues):
Now, suppose that you’re the player whose batting average is .330, putting you in the second highest category. It’s time for bonus checks and you don’t want to give management any reason to deny you a bonus this year—you’ve already bought a Tesla. So change the bin widths, amalgamating your results with the two players who were batting .337, and now you’re in with the very best players. While you’re at it, close up the ensuing gap (there are no longer any batters in the .327-centered bin), creating a discontinuity in the x-axis that probably few will notice:
The opposite of amalgamating is subdividing, and this can cause people to believe all kinds of things that aren’t so. To claim that x is a leading cause of y, I simply need to subdivide other causes into smaller and smaller categories.
Suppose you work for a manufacturer of air purifiers, and you’re on a campaign to prove that respiratory disease is the leading cause of death in the United States, overwhelming other causes like heart disease and cancer. As of today, the actual leading cause of death in the United States is heart disease. The U.S. Centers for Disease Control report that these were the top three causes of death in 2013:
Heart disease: 611,105
Cancer: 584,881
Chronic lower respiratory diseases: 149,205
Now, setting aside the pesky detail that home air purifiers may not form a significant line of defense against chronic respiratory disease, these numbers don’t make a compelling case for your company. Sure, you’d like to save more than 100,000 lives annually, but to say that you’re fighting the third largest cause of death doesn’t make for a very impressive ad campaign. But wait! Heart disease isn’t one thing, it’s several:
Acute rheumatic fever and chronic rheumatic heart disease: 3,260
Hypertensive heart disease: 37,144
Acute myocardial infarction: 116,793
Heart failure: 65,120
And so on. Next, break up the cancers into small subtypes. By failing to amalgamate, and creating these fine subdivisions, you’ve done it! Chronic lower respiratory disease becomes the number one killer. You’ve just earned yourself a bonus. Some food companies have used this subdivide strategy to hide the amounts of fats and sugars contained in their product.