Introduction: What Are You to Believe?
Before obtaining certainty we must often be satisfied with a more or less plausible guess.
—George Polya1
Try this sometime. Ask a friend, “Why do you believe what you believe? What sort of evidence persuades you that someone is right or that a product is good?” This question seldom elicits a careful, thoughtful response. Rather, it elicits silence and narrowed eyes. Most people think that their beliefs are shaped by logic and reason. Your friend will likely detect a whiff of insult in the question.
But our beliefs are fueled by much more than reason and fact. Yes, we are persuaded by solid evidence assembled into arguments that conform to principles of logic. But that’s true only for the messages that we examine, and we don’t have the time to audit every advertisement we hear and blog posting we read. We are pelted by information almost constantly. Just think of the ubiquity of screens. At airport gates, in restaurants, in waiting rooms, in the post office, even in hotel elevators. If a location provides a captive human audience, there is likely to be a screen, flashing updates from Afghanistan, coverage of a golf tournament, or an advertisement for Claritin. Much of this information is not neutral. It is meant to persuade us of something. Yet we don’t have the time or the mental energy to think through every message that comes our way.
Are we influenced by messages that we ignore? I stand in line at my bank and notice a large television behind the teller, displaying a channel exclusive to my bank. An advertisement appears, showing a sedan wending along a New England country road, scattering autumn leaves. I go into a reverie, thinking of the Berkshire mountains. I haven’t consciously noticed the make of the car . . . but am I nevertheless influenced? When I next need a car, even if it’s four years from now, perhaps I’ll be a bit more likely to buy this model because I was exposed to this ad. Will I be more likely to apply for a car loan at this bank, rather than shopping around for the best rate? Is it possible for attitudes to change outside my awareness? Although it makes us uncomfortable to contemplate it, psychological research from the last fifty years indicates that the answer is yes.
Sometimes, of course, I do pay attention to these messages, and I don’t fully trust what I’m hearing. For example, when I read Mother Jones or the Weekly Standard, I am aware that each has a political point of view, and I try to remember that information may be omitted or the interpretation of facts stretched to be consistent with that view. When I hear the president of Iran give a speech, I recall that he has denied that the Holocaust took place, so I am wary of any claim he makes. When I listen carefully to messages, am I able to account for the bias or trustworthiness of the source? To some extent, yes, but not completely.
I am making it sound as though we all are buffeted about—no, worse, systematically manipulated—by forces that operate outside our awareness or, even if we are aware of them, outside our control. Putting it that way is a bit dramatic, but it’s not far from the truth.
This book will tell you how to evaluate new ideas—in particular, those related to education—so that you are less likely to be persuaded by bad evidence.
Forewarned is forearmed. The first step in defending yourself from hidden persuaders is identifying them. I begin with what is perhaps the strangest example. The very shape that carries information to you has an impact on whether or not you believe this information. This story is a bit complex, although the mathematics behind it is relatively simple.
You and I have a number in common, a number that influences what we consider beautiful and worthy of our sustained attention: 1.618. (Actually, it’s 1.6180339887, but I’ll use the truncated version.) It’s important not as a number but as a ratio, and the simplest way to understand it is to consider the rectangle shown in Figure I.1.
FIGURE I.1: A rectangle with sides proportional to the Golden Ratio.
The ratio of the length of side b to side a is 1.618, and people find rectangles of this proportion more aesthetically pleasing than other rectangles. Confronted with, say, thirty rectangles of various proportions, most people pick this one as the most attractive. Because of its importance in aesthetics, 1.618 is called the Golden Ratio.
Researchers have observed this ratio in classical architecture. For example, the width and height of the façade of the Parthenon in Greece respects the Golden Ratio. It is also observed in the great pyramid of Giza. If one forms a triangle as shown, the ratio of the length of one face to half the length of the base is within 1 percent of the Golden Ratio (Figure I.2).
FIGURE I.2: Classic works of architecture such as the Parthenon (or the reproduction in Nashville, Tennessee, shown here) and the Great Pyramid of Giza have the Golden Ratio embedded in their proportions.
The Golden Ratio is observed in smaller-scale works of art as well, including the placement of figures in paintings by da Vinci and the elements of a Stradivarius violin (Figure I.3).
FIGURE I.3: Iconic works of Western art that show the Golden Ratio in their proportions.
Why would this ratio be aesthetically pleasing across cultures and across centuries? A reasonable suggestion is that it is commonly observed in nature. Indeed, the Golden Ratio is found in proportions of the human body (Figure I.4) and the human face, especially faces that others find attractive.
FIGURE I.4: Ratios of body parts also show the Golden Ratio. See text for description.
If the distance between the navel and the foot is taken as 1 unit, the height of a human being is typically equivalent to 1.618. Some other golden proportions in the average human body are
Naturally, there is variation across individuals in these proportions. The Golden Ratio is observed when we take averages across many individuals, and individuals with the “ideal proportions” are judged by others as having well-proportioned bodies.
The same is true for faces, and here the relationship to attractiveness is easy to appreciate. Faces are attractive not only because the eyes and the mouth are well shaped. The proportions of the face must be right. If a person’s eyes are too close together or too far apart, he or she is not attractive. The actress Jessica Alba, commonly considered to be very attractive, not only has a dazzling smile and beautiful eyes, but the distances between her features match the Golden Ratio perfectly (Figure I.5).
FIGURE I.5: Jessica Alba (a) is commonly considered one of the most beautiful women in Hollywood. These photos show some of the Golden Ratios observed in the proportion of features observed in the ideal human face: (b) distance between pupils / distance between eyebrows; (c) width of mouth / width of nose; and (d) distance between lips and where eyebrows meet / length of nose.
The Golden Ratio is observed elsewhere in nature as a spiral. To understand how, you need a basic understanding of the underlying mathematics. The Golden Ratio was first described by twelfth-century mathematician Leonardo Fibonacci. Perhaps you’ve heard of the Fibonacci sequence: I begin with the numbers 0 and 1, and then add the last two numbers in the sequence to generate the next number. That is, 0 + 1 = 1, so the sequence begins 0, 1, 1. To obtain the next number, I add the final two in the sequence thus far, hence, 1 + 1 = 2. So now the sequence is 0, 1, 1, 2. Continuing, the sequence is: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, and so on. If I take the ratio of successive numbers, the values converge on the Golden Ratio (Table I.1).
TABLE I.1: The ratio of neighboring numbers in the Fibonacci sequence converge on the Golden Ratio.
Ratio | Value |
3 to 2 | 1.5000 |
8 to 5 | 1.6000 |
21 to 13 | 1.6154 |
55 to 34 | 1.6176 |
144 to 89 | 1.6179 |
Now suppose that I create squares, each with sides equivalent to the numbers in the Fibonacci sequence (that is, I create squares whose sides are of lengths 1, 1, 2, 3, 5, and so on). Each square I create is added to the others so that they form a rectangle (Figure I.6). I can create an arc by connecting opposite corners of the squares.
FIGURE I.6: A Fibonacci arc. See text for description.
This is called a Fibonacci arc, and it too is observed in nature—for example, in the shape of seashells like the nautilus, and in the pattern of the seeds of flowers (such as the sunflower and daisy, as shown in Figure I.7). Spirals are observed in other plants as well—for example, the cauliflower, although easier to see in the Romanesco (a kind of broccoli-cauliflower hybrid).
FIGURE I.7: Examples of Fibonacci arcs observed in nature.
Fibonacci sequences are also present, though more subtly so, in the arrangement of leaves of many plants.
For example, in the rubber plant shown in Figure I.8, starting from the top we have three clockwise rotations before we meet another leaf directly below the first, passing five leaves on the way. If we go counterclockwise, we need just two rotations. Note that 2, 3, and 5 are consecutive Fibonacci numbers. This ratio of rotations to leaves is commonly observed.
FIGURE I.8: The leaves of many plants grow in a Fibonacci spiral, centered on the stem.
The interpretation of the aesthetic value of the Golden Ratio would seem to be clear: we are naturally drawn to objects showing the Golden Ratio because this ratio is found throughout nature.
But what is the connection of the Golden Ratio to persuasion? The great nineteenth-century British poet John Keats ended “Ode on a Grecian Urn” with these words: “Beauty is truth, truth beauty. That is all ye know on earth, and all ye need to know.” Keats, it turns out, was an excellent psychologist. We associate beauty and truth. When we see something that is physically beautiful, we assume that it has other good qualities, including truthfulness.
In semiotics (the study of symbols) one would call this a “sign.” Just as red means “hot” and blue means “cold,” beauty means “truth.” But the significance of red and blue to temperature is a cultural convention, and one that each of us must learn. The connection of beauty and truth is made across cultures, and need not be learned. It seems to be a natural part of the human makeup.
People are more likely to believe the contents of a book or magazine if its dimensions correspond to the Golden Ratio. Children’s books might be square, and so might art or cookbooks, but something like 95 percent of the nonfiction books that seek to persuade are sold in dimensions within 2 percent of the Golden Ratio (Figure I.9). The figure for magazines is over 90 percent.
FIGURE I.9: A surprisingly high percentage of nonfiction books use page formats corresponding to the Golden Ratio, but only those that seek to persuade.
The Golden Ratio does exert a powerful and powerfully subtle influence on persuasion. Or it would, if not for a small problem: the Golden Ratio theory is bunk.
Some of the statistics I’ve cited here are just plain inaccuracies. Studies have been conducted in which people (ordinary people2 or professional artists and designers3) are shown a large selection of rectangles and are asked which they find most attractive. It’s not the case that people select the Golden Ratio rectangles. Another study examined the dimensions of 565 rectangular paintings by famous artists. Artists showed no predilection for canvas sizes that respected the Golden Ratio; the mean ratio was 1.34.4 And natural objects like the human body, faces, and seashells show lots of variability. It’s not the case that the most attractive show the Golden Ratio.5 The statistics about the dimensions of books and magazines are complete fabrications.
Some of the Golden Ratio phenomena are accurate but trivial—trivial because examples that fit the Golden Ratio are emphasized, and examples that do not fit are ignored. Why evaluate the Parthenon and not the Pantheon? Why the pyramid of Giza and not the pyramid of Khafre? For that matter, why not the Roman Colosseum, the Taj Mahal, the Alhambra, or the Eiffel Tower? Then, too, a complex figure like the Parthenon or The Last Supper has many measurable features; that makes it too easy to pick and choose measurements that yield the desired ratio.6
I apologize for beginning this book with a sucker punch. (Maybe some part of me wanted company. I fell for the Golden Ratio hook, line, and sinker when I first heard it.a) The Golden Ratio is not interesting because it’s true. It’s interesting because the idea survives and continues to attract believers even though it is known to be wrong. In that way, it’s an object lesson for this book. Knowing what to believe is a problem.
People believe lots of things for which the scientific evidence is absent: that a special coin brings them luck, that aliens visit Earth regularly, or that astrological predictions are better than chance.b Many such beliefs, though unfounded, are harmless. Maybe they cost us a little time or money, but we find them fun or interesting, and we don’t take them all that seriously anyway.
But unfounded beliefs related to schooling are of greater concern. The costs in time and money can be substantial; worse, faulty beliefs about learning can potentially cost kids their education. Scientific tools can be a real help in sorting out which methods and materials truly help students learn and which do not. We cannot afford to let educational practice be guided by hunch or hope if better information is available. But even though scientific tools are routinely applied, the product is often ignored, or else it’s twisted by people with dollars on their minds.
Consider learning styles theories. These theories maintain that different people have different ways of learning, and that we can identify an individual’s style, tune our teaching to that style, and make learning easier or more effective. For example, the most popular theory of learning styles holds that some people learn best by seeing things (visual learners), some by hearing things (auditory learners), and some by manipulating objects (kinesthetic learners). This theory has been around for at least twenty-five years, and it has been tested in scientific experiments. In fact, testing the theory is quite straightforward.
Experiments like this have been conducted, and there is no support for the learning styles idea.7 Not for visual, auditory, or kinesthetic learners, nor for linear or holistic learners, nor for any of the other learners described by learning styles theories.
Yet if you search for “learning styles” on the Internet, you will not find a brief, academic obituary for this interesting idea that turned out to be wrong. You’ll find almost two million hits. You’ll find almost two thousand books on Amazon. You’ll find the term mentioned on the syllabi of thousands of college courses. And you’ll find lots and lots of products that promise improved educational outcomes once you know students’ learning styles . . . although knowing a child’s learning style often requires buying the book they want to sell you, or attending a workshop they are conducting.
The main cost of learning styles seems to be wasted time and money, and some worry on the part of teachers who feel that they ought to be paying more attention to learning styles, for it appears that most teachers don’t do much with them. The cost of other scientifically inaccurate beliefs has been more substantial. Consider this example. Before about 1920, the way to teach children to read seemed obvious. You start by teaching them the sound associated with each letter or letter combination (Figure I.10).
FIGURE I.10: For many years, students learning to read were first taught to associate the shape of letters with associated sounds, as in this image, reproduced from the New England Primer, published around 1760.
In the first quarter of the twentieth century, another theory of reading rose to prominence.8 In essence, it argued that children should be taught to read the way adults read. Adults seem to read entire words or even phrases all at once. (Watch the eyes of someone reading, and you’ll see that they do not dwell on each word, but rather stop a few times as they scan each line.) Adults read silently, which is much faster than reading aloud. And adults read what interests them. Children, in contrast, are taught to read sound by sound (not whole words), aloud (not silently), and out of boring primers (not engaging material).
In what became known at the look-say or whole-word method, children were encouraged to memorize entire words. Books for reading instruction used a limited set of words so as to make memorization possible.9 Students were encouraged to guess at a word based on the surrounding context and pictures if they did not recognize it. This method also emphasized that phonics drills—memorizing letters and their associated sounds—are boring and that using them was likely to make children hate reading. Instead, the whole-word method suggested that children be surrounded by real books, not drill materials, and that the stories be ones that they can understand and identify with. The whole-word method became dominant in American education during the 1930s and 1940s.10
Two factors might have tipped off educators that this approach to reading instruction was suspect. First, written language is a sound-based system, not a meaning-based system. Seeing the three letters d, o, and g doesn’t tell you meaning. Letters signify sounds. If that weren’t true, then when I showed you an unfamiliar word—for example, “mielesta”—you wouldn’t just be uncertain of its meaning; you would also have no idea of how to pronounce it. Given that writing is sound based, teaching reading with a method that ignores sound seems risky.
Second, the theory encourages the teaching of reading based on the way adults read. On the one hand, you can see the logic: if you want to learn something, find someone who is good at it and then try to do what he or she does. On the other hand, there’s no guarantee that the expert did it that way when he or she was a beginner. An expert basketball player no longer needs to think about the basics of ball handling and footwork because that has been so extensively practiced. The expert thinks about play making and strategy, but the beginner needs to think about fundamentals. Copying an expert reader is not necessarily a good strategy for beginning readers.
In 1955, the book Why Johnny Can’t Read was published.11 It argued that if the direct teaching of the sounds that go with letters were omitted, reading was not being taught. The book was a strident, take-no-prisoners invective, and it became a best seller. The book was negatively reviewed by many education professionals, however.12 Professors who studied reading argued that the book was ill-informed and that the author was simply wrong. Over the next several years, arguments over how to teach reading raged, later dubbed “the Reading Wars.”
In 1961, the Carnegie Corporation sought a scholar to comb through all of the scientific studies and to draw a conclusion: Was phonics-based or whole-word instruction superior? Jeanne Chall, a professor at the Harvard Graduate School of Education, was selected to conduct the review. In her 1967 book, she said that the relevant research showed that the phonics method was superior.13
That sounds clear enough, right? Education rides off the rails briefly (well, actually for thirty years or so), but science comes to the rescue. So we would expect that post-1967, the whole-word approach to reading instruction would be relegated to the dustbin. Well, we’d be wrong. The basic idea behind whole-word reading resurfaced in the mid-1980s.14 Renamed “whole language,” the pitch was familiar: phonics instruction deadens interest and is unnecessary. Learning to read is as natural as learning to speak. Just surround kids with authentic materials, and they will learn to read on their own. The Reading Wars began again. Some districts and even entire states (notably California) adopted curricula based on the whole-language method of teaching reading.
In 1997, Congress asked the Department of Education to draw together a panel of reading experts to sort through the scientific research on reading instruction. Their conclusion, published in 2000, matched that of Chall in 1967.15 Phonics instruction is a critical part of learning to read. In its absence, some kids figure out on their own the sounds that go with letters and letter combinations. But some don’t. And those kids will end up disliking reading, and some of them will end up being labeled as dyslexic.
The first phase of the Reading Wars was understandable. Someone had an ill-conceived theory about reading instruction. It sounded good, so people tried it. It’s somewhat more difficult to understand why it took as long as it did—some thirty years—for the scientific evidence to influence public opinion and public policy. It is nearly inconceivable that the same mistake was made again twenty years later, sparking the second phase of the Reading Wars.c
When science is brought to bear on education problems carelessly or underhandedly, perhaps the worst damage is inflicted on kids with disabilities. (As the parent of a child with Edwards’ syndrome—also called Trisomy 18—I have personal experience here.) Many developmental disabilities do not have effective treatments, and parents are desperate. They are willing, even eager, to try unproven alternative treatments—anything that might work, anything that holds out some hope. Furthermore, there are a lot of disabled kids in this country—estimates are that about 13 percent of kids have some disability, ranging from very mild speech impairments to chromosome disorders that affect nearly every aspect of intellectual and physical development.16
Scam artists go where the money is, and the parents of kids with autism spectrum disorder (ASD) are some of their preferred targets because there are a lot of them. Kids with ASD show a fairly broad range of characteristic behaviors, but they tend to have these in common: (1) difficulty with communication, both verbal and nonverbal (that is, pointing, gesturing); (2) problems in social relations, especially in perceiving the emotions and thinking of others; and (3) repetitive behaviors, such as following a strict routine or repeating words or actions again and again. Rates of autism have skyrocketed since 1994 (probably due to changes in diagnostic criteria17) so that now approximately 1 in 110 American kids is diagnosed.18
Treatment options are limited. The most reliable are behavioral therapies. These boil down to trying to teach the child which behavior is appropriate in a given situation—for example, to make eye contact and to respond when a cashier says “thank you.” If he knows which behavior is appropriate but usually doesn’t do it, the focus is on increasing the frequency with which the child does it.
Behavioral therapy for ASD is frustrating for everyone involved. It’s slow and painstaking, and it must feel to parents like a Band-Aid. It doesn’t address the underlying problem, just the symptoms. The underlying problem is certainly not behavioral; kids don’t have ASD because of something their parents did or didn’t do. ASD has a biological basis. So it seems that the treatment ought to be biological.d
Hucksters offer a bazaar of dubious biological therapies for ASD, none of them approved by the Food and Drug Administration, and all of them seeming to offer the promise of getting at the root cause. The least expensive and safest (but certainly inconvenient to follow) include vitamins and supplements or special diets. Other therapies can be wildly expensive—for example, hyperbaric oxygen therapy. Here the child is put in a sealed environment of enriched oxygen at a pressure greater than atmospheric pressure, which helps the blood carry more oxygen to the organs. Treatments might cost several thousand dollars each month. Another unproven ASD treatment is immunoglobulin (antibodies approved for leukemia and AIDS), which runs about $10,000. Much worse than the costs are the potential side effects. Hyperbaric oxygen therapy can put stress on the lungs, heart, and other organs. Immunoglobulin can cause fever, headache, meningitis, or anaphylactic shock.19
What are these parents thinking? Why would they subject their children to unproven therapies? As is so often the case, treatments that initially sound bizarre do have a certain logic, once you scratch a bit beneath the surface. There are data showing evidence of inflammation in the brains of autistic children. These data come from a study published in the prestigious scientific journal Annals of Neurology, by a research team at Johns Hopkins University.20 The claimed benefit of hyperbaric oxygen therapy and of immunoglobulin is the reduction of inflammation. So there is a rationale for the treatment.
If you were a parent listening to someone trying to sell you one of these treatments, the odds are good that you would be told about the study showing inflammation in the brains of kids with ASD. What you would not be told is that the researchers anticipated that quacks would rush to use their research as the basis for ASD “remedies.” So on their Web site they published a plain-language explanation of the findings, along with a strongly worded warning against using these findings: “THERE IS NO indication for using anti-inflammatory medications in patients with autism.”21 The Web site mentions immunoglobulin in particular as unlikely to have a significant effect because of the mechanism by which it reduces inflammation.
When we read about fringe ASD treatments, there is a strong temptation to think, “I’m no sucker. I wouldn’t believe something for which there is no evidence.” Other parents, teachers, and administrators aren’t stupid either. As I note, the treatments they believe will work do have a certain logic behind them. Whole-language advocates were correct in criticizing many phonics drills as boring, and the idea of following the methods used by more expert readers has a surface plausibility. The purveyors of unproven ASD therapies can point to reputable scientific studies as their backing, and it would take some scientific sophistication to know that the studies were being misinterpreted. Digging deep enough to spot the misinterpretation may be tougher than you’d think.
Suppose you’re a parent looking for supplementary support for your child with dyslexia, or a teacher curious about your district’s plan to put a new math program in place, or an educational administrator who has been asked by a superintendent to attend a weekend seminar on team building. In each case, you are assured that the program is “research based.”
If you want to know whether something is really research based, how could you know? Well, “research based” means that someone has done some formal scientific studies to evaluate whether or not the program, therapy, or gadget really does what it’s purported to do. Such research will be published in specialty journals that are devoted to this sort of thing, so that’s where you will have to look for it. The simple act of trying to locate scientific studies of a practice may tell you that such studies haven’t been done. That alone is useful, and, happily, merely finding out whether the studies have been done is now fairly easy using the Internet. I’ll have more to say about this in Chapter Seven.
But knowing whether or not relevant research exists is usually not enough. We saw this in the case of hyperbaric oxygen therapy for ASD. There is bona fide, high-quality evidence for inflammation in the brains of kids with ASD. There is bona fide, high-quality evidence that immunoglobulin can reduce inflammation. But to understand why the therapy is unlikely to work, you need a finer grain of detailed knowledge than most of us have; I’m dimly aware that there are multiple mechanisms by which brain tissue can become inflamed, but I really doubt I would have thought to wonder about that. I also would not have known that inflammation is not always a bad thing; it turns out that inflammation is sometimes a sign that the brain is trying to repair itself. I also probably would not have thought of the possibility that there is some other factor, call it X, that causes ASD and causes inflammation as a by-product. Treating the inflammation would be like treating the fever you get when you have the flu. It doesn’t make the virus go away, because fever is a symptom, not a cause.
Here’s another example of the need for deep knowledge when trying to evaluate whether something is research based. When I search “secretin autism” in Google Scholar (a database of scholarly research), I get 2,010 hits.22 (Secretin is a hormone that is important in digestion.) The first article is titled “Lack of Benefit of a Single Dose of Synthetic Human Secretin in the Treatment of Autism and Pervasive Developmental Disorder.” The second article is titled “Improved Social and Language Skills After Secretin Administration in Patients with Autistic Spectrum Disorders.” Hmm. So it seems there’s some controversy.23 Unfortunately, that’s typical. Human behavior is not a simple cause-and-effect system. Behaviors (for example, repetitive behaviors in kids with ASD) usually have multiple causes—for example, stress may make symptoms worse. And the problems will vary across kids. So even if secretin has some positive effects, you may see them in some studies and not in others. More important, studies will vary in quality; there are better and worse ways of conducting scientific research, and a study doesn’t have to be perfect to be published in a scientific journal. So what you really need to do is look at all the studies that have been done and try to see whether the ones that employ the best methodology are also the ones that show positive effects for secretin.
That sounds hard enough to do, but the problem is still one step more difficult than that. It’s not easy to know what constitutes a “good” study. Obviously, there are principles that guide research design and the use of statistics. Practice in thinking about those is certainly a help. But evaluating research quality also requires knowing the relevant scientific content. That’s because the content affects your interpretation of whether or not the study was well done. Here’s a simple example. Let’s say you read a study of the effect of secretin on the behavior of kids with ASD. The study reports that secretin didn’t help. You happen to notice that the study didn’t test boys and girls separately. All the kids with ASD were considered one big group. Does that make it a bad study? It depends. If previous research had shown that gender was an important variable in the way secretin works, that might mean this was a bad study. Or if there was reason to think that gender mattered in ASD, either in its symptomology or its treatment, then researchers probably should look at the effect of secretin separately on boys and girls. For any study, you can usually think up dozens of distinctions—right- or left-handedness, time of day, other medications taken, diet, gene markers—that might make a difference. And if we know that a factor has been important in prior work and the researcher ignored it, that’s a valid criticism.
Or suppose that the results of a study seemed to directly conflict with the results of previous research. The author ought to at least discuss the possible reasons for the conflict, if not pursue the issue in a new study. Or suppose that prior research has shown that the statistical method one might typically use in a given situation doesn’t work in this specialized case. Statistical techniques always rely on assumptions about the data, and let’s say it’s known that an important assumption doesn’t hold for a particular reading test when kids under the age of twelve take it with a time limit. It’s hard to be an intelligent reader of scientific articles if you are not already fairly well versed in the content.
All of the details I’m listing here are simply to emphasize that (1) saying “the research supports it” ought to mean that the research was conducted in the right way, and (2) knowing whether or not the research was conducted in the right way is no small matter. That’s not to say that only professional researchers can assess scientific quality. I’ve met people who became expert on one topic or another and were sophisticated consumers of research. But it took them a long time to get that way. Research expertise is just like any other type of expertise—developing it takes a lot of hard work and practice. Most people who have families, jobs, and other responsibilities cannot put in that sort of time. Is there a way to evaluate research that does not require becoming an expert?
We undertake other tasks that, if done properly, would be terribly complex and time consuming. The typical solution is not to invest the time and energy into doing a foolproof job. Rather, we find shortcuts that, although imperfect, get the job done. Consider the process of buying a car. It’s a big purchase, and you want to be sure that you get the most for your money, right? If you really want to optimize this decision, here’s what you ought to do. First, rate the importance to you of all of the features of an automobile from, say, 0 to 1.0. Hence, you might give “reliability” a 0.8, but “heated seats” just a 0.2. Second, rate every model of automobile for each of these features, using a 1–10 scale. The Porsche 911 gets a 10 for style, the Taurus gets a 3, and so forth. Third, for each car, multiply the ratings from step 2 and the importance values from step 1, and add the products. Now you have an overall value for each car that represents how much you like it. An example for a small set of features and for three vehicles is shown in Table I.2.
TABLE I.2: An example of a “logical” way to select an automobile for purchase.
Now that you know the desirability of each car, you need to factor in cost. In step 3, you would research the maintenance costs for each car, as well as depreciation. In step 4, you would visit every car dealership and negotiate a price for every model that they carry. In step 5, you would repeat steps 1 through 4 for used cars. In step 6, you would combine all of your information about desirability and cost to select the optimal car.
Obviously, no one picks a car this way. It’s too time consuming. We are confronted by many tasks with similar properties: there is a way to do the task that might be optimal, but we lack the time or knowledge to complete the task that way. So what do we do instead? We use an imperfect method called a heuristic. A heuristic is a shortcut. It’s not the best way to do something, but it yields a solution that’s usually pretty good, and it has the great benefit of practicality: it’s easy to calculate. When I last bought a car, my heuristic was Buy the first car you see that is fairly reliable, has four doors, a big trunk, room for two car seats in back, and costs less than $18,000. I may not end up with the optimal car, but by putting up front the characteristics I know I care about a great deal, I’ll probably end up satisfied. And I’ve made the problem manageable.
Evaluating research is like buying a car. There’s an optimal solution to the problem, which is to read and digest all of the relevant research, but most of us don’t have time to execute the optimal solution. The epigraph to this chapter says, “Before obtaining certainty we must often be satisfied with a more or less plausible guess.” Indeed, when certainty is not available, a plausible guess is the best we’re going to do. What we need is a good shortcut.
Our education system has multiple levels. The federal government attempts to influence the education policy of state governments. There are some thirteen thousand local school districts,24 each with its own administration, making decisions within the framework set by the states. School principals run their schools within the framework set by the district administration, and teachers run their classrooms within the framework set by the principal. If kids aren’t learning enough in those classrooms, parents try to supplement what they learn.
At every level of this system, people—motivated by politics, money, or altruism—try to influence what happens. And one of the most frequently used persuasive techniques is to figuratively don the white lab coat of science and intone, “The research says . . .” I’ve said that a shortcut is the best way to judge whether or not a claim really is scientifically supported, but before I introduce the shortcut, I need to persuade you that it makes sense. The reasoning behind the shortcut I described for buying a car is pretty transparent: cars have so many features that you can’t possibly evaluate all of them, so instead you pay attention only to the most important. For the research shortcut to look logical, we need to get a few issues straight.
First, we need to understand what sorts of things people find persuasive. I’ve already said that the people who try seemingly odd educational remedies—those who put their children in oxygen pressure tanks or who figure that kids will learn to read by memorizing what words look like—are not crazy, and they are not stupid. People were aware that the stakes were high as they made these decisions, and I’m more than ready to give them the benefit of the doubt that they made them thoughtfully. Nevertheless, people were persuaded that an educational intervention was scientifically backed when they should not have been so persuaded. What goes into a decision to believe or to disbelieve? In Chapter One, I’ll summarize some fifty years of research on this topic, and we’ll see that the mind actually comes pre-equipped with shortcuts. Certain features of messages are treated as signals of truth. For example, long messages are deemed more believable than short ones. You may not be aware of the message features your mind focuses on, but salespeople, politicians, and savvy social manipulators are. It’s time you knew about them too, and I’ll describe them in Chapter One.
Second, we need to understand how laypeople—not scientists—think about scientific evidence. Surveys show that scientists are trusted more than people in almost any other profession, and people believe that scientific research is the most reliable type of evidence. Why? Why this implicit trust in science? This story starts in sixteenth-century Europe, a time and place in which observing the world—the cornerstone of science—was considered the least persuasive type of evidence. The most persuasive was authority. If the Bible (or ancient thinkers, especially Aristotle) said it, it had to be right. The next hundred years saw a complete reversal of this attitude, with observation—especially controlled observation as found in experiments—held in the highest esteem. The change in the weighing of evidence was due primarily to the wild success of the method in explaining the world and improving the human condition. Science came to influence—and, usually, to improve—virtually every aspect of human affairs. This means that the simple veneer of scientific evidence is an important persuader. This type of evidence is so powerful that in other fields (for example, medicine and engineering) we have strong institutions that monitor and control its use. It’s illegal to say that a medicine has been scientifically tested if it has not. Education has no such constraints. Anyone can say that an educational nostrum is “research based,” and that’s why salespeople repeat the phrase like parrots. And that’s why the research shortcut is needed. In Chapter Two, we’ll look at how the situation reached this point.
Third, if we’re to have a research shortcut, we need to understand the path that we’re cutting short. The shortcut is meant to provide easy passage to the goal we would reach if we took the long route. The goal is “good science.” What does that look like? Interestingly enough, good science turns out to be as difficult to describe as pornography, but we cannot, as Supreme Court Justice Potter Steward famously did, satisfy ourselves by saying, “I know it when I see it.” In Chapter Three, I’ll describe seven principles that most experts would agree are scientific essentials.
Chapter Three will describe what good science looks like, but not how to use it. That’s the subject of Chapter Four, which describes the distinction between basic sciences (for example, chemistry, biology, psychology) and disciplines like education that use findings from the basic sciences. For example, if psychologists learn a new fact about how children think, is that new fact ready to be used in the classroom? I’ll outline two ways that scientific findings can be used in education. The first method is rather painstaking and expensive, but yields quite reliable knowledge. The second is inexpensive and is still painstaking when done well, but is also easy to do sloppily. It produces knowledge of educational practice that ought to be considered tentative. As I’ll show, the difficult, expensive method is a rarity in American education. The cheap, sloppy method is commonplace. Part of the shortcut is to recognize the difference.
In Part One, I argue that people are persuaded by poor arguments (Chapter One) and especially by arguments that appear to be scientific (Chapter Two). Unfortunately, people are unable to distinguish between good and bad science (Chapter Three), and they are usually unclear on how scientific findings can be brought to bear on education problems (Chapter Four). Part Two offers the shortcut, composed of four processes to be applied to the candidate educational program: strip it and flip it; trace it; and analyze it; in the fourth step, you make your decision about whether to adopt it.
“Strip it” means to lay the claim bare, devoid of the emotional language and other ornamentation that people use to cloak the actual scientific claim. Examining the claim in its simplest form can make many problems plain to you: the claim is true but self-evident, or the promised outcome is vague, or no one specifies the connection between what you’re supposed to do and what is supposed to improve. “Flip it” addresses the way that promised outcomes are sensitive to the description provided; for example, saying that ham is “90 percent fat free!” sounds quite different from saying it is “10 percent fat!” We’ll examine different ways that people try to make education products sound good, and how you can see past those claims.
“Trace it” is applied not to the educational program but to its inventor. Most of us use this step already and, in fact, overuse it. It means to pay attention to the qualifications and motivations of the person trying to persuade us. We are most convinced by people who are knowledgeable and impartial. Unfortunately, it’s hard to judge whether or not someone is knowledgeable about a subject unless we ourselves have some expertise. We tend, therefore, to rely on credentials. We believe doctors when they speak about medicine, and electricians when they talk about our fuse box. Naturally, credentials can be faked, but I’ll argue that even when they are genuine, credentials are not a reliable guide to believability in education. In fact, this most commonly used earmark of credibility is the least useful.
“Analyze it,” the third step of the shortcut, means to consider why you are being asked to believe something. We’ll take up two topics: how to use (and not to use) your own experience, and simple methods of evaluating research. I will argue that your experience does count: if the claims about an education product fly in the face of what you know to be true, there is a problem. At the same time, your experience is not an infallible guide. If it were, there would be no need for scientific research. So once we agree that your prior beliefs matter but are not definitive, we need to sort out the circumstances under which they are trustworthy and when they are likely to lead you astray. “Analyze it” also means to apply some simple guidelines to evaluate research claims. The point of the shortcut is to save you from having to evaluate research, so we’re not going to get too technical here. But there are some useful rules of thumb to apply.
After evaluating an idea’s scientific merit, you need to decide whether or not it should be adopted. Although I’m advocating for a shortcut here, I’m not advocating that a decision be rash. Nor am I saying that one should never adopt an educational program that lacks scientific support; as we’ll see, most lack such support. What I’m arguing for is adopting a program only when you have all of the relevant information before you. And that’s the final step—gathering what you know in one place at one time so that you can put it together.
• • •
This book will be especially useful in two situations. In the first, someone else has made a decision, and you’re affected by it. A superintendent has decided to adopt a reading program, and you’re a teacher who must implement it, or you are the parent of a child in the district. Or perhaps you’re a principal in the district who has just been told to bring the news to parents at the next PTA meeting. In each case, someone else has decided that an educational program is a good idea. This book can guide questions that you might pose to the decision maker. Decision makers often respond to any question by saying, “All the research supports it.” Or “This program was designed by Professor So-and-So at [fill in the name of prestigious university here].” This book will show you why these responses are inadequate, and will offer better questions to ask.
In the second situation, you yourself are the decision maker. You’re a parent looking for supplemental educational services for your child who is struggling in math. You’re a teacher who has been asked to recommend a software product for interactive whiteboards, to be used schoolwide. You’re a principal considering whether it’s worth using half of a teacher workday for a program on bullying recommended by a principal at another school. You’re a school board member wondering whether it’s worth sending all the principals in your district to a national conference. In each case, there is a product for sale, and you’re wondering about its educational value. This book will help you know which questions to ask, and what a good answer looks like.
This book will not turn you into a research expert. Indeed, the point of the book is to obviate the need for expertise. And the method I offer is imperfect, like all heuristics. You might apply these methods and still draw the wrong conclusion.
But I can promise this. Whatever your current level of research sophistication, this book will help you ask better questions about the research base behind a product, and it will help you think through the wisdom of purchasing and using a product in your classroom, school district, or home.
a I was in graduate school. A professor laid out the evidence for the Golden Ratio with a straight face, and I was not merely interested: I was agape. I was sure that God himself had placed this number in nature as some sort of code for us to decipher. When the professor pointed out all the flaws in the Golden Ratio argument, I felt cheated.
b If you believe any of these things, please don’t be insulted by my cavalier dismissal. I’m not here to tell you what to believe. But I will state flatly that there is no scientific evidence supporting these beliefs.
c At least one historian (D. Ravitch, 2000, Left back [New York: Simon & Schuster]) argues that the whole-language method caught on because phonics instruction was taken up with too much zest; reading instruction had become boring for kids through the overuse of worksheets and the like. My emphasis on the importance of phonics instruction doesn’t mean that there is no value in any aspect of the whole-word or whole-language approaches. Advocates certainly got right the importance of the child’s interest. But teaching phonics is not negotiable.
d Medications developed for other problems (for example, selective serotonin reuptake inhibitors) are sometimes prescribed for people with ASD. Some medications help with symptoms for some people, but there is not a medication that provides a full-blown cure.
Notes
1. Polya, G. (1973). How to solve it: A new aspect of mathematical method (2nd ed.). Princeton, NJ: Princeton University Press, p. 113. (Original work published 1945)
2. For example, Boselie, F. (1984). The aesthetic attractivity of the golden section. Psychological Research, 45, 367–375; Boselie, F. (1997). The golden section and the shape of objects. Empirical Studies of the Arts, 15, 131–141.
3. Macrosson, W.D.K., & Strachan, G. C. (1997). The preference amongst product designers for the golden section in line partitioning. Empirical Studies of the Arts, 15, 153–163; Macrosson, W.D.K., & Stewart, P. E. (1997). The inclination of artists to partition line sections in the Golden Ratio. Perceptual and Motor Skills, 84, 707–713.
4. Olariu, A. (1999). Golden section and the art of painting. Available online at http://arxiv.org/PS_cache/physics/pdf/9908/9908036v1.pdf.
5. Clement Falbo had the simple idea of measuring a bunch of seashells. They do indeed form logarithmic spirals, but the ratios he observed of real seashells were not close to 1.6; they were all in a range of 1.24–1.43. Falbo, C. (2005). The Golden Ratio—a contrary viewpoint. College Mathematics Journal, 36, 123–134. Available online at www.sonoma.edu/math/faculty/falbo/cmj123-134.
6. For an overview of problems, see Markowsky, G. (1992). Misconceptions about the Golden Ratio. College Mathematics Journal, 23, 2–19. Available online http://laptops.maine.edu/GoldenRatio.pdf.
7. Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008). Learning styles: Concepts and evidence. Psychological Science in the Public Interest, 9, 106–119; Riener, C., & Willingham, D. T. (2010). The myth of learning styles. Change, 42, 32–35.
8. In fact, it had been proposed much earlier, but did not catch on until the 1920s. Mathews, M. M. (1966). Teaching to read, historically considered. Chicago: University of Chicago Press.
9. Notable were the “Dick and Jane” book series by William Gray (longtime dean of the University of Chicago Graduate School of Education) and Zerna Sharp, published by Scott Foresman from the 1930s through the 1970s. They were often parodied for their repetitiveness, with page after page of text like “Oh see! Oh see Jane! Jane can run! Run, Jane, run!”
10. Balmuth, M. (1982). The roots of phonics: A historical introduction. New York: McGraw-Hill.
11. Flesch, R. (1955). Why Johnny can’t read. New York: Harper.
12. For example, Bienvenu, H. J., & Martyn, K. A. (1956). Why can’t Rudy read? National Education Association Journal, 44, 168–175; Betts, E. A. (1955). Teaching Johnny to read. Saturday Review, 38(31), 26–27; and Harris, A. J. (1956). Review of Why Johnny Can’t Read, Teachers College Record, 57, 263. Flesch specifically singled out linguists and psychologists as worthy researchers of reading; education researchers were, he said, the problem. The review in the journal of the Linguistic Society of America was mostly favorable: Hall, R. A., Jr. (1956). Review of Why Johnny Can’t Read. Language, 32, 310–313; but the review in American Psychologist less so: Carroll, J. B. (1956). The case of Dr. Flesch. American Psychologist, 11, 158–163.
13. Chall, J. S. (1967). Learning to read: The great debate. New York: McGraw-Hill.
14. It’s probably more accurate to say it resurfaced with prominence in the 1980s. It never really disappeared. Prominent publications included Goodman, K. (1986). What’s whole in whole language. Portsmouth, NH: Heinemann Educational Books; Smith, F. (1985). Reading without nonsense. New York: Teachers College Press.
15. National Institute of Child Health and Human Development. (2000). Report of the National Reading Panel: Teaching children to read: An evidence-based assessment of the scientific literature on reading and its implications for reading instruction. NIH publication no. 00-4754. Washington, DC: Government Printing Office.
16. Boulet, S. L., Boyle, C. A., & Schieve, L. A. (2009). Health care use and health and functional impact of developmental disabilities among US children, 1997–2005. Archives of Pediatric & Adolescent Medicine, 163, 19–26.
17. Bishop, D.V.M., Whitehouse, A.J.O., Watt, H. J., & Line, E. A. (2008). Autism and diagnostic substitution: Evidence from a study of adults with a history of developmental language disorder. Developmental Medicine & Child Neurology, 50, 341–345.
18. Centers for Disease Control. (2006). Prevalence of autism spectrum disorders—Autism and Developmental Disabilities Monitoring Network, United States, 2006. Available online at http://www.cdc.gov/mmwr/preview/mmwrhtml/ss5810a1.htm.
19. Shute, N. (2010, October). Desperate for an autism cure. Scientific American, pp. 80–85.
20. Vargas, D. L., Nascimbene, C., Krishnan, C., Zimmerman, A. W., & Pardo, C. A. (2005). Neuroglial activation and neuroinflammation in the brain of patients with autism. Annals of Neurology, 57, 67–81.
21. Neuroimmunopathology Laboratory. (n.d.). FAQs: The meaning of neuroinflammatory findings in autism. Available online at http://www.neuro.jhmi.edu/neuroimmunopath/autism_faqs.htm.
22. Search conducted October 14, 2010.
23. As of November 2011, the National Institutes of Health does not recommend the use of secretin to treat ASD. National Institute of Child Health and Human Development. (2011, November). Autism spectrum disorders (ASDs). Available online at http://www.nichd.nih.gov/health/topics/asd.cfm.
24. National Center for Education Statistics, U.S. Department of Education. (2010, November). Table 90: Number of public school districts and public and private elementary and secondary schools: Selected years, 1869–70 through 2008–09. Digest of Education Statistics. Available online at http://nces.ed.gov/programs/digest/d10/tables/dt10_090.asp.