NOTES

PREFACE: A SOCIOLOGIST’S APOLOGY

  1. For John Gribbin’s review of Becker (1998), see Gribbin (1998).

  2. See Watts (1999) for a description of small-world networks.

  3. See, for example, a recent story on the complexity of modern finance, war, and policy (Segal 2010).

  4. For a report on Bailey-Hutchinson’s proposal, see Mervis (2006). For a report on Senator Coburn’s remarks, see Glenn (2009).

  5. See Lazarsfeld (1949).

  6. For an example of the “it’s not rocket science” mentality, see Frist et al. (2010).

  7. See Svenson (1981) for the result about drivers. See Hoorens (1993), Klar and Giladi (1999), Dunning et al (1989), and Zuckerman and Jost (2001) for other examples of illusory superiority bias. See Alicke and Govorun (2005) for the leadership result.

CHAPTER 1: THE MYTH OF COMMON SENSE

  1. See Milgram’s Obedience to Authority for details (Milgram, 1969). An engaging account of Milgram’s life and research is given in Blass (2009).

  2. Milgram’s reaction was described in a 1974 interview in Psychology Today, and is reprinted in Blass (2009). The original report on the subway experiment is Milgram and Sabini (1983) and has been reprinted in Milgram (1992). Three decades later, two New York Times reporters set out to repeat Milgram’s experiment. They reported almost exactly the same experience: bafflement, even anger, from riders; and extreme discomfort themselves (Luo 2004, Ramirez and Medina 2004).

  3. Although the nature and limitations of common sense are discussed in introductory sociology textbooks (according to Mathisen [1989], roughly half of the sociology texts he surveyed contained references to common sense), the topic is rarely discussed in sociology journals. See, however, Taylor (1947), Stouffer (1947), Lazarsfeld (1949), Black (1979), Boudon (1988a), Mathisen (1989), Bengston and Hazzard (1990), Dobbin (1994), and Klein (2006) for a variety of perspectives by sociologists. Economists have been even less concerned with common sense than sociologists, but see Andreozzi (2004) for some interesting remarks on social versus physical intuition.

  4. See Geertz (1975, p.6).

  5. Taylor (1947, p. 1).

  6. Philosophers in particular have wondered about the place of common sense in understanding the world, with the tide of philosophical opinion going back and forth on the matter of how much respect common sense ought to be given. In brief, the argument seems to have been about the fundamental reliability of experience itself; that is, when is it acceptable to take something—an object, an experience, or an observation—for granted, and when must one question the evidence of one’s own senses? On one extreme were the radical skeptics, who posited that because all experience was, in effect, filtered through the mind, nothing at all could be taken for granted as representing some kind of objective reality. At the other extreme were philosophers like Thomas Reid, of the Scottish Realist School, who were of the opinion that any philosophy of nature ought to take the world “as it is.” Something of a compromise position was outlined in America at the beginning of the ninteenth century by the pragmatist school of philosophy, most prominently William James and Charles Saunders Peirce, who emphasized the need to reconcile abstract knowledge of a scientific kind with that of ordinary experience, but who also held that much of what passes for common sense was to be regarded with suspicion (James 1909, p 193). See Rescher (2005) and Mathisen (1989) for discussions of the history of common sense in philosophy.

  7. It should be noted that commonsense reasoning also seems to have backup systems that act like general principles. Thus when some commonsense rule for dealing with some particular situation fails, on account of some previously unencountered contingency, we are not completely lost, but rather simply refer to this more general covering rule for guidance. It should also be noted, however, that attempts to formalize this backup system, most notably in artificial intelligence research, have so far been unsuccessful (Dennett 1984); thus, however it works, it does not resemble the logical structure of science and mathematics.

  8. See Minsky (2006) for a discussion of common sense and artificial intelligence.

  9. For a description of the cross-cultural Ultimatum game study, see Henrich et al. (2001). For a review of Ultimatum game results in industrial countries, see Camerer, Loewenstein, and Rabin (2003).

10. See Collins (2007). Another consequence of the culturally embedded nature of commonsense knowledge is that what it treats as “facts”—self-evident, unadorned descriptions of an objective reality—often turn out to be value judgments that depend on other seemingly unrelated features of the socio-cultural landscape. Consider, for example, the claim that “police are more likely to respond to serious than non-serious crimes.” Empirical research on the matter has found that indeed they do—just as common sense would suggest—yet as the sociologist Donald Black has argued, it is also the case that victims of crimes are more likely to classify them as “serious” when the police respond to them. Viewed this way, the seriousness of a crime is determined not only by its intrinsic nature—robbery, burglary, assault, etc.—but also by the circumstances of the people who are the most likely to be attended to by the police. And as Black noted, these people tend be highly educated professionals living in wealthy neighborhoods. Thus what seems to be a plain description of reality—serious crime attracts police attention—is, in fact, really a value judgment about what counts as serious; and this in turn depends on other features of the world, like social and economic inequality, that would seem to have nothing to do with the “fact” in question. See Black (1979) for a discussion of the conflation of facts and values. Becker (1998, pp. 133–34) makes a similar point in slightly different language, noting that “factual” statements about individual attributes—height, intelligence, etc.—are invariably relational judgments that in turn depend on social structure (e.g., someone who is “tall” in one context may be short in another; someone who is poor at drawing is not considered “mentally retarded” whereas someone who is poor at math or reading may be). Finally, Berger and Luckman (1966) advance a more general theory of how subjective, possibly arbitrary routines, practices, and beliefs become reified as “facts” via a process of social construction.

11. See Geertz (1975).

12. See Wadler (2010) for the story about the “no lock people.”

13. For the Geertz quote, see Geertz (1975, p. 22). For a discussion of how people respond to their differences of opinions, and an intriguing theoretical explanation of their failure to converge on a consensus view, see Sethi and Yildiz (2009).

14. See Gelman, Lax, and Phillips. (2010) for survey results documenting Americans’ evolving attitudes toward same-sex marriage.

15. It should be noted that political professionals, like politicians, pundits, and party officials, do tend to hold consistently liberal or conservative positions. Thus, Congress, for example, is much more polarized along a liberal-conservative divide than the general population (Layman et al. 2006). See Baldassari and Gelman (2008) for a detailed discussion of how political beliefs of individuals do and don’t correlate with each other. See also Gelman et al. (2008) for a more general discussion of common misunderstanding about political beliefs and voting behavior.

16. Le Corbusier (1923, p. 61).

17. See Scott (1998).

18. For a detailed argument about the failures of planning in economic development, particularly with respect to Africa, see Easterly (2006). For an even more negative viewpoint of the effect of foreign aid in Africa, see Moyo (2009), who argues that it has actually hurt Africa, not helped. For a more hopeful alternative viewpoint see Sachs (2006).

19. See Jacobs (1961, p. 4)

20. See Venkatesh (2002).

21. See Ravitch (2010) for a discussion of how popular, commonsense policies such as increased testing and school choice actually undermined public education. See Cohn (2007) and Reid (2009) for analysis of the cost of health care and possible alternative models. See O’Toole (2007) for a detailed discussion on forestry management, urban planning, and other failures of government planning and regulation. See Howard (1997) for a discussion and numerous anecdotes of the unintended consequences of government regulations. See Easterly (2006) again for some interesting remarks on nation-building and political interference, and Tuchman (1985) for a scathing and detailed account of US involvement in Vietnam. See Gelb (2009) for an alternate view of American foreign policy.

22. See Barbera (2009) and Cassidy (2009) for discussion of the cost of financial crises. See Mintzberg (2000) and Raynor (2007) for overviews of strategic planning methods and failures. See Knee, Greenwald, and Seave (2009) for a discussion of the fallibility of media moguls; and McDonald and Robinson (2009), and Sorkin (2009) for inside accounts of investment banking leaders whose actions precipitated the recent financial crisis. See also recent news stories recounting the failed AOL–Time Warner merger (Arango 2010), and the rampant, ultimately doomed growth of Citigroup (Brooker 2010).

23. Clearly not all attempts at corporate or even government planning end badly. Looking back over the past few centuries, in fact, overall conditions of living have improved dramatically for a large fraction of the world’s populations—evidence that even the largest and most unwieldy political institutions do sometimes get things right. How are we to know, then, that common sense isn’t actually quite good at solving complex social problems, failing no more frequently than any other method we might use? Ultimately we cannot know the answer to this question, if only because no systematic attempt to collect data on relative rates of planning successes and failures has ever been attempted—at least, not to my knowledge. Even if such an attempt had been made, moreover, it would still not resolve the matter, because absent some other “uncommon sense” method against which to compare it, the success rate of commonsense-based planning would be meaningless. A more precise way to state my criticism of commonsense reasoning, therefore, is not that it is universally “good” or “bad,” but rather that there are sufficiently many examples where commonsense reasoning has led to important planning failures that it is worth contemplating how we might do better.

24. For details of financial crises throughout the ages, see Mackay (1932), Kindleberger (1978), and Reinhart and Rogoff (2009).

25. There are, of course, several overlapping traditions in philosophy that already take a suspicious view of what I am calling common sense as their starting point. One way to understand the entire project of what Rawls called political liberalism (Rawls 1993), along with the closely related idea of deliberative democracy (Bohman 1998; Bohman and Rehg 1997), is, in fact, as an attempt to prescribe a political system that can offer procedural justice to all its members without presupposing that any particular point of view—whether religious, moral, or otherwise—is correct. The whole principle of deliberation, in other words, presupposes that common sense is not to be trusted, thereby shifting the objective from determining what is “right” to designing political institutions that don’t privilege any one view of what is right over any other. Although this tradition is entirely consistent with the critiques of common sense that I raise in this book, my emphasis is somewhat different. Whereas deliberation simply assumes incompatibility of commonsense beliefs and looks to build political institutions that work anyway, I am more concerned with the particular types of errors that arise in commonsense reasoning. Nevertheless, I touch on aspects of this work in chapter 9 when I discuss matters of fairness and justice. A second strand of philosophy that starts with suspicion of common sense is the pragmatism of James and Dewey (see, for example, James 1909, p. 193). Pragmatists see errors embedded in common sense as an important obstruction to effective action in the world, and therefore take willingness to question and revise common sense as a condition for effective problem solving. This kind of pragmatism has in turn influenced efforts to build institutions, some of which I have described in chapter 8, that systematically question and revise their own routines and thus can adapt quickly to changes that cannot be predicted. This tradition, therefore, is also consistent with the critiques of common sense developed here, but as with the deliberation tradition, it can be advanced without explicitly articulating the particular cognitive biases that I identify. Nevertheless, I would contend that a discussion of the biases inherent to commonsense reasoning is a useful complement to both the deliberative and pragmatist agendas, providing in effect an alternative argument for the necessity of institutions and procedures that do not depend on commonsense reasoning in order to function.

CHAPTER 2: THINKING ABOUT THINKING

  1. For the original study of organ donor rates, see Johnson and Goldstein (2003). It should be noted that the rates of indicated consent were not the same as the eventual organ-donation rate, which often depends on other factors like family members’ approval. The difference in final donation rates was actually much smaller—more like 16 percent—but still dramatic.

  2. See Duesenberry (1960) for the original quotation, which is repeated approvingly by Becker himself (Becker and Murphy 2000, p. 22).

  3. For more details on the interplay between cooperation and punishment, see Fehr and Fischbacher (2003), Fehr and Gachter (2000 and 2002), Bowles et al. (2003), and Gurerk et al. (2006).

  4. Within sociology, the debate over rational choice theory has played out over the past twenty years, beginning with an early volume (Coleman and Fararo 1992) in which perspectives from both sides of the debate are represented, and continued in journals like the American Journal of Sociology (Kiser and Hechter 1998; Somers 1998; Boudon 1998) and Sociological Methods and Research (Quadagno and Knapp 1992). Over the same period, a similar debate has also played out in political science, sparked by the publication of Green and Shapiro’s (1994) polemic, Pathologies of Rational Choice Theory. See Friedman (1996) for the responses of a number of rational choice advocates to Green and Shapiro’s critique, along with Green and Shapiro’s responses to the responses. Other interesting commentaries are by Elster (1993, 2009), Goldthorpe (1998), McFadden (1999), and Whitford (2002).

  5. For accounts of the power of rational choice theory to explain behavior, see Harsanyi (1969), Becker (1976), Buchanan (1989), Farmer (1992) Coleman (1993) Kiser and Hechter (1998), and Cox (1999).

  6. See Freakonomics for details (Levitt and Dubner 2005). For other similar examples see Landsburg (1993 and 2007), Harford (2006), and Frank (2007).

  7. Max Weber, one of the founding fathers of sociology, effectively defined rational behavior as behavior that is understandable, while James Coleman, one of the intellectual fathers of rational choice theory, wrote that “The very concept of rational action is a conception of action that is ‘understandable’ action that we need ask no more questions about” (Coleman 1986, p. 1). Finally, Goldthorpe (1998, pp. 184–85) makes the interesting point that it is not even clear how we should talk about irrational, or nonrational behavior unless we first have a conception of what it means to behave rationally; thus even if it does not explain all behavior, rational action should be accorded what he calls “privilege” over other theories of action.

  8. See Berman (2009) for an economic analysis of terrorism. See Leonhardt (2009) for a discussion of incentives in the medical profession.

  9. See Goldstein et al. (2008) and Thaler and Sunstein (2008) for more discussion and examples of defaults.

10. For details of the major results of the psychology literature, see Gilovich, Griffin, and Kahneman (2002) and Gigerenzer et al., (1999). For the more recently established behavioral economics see Camerer, Loewenstein, and Rabin (2003). In addition to these academic contributions, a number of popular books have been published recently that cover much of the same ground. See, for example, Gilbert (2006), Ariely (2008), Marcus (2008), and Gigerenzer (2007).

11. See North et al. (1997) for details on the wine study, Berger and Fitzsimons (2008) for the study on Gatorade, and Mandel and Johnson (2002) for the online shopping study. See Bargh et al. (1996) for other examples of priming.

12. For more details and examples of anchoring and adjustment, see Chapman and Johnson (1994), Ariely et al. (2003), and Tversky and Kahneman (1974).

13. See Griffin et al. (2005) and Bettman et al. (1998) for examples of framing effects on consumer behavior. See Payne, Bettman, and Johnson (1992) for a discussion of what they call constructive preferences, including preference reversal.

14. See Tversky and Kahneman (1974) for a discussion of “availability bias.” See Gilbert (2006) for a discussion of what he calls “presentism.” See Bargh and Chartrand (1999) and Schwarz (2004) for more on the importance of “fluency.”

15. See Nickerson (1998) for a review of confirmation bias. See Bond et al. (2007) for an example of confirmation bias in evaluating consumer products. See Marcus (2008, pp. 53–57) for a discussion of motivated reasoning versus confirmation bias. Both biases are also closely to related to the phenomenon of cognitive dissonance (Festinger 1957; Harmon-Jones and Mills 1999) according to which individuals actively seek to reconcile conflicting beliefs (“The car I just bought was more expensive than I can really afford” versus “The car I just bought is awesome”) by exposing themselves selectively to information that supports one view or discredits the other.

16. See Dennett (1984).

17. According to the philosopher Jerry Fodor (2006), the crux of the frame problem derives from the “local” nature of computation, which—at least as currently understood—takes some set of parameters and conditions as given, and then applies some sort of operation on these inputs that generates an output. In the case of rational choice theory, for example, the “parameters and conditions” might be captured by the utility function, and the “operation” would be some optimization procedure; but one could imagine other conditions and operations as well, including heuristics, habits, and other nonrational approaches to problem solving. The point is that no matter what kind of computation one tries to write down, one must start from some set of assumptions about what is relevant, and that decision is not one that can be resolved in the same (i.e., local) manner. If one tried to resolve it, for example, by starting with some independent set of assumptions about what is relevant to the computation itself, one would simply end up with a different version of the same problem (what is relevant to that computation?), just one step removed. Of course, one could keep iterating this process and hope that it terminates at some well-defined point. In fact, one can always do this trivially by exhaustively including every item and concept in the known universe in the basket of potentially relevant factors, thereby making what at first seems to be a global problem local by definition. Unfortunately, this approach succeeds only at the expense of rendering the computational procedure intractable.

18. For an introduction to machine learning, see Bishop (2006). See Thompson (2010) for a story about the Jeopardy-playing computer.

19. For a compelling discussion of the many ways in which our brains misrepresent both our memories of past events and our anticipated experience of future events, see Gilbert (2006). As Becker (1998, p. 14) has noted, even social scientists are prone to this error, filling in the motivations, perspectives, and intentions of their subjects whenever they have no direct evidence of them. For related work on memory, see Schacter (2001) and Marcus (2008). See Bernard et al. (1984) for many examples of errors in survey respondents’ recollections of their own past behavior and experience. See Ariely (2008) for additional examples of individuals overestimating their anticipated happiness or, alternatively, underestimating their anticipated unhappiness, regarding future events. For the results on online dating, see Norton, Frost, and Ariely (2007).

20. For discussions of performance-based pay, see Hall and Liebman (1997) and Murphy (1998).

21. Mechanical Turk is named for a ninteenth-century chess-playing automaton that was famous for having beaten Napoleon. The original Turk, of course, was a hoax—in reality there was a human inside making all the moves—and that’s exactly the point. The tasks that one typically finds on Mechanical Turk are there because they are relatively easy for humans to solve, but difficult for computers—a phenomenon that Amazon founder Jeff Bezos calls “artificial, artificial intelligence. See Howe (2006) for an early report on Amazon’s Mechanical Turk, and Pontin (2007) for Bezos’s coinage of “artificial, artificial intelligence.” See http://behind-the-enemy-lines.blogspot.com for additional information on Mechanical Turk.

22. See Mason and Watts (2009) for details on the financial incentives experiment.

23. Overall, women in fact earn only about 75 percent as much as men, but much of this “pay gap” can be accounted for in terms of different choices that women make—for example, to work in lower-paying professions, or to take time off from work to raise a family, and so on. Accounting for all this variability, and comparing only men and women who work in comparable jobs under comparable conditions, roughly a 9 percent gap remains. See Bernard (2010) and http://www.iwpr.org/pdf/C350.pdf for more details.

24. See Prendergast (1999), Holmstrom and Milgrom (1991), and Baker (1992) for studies of “multitasking.” See Gneezy et al. (2009) for a study of the “choking” effect. See Herzberg (1987), Kohn (1993), and Pink (2009) for general critiques of financial rewards.

25. Levitt and Dubner (2005, p. 20)

26. For details on the unintended consequences of the No Child Left Behind Act, see Saldovnik et al. (2007). For a specific discussion of “educational triage” practices that raise pass rates without impacting overall educational quality, see Booher-Jennings (2005, 2006). See Meyer (2002) for a general discussion on the difficulty of measuring and rewarding performance.

27. See Rampell (2010) for the story about politicians.

28. This argument has been made most forcefully by Donald Green and Ian Shapiro, who argue that when “everything from conscious calculation to ‘cultural inertia’ may be squared with some variant of rational choice theory … our disagreement becomes merely semantic, and rational choice theory is nothing but an ever-expanding tent in which to house every plausible proposition advanced by anthropology, sociology, or social psychology.” (Green and Shapiro, 2005, p. 76).

CHAPTER 3: THE WISDOM (AND MADNESS) OF CROWDS

  1. See Riding (2005) for the statistic about visitors. See http://en.wikipedia.org/wiki/Mona_Lisa for other entertaining details about the Mona Lisa.

  2. See Clark (1973, p. 150).

  3. See Sassoon (2001).

  4. See Tucker (1999) for the full article on Harry Potter. See (Nielsen 2009) for details of their Facebook analysis. See Barnes (2009) for the story on movies.

  5. For the story about changes in consumer behavior postrecession, see Goodman (2009). Bruce Mayhew (1980) and Frank Dobbin (1994) have both made a similar argument about circular reasoning.

  6. This argument was made long ago by the physicist Philip Anderson in a famous paper titled “More Is Different” (Anderson 1972).

  7. For Thatcher’s original quote, see Keay (1987).

  8. The definition of “methodological individualism” is typically traced to the early twentieth century in the writings of the Austrian economist Joseph Schumpeter (1909, p. 231); however, the idea goes back much earlier, at least to the writings of Hobbes, and was popular among the thinkers of the Enlightenment, for whom an individualistic view of action fit perfectly with their emerging theories of rational action. See Lukes (1968) and Hodgson (2007) for a discussion of the intellectual origins of methodological individualism, as well as a scathing critique of its logical foundations.

  9. I am oversimplifying here, but not a lot. Although the original models of business cycles did assume a single representative agent, more recent models allow for multiple agents, each of which represents different sectors of the economy (Plosser 1989). Nevertheless, the same essential problem arises in all these models: the agents are not actually real people, or even firms, who pay attention to what other people and firms are doing, but rather are representative agents who make decisions on behalf of a whole population.

10. A number of excellent critiques of the representative individual idea have been written, most notably by the economist Alan Kirman (1992). That the criticism is so well known, however, and yet has had so little influence on the actual practice of social science, should demonstrate how difficult a problem it is to expunge.

11. Even rational choice theorists—who are as much as anyone the inheritors of methodological individualism—are in practice just as comfortable applying the principle of utility maximization to social actors like households, firms, unions, “elites,” and government bureaus as to individual people. See Becker (1976), Coleman and Fararo (1992), Kiser and Hechter (1998), and Cox (1999) for numerous examples of representative agents employed in rational choice models.

12. See Granovetter (1978) for details of the “riot model.”

13. For more details on the origins of social influence, see Cialdini (2001) and Cialdini and Goldstein (2004)

14. For examples of cumulative advantage models, see Ijiri and Simon (1975), Adler (1985), Arthur (1989), De Vany and Walls (1996), and De Vany (2004).

15. For the “army in a lab” quote, see Zelditch (1969). Experiments, it should be noted, are not entirely foreign to sociology. For example, the field of “network exchange” is one area of sociology in which it is common to run lab experiments, but these networks generally comprise only four or five individuals (Cook et al. 1983; Cook et al. 1993). Cooperation studies in behavioral economics, political science, and sociology also use experiments, but once again the groups involved are small (Fehr and Fischbacher 2003).

16. See Salganik, Dodds, and Watts (2006) for a detailed description of the original Music Lab experiment.

17. See Salganik and Watts (2009b; 2009a) for more background on Music Lab, and details of follow-up experiments.

CHAPTER 4: SPECIAL PEOPLE

  1. The movie The Social Network, about the founding of Facebook, was released in 2010. The Fosters beer commercial is available at http://www.youtube.com/watch?v=nPgSa9djYU8.

  2. For a history of social network analysis, see Freeman (2004). For summaries of the more recent literature on network science, see Newman (2003), Watts (2004), Jackson (2008), and Kleinberg and Easley (2010). For more popular accounts, see Watts (2003) and Christakis and Fowler (2009).

  3. See Leskovec and Horvitz (2008) for details of the Microsoft instant messenger network study.

  4. See Jacobs (1961, pp. 134–35).

  5. Milgram did not invent the phrase “six degrees of separation,” referring only to the “small world problem.” Instead, it was the playwright John Guare who wrote a play with that title in 1990. Oddly, Guare has credited the origin of the phrase to Guglielmo Marconi, the Italian inventor and developer of radiotelegraphy, who reportedly said that in a world connected by the telegraph, everyone would be connected to everyone else via only six degrees of separation. According to numerous citations on the web (see, e.g. http://www.megastarmedia.us/mediawiki/index.php/Six_degrees_of_separation), Marconi is supposed to have made this claim during his Nobel Prize lecture in 1909. Unfortunately, the speech itself (http://nobelprize.org/nobel_prizes/physics/laureates/1909/marconi-lecture.html) makes no mention of the concept; nor have I been able to locate the source of Marconi’s quote anywhere else. Regardless of the ultimate origin of the phrase, however Milgram deserves the credit for having been the first to put some evidence behind it.

  6. As a number of critics have noted, Milgram’s results were less conclusive than they have sometimes been portrayed (Kleinfeld 2002). In particular, of the three hundred chains that started out to reach the target, a third began in Boston itself, and another third began with individuals in Omaha who were investors in the stock market—which at the time would have required them to have access to a stockbroker. Seeing as the sole target of the experiment was a Boston stockbroker, it is not so surprising anymore that these chains could reach him. Thus the most compelling evidence for the small-world hypothesis came from the ninety-six chains that began with randomly selected people in Omaha, and only seventeen of these chains actually made it. Given these uncertainties, one has to be careful when placing too much weight on the role of people like Mr. Jacobs, who could easily have been a statistical fluke. Indeed, Milgram himself noted as much, claiming only that “the convergence of communication chains through common individuals is an important feature of small world nets, and it should be accounted for theoretically.”

  7. See Gladwell (1999).

  8. Naturally, how many friends you count people as having depends a lot on how you define “friendship,” a concept that has always been ambiguous, and is even more so now in the era of social networking sites, where you can “friend” someone you don’t even know. The result is that what we might call “true” friendship has become difficult to distinguish from mere “acquaintanceship,” which in turn has gotten blurred together with the even more ephemeral notion of “one-way acquaintanceship” (i.e., “I’ve heard of you, but you don’t know me from Adam”). Although some people on MySpace have a million “friends,” as soon as we apply even the loosest definition of friendship, such as each person knowing the other on a first-name basis, the number immediately drops to the range of a few hundred to a few thousand. Interestingly, this range has remained surprisingly constant since the first studies were conducted in the late 1980s (McCormick et al. 2008; Bernard et al. 1989, 1991; Zheng et al. 2006).

  9. There are a number of subtleties to the issue of chain lengths in small-world experiments that have led to a certain amount of confusion regarding what can and cannot be concluded from the evidence. For details about the experiment itself, see Dodds, Muhamad, and Watts (2003), and for a clarifying discussion of the evidence, as well as a detailed analysis of chain lengths, see Goel, Muhamad, and Watts (2009).

10. See Watts and Strogatz (1998); Kleinberg (2000a; 2000b); Watts, Dodds, and Newman (2002); Watts (2003, ch. 5); Dodds, Muhamad, and Watts (2003); and Adamic and Adar (2005) for details on the searchability of social networks.

11. Influencers go by many names. Often they are called opinion leaders or influentials but they are also called e-fluentials, mavens, hubs, connectors, alpha mums, or even passionistas. Not all of these labels are intended to mean exactly the same thing, but they all refer to the same basic idea that a small number of special individuals have an important effect on the opinions, beliefs, and consumption habits of a large number of “ordinary” individuals (see Katz and Lazarsfeld 1955, Merton 1968b, Weimann 1994, Keller and Berry 2003, Rand 2004, Burson-Marsteller 2001, Rosen 2000, and Gladwell 2000 for a range of influentials-related labels). Ed Keller and Michael Berry claim that “One in ten Americans tells the other nine how to vote, where to eat, and what to buy.” They conclude, in fact, that “Few important trends reach the mainstream without passing through the Influentials in the early stages, and the Influentials can stop a would-be trend in its tracks” (Keller and Berry 2003, pp. 21–22); and the market-research firm Burson-Marsteller concurs, claiming that “The far-reaching effect of this powerful group of men and women can make or break a brand, marshal or dissolve support for business and consumer issues, and provide insight into events as they unfold.” All one needs to do, it seems, is to find these individuals and influence them. As a result, “Influencers have become the ‘holy grail’ for today’s marketers” (Rand 2004).

12. For the original quote, see Gladwell (2000, pp. 19–21).

13. See Keller and Berry (2003, p. 15).

14. See, for example, Christakis and Fowler (2009), Salganik et al. (2006), and Stephen (2009).

15. In fact, even then you can’t be sure. If A and B are friends, they are likely to have similar tastes, or watch similar shows on TV and so be exposed to similar information; thus what looks like influence may really just be homophily. So if every time a friend of A’s adopts something that A adopts, we attribute that to A’s influence, we are probably overestimating how influential A is. See Aral (2009), Anagostopoulos et al. (2008), Bakshy et al. (2009), Cohen-Cole and Fletcher (2008b, 2008a) Shuliti and Thomas (2010), and Lyons (2010) for more details on the issue of similarity versus influence.

16. See Katz and Lazarsfeld (1955) for a discussion of the difficulty of measuring influence, along with a more general introduction to personal influence and opinion leaders. See Weimann (1994) for a discussion of proxy measures of influence.

17. See Watts (2003) and Christakis and Fowler (2009) for discussions of contagion in social networks.

18. The connection between influentials and contagion is most explicit in Gladwell’s analogy of “social epidemics,” but a similar connection is implied throughout the literature on influentials. Everett Rogers (1995, p. 281) claims that “The behavior of opinion leaders is important in determining the rate of adoption of an innovation in a system. In fact, the S-shape of the diffusion curve occurs because once opinion leaders adopt and tell others about the innovation, the number of adopters per unit time takes off.” Keller and Berry make a similar point when they claim that influentials are “like the central processing units of the nation. Because they know many people and are in contact with many people in the course of a week, they have a powerful multiplier effect, spreading the word quickly across a broad network when they find something they want others to know about” (Keller and Berry 2003, p. 29).

19. For details of the models, see Watts and Dodds (2007).

20. The original Bass model is described by Bass (1969).

21. See Gladwell (2000, p. 19).

22. A number of people interpreted this result as a claim that “influentials don’t exist,” but that’s actually not what we said. To begin with, as I’ve discussed, there are so many different kinds of influentials that it would be impossible to rule them all out even if that was what we intended to do. But we didn’t intend to do that. In fact, the whole point of our models was to assume the existence of influentials and see how much they mattered relative to ordinary individuals. Another misconception regarding our paper was that we had claimed that “influentials don’t matter,” but that’s not what we said either. Rather, we found only that influentials are unlikely to play the role described by the law of the few. Whether or not influentials, defined somehow, can be reliably identified and exploited in some manner remains an open question.

23. See Adar and Adamic (2005); Sun, Rosenn, Marlow, and Lento (2009); Bakshy, Karrer, and Adamic (2009); and Aral et al. (2009) for details.

24. For details of the Twitter study see Bakshy et al (2010).

25. For the anecdote about Kim Kardashian’s $10,000 Tweets, see Sorkin (2009, b).

CHAPTER 5: HISTORY, THE FICKLE TEACHER

  1. A number of sociologists have even argued explicitly that history ought to be a scientific discipline with its own laws and methods for extracting them (Kiser and Hechter 1998). Historians, meanwhile, have been more circumspect regarding the scientific status of their discipline but have nonetheless been tempted to draw analogies between their own practices and those of natural scientists (Gaddis 2002).

  2. See Scott (1998) for a discussion of what he calls metis (the Greek word for “skill”), meaning the collection of formal decision procedures, informal rules of thumb, and trained instinct that characterized the performance of experienced professionals.

  3. For more on creeping determinism and hindsight bias, see the classic article by Baruch Fischhoff (1982). Philosophers and psychologists disagree over how strong our psychological bias to think deterministically really is. As Roese and Olson (1996) point out, people frequently do engage in counterfactual thinking—imagining, for example, how things might have worked out “if only” some antecedent event had not taken place—suggesting that commonsense views of causality are more conditional than absolute. A more correct way to state the problem, therefore, is that we systematically overweight the likelihood of what happened relative to the counterfactual outcomes. For the purpose of my argument, however, it is sufficient that we do the latter.

  4. See Dawes (2002, Chapter 7) for the full story of Flight 2605 and analysis.

  5. See Dawes (2002) and Harding et al. (2002) for more on school shootings.

  6. See Gladwell (2000, p. 33)

  7. See Tomlinson and Cockram (2003) for details on the SARS outbreaks in the Prince of Wales Hospital and the Amoy Gardens apartment complex. Various theoretical models (Small et al. 2004; Bassetti et al. 2005; Masuda et al. 2004) have subsequently been proposed to explain the SARS epidemic in terms of superspreaders.

  8. See Berlin (1997, p. 449).

  9. Gaddis (2002), in fact, makes more or less this argument.

10. For the full argument, see Danto (1965).

11. For the full story of Cisco, see Rosenzweig (2007).

12. See Gaddis (2002).

13. See Lombrozo (2007) for details of the study. It should be noted that when told in simple terms the relative probabilities of the different explanations, participants did in fact choose the more complex explanation at a much higher rate. Such explicit information, however, is rarely available in real-world scenarios.

14. See Tversky and Kahneman (1983) for details.

15. For evidence of confidence afforded by stories, see Lombrozo (2006, 2007) and Dawes (2002, p. 114). Dawes (1999), in fact, makes the stronger argument that human “cognitive capacity shuts down in the absence of a story.”

16. For example, a preference for simplicity in explanations is deeply embedded in the philosophy of science. The famous Ockham’s razor—named for the fourteenth-century English logician William of Ockham—posits that “plurality ought never be posited without necessity,” meaning essentially that a complex theory ought never to be adopted where a simpler one would suffice. Most working scientists regard Ockham’s razor with something close to reverence—Albert Einstein, for example, once claimed that a theory “ought to be as simple as possible, and no simpler”—and the history of science would seem to justify this reverence, filled as it is with examples of complex and unwieldy ideas being swept away by simpler, more elegant formulations. What is perhaps less appreciated about the history of science is that it is also filled with examples of initially simple and elegant formulations becoming increasingly more complex and inelegant as they struggle to bear the burden of empirical evidence. Arguably, in fact, it is the capacity of the scientific method to pursue explanatory power, even at the cost of theoretical elegance and parsimony, where its real strength lies.

17. For Berlin’s full analysis of the differences between science and history, and the impossibility of remaking the latter in the image of the former, see Berlin (1960).

18. See Gaddis (2002) for a warning about the perils of generalizing, and also some examples of doing just that.

19. George Santayana (1905).

CHAPTER 6: THE DREAM OF PREDICTION

  1. See Rosenbloom (2009).

  2. See Tetlock (2005) for details.

  3. See Schnaars (1989, pp. 9–33) for his analysis and lots of entertaining examples. See also Sherden (1998) for additional evidence of the lousy forecasting record of futurologists. See also Kuran (1991) and Lohmann (1994) for discussions of the unpredictability of political revolutions; specifically the 1989 collapse of the East Germany. And see Gabel (2009) for a retrospective look at the Congressional Budget Office’s Medicare cost predictions.

  4. See Parish (2006) for a litany of intended blockbusters that tanked at the U.S. box office (although some, like Waterworld, later became profitable through foreign box office revenues and video and DVD sales). See Seabrook (2000) and Carter (2006) for some entertaining stories about some disastrous miscalculations and near-misses inside the media industry. See Lawless (2005) for some interesting background on the publisher Bloomsbury’s decision to acquire Harry Potter (for £2,500). General information about production in cultural industries is given in Caves (2000) and Bielby and Bielby (1994).

  5. In early 2010, the market capitalization of Google was around $160B, but it has fluctuated as high as $220B. See Makridakis, Hogarth, and Gaba (2009a) and Taleb (2007) for lengthier descriptions of these and other missed predictions. See Lowenstein (2000) for the full story of Long-Term Capital Management.

  6. Newton’s quote is taken from Janiak (2004, p. 41).

  7. The Laplace quote is taken from http://en.wikipedia.org/wiki/Laplace’s-demon.

  8. Lumping all processes into two coarse categories is a vast oversimplification of reality, as the “complexity” of a process is not a sufficiently well understood property to be assigned anything like a single number. It’s also a somewhat arbitrary one, as there’s no clear definition of when a process is complex enough to be called complex. In an elegant essay, Warren Weaver, then vice president of the Rockefeller Foundation, differentiated between what he called disorganized and organized complexity (Weaver 1958), where the former correspond to systems of very large numbers of independent entities, like molecules in a gas. Weaver’s point was that disorganized complexity can be handled with the same kinds of tools that apply to simple systems, albeit in a statistical rather than deterministic way. By organized complexity, however, he means systems that are neither simple nor subject to the helpful averaging properties of disorganized systems. In my dichotomous classification scheme, in other words, I have effectively lumped together simple systems with disorganized systems. As different as they are, however, they are similar from the perspective of making predictions; thus conflation does not affect my argument.

  9. See Orrell (2007) for a slightly different take on prediction in simple versus complex systems. See Gleick (1987), Watts (2003), and Mitchell (2009) for more general discussions of complex systems.

10. When I say we can predict only the probability of something happening, I am speaking somewhat loosely. The more correct way to talk about prediction for complex systems is that we ought to be able to predict properties of the distribution of outcomes, where this distribution characterizes the probability that a specified class of events will occur. So, for example, we might predict the probability that it will rain on a given day, or that the home team will win, or that a movie will generate more than a certain level of revenue. Equivalently, we might ask questions about the number of points by which we expect the home team to win, or the expected revenue of a particular class of movies to earn, or even the variance that we expect to observe around the average. Regardless, all these predictions are about “average properties” in the sense that they can be expressed as an expectation of some statistic over many draws from the distribution of outcomes.

11. For a die roll, it’s even worse: The best possible performance is to be right one time out of six, or less than 17 percent. In real life, therefore, where the range of possible outcomes can be much greater than a die roll—think, for example, of trying to predict the next bestseller—a track record of predicting the right outcome 20 percent of the time might very well be as good as possible. It’s just that being “right” 20 percent of the time also means being “wrong” 80 percent of the time; that just doesn’t sound very good.

12. See http://www.cimms.ou.edu/~doswell/probability/Probability.html. Orrell (2007) also presents an informative discussion of weather prediction; however, he is mostly concerned with longer-range forecasts, which are considerably less reliable.

13. Specifically, “frequentists” insist that statements about probabilities refer to the relative fraction of particular outcomes being realized, and therefore apply only to events, like flipping a coin, that can in principle be repeated ad infinitum. Conversely, the “evidential” view is that a probability should be interpreted only as the odds one ought to accept for a particular gamble, regardless of whether it is repeated or not.

14. See de Mesquita (2009) for details.

15. As Taleb explains, the term “black swan” derives from the European settlement of Australia: Until the settlers witnessed black swans in what is now Western Australia, conventional wisdom held that all swans must be white.

16. For details of the entire sequence of events surrounding the Bastille, see Sewell (1996, pp. 871–78). It is worth noting, moreover, that other historians of the French Revolution draw the boundaries rather differently from Sewell.

17. Taleb makes a similar point—namely that to have predicted the invention of what we now call the Internet, one would have to have known an awful lot about the applications to which the Internet was put after it had been invented. As Taleb puts it, “to understand the future to the point of being able to predict it, you need to incorporate elements from this future itself. If you know about the discovery you are about to make, then you have almost made it” (Taleb 2007, p. 172).

CHAPTER 7: THE BEST-LAID PLANS

  1. Interestingly, a recent story in Time magazine (Kadlec 2010) contends that a new breed of poker players is relying on statistical analysis of millions of games played online to win at major tournaments.

  2. See Ayres (2008) for details. See also Baker (2009) and Mauboussin (2009) for more examples of supercrunching.

  3. For more details on prediction markets, see Arrow et al. (2008), Wolfers and Zitzewitz (2004), Tziralis and Tatsiopoulos (2006), and Sunstein (2005). See also Surowiecki (2004) for a more general overview of the wisdom of crowds.

  4. See Rothschild and Wolfers (2008) for details of the Intrade manipulation story.

  5. In a recent blog post, Ian Ayres (author of Supercrunchers) calls the relative performance of prediction markets “one of the great unresolved questions of predictive analytics” (http://freakonomics.blogs.nytimes.com/2009/12/23/prediction-markets-vs-super-crunching-which-can-better-predict-how-justice-kennedy-will-vote/).

  6. To be precise, we had different amounts of data for each of the methods—for example, our own polls were conducted over only the 2008–2009 season, whereas we had nearly thirty years of Vegas data, and TradeSports predictions ended in November 2008, when it was shut down—so we couldn’t compare all six methods over any given time interval. Nevertheless, for any given interval, we were always able to compare multiple methods. See Goel, Reeves, et al. (2010) for details.

  7. In this case, the model was based on the number of screens the movie was projected to open on, and the number of people searching for it on Yahoo! the week before it opened. See Goel, Reeves, et al. (2010) for details. See Sunstein (2005) for more details on the Hollywood Stock Exchange and other prediction markets.

  8. See Erikson and Wlezien (2008) for details of their comparison between opinion polls and the Iowa Electronic Markets.

  9. Ironically, the problem with experts is not that they know too little, but rather that they know too much. As a result, they are better than nonexperts at wrapping their guesses in elaborate rationalizations that make them seem more authoritative, but are in fact no more accurate. See Payne, Bettman, and Johnson (1992) for more details of how experts reason. Not knowing anything, however, is also bad, because without a little expertise, one has trouble even knowing what one ought to be making guesses about. For example, while most of the attention paid to Tetlock’s study of expert prediction was directed at the surprisingly poor performance of the experts—who, remember, were more accurate when making predictions outside their area of expertise than in it—Tetlock also found that predictions made by naïve subjects (in this case university undergraduates) were significantly worse than those of the experts. The correct message of Tetlock’s study, therefore, was not that experts are no better than anyone at making predictions, but rather that someone with only general knowledge of the subject, but not no knowledge at all, can outperform someone with a great deal of knowledge. See Tetlock (2005) for details.

10. Spyros Makridakis and colleagues have shown in a series of studies over the years (Makridakis and Hibon 2000; Makridakis et al. 1979; Makridakis et al. 2009b) that simple models are about as accurate as complex models in forecasting economic time series. Armstrong (1985) also makes this point.

11. See Dawes (1979) for a discussion of simple linear models and their usefulness to decision making.

12. See Mauboussin (2009, Chapters 1 and 3) for an insightful discussion on how to improve predictions, along with traps to be avoided.

13. The simplest case occurs when the distribution of probabilities is what statisticians call stationary, meaning that its properties are constant over time. A more general version of the condition allows the distribution to change as long as changes in the distribution follow a predictable trend, such as average house prices increasing steadily over time. However, in either case, the past is assumed to be a reliable predictor of the future.

14. Possibly if the models had included data from a much longer stretch of time—the past century rather than the past decade or so—they might have captured more accurately the probability of a large, rapid, nationwide downtown. But so many other aspects of the economy also changed over that period of time that it’s not clear how relevant much of this data would have been. Presumably, in fact, that’s why the banks decided to restrict the time window of their historical data the way they did.

15. See Raynor (2007, Chapter 2) for the full story.

16. Sony did in fact pursue a partnership with Matsushita, but abandoned the plan in light of Matsushita’s quality problems. Sony therefore opted for product quality while Matsushita opted for low cost—both reasonable strategies that had a chance of succeeding.

17. As Raynor writes, “Sony’s strategies for Betamax and MiniDisc had all the elements of success, but neither succeeded. The cause of these failures was, simply put, bad luck: the strategic choices Sony made were perfectly reasonable; they just turned out to be wrong.” (p. 44).

18. For an overview of the history of scenario planning, see Millet (2003). For theoretical discussions, see Brauers and Weber (1988), Schoemaker (1991), Perrottet (1996), and Wright and Goodwin (2009). Scenario planning also closely resembles what Makridakis, Hogarth and Gaba (2009a) call “future perfect thinking.”

19. For details of Pierre Wack’s work at Royal Dutch/Shell, see Wack (1985a; 1985b).

20. Raynor actually distinguishes three kinds of management: functional management, which is about optimizing daily tasks; operational management, which is focused on executing existing strategies; and strategic management, which is focused on the management of strategic uncertainty. (Raynor 2007, pp. 107–108)

21. For example, a 2010 story about Ford’s then CEO claimed that “What Ford won’t do is change direction again, at least not under Mr. Mulally’s watch. He promises that he—and Ford’s 200,000 employees—will not waver from his ‘point of view’ about the future of the auto industry. ‘That is what strategy is all about,’ he says. ‘It’s about a point of view about the future and then making decisions based on that. The worst thing you can do is not have a point of view, and not make decisions.’ New York Times, January 9, 2010.

22. This example was originally presented in Beck (1983), but my discussion of it is based on the analysis by Schoemaker (1991).

23. According to Schoemaker (1991, p. 552), “A deeper scenario analysis would have recognized the confluence of special circumstances (e.g. high oil prices, tax incentives for drilling, conducive interest rates, etc.) underlying this temporary peak. Good scenario planning goes beyond just high-low projections.”

24. See Raynor (2007, p. 37).

CHAPTER 8: THE MEASURE OF ALL THINGS

  1. Some more details about Zara’s supply chain management are provided in a Harvard Business Review case study of the company (2004, pp. 69–70). Additional details are provided in Kumar and Linguri (2006).

  2. Mintzberg, it should be noted, was careful to differentiate strategic planning from “operational” planning, which is concerned with short-term optimization of existing procedures. The kind of planning models that don’t work for strategic plans actually do work quite well for operational planning—indeed, it was for operational planning that the models were originally developed, and it was their success in this context that Mintzberg believed had encouraged planners to repurpose them for strategic planning. The problem is therefore not that planning of any kind is impossible, any more than prediction of any kind is impossible, but rather that certain kinds of plans can be made reliably and others can’t be, and that planners need to be able to tell the difference.

  3. See Helft (2008) for a story about the Yahoo! home page overhoul.

  4. See Kohavi et al. (2010) and Tang et al. (2010).

  5. See Clifford (2009) for a story about startup companies using quantitative performance metrics to substitute for design instinct.

  6. See Alterman (2008) for Peretti’s original description of the Mullet Strategy. See Dholakia and Vianello (2009) for a discussion of how the same approach can work for communities built around brands, and the associated tradeoff between control and insight.

  7. See Howe (2008, 2006) for a general discussion of crowdsourcing. See Rice (2010) for examples of recent trends in online journalism.

  8. See Clifford (2010) for more details on Bravo, and Wortman (2010) for more details on Cheezburger Network. See http://bit.ly/9EAbjR for an interview with Jonah Peretti about contagious media and BuzzFeed, which he founded.

  9. See http://blog.doloreslabs.com for many innovative uses of crowd sourcing.

10. See Paolacci et al (2010) for details of turker demographics and motivations. See Kittur et al. (2008) and Snow et al. (2008) for studies of Mechanical Turk reliability. And see Sheng, Provost, and Ipeirotis (2008) for a method for improving turker reliability.

11. See Polgreen et al. (2008) and Ginsberg et al. (2008) for details of the influenza studies. Recently, the CDC has reduced its reporting delay for influenza caseloads (Mearian 2009), somewhat undermining the time advantages of search-based surveillance.

12. The Facebook happiness index is available at http://apps.facebook.com/usa-gnh. See also Kramer (2010) for more details. A similar approach has been used to extract happiness indices from song lyrics and blog postings (Dodds and Danforth 2009) as well as Twitter updates (Bollen et al. 2009).

13. See http://yearinreview.yahoo.com/2009 for a compilation of most popular searches in 2009. Facebook has a similar service based on status updates, as does Twitter. As some commenters have noted (http://www.collisiondetection.net/mt/archives/2010/01/the_problem_wit.php), these lists often produce rather banal results, and so possibly would be more interesting or useful if constrained to more specific subpopulations of interest to particular individuals—like his or her friends, for example. Fortunately, modifications like this are relatively easy to implement; thus the fact that topics of highest average interest are unsurprising or banal does not imply that the capability to reflect collective interest is itself uninteresting.

14. See Choi and Varian (2008) for more examples of “predicting the present” using search trends.

15. See Goel et al. (2010, Lahaie, Hofman) for details of using web search to make predictions.

16. Steve Hasker and I wrote about this approach to planning in marketing a few years ago in the Harvard Business Review (Watts and Hasker 2006).

17. The relationship between sales and advertising is in fact a textbook example of what economists call the endogeneity problem (Berndt 1991).

18. In fact, there was a time when controlled experiments of this kind enjoyed a brief burst of enthusiasm among advertisers, and some marketers, especially in the direct-mail world, still run them. In particular, Leonard Lodish and colleagues conducted a series of advertising experiments, mostly in the early 1990s using split cable TV (Abraham and Lodish 1990; Lodish et al. 1995a; Lodish et al. 1995b; and Hu et al. 2007). Also see Bertrand et al. (2010) for an example of a direct-mail advertising experiment. Curiously, however, the practice of routinely including control groups in advertising campaigns, for TV, word-of-mouth, and even brand advertising, never caught on, and these days it is mostly overlooked in favor of statistical models, often called “marketing mix models” (http://en.wikipedia.org/wiki/Marketing_mix_modeling).

19. See, for example, a recent Harvard Business School article by the president and CEO of comScore (Abraham 2008). Curiously, the author was one of Lodish’s colleagues who worked on the split-cable TV experiments.

20. User anonymity was maintained throughout the experiment by using a third-party service to match Yahoo! and retailer IDs without disclosing individual identities to the researchers. See Lewis and Reiley (2009) for details.

21. More effective advertising may even be better for the rest of us. If you only saw ads when there was a chance you might be persuaded by them, you’d probably see many fewer ads, and possibly wouldn’t find them as annoying.

22. See Brynjolfsson and Schrage (2009). Department stores have long experimented with product placement, trying out different locations or prices for the same product in different stores to learn which arrangements sell the most. But now that virtually all physical products are labeled with unique barcodes, and many also contain embedded RFID chips, they have the potential to track inventory and measure variation between stores, regions, times of the day, or times of the year—possibly leading to what Marshall Fisher of the University of Pennsylvania Wharton School has called the era of “Rocket Science” retailing (Fisher 2009). Ariely (2008) has also made a similar point.

23. See http://www.povertyactionlab.org/ for information on the MIT Poverty Action Lab. See Arceneaux and Nickerson (2009) and Gerber et al (2009) for examples of field experiments run by political scientists. See Lazear (2000) and Bandiera, Barankay, and Rasul (2009) for examples of field experiments run by labor economists. See O’Toole (2007, p. 342) for the example of the national parks and Ostrom (1999, p. 497) for a similar attitude to common pool resource governance, in which she argues that “all policy proposals must be considered as experiments.” Finally, see Ayers (2007, chapter 3) for other examples of field experiments.

24. Ethical considerations also limit the scope of experimental methods. For example, although the Department of Education could randomly assign students to different schools, and while that would probably be the best way to learn which education strategies really work, doing so would impose hardship on the students who were assigned to the bad schools, and so would be unethical. If you have a reasonable suspicion that something might be harmful, you cannot ethically force people to experience it even if you’re not sure; nor can you ethically refuse them something that might be good for them. All of this is as it should be, but it necessarily limits the range of interventions to which aid and development agencies can assign people or regions randomly, even if they could do so practically.

25. For specific quotes, see Scott (1998) pp. 318, 313, and 316, respectively.

26. See Leonhardt (2010) for a discussion of the virtues of cap and trade. See Hayek (1945) for the original argument.

27. See Brill (2010) for an interesting journalistic account of the Race to the Top. See Booher-Jennings (2005) and Ravitch (2010) for critiques of standardized testing as the relevant metric for student performance and teacher quality.

28. See Heath and Heath (2010) for their definition of bright spots. See Marsh et al. (2004) for more details of the positive deviance approach. Examples of positive deviance can be found at http://www.positivedeviance.org/. The hand-washing story is taken from Gawande (2008, pp. 13–28), who describes an initial experiment run in Pittsburgh. Gawande cautions that it is still uncertain how well the initial results will last, or whether they will generalize to other hospitals; however, a recent controlled experiment (Marra et al. 2010) suggests that they might.

29. See Sabel (2007) for a description of bootstrapping. See Watts (2003, Chapter 9) for an account of Toyota’s near catastrophe with “just in time” manufacturing, and also their remarkable recovery. See Nishiguchi and Beaudet (2000) for the original account. See Helper, MacDuffie, and Sabel (2000) for a discussion of how the principles of the Toyota production system have been adopted by American firms.

30. See Sabel (2007) for more details on what makes for successful industrial clusters, and Giuliani, Rabellotti, and van Dijk (2005) for a range of case studies. See Lerner (2009) for cautionary lessons in government attempts to stimulate innovation.

31. Of course in attempting to generalize local solutions, one must remain sensitive to the context in which they are used. Just because a particular hand-washing practice works in one hospital does not necessarily mean that it will work in another, where a different set of resources, constraints, problems, patients, and cultural attitudes may prevail. We don’t always know when a solution can be applied more broadly—in fact, it is precisely this unpredictability that makes central bureaucrats and administrators unable to solve the problem in the first place. Nevertheless, that should be the focus of the plan.

32. Easterly (2006, p. 6).

CHAPTER 9: FAIRNESS AND JUSTICE

  1. Herrera then sued the city, which in 2006 eventually settled for $1.5 million. Three other officers who were involved in the incident were fired, and overall seventeen members of the 72nd precinct, including the commander, were disciplined. Police Commissioner Kerik opened an investigation into the operation of the midnight shift, which was apparently known to suffer from poor supervision and lax routines. Both Mayor Giuliani, and his successor, Michael Bloomberg, weighed in on the case, as did Governor Pataki. The legal status of the unborn baby Ricardo resulted in a fight between the medical examiner, who claimed the baby did not live independently of its mother and was therefore not to be considered a separate death, and the district prosecutor, who claimed the opposite. From the initial reports of the accident through the settlement of the lawsuit, the New York Times published nearly forty articles about the tragedy.

  2. For a discussion of the relationship between rational organizing principles and the actual functioning of real social organizations, see Meyer and Rowan (1977), DiMaggio and Powell (1983), and Dobbin (1994). For a comprehensive treatment of the “new institutionalist” view of organizational sociology, see Powell and DiMaggio (1991).

  3. See Menand (2001, pp. 429–33) for a discussion of Wendell Holmes’s reasoning.

  4. The psychologist Ed Thorndike was the first to document the Halo Effect in psychological evaluations (cite Thorndike 1920). For a review of the psychological literature on the Halo Effect, see Cooper (1981). For the John Adams quote, see Higginbotham (2001, p. 216).

  5. For more examples of the Halo Effect in business, see Rosenzweig (2007). For a glowing story about the success of Steve & Barry’s, see Wilson (2008). For a story about their subsequent bankruptcy, see Sorkin (2008).

  6. See Rosenzweig (2007, pp. 54–56) for more examples of attribution error, and Staw (1975) for details of the experiment that Rosenzweig discusses.

  7. To illustrate, consider a simple thought experiment in which we compare a “good” process, G, with a “bad” process, B, and where, just for the sake of the example, G has a 60 percent chance of success, while B succeeds only 40 percent of the time. If you think this isn’t a big difference, imagine two roulette wheels that produced red outcomes 60 percent and 40 percent of the time—betting on red and black, respectively, one could quickly and easily make a fortune. Likewise, a strategy for making money in financial markets by placing many small bets would do very well if it paid out equal amounts of money 60 percent of the time, and lost them 40 percent of the time. But imagine now that instead of spinning a roulette wheel—a process we can repeat many times—our processes correspond to alternative corporate strategies or education policies. This now being an experiment that can be run only once, we observe the following probabilities

Prob[G succeeds while B fails] = 0.6 * (1 - 0.4) = 0.36

Prob[B succeeds while G fails] = 0.4 * (1 - 0.6) = 0.16

Prob[G and B both succeed] = 0.6 * 0.4 = 0.24

Prob[G and B both fail] = (1 - 0.6) * (1 - 0.4) = 0.24

In other words, it is more likely that G will do at least as well as B than the other way around—just as one would expect. But it is also the case that only one time in three, roughly, will G succeed while B fails. Almost half the time, in fact, both strategies perform equally well—or poorly—and one time out of six, it will even be the case that B will succeed while G fails. With almost two-thirds probability, it follows that when the good and bad processes are run side by side, the outcomes will not accurately reflect their differences.

  8. See Brill (2009) for the original quote.

  9. The distinction is important because it is often argued that for any sufficiently large population of fund managers, someone will be successful for many years in a row, even if success in any given year is determined by a coin toss. But as Mauboussin (2006, 2010) shows, coin tossing is actually a misleading metaphor. Because the performance of managed funds is assessed after fees, and because the overall portfolio of managed funds does not necessarily mirror the S&P 500, there is no reason to think that 50 percent of funds should “beat the market” in any given year. In fact the actual percentage varied from 7.9 percent (in 1997) to 67.1 percent (in 2005) over the fifteen-year interval of Miller’s streak. When these empirical success rates are taken into account, the probability of observing a streak like Miller’s is closer to one in 2.3 million (Mauboussin 2006, p. 50).

10. For DiMaggio’s statistics see http://www.baseball-almanac.com/fur: DiMaggio’s Statistics.

11. Arbesman and Strogatz (2008), using simulations, find that the likelihood of a fifty-six-game streak is somewhere between 20 percent and 50 percent. Interestingly, they also find that DiMaggio was not the most likely player to have attained this distinction; thus his streak was some mixture of skill and luck. See also McCotter (2008), who shows that long streaks happen more frequently than they should if batting average is constant, as Arbesman and Strogatz assume, suggesting that batters in the midst of a streak may be more likely to score a subsequent hit than their season average would suggest. Although they disagree with respect to the likelihood of streaks, however, both models are consistent in the idea that the correct measure of performance is the batting average, not the streak itself.

12. Of course, it’s not always easy to agree on what constitutes a reliable measure of talent in sports either: whereas for a 100-meter sprinter it is very clear, in baseball it is much less so, and fans argue endlessly over which statistics—batting average, strikeout rate, runs batted in, slugging percentage—ought to count for more. Mauboussin (2010), for example, argues that strikeout rate is a more reliable measure of performance than batting average. Whatever the right measure is, however, the main point is that sports afford relatively large numbers of “trials” that are conducted under relatively comparable conditions.

13. See Lewis (2009) for an example of measuring performance in terms of a player’s effect on the team’s win-loss record.

14. Of course, we could artificially increase the number of data points by looking at their daily or weekly performance rather than their annual one; but these measures are also correspondingly noisier than annual measures, so it probably wouldn’t help.

15. See Merton for the original paper. See also Denrell (2004) for a related argument about how random processes can account for persistent differences in profitability among businesses.

16. See Rigney (2010). See also DiPrete and Eirich (2006) for a more technical review of the cumulative advantage and inequality literature. See Kahn (2010) for detail on college graduates’ earnings.

17. See McDonald (2005) for the Miller quote.

18. Mauboussin (2010) makes this point in considerably more detail.

19. Ironically, the further removed a measure of success is from a direct measure of talent, the more powerful the Halo Effect becomes. As long as your claim to talent is based on your personally having performed a particular thing well, someone can always question how well it was actually done, or how worthwhile a thing it was to do in the first place. But as soon as one’s accomplishments become abstracted from their substance—as happens, for example, when a person wins important prizes, achieves great recognition, or makes fabulous amounts of money—concrete, individual metrics for assessing performance are gradually displaced by the Halo. A successful person, like a bestselling book or popular idea, is simply assumed to have displayed the appropriate merit, at which point the success effectively becomes a substitute for merit itself. But even more than that, it is merit that cannot be easily questioned. If one believes that the Mona Lisa is a great piece of art because of X, Y, and Z, a knowledgeable disputant can immediately counter with his or her own criteria, or point out other examples that ought to be considered superior. But if one believes instead that the Mona Lisa is a great piece of art simply because it is famous, our pesky disputant can come up with all the objections she desires and we can insist quite reasonably that she must be missing the point. No matter how knowledgably she argues that properties of the Mona Lisa aren’t uniquely special, we cannot help but suspect that something must have been overlooked, because surely if the artwork was not really special, then it wouldn’t be, well, special.

20. See http://www.forbes.com/lists/2009/12/best-boss-09_Steven-P-Jobs_HEDB.html.

21. Sometimes even the leaders themselves concede this point—but interestingly they tend to do so only when things are going badly. For example, when the leaders of the four largest investment banks testified before Congress in early 2010, they did not take personal responsibility for the performance of their firms, claiming instead that to have been victims of a “financial tsunami” that had wreaked havoc on the economy. Yet in the years leading up to the crisis, when their firms were making money hand over fist, these same leaders were not turning down their bonuses on the grounds that everyone in their industry was making money, and therefore they shouldn’t be credited with doing anything special. See Khurana (2002) for details, and Wasserman, Anand, and Nohira (2010) for the empirical results on when leadership matters.

22. To quote Khurana directly: “strong social, cultural, and psychological forces lead people to believe in cause-and-effect relationships such as that between corporate leadership and corporate performance. In the United States, the cultural bias towards individualism largely discounts the influence of social, economic, and political forces in human affairs so that accounts of complicated events such as wars and economic cycles reduce the forces behind them to personifications.… This process of exaggerating the ability of individuals to influence immensely complex events is strongly abetted by the media, which fixate the public’s attention on the personal characteristics of leaders at the expense of serious analysis of events” (Khurana 2002, p. 23).

23. As Khurana and other critics are quick to acknowledge, their research does not mean that anyone can be an effective CEO, or that CEO performance is irrelevant. It is certainly possible, for example, for a CEO to destroy tremendous value by making awful or irresponsible decisions. And because avoiding bad decisions can be difficult, even satisfactory performance requires a certain amount of experience, intellect, and leadership ability. Certainly not everyone has the wherewithal to qualify for the job, or the discipline and energy to perform it. Many CEOs are impressive people who work long hours under stressful conditions and carry heavy burdens of responsibility. It’s therefore perfectly reasonable for corporate boards to choose candidates selectively and to compensate them appropriately for their talent and their time. The argument is just that they shouldn’t be selected or compensated on the grounds that their individual performance will have more than a weak influence on the future performance of their firm.

24. For a summary of Rawls’s and Nozick’s arguments, see Sandel (2009). For the original arguments, see Rawls (1971) and Nozick (1974).

25. See DiPrete (2002) for empirical evidence on intergenerational social mobility.

26. See, for example, Herszenhorn (2009) and Kocieniewski (2010).

27. See, for example, Watts (2009).

28. See Watts (2003, Chapter 1) for a detailed discussion of one such cascading failure—the 1996 failure in the western United States.

29. See Perrow (1984) for examples of what he calls normal accidents in complex organizations. See also Carlson and Doyle (2002) for a more technical treatment of the “robust yet fragile” nature of complex systems.

30. See Tabibi (2009) for an example of how Goldman Sachs profited from multiple forms of government assistance.

31. See Sandel (2009).

32. See Granovetter (1985).

33. See Berger and Luckman (1966). The deliberative democracy literature mentioned in Chapter 1, note 25, is also relevant to Sandel’s argument.

CHAPTER 10: THE PROPER STUDY OF MANKIND

  1. The full text of Pope’s “Essay on Man” is available online at Project Gutenberg http://www.gutenberg.org/etext/2428.

  2. Parsons’s notion of rationality was inspired by Max Weber, who interestingly was not a functionalist, or even a positivist, espousing instead what has become known as an interpretive school of sociology, manifest in his claim that rational action was that which was understandable (verstehen) to an analyst. Nevertheless, Weber’s work was quickly seconded by strongly positivistic theories, of which rational choice theory is the most obvious; thus illustrating how deeply the positivistic urge runs in all forms of science, including social science. Parsons also is sometimes cast as an anti-positivist, but once again, his ideas have been incorporated into positivist theories of social action.

  3. For critiques of Parsons, see Mayhew (1980, p. 353), Harsanyi (1969, p. 514), and Coleman and Fararo (1992, p. xvii).

  4. Many sociologists—both before Merton and since—have been critical of what they have viewed as facile attempts to replicate the success of natural science by imitating its form rather than its methods As early as the 1940s, for example, Parsons’s contemporary Huntington Cairns wrote that “We possess no such synoptic view of social sciences which encourages us to believe that we are now at a stage of analysis where we can with any certainty select the basic concepts upon which an integrated structure of knowledge can be erected” (Cairns 1945, p. 13). More recently, a steady drumbeat of criticism has been directed at rational choice theory, for much the same reasons (Quadagno and Knapp 1992; Somers 1998).

  5. Quotes are from Merton (1968a).

  6. See Merton (1968a) for his description of theories of the middle range, including the theory of relative deprivation and the theory of the role set.

  7. Harsanyi (1969, p. 514) and Diermeier (1996) both reference Newton, while the political scientists Donald Green and Ian Shapiro have called rational choice theory “an ever-expanding tent in which to house every plausible proposition advanced by anthropology, sociology, or social psychology” (Green and Shapiro 2005).

  8. The “success” or “failure” of rational choice theory, it should be noted, is highly controversial, with rational choice advocates claiming that it is unfair to evaluate rational choice theory as a “theory” in the first place, when really it should be regarded rather more as a family of theories unified only by their emphasis on purposive action as the cause of social outcomes over accident, blind conformity, or habit (Farmer 1992; Kiser and Hechter 1998; Cox 1999). Perhaps this is an accurate statement of what rational choice theory has become (although interestingly some rational choice theorists include even habit within the gambit of rational incentives [Becker and Murphy 2000]), but it’s certainly not what early proponents like Harsanyi intended it to be. Harsanyi in fact criticized Parsons’s theory explicitly for not being a “theory” at all, lacking the ability to derive conclusions logically from a set of axioms—or as he put it, “the very concept of social function in a collectivist sense gives rise to insoluble problems of definition and of empirical identification” (1969, p. 533). Whether or not it has subsequently metamorphosed into something more realistic should therefore not distract from the point that its original mission was to be a theory, and that in that sense it has been no more successful than any of its predecessors.

  9. Indeed, as Becker (1945, p. 84) noted long ago, natural scientists are every bit as prone as social scientists to overestimate their ability to construct predictive models of human behavior.

10. Stouffer (1947).

11. It should be noted that not all sociologists agree that measurement is really the problem that I’m making it out to be. According to at least one school of thought, sociological theories should help us to make sense of the world, and give us a language with which to argue about it; but they shouldn’t aim to make predictions or to solve problems, and so shouldn’t be judged by the pragmatic test in the first place. If this “interpretive” view of sociology is correct, the whole positivist enterprise that began with Comte is based on a fundamental misunderstanding about the nature of social science, starting with assumption that it ought to be considered a branch of science at all (Boudon, 1988b). Sociologists, therefore, would do better to focus on developing “approaches” and “frameworks”—ways of thinking about the world that allow them to see what they might otherwise miss, and question what other people take for granted—and forget all about trying to build theories of the kind that are familiar to us from physics. It was essentially this kind of approach to sociology, in fact, that Howard Becker was advocating in his book Tricks of the Trade, the review of which I encountered back in 1998, and that John Gribbin—the reviewer, who, remember, is a physicist—evidently found infuriating.

12. See, for example, Paolacci et al. (2010).

13. The privacy debate is an important one, and raises a number of unresolved questions. First, when asked, people say they care deeply about maintaining their privacy (Turow et al. 2009); however, their actions frequently belie their responses to survey questions. Not only do many people post a great deal of highly personal information about themselves in public but they also decline to pay for services that would guarantee them a higher than default level of privacy. Possibly this disconnect between espoused and revealed preferences implies only that people do not understand the consequences of their actions; but it may also imply that abstract questions about “privacy” are less meaningful than concrete tradeoffs in specific situations. A second, more troubling problem is that regardless of how people “really” feel about revealing particular pieces of information about themselves, they are almost certainly unable to appreciate the ability of third parties to construct information profiles about them, and thereby infer other information that they would not feel comfortable revealing.

14. See Sherif (1937) and Asch (1953) for details of their pioneering experiments. See Zelditch (1969) for his discussion of small-group versus large-group studies. See Adar and Adamic (2005), Sun et al. (2009), and Bakshy and Adamic (2009) for other examples of tracking information diffusion in online networks.

15. Now that they’ve proven the concept, Reiley and Lewis are embarking on a whole array of similar experiments—for department stores, phone providers, financial services companies, and so on—with the aim of measuring differences across domains (do ads work differently for phones than for credit cards?), across demographics (are older people more susceptible than younger?), and even across specific ad layouts and designs (blue background versus white?).

16. See Lazarsfeld and Merton (1954) for the original definition of “homophily,” and McPherson et al. (2001) for a recent survey of the literature. See Feld (1981) and McPherson and Smith-Lovin (1987) for discussion of the importance of structural opportunities.

17. The reason is that social structure not only shapes our choices but is also shaped by them. It is true, for example, that whom we are likely to meet in the immediate future is determined to some degree by our existing social circles and activities. But on a slightly longer timescale it is also true that we may choose to do certain things over others precisely because of the people we expect to meet in the course of doing them. The whole point of “social networking” events in the business world, for example, is to put yourself in a situation where you might meet interesting people. Likewise, the determination of some parents to get their children into the “right” schools has less to do with the quality of education they will receive than the classmates they will have. That said, of course, it is not equally easy for everyone to get into Harvard, or to get invited to the most desirable social gatherings. On a longer timescale again, therefore, your position in the social structure constrains not only whom you can get to know now but also the choices that will determine your future position in the social structure. Arguments about the relative importance of individual preferences and social structure invariably get bogged down in this chicken-and-egg tangle, and so tend to get resolved by ideology rather than by data. Those who believe in the power of individual choice can always contend that structure is simply the consequence of choices that individuals have made, while those who believe in the power of structure can always contend that the appearance of choice is illusory.

18. A similar finding has subsequently been reported in another study of homophily using data collected from Facebook (Wimmer and Lewis, 2010).

19. Some studies have found that polarization is increasing (Abramowitz and Saunders 2008; Bishop 2008), whereas others have found that Americans agree more than they disagree, and that views on one issue, say abortion, turn out to be surprisingly uncorrelated with views on other matters, like gun ownership, or immigration (Baldassari and Gelman 2008; Gelman et al. 2008; DiMaggio et al. 1996; Fiorina et al. 2005).

20. See Baldassari and Bearman (2007) for a discussion of real versus perceived agreement. In spite of the practical difficulties, some pioneering studies of precisely this kind have been conducted, first by Laumann (1969) and later by Huckfeldt and colleagues (Huckfeldt et al. 2004; Huckfeldt and Sprague 1987).

21. Clearly Facebook is an imperfect representation of everyone’s friendship network: Not everyone is on Facebook, so some close friends may be missing, while many “friends” are barely acquainted in real life. Counting mutual friends can help differentiate between genuine and illusory friendships, but this method is also imperfect, as even casual acquaintances on Facebook may share many mutual friends. A better approach would be to observe how frequently friends communicate or perform other kinds of relational acts (e.g., clicking on a newsfeed item, commenting, liking, etc.); however, this data is not yet available to third-party developers.

22. For details of the Friend Sense study, see Goel, Mason, and Watts (2010).

23. Projection is a well-studied phenomenon in psychology, but it has been difficult to measure in social networks, for much the same reasons that have stymied network research in general. For a review of the projection literature, see Krueger and Clement (1994), Krueger (2007), and Robbins and Krueger (2005).

24. See Aral, Muchnik, and Sundararajan (2009) for a recent study of influence in viral marketing.

25. For other recent work using e-mail data see, Tyler et al. (2005), Cortes et al. (2003), Kossinets and Watts (2006), Malmgren et al. (2009), De Choudhury et al. (2010), and Clauset and Eagle (2007). For related work using cell-phone data, see Eagle et al. (2007) and Onnela et al. (2007); and for work using instant messaging data, see Leskovec and Horvitz (2008).

26. For information on the progress on cancer see an excellent series of articles, “The Forty Years War” published in the New York Times. Search “forty years war cancer” or go to http://bit.ly/c4bsc9. For a similar account of the genomics revolution, see recent articles by Wade (2010) and Pollack (2010).

27. I have made a similar argument elsewhere (Watts 2007), as have a number of other authors (Shneiderman 2008; Lazer et al. 2009).