The message of the previous chapter is that the kinds of predictions that common sense tells us we ought to be able to make are in fact impossible—for two reasons. First, common sense tells us that only one future will actually play out, and so it is natural to want to make specific predictions about it. In complex systems, however, which comprise most of our social and economic life, the best we can hope for is to reliably estimate the probabilities with which certain kinds of events will occur. Second, common sense also demands that we ignore the many uninteresting, unimportant predictions that we could be making all the time, and focus on those outcomes that actually matter. In reality, however, there is no way to anticipate, even in principle, which events will be important in the future. Even worse, the black swan events that we most wish we could have predicted are not really events at all, but rather shorthand descriptions—“the French Revolution,” “the Internet,” “Hurricane Katrina,” “the global financial crisis”—of what are in reality whole swaths of history. Predicting black swans is therefore doubly hopeless, because until history has played out it’s impossible even to know what the relevant terms are.
It’s a sobering message. But just because we can’t make the kinds of predictions we’d like to make doesn’t mean that we can’t predict anything at all. As any good poker player can tell you, counting cards won’t tell you exactly which card is going to show up next, but by knowing the odds better than your opponents you can still make a lot of money over time by placing more informed bets, and winning more often than you lose.1 And even for outcomes that truly can’t be predicted with any reliability whatsoever, just knowing the limits of what’s possible can still be helpful—because it forces us to change the way we plan. So what kinds of predictions can we make, and how can we make them as accurately as possible? And how should we change the way we think about planning—in politics, business, policy, marketing, and management—to accommodate the understanding that some predictions cannot be made at all? These questions may seem distant from the kinds of issues and puzzles that we grapple with on an everyday basis, but one way or another—through their influence on the firms we work for, or the economy at large, or the issues that we read about every day in the newspaper—they affect us all.
To oversimplify somewhat, there are two kinds of events that arise in complex social systems—events that conform to some stable historical pattern, and events that do not—and it is only the first kind about which we can make reliable predictions. As I discussed in the previous chapter, even for these events we can’t predict any particular outcome any more than we can predict the outcome of any particular die roll. But as long as we can gather enough data on their past behavior, we can do a reasonable job of predicting probabilities, and that can be enough for many purposes.
Every year, for example, each of us may or may not be unlucky enough to catch the flu. The best anyone can predict is that in any given season we would have some probability of getting sick. Because there are so many of us, however, and because seasonal influenza trends are relatively consistent from year to year, drug companies can do a reasonable job of anticipating how many flu shots they will need to ship to a given part of the world in a given month. Likewise, consumers with identical financial backgrounds may vary widely in their likelihood of defaulting on a credit card, depending on what is going on in their lives. But credit card companies can do a surprisingly good job of predicting aggregate default rates by paying attention to a range of socioeconomic, demographic, and behavioral variables. And Internet companies are increasingly taking advantage of the mountains of Web-browsing data generated by their users to predict the probability that a given user will click on a given search result, respond favorably to particular news story, or be swayed by a particular recommendation. As the political scientist Ian Ayres writes in his book Super Crunchers, predictions of this kind are being made increasingly in highly data-intensive industries like finance, healthcare, and e-commerce, where the often modest gains associated with data-driven predictions can add up over millions or even billions of tiny decisions—in some cases every day—to produce very substantial gains to the bottom line.2
So far, so good. But there are also many areas of business—as well as of government and policy—that rely on predictions that do not quite fit into this supercrunching mold. For example, whenever a book publisher decides how much of an advance to offer a potential author, it is effectively making a prediction about the future sales of the proposed book. The more copies the book sells, the more royalties the author is entitled to, and so the more of an advance the publisher should offer to prevent the author from signing with a different publisher. But if in making this calculation, the publisher overestimates how well the book will sell, it will end up overpaying the author—good for the author but bad for the publisher’s bottom line. Likewise when a movie studio decides to green-light a project, it is effectively making a prediction about the future revenues of the movie, and thus how much it can afford to spend making and marketing it. Or when a drug company decides to proceed with the clinical testing stage of a new drug, it must justify the enormous expense in terms of some prediction about the likely success of the trial and the eventual market size for the drug.
All these lines of business therefore depend on predictions, but they are considerably more complicated predictions than predictions about the number of flu cases expected in North America this winter, or the probability that a given user will click on a given ad online. When a publisher offers an advance for a book, the book itself is typically at least a year or two away from publication; so the publisher has to make a prediction not only about how the book itself will turn out but also what the market will be like for that kind of book when it is eventually published, how it will be reviewed, and any number of other related factors. Likewise predictions about movies, new drugs, and other kinds of business or development projects are, in effect, predictions about complex, multifaceted processes that play out over months or years. Even worse, because decision makers are constrained to making only a handful of such decisions every year, they do not have the luxury of averaging out their uncertainty over huge numbers of predictions.
Nevertheless, even in these cases, decision makers often have at least some historical data on which to draw. Publishers can keep track of how many copies they have sold of similar books in the past, while movie studios can do the same for box office revenues, DVD sales, and merchandising profits. Likewise, drug companies can assess the rates with which similar drugs have succeeded in reaching the market, marketers can track the historical success of comparable products, and magazine publishers can track the newsstand sales of previous cover stories. Decision makers often also have a lot of other data on which to draw—including market research, internal evaluations of the project in question, and their knowledge of the industry in general. So as long as nothing dramatic changes in the world between when they commit to a project and when it launches, then they are still in the realm of predictions that are at least possible to make reliably. How should they go about making them?
One increasingly popular method is to use what is called a prediction market—meaning a market in which buyers and sellers can trade specially designed securities whose prices correspond to the predicted probability that a specific outcome will take place. For example, the day before the 2008 US presidential election, an investor could have paid $0.92 for a contract in the Iowa Electronic Markets—one of the longest-running and best-known prediction markets—that would have yielded him or her $1 if Barack Obama had won. Participants in prediction markets therefore behave much like participants in financial markets, buying and selling contracts for whatever price is on offer. But in the case of prediction markets, the prices are explicitly interpreted as making a prediction about the outcome in question—for example, the probability of an Obama victory on the eve of Election Day was predicted by the Iowa Electronic Markets to be 92 percent.
In generating predictions like this one, prediction markets exploit a phenomenon that New Yorker writer James Surowiecki dubbed the “wisdom of crowds”—the notion that although individual people tend to make highly error-prone predictions, when lots of these estimates are averaged together, the errors have a tendency to cancel out; hence the market is in some sense “smarter” than its constitutents. Many such markets also require participants to bet real money, thus people who know something about a particular topic are more likely to participate than people who don’t. What’s so powerful about this feature of prediction markets is that it doesn’t matter who has the relevant market information—a single expert or a large number of nonexperts, or any combination in between. In theory, the market should incorporate all their opinions in proportion to how much each is willing to bet. In theory, in fact, no one should be able to consistently outperform a properly designed prediction market. The reason is that if someone could outperform the market, they would have an incentive to make money in it. But the very act of making money in the market would immediately shift the prices to incorporate the new information.3
The potential of prediction markets to tap into collective wisdom has generated a tremendous amount of excitement among professional economists and policy makers alike. Imagine, for example, that a market had been set up to predict the possibility of a catastrophic failure in deep-water oil drilling in the Gulf prior to the BP disaster in April 2010. Possibly insiders like BP engineers could have participated in the market, effectively making public what they knew about the risks their firms were taking. Possibly then regulators would have had a more accurate assessment of those risks and been more inclined to crack down on the oil industry before a disaster took place. Possibly the disaster could have been averted. These are the sorts of claims that the proponents of prediction markets tend to make, and it’s easy to see why they’ve generated so much interest. In recent years, in fact, prediction markets have been set up to make predictions as varied as the likely success of new products, the box office revenues of upcoming movies, and the outcomes of sporting events.
In practice, however, prediction markets are more complicated than the theory suggests. In the 2008 presidential election, for example, one of the most popular prediction markets, Intrade, experienced a series of strange fluctuations when an unknown trader started placing very large bets on John McCain, generating large spikes in the market’s prediction for a McCain victory. Nobody figured out who was behind these bets, but the suspicion was that it was a McCain supporter or even a member of the campaign. By manipulating the market prices, he or she was trying to create the impression that a respected source of election forecasts was calling the election for McCain, presumably with the hope of creating a self-fulfilling prophecy. It didn’t work. The spikes were quickly reversed by other traders, and the mystery bettor ended up losing money; thus the market functioned essentially as it was supposed to. Nevertheless, it exposed a potential vulnerability of the theory, which assumes that rational traders will not deliberately lose money. The problem is that if the goal of a participant is instead to manipulate perceptions of people outside the market (like the media) and if the amounts involved are relatively small (tens of thousands of dollars, say, compared with the tens of millions of dollars spent on TV advertising), then they may not care about losing money, in which case it’s no longer clear what signal the market is sending.4
Problems like this one have led some skeptics to claim that prediction markets are not necessarily superior to other less sophisticated methods, such as opinion polls, that are harder to manipulate in practice. However, little attention has been paid to evaluating the relative performance of different methods, so nobody really knows for sure.5 To try to settle the matter, my colleagues at Yahoo! Research and I conducted a systematic comparison of several different prediction methods, where the predictions in question were the outcomes of NFL football games. To begin with, for each of the fourteen to sixteen games taking place each weekend over the course of the 2008 season, we conducted a poll in which we asked respondents to state the probability that the home team would win as well as their confidence in their prediction. We also collected similar data from the website Probability Sports, an online contest where participants can win cash prizes by predicting the outcomes of sporting events. Next, we compared the performance of these two polls with the Vegas sports betting market—one of the oldest and most popular betting markets in the world—as well as with another prediction market, TradeSports. And finally, we compared the prediction of both the markets and the polls against two simple statistical models. The first model relied only on the historical probability that home teams win—which they do 58 percent of the time—while the second model also factored in the recent win-loss records of the two teams in question. In this way, we set up a six-way comparison between different prediction methods—two statistical models, two markets, and two polls.6
Given how different these methods were, what we found was surprising: All of them performed about the same. To be fair, the two prediction markets performed a little better than the other methods, which is consistent with the theoretical argument above. But the very best performing method—the Las Vegas Market—was only about 3 percentage points more accurate than the worst-performing method, which was the model that always predicted the home team would win with 58 percent probability. All the other methods were somewhere in between. In fact, the model that also included recent win-loss records was so close to the Vegas market that if you used both methods to predict the actual point differences between the teams, the average error in their predictions would differ by less than a tenth of a point. Now, if you’re betting on the outcomes of hundreds or thousands of games, these tiny differences may still be the difference between making and losing money. At the same time, however, it’s surprising that the aggregated wisdom of thousands of market participants, who collectively devote countless hours to analyzing upcoming games for any shred of useful information, is only incrementally better than a simple statistical model that relies only on historical averages.
When we first told some prediction market researchers about this result, their reaction was that it must reflect some special feature of football. The NFL, they argued, has lots of rules like salary caps and draft picks that help to keep teams as equal as possible. And football, of course, is a game where the result can be decided by tiny random acts, like the wide receiver dragging in the quarterback’s desperate pass with his fingertips as he runs full tilt across the goal line to win the game in its closing seconds. Football games, in other words, have a lot of randomness built into them—arguably, in fact, that’s what makes them exciting. Perhaps it’s not so surprising after all, then, that all the information and analysis that is generated by the small army of football pundits who bombard fans with predictions every week is not superhelpful (although it might be surprising to the pundits). In order to be persuaded, our colleagues insisted, we would have to find the same result in some other domain for which the signal-to-noise ratio might be considerably higher than it is in the specific case of football.
OK, what about baseball? Baseball fans pride themselves on their near-fanatical attention to every measurable detail of the game, from batting averages to pitching rotations. Indeed, an entire field of research called sabermetrics has developed specifically for the purpose of analyzing baseball statistics, even spawning its own journal, the Baseball Research Journal. One might think, therefore, that prediction markets, with their far greater capacity to factor in different sorts of information, would outperform simplistic statistical models by a much wider margin for baseball than they do for football. But that turns out not to be true either. We compared the predictions of the Las Vegas sports betting markets over nearly twenty thousand Major League baseball games played from 1999 to 2006 with a simple statistical model based again on home-team advantage and the recent win-loss records of the two teams. This time, the difference between the two was even smaller—in fact, the performance of the market and the model were indistinguishable. In spite of all the statistics and analysis, in other words, and in spite of the absence of meaningful salary caps in baseball and the resulting concentration of superstar players on teams like the New York Yankees and Boston Red Sox, the outcomes of baseball games are even closer to random events than football games.
Since then, we have either found or learned about the same kind of result for other kinds of events that prediction markets have been used to predict, from the opening weekend box office revenues for feature films to the outcomes of presidential elections. Unlike sports, these events occur without any of the rules or conditions that are designed to make sports competitive. There is also a lot of relevant information that prediction markets could conceivably exploit to boost their performance well beyond that of a simple model or a poll of relatively uninformed individuals. Yet when we compared the Hollywood Stock Exchange (HSX)—one of the most popular prediction markets, which has a reputation for accurate prediction—with a simple statistical model, the HSX did only slightly better.7 And in a separate study of the outcomes of five US presidential elections from 1988 to 2004, political scientists Robert Erikson and Christopher Wlezien found that a simple statistical correction of ordinary opinion polls outperformed even the vaunted Iowa Electronic Markets.8
So what’s going on here? We are not really sure, but our suspicion is that the strikingly similar performance of different methods is an unexpected side effect of the prediction puzzle from the previous chapter. On the one hand, when it comes to complex systems—whether they involve sporting matches, elections, or movie audiences—there are strict limits to how accurately we can predict what will happen. But on the other hand, it seems that one can get pretty close to the limit of what is possible with relatively simple methods. By analogy, if you’re handed a weighted die, you might be able to figure out which sides will come up more frequently in a few dozen rolls, after which you would do well to bet on those outcomes. But beyond that, more elaborate methods like studying the die under a microscope to map out all the tiny fissures and irregularities on its surface, or building a complex computer simulation, aren’t going to help you much in improving your prediction.
In the same way, we found that with football games a single piece of information—that the home team wins slightly more than half the time—is enough to boost one’s performance in predicting the outcome above random guessing. In addition, a second simple insight, that the team with the better win-loss record should have a slight advantage, gives you another significant boost. Beyond that, however, all the additional information you might consider gathering—the recent performance of the quarterback, the injuries on the team, the girlfriend troubles of the star running back—will only improve your predictions incrementally at best. Predictions about complex systems, in other words, are highly subject to the law of diminishing returns: The first pieces of information help a lot, but very quickly you exhaust whatever potential for improvement exists.
Of course, there are circumstances in which we may care about very small improvements in prediction accuracy. In online advertising or high-frequency stock trading, for example, one might be making millions or even billions of predictions every day, and large sums of money may be at stake. Under these circumstances, it’s probably worth the effort and expense to invest in sophisticated methods that can exploit the subtlest patterns. But in just about any other business, from making movies or publishing books to developing new technologies, where you get to make only dozens or at most hundreds of predictions a year, and where the predictions you are making are usually just one aspect of your overall decision-making process, you can probably predict about as well as possible with the help of a relatively simple method.
The one method you don’t want to use when making predictions is to rely on a single person’s opinion—especially not your own. The reason is that although humans are generally good at perceiving which factors are potentially relevant to a particular problem, they are generally bad at estimating how important one factor is relative to another. In predicting the opening weekend box office revenue for a movie, for example, you might think that variables such as the movie’s production and marketing budgets, the number of screens on which it will open, and advance ratings by reviewers are all highly relevant—and you’d be correct. But how much should you weight a slightly worse-than-average review against an extra $10 million marketing budget? It isn’t clear. Nor is it clear, when deciding how to allocate a marketing budget, how much people will be influenced by the ads they see online or in a magazine versus what they hear about the product from their friends—even though all these factors are likely to be relevant.
You might think that making these sorts of judgments accurately is what experts would be good at, but as Tetlock showed in his experiment, experts are just as bad at making quantitative predictions as nonexperts and maybe even worse.9 The real problem with relying on experts, however, is not that they are appreciably worse than nonexperts, but rather that because they are experts we tend to consult only one at a time. Instead, what we should do is poll many individual opinions—whether experts or not—and take the average. Precisely how you do this, it turns out, may not matter so much. With all their fancy bells and whistles, prediction markets may produce slightly better predictions than a simple method like a poll, but the difference between the two is much less important than the gain from simply averaging lots of opinions somehow. Alternatively, one can estimate the relative importance of the various predictors directly from historical data, which is really all a statistical model accomplishes. And once again, although a fancy model may work slightly better than a simple model, the difference is small relative to using no model at all.10 At the end of the day, both models and crowds accomplish the same objective. First, they rely on some version of human judgment to identify which factors are relevant to the prediction in question. And second, they estimate and weight the relative importance of each of these factors. As the psychologist Robyn Dawes once pointed out, “the whole trick is to know what variables to look at and then know how to add.”11
By applying this trick consistently, one can also learn over time which predictions can be made with relatively low error, and which cannot be. All else being equal, for example, the further in advance you predict the outcome of an event, the larger your error will be. It is simply harder to predict the box office potential of a movie at green light stage than a week or two before its release, no matter what methods you use. In the same way, predictions about new product sales, say, are likely to be less accurate than predictions about the sales of existing products no matter when you make them. There’s nothing you can do about that, but what you can do is start using any one of several different methods—or even use all of them together, as we did in our study of prediction markets—and keep track of their performance over time. As I mentioned at the beginning of the previous chapter, keeping track of our predictions is not something that comes naturally to us: We make lots of predictions, but rarely check back to see how often we got them right. But keeping track of performance is possibly the most important activity of all—because only then can you learn how accurately it is possible to predict, and therefore how much weight you should put on the predictions you make.12
No matter how carefully you adhere to this advice, a serious limitation with all prediction methods is that they are only reliable to the extent that the same kind of events will happen in the future as happened in the past, and with the same average frequency.13 In regular times, for example, credit card companies may be able to do a pretty good job of predicting default rates. Individual people may be complicated and unpredictable, but they tend to be complicated and unpredictable in much the same way this week as they were last week, and so on average the models work reasonably well. But as many critics of predictive modeling have pointed out, many of the outcomes that we care about most—like the onset of the financial crisis, the emergence of a revolutionary new technology, the overthrow of an oppressive regime, or a precipitous drop in violent crime—are interesting to us precisely because they are not regular times. And in these situations some very serious problems arise from relying on historical data to predict future outcomes—as a number of credit card companies discovered when default rates soared in the aftermath of the recent financial crisis.
Even more important, the models that many banks were using to price mortgage-backed derivatives prior to 2008—like the infamous CDOs—now seem to have relied too much on data from the recent past, during which time housing prices had only gone up. As a result, ratings analysts and traders alike collectively placed too low a probability on a nationwide drop in real-estate values, and so badly underestimated the risk of mortgage defaults and foreclosure rates.14 At first, it might seem that this would have been a perfect application for prediction markets, which might have done a better job of anticipating the crisis than all the “quants” working in the banks. But in fact it would have been precisely these people—along with the politicians, government regulators, and other financial market specialists who also failed to anticipate the crisis—who would have been participating in the prediction market, so it’s unlikely that the wisdom of crowds would have been any help at all. Arguably, in fact, it was precisely the “wisdom” of the crowd that got us into the mess in the first place. So if models, markets, and crowds can’t help predict black swan events like the financial crisis, then what are we supposed to do about them?
A second problem with methods that rely on historical data is that big, strategic decisions are not made frequently enough to benefit from a statistical approach. It may be the case, historically speaking, that most wars end poorly, or that most corporate mergers don’t pay off. But it may also be true that some military interventions are justified and that some mergers succeed, and it may be impossible to tell the difference in advance. If you could make millions, or even hundreds, of such bets, it would make sense to go with the historical probabilities. But when facing a decision about whether or not to lead the country into war, or to make some strategic acquisition, you cannot count on getting more than one attempt. Even if you could measure the probabilities, therefore, the difference between a 60 percent and 40 percent probability of success may not be terribly meaningful.
Like anticipating black swans, making one-off strategic decisions is therefore ill suited to statistical models or crowd wisdom. Nevertheless, these sorts of decisions have to get made all the time, and they are potentially the most consequential decisions that anyone makes. Is there a way to improve our success here as well? Unfortunately, there’s no clear answer to this question. A number of approaches have been tried over the years, but none of them has a consistently successful track record. In part that’s because the techniques can be difficult to implement correctly, but mostly it’s because of the problem raised in the previous chapter—that there is simply a level of uncertainty about the future that we’re stuck with, and this uncertainty inevitably introduces errors into the best-laid plans.
Ironically, in fact, the organizations that embody what would seem to be the best practices in strategy planning—organizations, for example, that possess great clarity of vision and that act decisively—can also be the most vulnerable to planning errors. The problem is what strategy consultant and author Michael Raynor calls the strategy paradox. In his book of the same name, Raynor illustrates the paradox by revisiting the case of Sony’s Betamax videocassette, which famously lost out to the cheaper, lower-quality VHS technology developed by Matsushita. According to conventional wisdom, Sony’s blunder was twofold: First, they focused on image quality over running time, thereby conceding VHS the advantage of being able to tape full-length movies. And second, they designed Betamax to be a standalone format, whereas VHS was “open,” meaning that multiple manufacturers could compete to make the devices, thereby driving down the price. As the video-rental market exploded, VHS gained a small but inevitable lead in market share, and this small lead then grew rapidly through a process of cumulative advantage. The more people bought VHS recorders, the more stores stocked VHS tapes, and vice versa. The result over time was near-total saturation of the market by the VHS format and a humiliating defeat for Sony.15
What the conventional wisdom overlooks, however, is that Sony’s vision of the VCR wasn’t as a device for watching rented movies at all. Rather, Sony expected people to use VCRs to tape TV shows, allowing them to watch their favorite shows at their leisure. Considering the exploding popularity of digital VCRs that are now used for precisely this purpose, Sony’s view of the future wasn’t implausible at all. And if it had come to pass, the superior picture quality of Betamax might well have made up for the extra cost, while the shorter taping time may have been irrelevant.16 Nor was it the case that Matsushita had any better inkling than Sony how fast the video-rental market would take off—indeed, an earlier experiment in movie rentals by the Palo Alto–based firm CTI had failed dramatically. Regardless, by the time it had become clear that home movie viewing, not taping TV shows, would be the killer app of the VCR, it was too late. Sony did their best to correct course, and in fact very quickly produced a longer-playing BII version, eliminating the initial advantage held by Matsushita. But it was all to no avail. Once VHS got a sufficient market lead, the resulting network effects were impossible to overcome. Sony’s failure, in other words, was not really the strategic blunder it is often made out to be, resulting instead from a shift in consumer demand that happened far more rapidly than anyone in the industry had anticipated.
Shortly after their debacle with Betamax, Sony made another big strategic bet on recording technology—this time with their MiniDisc players. Determined not to make the same mistake twice, Sony paid careful attention to where Betamax had gone wrong, and did their best to learn the appropriate lessons. In contrast with Betamax, Sony made sure that MiniDiscs had ample capacity to record whole albums. And mindful of the importance of content distribution to the outcome of the VCR wars, they acquired their own content repository in the form of Sony Music. At the time they were introduced in the early 1990s, MiniDiscs held clear technical advantages over the then-dominant CD format. In particular, the MiniDiscs could record as well as play, and because they were smaller and more resistant to jolts they were better suited to portable devices. Recordable CDs, by contrast, required entirely new machines, which at the time were extremely expensive.
By all reasonable measures the MiniDisc should have been an outrageous success. And yet it bombed. What happened? In a nutshell, the Internet happened. The cost of memory plummeted, allowing people to store entire libraries of music on their personal computers. High-speed Internet connections allowed for peer-to-peer file sharing. Flash drive memory allowed for easy downloading to portable devices. And new websites for finding and downloading music abounded. The explosive growth of the Internet was not driven by the music business in particular, nor was Sony the only company that failed to anticipate the profound effect that the Internet would have on production, distribution, and consumption of music. Nobody did. Sony, in other words, really was doing the best that anyone could have done to learn from the past and to anticipate the future—but they got rolled anyway, by forces beyond anyone’s ability to predict or control.
Surprisingly, the company that “got it right” in the music industry was Apple, with their combination of the iPod player and their iTunes store. In retrospect, Apple’s strategy looks visionary, and analysts and consumers alike fall over themselves to pay homage to Apple’s dedication to design and quality. Yet the iPod was exactly the kind of strategic play that the lessons of Betamax, not to mention Apple’s own experience in the PC market, should have taught them would fail. The iPod was large and expensive. It was based on closed architecture that Apple refused to license, ran on proprietary software, and was actively resisted by the major content providers. Nevertheless, it was a smashing success. So in what sense was Apple’s strategy better than Sony’s? Yes, Apple had made a great product, but so had Sony. Yes, they looked ahead and did their best to see which way the technological winds were blowing, but so did Sony. And yes, once they made their choices, they stuck to them and executed brilliantly; but that’s exactly what Sony did as well. The only important difference, in Raynor’s view, was that Sony’s choices happened to be wrong while Apple’s happened to be right.17
This is the strategy paradox. The main cause of strategic failure, Raynor argues, is not bad strategy, but great strategy that just happens to be wrong. Bad strategy is characterized by lack of vision, muddled leadership, and inept execution—not the stuff of success for sure, but more likely to lead to persistent mediocrity than colossal failure. Great strategy, by contrast, is marked by clarity of vision, bold leadership, and laser-focused execution. When applied to just the right set of commitments, great strategy can lead to resounding success—as it did for Apple with the iPod—but it can also lead to resounding failure. Whether great strategy succeeds or fails therefore depends entirely on whether the initial vision happens to be right or not. And that is not just difficult to know in advance, but impossible.
The solution to the strategy paradox, Raynor argues, is to acknowledge openly that there are limits to what can be predicted, and to develop methods for planning that respect those limits. In particular, he recommends that planners look for ways to integrate what he calls strategic uncertainty—uncertainty about the future of the business you’re in—into the planning process itself. Raynor’s solution, in fact, is a variant of a much older planning technique called scenario planning, which was developed by Herman Kahn of the RAND Corporation in the 1950s as an aid for cold war military strategists. The basic idea of scenario planning is to create what strategy consultant Charles Perrottet calls “detailed, speculative, well thought out narratives of ‘future history.’ ” Critically, however, scenario planners attempt to sketch out a wide range of these hypothetical futures, where the main aim is not so much to decide which of these scenarios is most likely as to challenge possibly unstated assumptions that underpin existing strategies.18
In the early 1970s, for example, the economist and strategist Pierre Wack led a team at Royal Dutch/Shell that used scenario planning to test senior management’s assumptions about the future success of oil exploration efforts, the political stability of the Middle East, and the emergence of alternative energy technologies. Although the main scenarios were constructed in the relatively placid years of energy production before the oil shocks of the 1970s and the subsequent rise of OPEC—events that definitely fall into the black swan category—Wack later claimed that the main trends had indeed been captured in one of his scenarios, and that the company was as a result better prepared both to exploit emerging opportunities and to hedge against potential pitfalls.19
Once these scenarios have been sketched out, Raynor argues that planners should formulate not one strategy, but rather a portfolio of strategies, each of which is optimized for a given scenario. In addition, one must differentiate core elements that are common to all these strategies from contingent elements that appear in only one or a few of them. Managing strategic uncertainty is then a matter of creating “strategic flexibility” by building strategies around the core elements and hedging the contingent elements through investments in various strategic options. In the Betamax case, for example, Sony expected the dominant use of VCRs would be to tape TV shows for the future, but it did have some evidence from the CTI experiment that the dominant use might instead turn out to be home movie viewing. Faced with these possibilities, Sony adopted a traditional planning approach, deciding first which of these outcomes they considered more likely, and then optimizing their strategy around that outcome. Optimizing for strategic flexibility, by contrast, would have led Sony to identify elements that would have worked no matter which version of the future played out, and then to hedge the residual uncertainty, perhaps by tasking different operating divisions to develop higher- and lower-quality models to be sold at different price points.
Raynor’s approach to managing uncertainty through strategic flexibility is certainly intriguing. However, it is also a time-consuming process—constructing scenarios, deciding what is core and what is contingent, devising strategic hedges, and so on—that necessarily diverts attention from the equally important business of running a company. According to Raynor, the problem with most companies is that their senior management, meaning the board of directors and the top executives, spends too much time managing and optimizing their existing strategies—what he calls operational management—and not enough thinking through strategic uncertainty. Instead, he argues that they should devote all their time to managing strategic uncertainty, leaving the operational planning to division heads. As he puts it, “The board of directors and CEO of an organization should not be concerned primarily with the short-term performance of the organization, but instead occupy themselves with creating strategic options for the organization’s operating divisions.”20
Raynor’s justification for this radical proposal is that the only way to deal adequately with strategic uncertainty is to manage it continuously—“Once an organization has gone through the process of building scenarios, developing optimal strategies, and identifying and acquiring the desired portfolio of strategic options, it is time to do it all over again.” And if indeed strategic planning requires such a continuous loop, it does make a kind of sense that the best people to be doing it are senior management. Nevertheless, it is hard to imagine how senior managers can suddenly stop doing the sort of planning that got them promoted to senior management in the first place and start acting like an academic think tank. Nor does it seem likely that shareholders or even employees would tolerate a CEO who didn’t consider it his or her business to execute strategy or to worry about short-term performance.21 This isn’t to say that Raynor isn’t right—he may be—just that his proposals have not exactly been embraced by corporate America.
A more fundamental concern is that even if senior management did embrace Raynor’s brand of strategic management as their primary task, it may still not work. Consider the example of a Houston-based oilfield drilling company that engaged in a scenario-planning exercise around 1980. As shown in the figure on this page, the planners identified three different scenarios that they considered to represent the full range of possible futures, and they plotted out the corresponding predicted yields—exactly what they were supposed to do. Unfortunately, none of the scenarios considered the possibility that the boom in oil exploration that had begun in 1980 might be a historical aberration. In fact, that’s exactly what it turned out to be, and as a result the actual future that unfolded wasn’t anywhere within the ballpark of possibilities that the participants had envisaged. Scenario planning, therefore, left the company just as unprepared for the future as if they hadn’t bothered to use the method at all. Arguably, in fact, the exercise had left them in an even worse position. Although it had accomplished its goal of challenging their initial assumptions, it had ultimately increased their confidence that they had considered the appropriate range of scenarios, which of course they hadn’t, and therefore left them even more vulnerable to surprise than before.22
Scenario planning gone wrong (reprinted from Schoemaker 1991)
Possibly, this bad outcome was merely a consequence of poor execution of scenario planning, not a fundamental limitation of the method.23 But how is a firm in the throes of a scenario analysis supposed to know that it isn’t making the same mistake as the oil producer? Perhaps Sony could have taken the home video market more seriously, but what killed them was really the speed with which it exploded. It’s hard to see how they could have anticipated that. Even worse, when developing the MiniDisc, it’s unclear how Sony could possibly have anticipated the complicated combination of technological, economic, and cultural changes that arrived in short order with the explosive growth of the Internet. As Raynor puts it, “Not only did everything that could go wrong for Sony actually go wrong, everything that went wrong had to go wrong in order to sink what was in fact a brilliantly conceived and executed strategy.”24 So although more flexibility in their strategy might have helped, it’s unclear how much flexibility they would have needed in order to adapt to such a radically shifting marketplace, or how they could have accomplished the requisite hedging without undermining their ability to execute any one strategy in particular.
Ultimately, the main problem with strategic flexibility as a planning approach is precisely the same problem that it is intended to solve—namely that in hindsight the trends that turned out to shape a given industry always appear obvious. And as a result, when we revisit history it is all too easy to persuade ourselves that had we been faced with a strategic decision “back then,” we could have boiled down the list of possible futures to a small number of contenders—including, of course, the one future that did in fact transpire. But when we look to our own future, what we see instead is myriad potential trends, any one of which could be game changing and most of which will prove fleeting or irrelevant. How are we to know which is which? And without knowing what is relevant, how wide a range of possibilities should we consider? Techniques like scenario planning can help managers think through these questions in a systematic way. Likewise, an emphasis on strategic flexibility can help them manage the uncertainty that the scenarios expose. But no matter how you slice it, strategic planning involves prediction, and prediction runs into the fundamental “prophecy” problem I discussed in the previous chapter—that we just can’t know what it is that we should be worrying about until after its importance has been revealed to us. An alternative approach, therefore—and the subject of the next chapter—is to rethink the whole philosophy of planning altogether, placing less emphasis on anticipating the future, or even multiple futures, and more on reacting to the present.