6

COGNITIVE DIVERSITY

The first gasoline-powered horseless carriages started to appear on English roadways in the early twentieth century, and with them came an entirely new domain for civil management: traffic and road safety. Not only were there no street signs, or even clear rules of the road, the roads themselves were not designed with motorists in mind. And so a social movement was launched to improve them. For its part, the Motor Union of Great Britain and Ireland suggested that the owners of British estates clip their high hedges to enable drivers on the roads to see over them.

In response, on July 13, 1908, the following letter appeared in the Times (London), dashed off in haste by an angry gentleman named Colonel Willoughby Verner.

Dear Sir,

 

Before any of your readers may be induced to cut their hedges as suggested by the secretary of the Motor Union they may like to know my experience of having done so. Four years ago I cut down the hedges and shrubs to a height of 4ft for 30 yards back from the dangerous crossing in this hamlet. The results were twofold: the following summer my garden was smothered with dust caused by fast-driven cars, and the average pace of the passing cars was considerably increased. This was bad enough, but when the culprits secured by the police pleaded that “it was perfectly safe to go fast” because “they could see well at the corner,” I realized that I had made a mistake.

Since then I have let my hedges and shrubs grow, and by planting roses and hops have raised a screen 8ft to 10ft high, by which means the garden is sheltered to some degree from the dust and the speed of many passing cars sensibly diminished. For it is perfectly plain that there are a large number of motorists who can only be induced to go at a reasonable speed at cross-roads by consideration for their own personal safety.

Hence the advantage to the public of automatically fostering this spirit as I am now doing. To cut hedges is a direct encouragement to reckless driving.

Your obedient servant,

Willoughby Verner      

Are all safety benefits simply consumed as performance benefits? Are human beings fundamentally wired to try to game the system for greater efficiency and reward? Engineering efforts are forever trying to help us become safer as a society but, ironically, these same innovations often push us closer and closer to the threshold of danger.

In 1815, Sir Humphry Davy, president of the Royal Society, invented a safety lamp for miners—the Davy lamp—that earned a reputation as one of the most significant safety improvements in the history of mining. When the lamp was put into use in the country’s mines, however, not only did the explosions and fatalities not decrease in number, they actually increased. How could this be?

As it turned out, the lamp operated at a temperature below the ignition point of methane, thereby permitting the extension of mining into methane-rich—and increasingly dangerous—atmospheres. What started out as an attempt at a safety measure ended up only pushing the system closer to its edge. This tendency—to slowly erode the controlling mechanisms that are present to ensure safety—is a central dynamic in many robust-yet-fragile systems, especially social ones.

•     •     •

In 1975, Sam Peltzman, a University of Chicago economist, analyzed all the federal auto-safety standards imposed throughout the late 1960s. In his published account, he concluded that though these standards provided greater safety for vehicle occupants, they also led to the deaths of pedestrians, cyclists, and other drivers on the road.

John Adams, professor of geography at University College London, has written on the subject of risk for more than two decades. In 1981, he published a now famous study on the impact of seat belts on highway fatalities.

“Why, in country after country that mandated seat belts, was it impossible to see the promised reduction in road accident fatalities?” Adams wrote in one of his many essays on risk. “It appears that measures that protect drivers from the consequences of bad driving encourage bad driving. The principal effect of seat belt legislation has been a shift in the burden of risk from those already best protected in cars, to the most vulnerable, pedestrians and cyclists, outside cars.”

Adams, along with a growing cadre of behavioral scientists and risk analysts, started to group these counterintuitive findings under the concept of risk compensation, the idea that humans have an inborn tolerance for risk. As safety features are added to vehicles and roads, drivers feel less vulnerable and tend to take more chances. The feeling of greater security tempts us to be more reckless.

The phenomenon can be observed in all aspects of our daily lives. Children who wear protective gear during their games have a tendency to take more physical risks. Public health officials have noted that improved HIV treatment has led to riskier and riskier sexual behavior. Forest rangers report that hikers take more risks when they think a rescuer can access them easily. Perhaps no one has put it better than Bill Booth, famous skydiver, who coined Booth’s rule number 2: “The safer skydiving gear becomes, the more chances skydivers will take, in order to keep the fatality rate constant.”

Most social scientists agree that risk compensation exists but, in 1976, Gerald Wilde devised a model that pushed the theory of risk thresholds even further. In collaboration with John Adams, Wilde postulated that humans are constantly balancing risk, safety, and reward in a dance that is analogous to a thermostat. The setting of the thermostat varies from one individual to another, from one group to another, from one culture to another. According to Adams, “Some like it hot—a Hell’s Angel or a Grand Prix racing driver, for example; others like it cool—a Mr. Milquetoast or a little old lady named Prudence. But no one wants absolute zero.” Wilde called this the “theory of risk homeostasis”: All individuals become accustomed to some acceptable level of risk—a risk temperature—so when they are required to reduce risk in one area of their life, they will find themselves, consciously or unconsciously, increasing other risks until they are back in their risk temperature comfort zone. If they are required to wear seat belts, the evidence suggests they drive faster, pass other cars more dangerously, put on makeup while driving—you name it—just to stay in their comfort zone. In effect, they consume the additional safety they are required to have by changing their driving behavior so as to attain other desirable ends.

The concept of homeostasis can be seen at work in many biological systems. Despite the widely fluctuating temperature outside, our bodies maintain a core body temperature around 98.6 degrees Fahrenheit. If the heat outside notches that number up, count on our bodies to perspire to cool us down. Such regulating mechanisms make up the essential machinery of all living systems. When the brain doesn’t have enough glucose to function properly, the liver kicks into gear and releases glycogen, transferred by the blood to restore levels in the brain, maintaining homeostasis in the whole system.

There is even evidence that organisms exhibit homeostasis at the population level. Mice kept in cages, amply fed and without any predators, do not grow in numbers indefinitely. Scientists find that after a certain optimum population is achieved, there is a decrease in ovulation and reproduction in the females. Guppies kept in tanks exhibit the same population homeostasis, but their numbers are kept in check by cannibalism of the young just after birth.

Risk homeostasis functions through similar feedback mechanisms. Think of your own household furnace. This is a simple but elegant example of homeostasis. The thermostat is connected to a thermometer by a switch so that when the temperature in the house drops to the level selected, that same switch turns on the furnace. As the furnace brings more and more heat into the house, the thermometer rises until it reaches the level of homeostasis—the optimum temperature—and then it switches off the furnace and shuts down the heat. When the temperature cools down again, that information is fed back into the furnace through the thermostat and the whole cycle begins again.

These same oscillations, Wilde argues, occur when our risk thermostats are in a feedback loop with our surroundings, leading us to modify our behavior. Most of us slow down when we are driving in a snowstorm. Some men and most women think twice before walking down a dark alley in an unknown part of town. If a floor is slippery, we are more likely to tread carefully. These behaviors will strike us as simple common sense.

Less intuitive is the risky behavior that creates a balancing feedback loop to the safety mechanisms of policy and regulation. A three-year study of taxicabs in Munich, Germany, concluded that cabs with antilock brakes had slightly more accidents than those without. Even more disturbing, the cabs’ accelerometers showed clearly that the drivers with the safer cars accelerated faster and stopped harder.

Perhaps the most insidious example is the oft-cited study by Kip Viscusi of Duke University regarding the Consumer Product Safety Commission’s mandate for childproof aspirin caps:

 

A much more surprising result was the pattern displayed by poisoning rates after the advent of safety caps. For those products covered by safety caps, there was no downward shift in poisoning rates. This ineffectiveness appears to be attributable in part to increased parental irresponsibility, such as leaving the caps off bottles. This lulling effect in turn led to a higher level of poisonings for related products not protected by the caps.

 

Wilde likens our appetite for risk to a river delta. If the river divides into three channels before emptying out into the ocean, we can’t simply dam up two of the three channels and make the flow of water disappear. Our desire for risk, like a flowing river, will widen out the one remaining channel or open entirely new channels. In other words, if we make a risky activity like skydiving illegal, Wilde posits, all of the former skydivers are not likely to take up basket weaving. Instead they will innovate and develop new risk-laden fads until they are back in their risk thermostat comfort zone.

What if Wilde’s theory applies not just to individual choices but, more disconcertingly, to the culture of an entire community or organization?

CULTURES OF RISK

By now, the legacy of the BP Deepwater Horizon oil well disaster on April 20, 2010, which killed eleven and caused the largest man-made environmental catastrophe in U.S. history, is well documented.

Less appreciated, at least at the time, was that this disaster was part of a regular pattern with BP. Throughout the prior decade leading up to the spill, a serious catastrophe had been associated with the company about every other year. In 2003, a BP rig in the North Sea experienced a massive upwelling of gas that nearly destroyed the platform; in 2005, a BP refinery exploded in Texas City, Texas, killing fifteen workers; in 2006, a BP pipeline on the North Slope of Alaska ruptured, spilling 200,000 gallons of crude oil.

In 2007, Carolyn Merritt, chairman of the U.S. Chemical Safety Board, led an investigation of the Texas City explosion and noted, “As the investigation unfolded, we were absolutely terrified that such a [poor safety] culture could exist at BP.”

That same year, following a sex scandal, BP replaced its larger-than-life CEO, Lord John Browne. Browne had led the company for a decade, but he had earned a reputation for neglecting day-to-day concerns—including safety—at the expense of large-scale deal making. Incoming CEO Tony Hayward noted that BP’s practices “failed to meet our own standards and the requirements of the law” and promised to “focus like a laser” on the company’s accident record.

Unfortunately, BP’s culture proved resistant to Hayward’s beam. In 2009, two years into his term, the Occupational Safety and Health Administration (OSHA) documented more than seven hundred violations at the very same Texas City refinery where the deadly 2005 explosion had occurred just four years before. The agency fined the company $87.4 million, more than four times the amount it had fined the company for the original 2005 explosion. Dangerous situations also continued to plague the company’s operations in Alaska during the same time period.

These fines and admonitions had little impact on the company, whose leadership increasingly came to see them as the cost of growth in the high-stakes world of global energy extraction. Instead, BP baked high-risk decision making into its engineering and operations plans, replacing normal procedures with ones designed to save money.

On the day of the Deepwater Horizon accident, for example, BP decided to replace heavy drilling mud with lighter seawater to seal the well; the maneuver was designed to accelerate a process that was running behind schedule and costing the company some $750,000 a day. But the procedure was untested, and workers on the rig, including the chief driller, Dewey Revette, expressed grave concerns. They were overruled. Just as the workers feared, the lighter material provided insufficient downward pressure to keep errant gas from escaping, leading directly to the blowout. Revette and ten others were killed.

According to survivors of the disaster and other former employees of BP, for years it was tacitly understood that if you raised safety concerns, you could get fired from the company. Oberon Houston, a former employee who had previously served as the deputy offshore installation manager of a BP-operated production platform in the North Sea, commented in a recent blog post, “BP management focused heavily on the easy part of safety, holding the hand rails, spending hours discussing the merits of reverse parking and the dangers of not having a lid on a coffee cup, but were less enthusiastic about the hard stuff, investing in and maintaining their complex facilities.

“A continual focus on costs and an undoubted commercial savvy was not complemented with similar expertise, or enthusiasm, for the nuts and bolts of the job,” Oberon added. “Management listened intently to the views of market analysts, who knew little about the technical detail of the oil business, but instead were driven by quarterly results; encouraging and cheering on management’s relentless drive to reduce costs. This resulted in a chronic short term view at the very top of the company.”

The Deepwater Horizon spill epitomizes the central role of culture in both amplifying a mediating risk and creating or obliterating the conditions for greater organizational resilience—and not just in the world of offshore drilling. For example, in a 2009 survey conducted of almost five hundred bank executives by the consulting firm KPMG, almost half—48 percent—of respondents cited the financial firms’ risk culture as a leading contributor to the financial crisis. More than half—58 percent—of corporate board members and internal auditors included in the survey said that their company’s employees had little or no understanding of how risks should even be assessed.

In the absence of any contravening internal signals, one or another level of risk homeostasis naturally takes root, and increasingly narrow styles of thought take hold. Those who espouse the dominant perspective and values are rewarded and promoted; those who espouse different norms can be systematically undermined and driven away. For good or ill, every time that happens, silent but powerful cues are sent to every individual who remains: You can’t change it. It’s just how things are done here. This is how you need to think and act if you want to succeed. Think back to the way Geoffrey West championed the dynamism of cities through their diversity, the presence of “crazy” people. Without these crazies—people capable of dissent—fragility sets in.

Can anything be done to reverse the resulting myopia? As the U.S. Army is discovering, one way is to build and deploy a corps of professional skeptics.

RED TEAM UNIVERSITY

Fort Leavenworth, located in its namesake town of Leavenworth, Kansas, is the oldest active army post west of the Mississippi River, in continuous operation since 1827. Given its heritage and stately architecture, the casual observer might confuse many of its buildings for those of a staid eastern college campus, yet it’s actually home to a far more visionary educational institution: the University of Foreign Military and Cultural Studies, more commonly known by its nickname, Red Team University.

Started in 2004 by a group led by retired army colonel Greg Fontenot, Red Team U is an effort to train professional military devil’s advocates—field operatives who bring critical thinking to the battlefield and help commanding officers avoid the perils of overconfidence, strategic brittleness, and groupthink.

As expounded upon by Irving Janis in 1972 to describe fiascos like the Bay of Pigs invasion, groupthink is an organizational pathology that can occur within any tightly knit group of people who depend on social cohesion to operate—a nearly perfect working description of units of fighting soldiers on the battlefield. Its hallmarks include a strong illusion of invulnerability by key decision makers; a belief in the inherent morality of the group; the stereotyping of those who do not agree with the group’s perspective; and overly simplistic moral formulations that dissuade deeper rational analysis. Self-appointed thought-guards prevent alternative views from being aired and place significant pressure on dissenters, leading to the illusion of unanimity, even if dissent is rampant below the surface. It’s this kind of cultural and cognitive insularity that can get soldiers killed and needlessly prolong wars, and Fontenot and his team are on a mission to stamp it out.

Fontenot is bald with dark glasses, a penchant for cigars, and a piercing, no-nonsense gaze that suggests he may have recently made more important decisions than speaking to you. He appears to be every bit the straight-from-central-casting, front-line tank commander he was for almost three decades. Until he opens his mouth, at which point he may, in a measured drawl, turn the conversation with equal erudition from an analysis of military strategies against asymmetric threats like al-Qaeda, to the Jungian archetypes of North Korean leaders, to culturally illuminating concepts of Chinese philosophy. This is, after all, a man who taught history at West Point.

Fontenot’s extensive military career had given him a front-row seat to the entire parade of post–cold war battles fought by the United States and firsthand experience with both the evolving purposes to which American military power would be seconded and with the creeping dangers of groupthink. He commanded a battalion in the army’s point division that broke the Saddam line in the First Gulf War and later commanded the first brigade to enter Bosnia. But he was unprepared for what he found when he arrived.

“I had studied World War I extensively, and so I was excited to get to Bosnia. But when I got there, I found we didn’t understand the operating environment at all. Everyone was killing one another over issues only they understood. To us, it was incomprehensible. Everyone looked the same, and we couldn’t imagine how they perceived us. It was humbling. Walking in, I thought I knew something about it. I quickly realized I knew nothing.”

With insufficient cultural awareness, Fontenot watched as units on the ground struggled to make sense of it all. When in doubt, the soldiers frequently reverted to what they knew, falling back into rote practices and mind-sets that were not necessarily well adapted to the circumstances in which they found themselves. Echoing Arquilla’s predictions of wars yet to come, the Department of Killing People and Breaking Things were being asked to try to put things back together, or at least keep things from falling apart further.

“In that kind of situation, culture is everything,” says Fontenot. “It’s not just about understanding the combatants’ capabilities, it’s about understanding how they think. The Balkans have historically been a bad neighborhood, with ancient enmities that were largely opaque to outsiders. Our forces were there with the right hardware, but without the right cultural software, the likelihood of things being misinterpreted goes up dramatically. One man’s act of deterrence is another’s act of war.”

•     •     •

To combat surprises in the field, militaries around the world have long engaged in war gaming. In these simulated rehearsals of real conflict, the home team is assigned the color blue and the enemy the color red. The blue team develops plans for the exercise, while the opposing red team attempts to either defend a position or disrupt blue’s operations. There are significant limitations to these exercises: For one thing, such games commonly hew to the blue team’s plans, without allowing the red team to alter its strategy and influence the blue team’s tactics. But even so, war games enable militaries to think about how enemies, civilians, and partners—collectively “the others” on a battlefield—might respond to a hypothetical situation. In an age dominated by increasingly complex, low-intensity, coalition-style fighting and outpost and outreach antiterrorism and counterinsurgency operations, the need for such insights has accelerated dramatically.

Fontenot wanted to take this “think like the others” war gaming approach to new heights of sophistication, remove it from the simulated context altogether, and embed it in real units doing real fighting and real peacekeeping. In 2004, Red Team University was launched; today almost three hundred graduates are operating in the field around the world.

The eighteen-week Red Team course offers an intensive, and intentionally eclectic, survey of connected ideas in everything from military theory to strategies for negotiation, business modeling to terrorism and counterinsurgency, mixed with case studies and a heavy dose of anthropology.

Fontenot and the instructors are constantly evolving the syllabus, which relies on a mix of well-known works on creative thinking and behavioral economics; strategic in-depth analyses of current concerns like Iraq and Iran, the Middle East, terror networks, and North Korea; and lesser-known treatises on philosophy and cultural criticism. A good example text is The Propensity of Things by French sinologist François Jullien, which explores the Chinese notion of shi, a term with multiple connotations and no direct English translation, which is intrinsic to a wide array of Chinese thought, military and otherwise. Shi encompasses notions of power, relationship, and circumstance, though Jullien translates it as “propensity,” or a tendency that, like a seed, germinates within a situation. Once the propensity of a situation is set off, it can’t be stopped until the situation comes back into equilibrium. Thus, according to Chinese thought, a great power imbalance contains within itself not merely the potential, but the propensity for a great rebalancing. If one understands and designs for the propensity of the actors on a battlefield and can shape the energetic forces that are already playing out, conflict itself may be avoided, even as the desired outcome is achieved.

Like many of the concepts introduced at Red Team U, learning about shi requires embracing new cultural frameworks and new modes of thought, in this case a shift from thinking about stockpiles and objects—the standard units of military force—to flows and relationships. This is a style of thinking to which few of the participating officers have previously been exposed.

The broad conceptual portfolio is a welcome respite for many uniformed officers who are used to narrower, more top-down teaching approaches, but this is not some lightweight seminar: There are 250 pages of reading and analysis, on average, every evening. The program is designed to encourage Red Team officers to think laterally, challenge and de-bias their assumptions, ask tough questions of a commander in the field, and help them consider less-obvious cultural perceptions of and by U.S. coalition partners, adversaries, and others. “The goal,” says Fontenot, “is to help them to escape the conventions of the Western military mind-set, so they can in turn help others see past it.”

Of course, making connections in a classroom is one thing; working with commanders in the field is another. Initial skepticism about the Red Team University approach in the field is expected; graduates sometimes face outright hostility. “Some of the flak results from the nature of the command-and-control structure itself, some of it is just a by-product of people working extremely hard and not being able to step back and see the larger picture,” says Steve Hall, a graduate of the program. “The senior guys, they want this, but some of the tactical guys give us some pushback—they say, ‘I don’t need someone looking over my shoulder.’”

That’s why a big part of Red Team U’s curriculum is focused on how to effectively pitch new ideas to soldiers in the field. “Our job isn’t to second-guess the commanders, it’s to help them become better thinkers—to consider perspectives and options that they wouldn’t normally,” adds Hall. Like all Red Team University graduates, he was taught to raise issues, then back away if things become too contentious. Pushing too hard can paralyze a group with indecision—exactly the opposite of what Red Teamers trying to achieve. “We focus on the psychology of positive reinforcement, of suggestion as much as respectful challenge. You have to figure out how to sell the boss.” During the program, officers deconstruct scenes from films like The Godfather, exploring the tactics of the consigliere character played by Robert Duvall to mine ideas for effectively communicating externalities up the chain of command.

Hall adds that Red Team University’s approach is particularly timely, given that much of the training received by today’s officers was designed for entirely different kinds of conflict. “I was trained as a Cobra helicopter pilot,” says Hall. “We were trained to think in terms of dealing with the Soviet Union. We would fly our helicopters low and slow, at night, to avoid radar—and we’d worry about tanks. Now, in Iraq, none of that is relevant: You fly during the day, there is no radar to avoid, and you worry about Kalashnikov machine guns, an outdated weapon that can still kill you.”

“Our underlying assumption is not that people are evil, lazy, or incapable, but that it’s just hard to critique your own work when you’re doing it,” adds Fontenot. “People reason by analogy, and it’s hard to recognize your own untested hypotheses. If someone doesn’t challenge them, hubris can set in, bred of custom and complacency.”

•     •     •

In breaking up that complacency, Red Team increases what Scott Page, a professor of complex systems, political science, and economics at the University of Michigan, calls the “cognitive diversity” of the team—the distribution of different kinds of thinkers within each group. Mathematical modeling suggests that improving cognitive diversity within a team can lead to vastly better outcomes. Page points to the diversity prediction theorem, an empirically tested mathematical model that shows that a crowd’s collective accuracy equals its average individual accuracy minus its collective predictive diversity.

What does that mean exactly? To achieve a truly wise crowd with accurate predictive skills, you either need to have an extremely smart crowd (high ability) or you need to have people who are moderately smart but who also happen to be cognitively diverse (high diversity). Both ability and diversity contribute equally and positively—a deeply diverse team can be as good as a deeply talented one. Using a method similar to the portfolio approach we discussed in chapter 1, remixing talent into highly cognitively diverse teams produces safer, more resilient, and better performance.

The performance benefits of cognitive diversity have been experimentally validated by Kevin Dunbar, a psychologist at the University of Toronto who has explored its impact in another highly rigorous and regimented setting: scientific research labs. Over the course of a year, Dunbar and his team studied the working habits of four molecular biology labs. In a turnabout of normal affairs, the scientists were the subjects, and Dunbar studied them in their habitat, much as a primatologist studies chimps in the wild. He attended regular lab meetings where scientists present their research and current problems to one another; reviewed their data and their interim work product; and spent time in the lab interviewing, observing, and just hanging out.

What did he find? Unlike stereotypical notions of science as a rational, if tedious process, “the actual process of science is surprisingly sloppy and fraught with uncertainty—you are constantly confronted with outcomes you didn’t expect,” says Dunbar. “When scientists gets a surprising result—which can be as often as fifty percent of the time—they have to ask themselves: Was this the result of some methodological error on my part, a problem with the equipment, or is it a significant new result? What does it mean?”

When answering those questions, Dunbar found that scientists, like most people, tend to explain unexpected results through analogy. But there were big differences in how groups of scientists reasoned together. At lab meetings with lots of scientists from a single field, unexpected results tended to be interpreted through the lens of narrower, local analogies. For example, in a lab filled entirely with E. coli researchers, unexpected results would be interpreted almost exclusively in the light of prior E. coli findings, and these labs generally made slower progress. In contrast, labs filled with a more diverse array of scientists tended to use broader, more long-distance analogies, drawing on concepts and prior results well outside the target field of study, and tended to progress more quickly.

Dunbar makes (what else?) an analogy: “Local analogies are like pawns on a chess board—they only allow you to move in limited ways and require you to exhaustively check a large number of modestly different possibilities. Longer-distance analogies, on the other hand, are like queens, allowing you to rapidly travel to entirely different parts of the search space of solutions.”

The effect can be profound. “By coincidence, in one week, we happened to observe two different molecular biology labs confronting exactly the same technical problem. Lab A was headed by a brilliant scientist but a rather self-similar scientific staff. Lab B was filled with diverse scientists—a chemist, an MD, a geneticist, and so on. Two members of Lab B solved the problem in two minutes at a meeting, and Lab A was still working incrementally on the problem two months later.”

The diversity of Lab B is not free—it comes at a cost: It takes additional time to harmonize and integrate diverse team members, which can be seen as imposing a multidisciplinarity tax and modest suboptimality on the group, in exchange for dramatically better results getting out of the inevitable jams when they occur. And the members can’t be so diverse that they diverge in terms of values, differ on the end goal, or can’t bridge their methodological differences. “We found the labs that harnessed their cognitive diversity occupied a warm zone where there were meaningful disciplinary differences between team members but not irreconcilable ones. Most important, the members were united by a commonly understood goal, whether that was hunting a virus or unlocking a gene mechanism. When all of those pieces are in place, the team members were motivated to spend lots of time explaining themselves to one another and developing their own shared terms and language; new participants end up learning that special, cross-disciplinary language as part of the process of integration.”

Equally important is that these teams employed the right kinds of analogies at the right time. “Over and over again, we found that local analogies were useful for fixing experiments that weren’t working; regional analogies for coming up with new hypotheses; and long-distance analogies are good for explaining things to nonspecialists.”

Dunbar’s warm zone findings with scientific teams are mirrored by additional research by Sinan Aral at New York University. Aral and a team of colleagues studied the email correspondence of nearly 1,400 teams of executive recruiters over five years and found that graphs of their productivity followed an inverted U shape. Like the scientists, when the recruiters were too closely aligned in terms of their subject-matter expertise and social networks, they were less productive at finding good candidates for open positions. On the other hand, when the recruiters were working far outside their area of expertise and their social networks, they were similarly less productive. But in the middle of these two extremes, where team members shared the same mental models but brought a moderate amount of diversity to the tasks at hand, productivity skyrocketed. “With the right amount of diversity, the volume, speed, and revenue generated by the work all went up.”

Dunbar’s team also found that cognitive diversity and actual diversity reinforce each other. For instance, as a rule, when male scientists received an unexpected finding, they assumed they knew what the cause was and went ahead anyway, which frequently sent them charging down blind alleys. Female scientists, on the other hand, tried more often to replicate results in order to find out why they got the unexpected findings. “There’s a pernicious view that women are passive or that only those who act like men can compete,” Dunbar said. “But in these labs we studied, the women were just as aggressive as the men—they just approached the unexpected in a completely different way.”

In addition to documenting the way these scientists worked, Dunbar also studied the way their brains responded to both expected and unexpected results. In cases where subjects were confronted with an expected result, the areas of the brain responsible for putting that information into memory were activated, akin to a reward for good behavior: I got what I wanted, I’ll remember this! When confronted with an unexpected result, however, the information was often not committed to memory at all. “Sometimes, data that you don’t like, you don’t even process,” Dunbar said.

This is the neurological manifestation of the widely observed confirmation bias, a person’s tendency to favor information that confirms his assumptions, preconceptions, or hypotheses whether they are actually true or not. It was first observed by the English psychologist Peter Wason in 1960, who demonstrated it with a simple but powerful test. Wason would present experimental subjects with a triplet of numbers, say 2–4–6, and each would have to guess what rule tied the three numbers together. Subjects would do this by providing their own triplet back to the experimenter, who would either confirm the guess as conforming to the rule, or not.

When presented with the triplet above as a starting point, most subjects started by forming an initial hypothesis about them: They are a sequence of even numbers. They would test this hypothesis with a few guesses, such as 4–6–8, 4–8–12, and then perhaps 8–10–12, all of which would generate a positive confirmation from the experimenter, and then they would stop, confident they had confirmed the rule. Only they hadn’t: The rule was that the triplet contained ascending numbers, not even numbers. Surprisingly few people generated any guesses that would disconfirm their hypothesis (such as 8–6–4, 1–2–3, or 2–2–2) and only one in five subjects guessed the underlying rule correctly.

The subconscious aversion underlying confirmation bias is rooted in our universal distaste for finding out that we might be wrong. For evidence of its power, try a simple experiment: Say something considered politically opposite of your beliefs in a forum like Twitter or on Facebook, and almost certainly you will see some instantaneous churn in those who follow you, as some stop and others start, each manifesting his or her own confirmation biases.

This phenomenon not only causes us to avoid messages we don’t agree with, it can shade our interpretation of those we can’t avoid. In 2009, Heather LaMarre and her colleagues at Ohio State University found that when watching The Colbert Report, politically conservative U.S. viewers were more likely to report that Colbert disliked liberalism, only pretends to be joking, and genuinely meant what he said, while liberal viewers were more likely to report that Colbert used satire to mock conservatives and was not serious when offering conservative political statements with which they disagreed. While everyone agrees he’s funny, for the average viewer, the interpretation of whether Colbert is actually a liberal or a conservative is strongly predictive of whether the viewer is actually a liberal or conservative: The show serves as an elegant mirror, affirming our identity even as it confirms our prejudices.

If we so readily make mistakes about something as ubiquitous and designed for relevance as The Colbert Report, imagine the challenge for the army’s Red Teamers, embedded with forces operating in culturally foreign and sporadically lethal environments, where a command-and-control organizational structure dominates, and where confirmation bias can literally kill.

To be effective, they must challenge—supportively—where they can, expanding the conceptual search space of senior commanders’ thinking. They must give voice to unpopular, unconventional, or unorthodox views of strategies that may have been authored by the very commanders they serve. They must maintain a warm zone of respectful challenge and supportive dissent, becoming neither co-opted nor ostracized by the larger chain of command, while confronting the omnipresent dangers of homeostasis, groupthink, and confirmation bias. It can be difficult for the Red Teamers to show they’re having an impact, as theirs is a long game of institutional cultural change, of vigilantly challenging implicit norms and biases before they harden. If they’re doing their job well, they leave no fingerprints, just better decisions in their wake.

For their commanders, and indeed all leaders, the lessons are also clear: Resilient cultures are rooted in diversity and difference and are tolerant of occasional dissent. These factors protect the alternative search spaces that are so vital to any community struggling to change maladaptive cultural norms. The success of Red Team University—an intervention designed by the military for the military—is due, in no small part, to the social credibility of its professional skeptics. Though they are trained to be outsiders, they are still very much in the culture, allowing them to work in the diversity warm zone. As we’ll see next, embedding such resilience-enhancing interventions authentically within a community’s culture in this way isn’t just preferable—it’s essential to making them work.