2 Learning from the Firefighters

In our first study of firefighters, my research team and I developed our methods and our basic model of naturalistic decision making. We came to this task in 1984, when the federal government issued a published notice asking for written research proposals to study how people make decisions under time pressure. The request came from the U.S. Army Research Institute for the Behavioral and Social Sciences, which is in charge of studying the human side of the battlefield equation. The notice was sent out through a new program for small research companies such as my own. The entire description of what the army wanted was covered in a single paragraph:

Topic Description: Commanders, intelligence analysts, and others are often required to make decisions under conditions of uncertainty and severe time stress. Uncertainties may be associated with missing, incomplete, or ambiguous information, or with future outcomes that are unknown. Research is needed to: (1) better understand the cognitive processes (e.g., memory, judgment, or problem solving) of the decisionmaker under such conditions, and (2) suggest approaches for supporting the cognitive processes so that the overall quality of timeliness of decisions made under uncertainty and time stress are enhanced.

My research company wrote a short proposal (it was requested to be only twenty-five pages or fewer), and we won the contract. Years later, during discussions with some civilians who administer programs at the Army Research Institute, I got some insight into why our proposal was judged favorably. They explained to me that the U.S. Government had spent millions of dollars in the 1970s and early 1980s finding out how people make decisions, and the army had used these findings to build very expensive decision aids for battle commanders in the field. Unfortunately, most of the aids were disappointing. No one would use them. After ten years of research and considerable expense, they were not much further along than when they had begun.

These civilian program directors were also concerned about how to train people to make better decisions. Turnover in the army is high, with people coming in for two- or four-year stints. Even officers who stay in for a full twenty years are rotated every few years. For example, a new tank commander may spend six months getting trained in the rudiments, then another year coming up to speed. That gives him little more than a year to help train other people before he is moved to his next rotation. How can these officers develop skills faster? Here again, the decision research program had been a disappointment. The experiments did not shed much light on how to train a new lieutenant to make effective decisions in controlling his tank platoon. The army has doctrine about how decisions should be made, but it seemed that soldiers usually did not follow the doctrine.

Recovering from a Research Plan

I am still amazed at how poorly I designed this study of decision making. We developed our recognitional model of decision making from this study and spent the next several years following up leads from our results, but almost every major design feature in my original plan was wrong.

In the following paragraphs, I list all the design features we had planned to use, along with our starting hypotheses. (You can try to guess which were the good ones and which were the bad ones.)

  1. Fireground commanders. We wanted to study fireground commanders—the people in charge of urban and suburban fires. They decide how to attack the fire and how to use the crews. They are highly experienced and take charge of life-threatening situations. If someone is injured or killed, they are responsible. Commanders work under a great deal of time pressure. In between the fires, they might tolerate a team of scientists asking them questions.
  2. Observers. We planned to train college undergraduates as observers and put them in firehouses or in radio communication with the fire dispatchers, so they could quickly get to the scene of new fires and observe the decision making on the spot. We planned to observe the commanders during the fires and then interview them after. If we used our higher-priced and better-trained researchers, we might have wasted a lot of money having them sit idly in station houses waiting for action.
  3. Exceptional cases. We thought that the most interesting decisions to examine would be the most difficult ones, such as whether to try to extinguish a fire or to give up and make sure it does not spread, rather than the routine ones, such as where to park the trucks.
  4. Two-option hypothesis. We hypothesized that under time pressure, the commanders could not think of lots and lots of options. Instead, they would have to consider only two options: one that was intuitively the favorite, and the other to serve as a comparison to show why the favorite was better.
  5. Analogies. We expected to see lots of analogical reasoning. We believed that the commanders could use their experience like a memory bank, to recognize that a fire was just like one they had worked on previously. In this way, they could directly use their memory to make their decisions quickly.
  6. Data analysis. We believed that all we would need to do to test the two-option hypothesis was to count how often the commanders had compared lots of options versus how often they had compared just two options.

Only two of these six expectations worked out well. The others were wrong.

My friend was wrong about the value of studying firefighters. These commanders showed us how people function under the stress of having to make choices with high stakes. Our later studies showed that military commanders use the same strategies as the fireground commanders.

Even if there had been enough fires, we should not have used relatively inexperienced undergraduates. For an initial study, we needed to be on the spot ourselves. Only after we knew what was happening could we have turned this task over to others. We can easily train inexperienced research assistants to collect data in a standardized laboratory experiment, but at this first step of observation, we needed researchers with more experience and sophistication.

Soelberg’s course on decision making at the MIT Sloan School of Management taught students how to perform the classical decision analysis method we can call the rational choice strategy. The decision maker:

  1. Identifies the set of options.
  2. Identifies the ways of evaluating these options.
  3. Weights each evaluation dimension.
  4. Does the rating.
  5. Picks the option with the highest score.

For his Ph.D. dissertation, Soelberg studied the decision strategies his students used to perform a natural task: selecting their jobs as they finished their degrees. He assumed that they would rely on the rational choice strategy.

He was wrong. His students showed little inclination toward systematic thinking. Instead they would make a gut choice. By interviewing his students, Soelberg found he could identify their favorite job choice and predict their ultimate choice with 87 percent accuracy—up to three weeks before the students themselves announced their choice.

Soelberg had trained his students to use rational methods, yet when it was time for them to make a rational and important choice, they would not do it. Soelberg was also a good observer, and he tried to capture the students’ actual decision strategies.

What did the students do during this time? If asked, they would deny that they had made a decision yet. For them, a decision was just what Soelberg had taught: a deliberated choice between two or more options. To feel that they had made such a decision, they had to go through a systematic process of evaluation. They selected one other candidate as a comparison, and then tried to show that their favorite was as good as or better than the comparison candidate on each evaluation dimension. Once they had shown this to their satisfaction (even if it meant fudging a little or finding ways to beef up their favorite), then they would announce as their decision the gut favorite that Soelberg had identified much earlier. They were not actually making a decision; they were constructing a justification.

We hypothesized that the fireground commanders would behave in the same way. We thought this hypothesis—that instead of considering lots of options they would consider only two—was daring. Actually, it was conservative. The commanders did not consider two. In fact, they did not seem to be comparing any options at all. This was disconcerting, and we discovered it at the first background discussion we had with a fireground commander, even before the real interviews. We asked the commander to tell us about some difficult decisions he had made.

“I don't make decisions,” he announced to his startled listeners. “I don't remember when I've ever made a decision.”

For researchers starting a study of decision making, this was unhappy news. Even worse, he insisted that fireground commanders never make decisions. We pressed him further. Surely there are decisions during a fire—decisions about whether to call a second alarm, where to send his crews, how to contain the fire.

He agreed that there were options, yet it was usually obvious what to do in any given situation. We soon realized that he was defining the making of a decision in the same way as Soelberg’s students—generating a set of options and evaluating them to find the best one. We call this strategy of examining two or more options at the same time, usually by comparing the strengths and weaknesses of each, comparative evaluation. He insisted that he never did it. There just was no time. The structure would burn down by the time he finished listing all the options, let alone evaluating them.

Because Soelberg’s theory was one of my favorites, we kept asking questions about the two-option hypothesis for much of this study. We never found any evidence for it.

Example 2.1
The Falling Billboards

Chief V, a veteran with about twenty-five years of firefighting experience, is in charge of putting out an apartment fire. He looks up and sees some billboards on the roof, then remembers a previous fire where the wooden supports for the billboards were burned through, sending the billboards crashing to the street below. He orders his crew to push the crowds farther back, to make sure no bystanders are injured by a falling billboard.

In this incident, the memory of an earlier experience led the commander to detect a possible danger and make a quick decision by issuing an order that would reduce that danger. The memory was of part of an incident, though, not of a whole fire.

  • 6. Data analysis. We thought the data analysis would be straightforward. We expected that we would count the number of times people used the Soelberg evaluation strategy of favorite versus comparison, as opposed to the number of times they used a more complete decision matrix. In fact, were using a different strategy altogether.

Figuring Out How to Do the Project

Rather than waiting for the tough cases to happen, we asked the commanders to tell us about the big fires they had worked on during the previous few weeks or months. We treated each critical incident as a story and made the interview flow around the storytelling of the commanders. This method enabled us to get at the context of their decision making. It also ensured their interest and participation, because they enjoyed relating their experiences.

We have found the same thing in other studies. People who are good at what they do relish the chance to explain it to an appreciative audience. Once, one of our data collectors was interviewing the command staff of firefighters who worked on forest fires. She was doing the interviews during an actual fire that had spread over six mountains in Idaho and took weeks to bring under control. Even under these circumstances, she got their cooperation. In fact, firefighters who watched what she was doing but were not on her list to be contacted would ask her for permission to be interviewed. They wanted to explain to her and to themselves what had happened at critical times.

The study did not just consist of people telling us stories. It is important to select the right incident to study. To define what we want to learn from the stories, we plan strategy, sometimes with checklists of items to cover so that if they do not emerge during the story, we can ask about them. Usually we send two people on an interview: one to lead the interview and get the story moving, and the other to take notes and review the checklist of probes to make sure everything has been included.

Over the years, we have compiled lists of cognitive probes we use, such as ways that the person’s understanding of the situation changed during the episode or ways that someone with less experience might have faltered. We have learned where the expertise comes in during an incident, so we know where to probe more deeply. We have evolved ways of diagraming the incidents during the telling and after. New staff members take a short workshop on interviewing and then assist others for at least six months before they lead any interviews. (I cover some of the details in chapter 11, when I discuss storytelling.)

In this first study with fireground commanders, we needed to build a framework for conducting the interviews and guiding the stories. Roberta Calderwood, one of the research team members, took the lead in preparing interview guides and standardizing them to make it easy to listen to the stories and direct them where needed.

In these first interviews, we asked the participants if they could recall a recent event that had been nonroutine and had demanded special experience. Once we found such an incident, we asked the commanders to go through it, telling it in their own words. After we had a sense of the story, we would go through the incident again to pin down what happened and when. We tried to identify what we call decision points—times when several courses of action were open. We asked whether the commander thought about other courses of action, and if so, how the choice was made. If the commander had not considered other options, we asked why not, and what about the situation made it so obvious. We tape-recorded the interviews and took extensive notes, since we were not sure what we were after or what would be important later.