If only it weren’t for the people . . . earth would be an engineer’s paradise.
—Kurt Vonnegut
As an avid scuba diver, Luca Parmitano was familiar with the risks of drowning. He just didn’t realize it could happen in outer space.
Luca had just become the youngest astronaut ever to take a long trip to the International Space Station. In July 2013, the thirty-six-year-old Italian astronaut completed his first spacewalk, spending six hours running experiments, moving equipment, and setting up power and data cables. Now, a week later, Luca and another astronaut, Chris Cassidy, were heading out for a second walk to continue their work and do some maintenance. As they prepared to leave the airlock, they could see the Earth 250 miles below.
After forty-four minutes in space, Luca felt something strange: the back of his head seemed to be wet. He wasn’t sure where the water was coming from. It wasn’t just a nuisance; it could cut off communication by shorting out his microphone or earphones. He reported the problem to Mission Control in Houston, and Chris asked if he was sweating. “I am sweating,” Luca said, “but it feels like a lot of water. It’s not going anywhere, it’s just in my Snoopy cap. Just FYI.” He went back to work.
The officer in charge of spacewalks, Karina Eversley, knew something was wrong. That’s not normal, she thought, and quickly recruited a team of experts to compile questions for Luca. Was the amount of liquid increasing? Luca couldn’t tell. Was he sure it was water? When he stuck out his tongue to capture a few of the drops that were floating in his helmet, the taste was metallic.
Mission Control made the call to terminate the spacewalk early. Luca and Chris had to split up to follow their tethers, which were routed in opposite directions. To get around an antenna, Luca flipped over. Suddenly, he couldn’t see clearly or breathe through his nose—globs of water were covering his eyes and filling his nostrils. The water was continuing to accumulate, and if it reached his mouth he could drown. His only hope was to navigate quickly back to the airlock. As the sun set, Luca was surrounded by darkness, with only a small headlight to guide him. Then his comms went down, too—he couldn’t hear himself or anyone else speak.
Luca managed to find his way back to the outer hatch of the airlock, using his memory and the tension in his tether. He was still in grave danger: before he could remove his helmet, he would have to wait for Chris to close the hatch and repressurize the airlock. For several agonizing minutes of silence, it was unclear whether he would survive. When it was finally safe to remove his helmet, a quart and a half of water was in it, but Luca was alive. Months later, the incident would be called the “scariest wardrobe malfunction in NASA history.”
The technical updates followed swiftly. The spacesuit engineers traced the leak to a fan/pump/separator, which they replaced moving forward. They also added a breathing tube that works like a snorkel and a pad to absorb water inside the helmet. Yet the biggest error wasn’t technical—it was human.
When Luca had returned from his first spacewalk a week earlier, he had noticed some droplets of water in his helmet. He and Chris assumed they were the result of a leak in the bag that provided drinking water in his suit, and the crew in Houston agreed. Just to be safe, they replaced the bag, but that was the end of the discussion.
The space station chief engineer, Chris Hansen, led the eventual investigation into what had gone wrong with Luca’s suit. “The occurrence of minor amounts of water in the helmet was normalized,” Chris told me. In the space station community, the “perception was that drink bags leak, which led to an acceptance that it was a likely explanation without digging deeper into it.”
Luca’s scare wasn’t the first time that NASA’s failure at rethinking had proven disastrous. In 1986, the space shuttle Challenger exploded after a catastrophically shallow analysis of the risk that circular gaskets called O-rings could fail. Although this had been identified as a launch constraint, NASA had a track record of overriding it in prior missions without any problems occurring. On an unusually cold launch day, the O-ring sealing the rocket booster joints ruptured, allowing hot gas to burn through the fuel tank, killing all seven Challenger astronauts.
In 2003, the space shuttle Columbia disintegrated under similar circumstances. After takeoff, the team on the ground noticed that some foam had fallen from the ship, but most of them assumed it wasn’t a major issue since it had happened in past missions without incident. They failed to rethink that assumption and instead started discussing what repairs would be done to the ship to reduce the turnaround time for the next mission. The foam loss was, in fact, a critical issue: the damage it caused to the wing’s leading edge let hot gas leak into the shuttle’s wing upon reentry into the atmosphere. Once again, all seven astronauts lost their lives.
Rethinking is not just an individual skill. It’s a collective capability, and it depends heavily on an organization’s culture. NASA had long been a prime example of a performance culture: excellence of execution was the paramount value. Although NASA accomplished extraordinary things, they soon became victims of overconfidence cycles. As people took pride in their standard operating procedures, gained conviction in their routines, and saw their decisions validated through their results, they missed opportunities for rethinking.
Rethinking is more likely to happen in a learning culture, where growth is the core value and rethinking cycles are routine. In learning cultures, the norm is for people to know what they don’t know, doubt their existing practices, and stay curious about new routines to try out. Evidence shows that in learning cultures, organizations innovate more and make fewer mistakes. After studying and advising change initiatives at NASA and the Gates Foundation, I’ve learned that learning cultures thrive under a particular combination of psychological safety and accountability.
Years ago, an engineer turned management professor named Amy Edmondson became interested in preventing medical errors. She went into a hospital and surveyed its staff about the degree of psychological safety they experienced in their teams—could they take risks without the fear of being punished? Then she collected data on the number of medical errors each team made, tracking serious outcomes like potentially fatal doses of the wrong medication. She was surprised to find that the more psychological safety a team felt, the higher its error rates.
It appeared that psychological safety could breed complacency. When trust runs deep in a team, people might not feel the need to question their colleagues or double-check their own work.
But Edmondson soon recognized a major limitation of the data: the errors were all self-reported. To get an unbiased measure of mistakes, she sent a covert observer into the units. When she analyzed those data, the results flipped: psychologically safe teams reported more errors, but they actually made fewer errors. By freely admitting their mistakes, they were then able to learn what had caused them and eliminate them moving forward. In psychologically unsafe teams, people hid their mishaps to avoid penalties, which made it difficult for anyone to diagnose the root causes and prevent future problems. They kept repeating the same mistakes.
Since then, research on psychological safety has flourished. When I was involved in a study at Google to identify the factors that distinguish teams with high performance and well-being, the most important differentiator wasn’t who was on the team or even how meaningful their work was. What mattered most was psychological safety.
Over the past few years, psychological safety has become a buzzword in many workplaces. Although leaders might understand its significance, they often misunderstand exactly what it is and how to create it. Edmondson is quick to point out that psychological safety is not a matter of relaxing standards, making people comfortable, being nice and agreeable, or giving unconditional praise. It’s fostering a climate of respect, trust, and openness in which people can raise concerns and suggestions without fear of reprisal. It’s the foundation of a learning culture.
In performance cultures, the emphasis on results often undermines psychological safety. When we see people get punished for failures and mistakes, we become worried about proving our competence and protecting our careers. We learn to engage in self-limiting behavior, biting our tongues rather than voicing questions and concerns. Sometimes that’s due to power distance: we’re afraid of challenging the big boss at the top. The pressure to conform to authority is real, and those who dare to deviate run the risk of backlash. In performance cultures, we also censor ourselves in the presence of experts who seem to know all the answers—especially if we lack confidence in our own expertise.
A lack of psychological safety was a persistent problem at NASA. Before the Challenger launch, some engineers did raise red flags but were silenced by managers; others were ignored and ended up silencing themselves. After the Columbia launch, an engineer asked for clearer photographs to inspect the damage to the wing, but managers didn’t supply them. In a critical meeting to evaluate the condition of the shuttle after takeoff, the engineer didn’t speak up.
About a month before that Columbia launch, Ellen Ochoa became the deputy director of flight crew operations. In 1993, Ellen had made history by becoming the first Latina in space. Now, the first flight she supported in a management role had ended in tragedy. After breaking the news to the space station crew and consoling the family members of the fallen astronauts, she was determined to figure out how she could personally help to prevent this kind of disaster from ever happening again.
Ellen recognized that at NASA, the performance culture was eroding psychological safety. “People pride themselves on their engineering expertise and excellence,” she told me. “They fear their expertise will be questioned in a way that’s embarrassing to them. It’s that basic fear of looking like a fool, asking questions that people just dismiss, or being told you don’t know what you’re talking about.” To combat that problem and nudge the culture toward learning, she started carrying a 3 × 5 note card in her pocket with questions to ask about every launch and important operational decision. Her list included:
What leads you to that assumption? Why do you think it is correct? What might happen if it’s wrong?
What are the uncertainties in your analysis?
I understand the advantages of your recommendation. What are the disadvantages?
A decade later, though, the same lessons about rethinking would have to be relearned in the context of spacewalk suits. As flight controllers first became aware of the droplets of water in Luca Parmitano’s helmet, they made two faulty assumptions: the cause was the drink bag, and the effect was inconsequential. It wasn’t until the second spacewalk, when Luca was in actual danger, that they started to question whether those assumptions were wrong.
When engineer Chris Hansen took over as the manager of the extravehicular activity office, he inaugurated a norm of posing questions like Ellen’s: “All anybody would’ve had to ask is, ‘How do you know the drink bag leaked?’ The answer would’ve been, ‘Because somebody told us.’ That response would’ve set off red flags. It would’ve taken ten minutes to check, but nobody asked. It was the same for Columbia. Boeing came in and said, ‘This foam, we think we know what it did.’ If somebody had asked how they knew, nobody could’ve answered that question.”
How do you know? It’s a question we need to ask more often, both of ourselves and of others. The power lies in its frankness. It’s nonjudgmental—a straightforward expression of doubt and curiosity that doesn’t put people on the defensive. Ellen Ochoa wasn’t afraid to ask that question, but she was an astronaut with a doctorate in engineering, serving in a senior leadership role. For too many people in too many workplaces, the question feels like a bridge too far. Creating psychological safety is easier said than done, so I set out to learn about how leaders can establish it.
When I first arrived at the Gates Foundation, people were whispering about the annual strategy reviews. It’s the time when program teams across the foundation meet with the cochairs—Bill and Melinda Gates—and the CEO to give progress reports on execution and collect feedback. Although the foundation employs some of the world’s leading experts in areas ranging from eradicating disease to promoting educational equity, these experts are often intimidated by Bill’s knowledge base, which seems impossibly broad and deep. What if he spots a fatal flaw in my work? Will it be the end of my career here?
A few years ago, leaders at the Gates Foundation reached out to see if I could help them build psychological safety. They were worried that the pressure to present airtight analyses was discouraging people from taking risks. They often stuck to tried-and-true strategies that would make incremental progress rather than daring to undertake bold experiments that might make a bigger dent in some of the world’s most vexing problems.
The existing evidence on creating psychological safety gave us some starting points. I knew that changing the culture of an entire organization is daunting, while changing the culture of a team is more feasible. It starts with modeling the values we want to promote, identifying and praising others who exemplify them, and building a coalition of colleagues who are committed to making the change.
The standard advice for managers on building psychological safety is to model openness and inclusiveness. Ask for feedback on how you can improve, and people will feel safe to take risks. To test whether that recommendation would work, I launched an experiment with a doctoral student, Constantinos Coutifaris. In multiple companies, we randomly assigned some managers to ask their teams for constructive criticism. Over the following week, their teams reported higher psychological safety, but as we anticipated, it didn’t last. Some managers who asked for feedback didn’t like what they heard and got defensive. Others found the feedback useless or felt helpless to act on it, which discouraged them from continuing to seek feedback and their teams from continuing to offer it.
Another group of managers took a different approach, one that had less immediate impact in the first week but led to sustainable gains in psychological safety a full year later. Instead of asking them to seek feedback, we had randomly assigned those managers to share their past experiences with receiving feedback and their future development goals. We advised them to tell their teams about a time when they benefited from constructive criticism and to identify the areas that they were working to improve now.
By admitting some of their imperfections out loud, managers demonstrated that they could take it—and made a public commitment to remain open to feedback. They normalized vulnerability, making their teams more comfortable opening up about their own struggles. Their employees gave more useful feedback because they knew where their managers were working to grow. That motivated managers to create practices to keep the door open: they started holding “ask me anything” coffee chats, opening weekly one-on-one meetings by asking for constructive criticism, and setting up monthly team sessions where everyone shared their development goals and progress.
Creating psychological safety can’t be an isolated episode or a task to check off on a to-do list. When discussing their weaknesses, many of the managers in our experiment felt awkward and anxious at first. Many of their team members were surprised by that vulnerability and unsure of how to respond. Some were skeptical: they thought their managers might be fishing for compliments or cherry-picking comments that made them look good. It was only over time—as managers repeatedly demonstrated humility and curiosity—that the dynamic changed.
At the Gates Foundation, I wanted to go a step further. Instead of just having managers open up with their own teams about how they had previously been criticized, I wondered what would happen if senior leaders shared their experiences across the entire organization. It dawned on me that I had a memorable way to make that happen.
A few years earlier, our MBA students at Wharton decided to create a video for their annual comedy show. It was inspired by “Mean Tweets,” the late-night segment on Jimmy Kimmel Live! in which celebrities read cruel tweets about themselves out loud. Our version was Mean Reviews, where faculty members read harsh comments from student course evaluations. “This is possibly the worst class I’ve ever taken in my life,” one professor read, looking defeated before saying, “Fair enough.” Another read, “This professor is a b*tch. But she’s a nice b*tch,” adding with chagrin: “That’s sweet.” One of my own was “You remind me of a Muppet.” The kicker belonged to a junior faculty member: “Prof acts all down with pop culture, but secretly thinks Ariana Grande is a font in Microsoft Word.”
I made it a habit to show that video in class every fall, and afterward the floodgates would open. Students seemed to be more comfortable sharing their criticisms and suggestions for improvement after seeing that although I take my work seriously, I don’t take myself too seriously.
I sent the video to Melinda Gates, asking if she thought something similar might help with psychological safety in her organization. She not only said yes; she challenged the entire executive leadership team to participate and volunteered to be the first to take the hot seat. Her team compiled criticisms from staff surveys, printed them on note cards, and had her react in real time in front of a camera. She read one employee’s complaint that she was like Mary F***ing Poppins—the first time anyone could remember hearing Melinda curse—and explained how she was working on making her imperfections more visible.
To test the impact of her presentation, we randomly assigned one group of employees to watch Melinda engage with the tough comments, a second to watch a video of her talking about the culture she wanted to create in more general terms, and a third to serve as a pure control group. The first group came away with a stronger learning orientation—they were inspired to recognize their shortcomings and work to overcome them. Some of the power distance evaporated—they were more likely to reach out to Melinda and other senior leaders with both criticism and compliments. One employee commented:
In that video Melinda did something that I’ve not yet seen happen at the foundation: she broke through the veneer. It happened for me when she said, “I go into so many meetings where there are things I don’t know.” I had to write that down because I was shocked and grateful at her honesty. Later, when she laughed, like really belly-laughed, and then answered the hard comments, the veneer came off again and I saw that she was no less of Melinda Gates, but actually, a whole lot more of Melinda Gates.
It takes confident humility to admit that we’re a work in progress. It shows that we care more about improving ourselves than proving ourselves.* If that mindset spreads far enough within an organization, it can give people the freedom and courage to speak up.
But mindsets aren’t enough to transform a culture. Although psychological safety erases the fear of challenging authority, it doesn’t necessarily motivate us to question authority in the first place. To build a learning culture, we also need to create a specific kind of accountability—one that leads people to think again about the best practices in their workplaces.
In performance cultures, people often become attached to best practices. The risk is that once we’ve declared a routine the best, it becomes frozen in time. We preach about its virtues and stop questioning its vices, no longer curious about where it’s imperfect and where it could improve. Organizational learning should be an ongoing activity, but best practices imply it has reached an endpoint. We might be better off looking for better practices.
At NASA, although teams routinely debriefed after both training simulations and significant operational events, what sometimes stood in the way of exploring better practices was a performance culture that held people accountable for outcomes. Every time they delayed a scheduled launch, they faced widespread public criticism and threats to funding. Each time they celebrated a flight that made it into orbit, they were encouraging their engineers to focus on the fact that the launch resulted in a success rather than on the faulty processes that could jeopardize future launches. That left NASA rewarding luck and repeating problematic practices, failing to rethink what qualified as an acceptable risk. It wasn’t for a lack of ability. After all, these were rocket scientists. As Ellen Ochoa observes, “When you are dealing with people’s lives hanging in the balance, you rely on following the procedures you already have. This can be the best approach in a time-critical situation, but it’s problematic if it prevents a thorough assessment in the aftermath.”
Focusing on results might be good for short-term performance, but it can be an obstacle to long-term learning. Sure enough, social scientists find that when people are held accountable only for whether the outcome was a success or failure, they are more likely to continue with ill-fated courses of action. Exclusively praising and rewarding results is dangerous because it breeds overconfidence in poor strategies, incentivizing people to keep doing things the way they’ve always done them. It isn’t until a high-stakes decision goes horribly wrong that people pause to reexamine their practices.
We shouldn’t have to wait until a space shuttle explodes or an astronaut nearly drowns to determine whether a decision was successful. Along with outcome accountability, we can create process accountability by evaluating how carefully different options are considered as people make decisions. A bad decision process is based on shallow thinking. A good process is grounded in deep thinking and rethinking, enabling people to form and express independent opinions. Research shows that when we have to explain the procedures behind our decisions in real time, we think more critically and process the possibilities more thoroughly.
Process accountability might sound like the opposite of psychological safety, but they’re actually independent. Amy Edmondson finds that when psychological safety exists without accountability, people tend to stay within their comfort zone, and when there’s accountability but not safety, people tend to stay silent in an anxiety zone. When we combine the two, we create a learning zone. People feel free to experiment—and to poke holes in one another’s experiments in service of making them better. They become a challenge network.
One of the most effective steps toward process accountability that I’ve seen is at Amazon, where important decisions aren’t made based on simple PowerPoint presentations. They’re informed by a six-page memo that lays out a problem, the different approaches that have been considered in the past, and how the proposed solutions serve the customer. At the start of the meeting, to avoid groupthink, everyone reads the memo silently. This isn’t practical in every situation, but it’s paramount when choices are both consequential and irreversible. Long before the results of the decision are known, the quality of the process can be evaluated based on the rigor and creativity of the author’s thinking in the memo and in the thoroughness of the discussion that ensues in the meeting.
In learning cultures, people don’t stop keeping score. They expand the scorecard to consider processes as well as outcomes:
Even if the outcome of a decision is positive, it doesn’t necessarily qualify as a success. If the process was shallow, you were lucky. If the decision process was deep, you can count it as an improvement: you’ve discovered a better practice. If the outcome is negative, it’s a failure only if the decision process was shallow. If the result was negative but you evaluated the decision thoroughly, you’ve run a smart experiment.
The ideal time to run those experiments is when decisions are relatively inconsequential or reversible. In too many organizations, leaders look for guarantees that the results will be favorable before testing or investing in something new. It’s the equivalent of telling Gutenberg you’d only bankroll his printing press once he had a long line of satisfied customers—or announcing to a group of HIV researchers that you’d only fund their clinical trials after their treatments worked.
Requiring proof is an enemy of progress. This is why companies like Amazon use a principle of disagree and commit. As Jeff Bezos explained it in an annual shareholder letter, instead of demanding convincing results, experiments start with asking people to make bets. “Look, I know we disagree on this but will you gamble with me on it?” The goal in a learning culture is to welcome these kinds of experiments, to make rethinking so familiar that it becomes routine.
Process accountability isn’t just a matter of rewards and punishments. It’s also about who has decision authority. In a study of California banks, executives often kept approving additional loans to customers who’d already defaulted on a previous one. Since the bankers had signed off on the first loan, they were motivated to justify their initial decision. Interestingly, banks were more likely to identify and write off problem loans when they had high rates of executive turnover. If you’re not the person who greenlit the initial loan, you have every incentive to rethink the previous assessment of that customer. If they’ve defaulted on the past nineteen loans, it’s probably time to adjust. Rethinking is more likely when we separate the initial decision makers from the later decision evaluators.
© Hayley Lewis, Sketchnote summary of A Spectrum of Reasons for Failure. Illustration drawn May 2020. London, United Kingdom. Copyright © 2020 by HALO Psychology Limited.
For years, NASA had failed to create that separation. Ellen Ochoa recalls that traditionally “the same managers who were responsible for cost and schedule were the ones who also had the authority to waive technical requirements. It’s easy to talk yourself into something on a launch day.”
The Columbia disaster reinforced the need for NASA to develop a stronger learning culture. On the next space shuttle flight, a problem surfaced with the sensors in an external engine tank. It reoccurred several more times over the next year and a half, but it didn’t create any observable problems. In 2006, on the day of a countdown in Houston, the whole mission management team held a vote. There was overwhelming consensus that the launch should go forward. Only one outlier had voted no: Ellen Ochoa.
In the old performance culture, Ellen might’ve been afraid to vote against the launch. In the emerging learning culture, “it’s not just that we’re encouraged to speak up. It’s our responsibility to speak up,” she explains. “Inclusion at NASA is not only a way to increase innovation and engage employees; it directly affects safety since people need to feel valued and respected in order to be comfortable speaking up.” In the past, the onus would’ve been on her to prove it was not safe to launch. Now the onus was on the team to prove it was safe to launch. That meant approaching their expertise with more humility, their decision with more doubt, and their analysis with more curiosity about the causes and potential consequences of the problem.
After the vote, Ellen received a call from the NASA administrator in Florida, who expressed surprising interest in rethinking the majority opinion in the room. “I’d like to understand your thinking,” he told her. They went on to delay the launch. “Some people weren’t happy we didn’t launch that day,” Ellen reflects. “But people did not come up to me and berate me in any way or make me feel bad. They didn’t take it out on me personally.” The following day all the sensors worked properly, but NASA ended up delaying three more launches over the next few months due to intermittent sensor malfunctions. At that point, the manager of the shuttle program called for the team to stand down until they identified the root cause. Eventually they figured out that the sensors were working fine; it was the cryogenic environment that was causing a faulty connection between the sensors and computers.
Ellen became the deputy director and then the director of the Johnson Space Center, and NASA went on to execute nineteen consecutive successful space shuttle missions before retiring the program. In 2018, when Ellen retired from NASA, a senior leader approached her to tell her how her vote to delay the launch in 2006 had affected him. “I never said anything to you twelve years ago,” he said, but “it made me rethink how I approached launch days and whether I’m doing the right thing.”
We can’t run experiments in the past; we can only imagine the counterfactual in the present. We can wonder whether the lives of fourteen astronauts would have been saved if NASA had gone back to rethink the risks of O-ring failures and foam loss before it was too late. We can wonder why those events didn’t make them as careful in reevaluating problems with spacesuits as they had become with space shuttles. In cultures of learning, we’re not weighed down with as many of these questions—which means we can live with fewer regrets.