AT A 2013 ROBOTICS conference the MIT researcher Kate Darling invited attendees to play with animatronic toy dinosaurs called Pleos, which are about the size of a Chihuahua. The participants were told to name their robots and interact with them. They quickly learned that their Pleos could communicate: The dinos made it clear through gestures and facial expressions that they liked to be petted and didn’t like to be picked up by the tail. After an hour, Darling gave the participants a break. When they returned, she handed out knives and hatchets and asked them to torture and dismember their Pleos.
Darling was ready for a bit of resistance, but she was surprised by the group’s uniform refusal to harm the robots. Some participants went as far as shielding the Pleos with their bodies so that no one could hurt them. “We respond to social cues from these lifelike machines,” she concluded in a 2013 lecture, “even if we know that they’re not real.”
This insight will shape the next wave of automation. As Erik Brynjolfsson and Andrew McAfee describe in their book The Second Machine Age, “thinking machines”—from autonomous robots that can quickly learn new tasks on the manufacturing floor to software that can evaluate job applicants or recommend a corporate strategy—are coming to the workplace and may create enormous value for businesses and society. But although technological constraints are dissolving, social ones remain. How can you persuade your team to trust artificial intelligence? Or to accept a robot as a member—or even as a manager? If you replace that robot, will morale suffer?
Answering these questions requires an understanding of how humans will work with and relate to thinking machines. A growing body of research is expanding our knowledge, providing essential insights into how such collaborations can get work done. As these machines evolve from tools to teammates, one thing is clear: Accepting them will be more than a matter of simply adopting new technology.
The first challenge in working with thinking machines is recognizing that they often know more than we do. Consider this 2014 finding: Researchers from Wharton ran a series of experiments in which participants were financially rewarded for good predictions and could either go with their own judgment or defer to an algorithm to make those predictions. For example, in one experiment they were shown admissions data for a group of past MBA students and asked to estimate how well each student had performed during the program. Most people preferred to go with their gut rather than defer to the algorithm’s estimates.
This phenomenon is called “algorithm avoidance,” and it has been documented in many other studies. Whether they’re diagnosing patients or forecasting political outcomes, people consistently prefer human judgment—their own or someone else’s—to algorithms, and as a result they often make worse decisions. The message for managers is that helping humans to trust thinking machines will be essential.
Unfortunately, simply showing people how well an algorithm performs doesn’t make them trust it. When the Wharton researchers let participants see their guesses, the algorithm’s, and the correct answers, the participants recognized that the algorithm usually performed better. But seeing the results also meant seeing the algorithm’s errors, which affected trust. “People lose confidence in algorithms after they’ve seen them err,” says Berkeley Dietvorst, one of the researchers. Even though the humans were wrong more often than the algorithm was, he says, “people don’t lose confidence in themselves.” In other words, we seem to hold mistakes against an algorithm more than we would against a human being. According to Dietvorst, that’s because we believe that human judgment can improve, but we think (falsely) that an algorithm can’t.
Algorithm avoidance may be even more pronounced for work that we perceive as more sophisticated or instinctive than number crunching. Researchers from Northwestern’s Kellogg School and Harvard Business School asked workers on the crowdsourcing site Mechanical Turk to complete a variety of tasks; some were told that the tasks required “cognition” and “analytical reasoning,” while others were told that they required “feeling” and “emotion processing.” Then the participants were asked whether they would be comfortable if this sort of work was outsourced to machines. Those who had been told that the work was emotional were far more disturbed by the suggestion than those who had been told it was analytical. “Thinking is almost like doing math,” concludes Michael Norton, of HBS, one of the study’s authors. “And it’s OK for robots to do math. But it’s not OK for robots to feel things, because then they’re too close to being human.”
Norton believes that simply framing a task as analytical could help overcome people’s skepticism about algorithms. In another experiment, he and Adam Waytz, of Kellogg, found that people were more likely to be comfortable with the idea of a robot’s taking the job of math teacher when they were told that it “requires a lot of analytic skill to teach the students various formulas and algorithms,” and less likely to approve when told that it requires “the ability to relate to young people.”
Dietvorst and his Wharton colleagues offer another answer. If people prefer their own judgment to an algorithm’s, why not incorporate the former into the latter? In one experiment they let people tweak the output of an algorithm slightly. They asked the participants to estimate, on the basis of a variety of data points, how well a high school student had performed on a standardized math test. Rather than being forced to choose between their own estimate and the algorithm’s, participants could adjust the algorithm’s estimate up or down by a few percentage points and submit the result as their prediction. The researchers found that people given this option were more likely to trust the algorithm. Dietvorst thinks that’s because they no longer felt they were giving up control over the forecast.
Another way to encourage people to trust thinking machines is to make the latter more humanoid. Studies suggest that giving a machine or an algorithm a voice or a recognizably human body makes it more relatable. Researchers at Northwestern, the University of Connecticut, and the University of Chicago examined this thesis in the context of self-driving cars. In their experiment, participants used a driving simulator and could either do the steering themselves or engage the self-driving feature. In some cases the self-driving feature merely took control of the simulator’s steering and speed. In other cases it also had humanoid characteristics—it was named Iris, had a female voice, and spoke to the drivers during the trip. Drivers in Iris-equipped cars were more likely to engage the self-driving feature. The researchers also programmed a simulated accident that appeared to be the fault of another car rather than of the self-driving feature. Participants who experienced the accident with Iris at the wheel were more relaxed and less likely to blame the self-driving feature for causing it than those whose feature had no name or voice.
People trusted Iris more, according to the researchers, because of a tendency toward anthropomorphism—the attribution of human characteristics and motivations, such as the capacity to think, feel, or express intent, to nonhumans. A long line of research suggests that giving machines a voice, a body, or even a name can tap into this tendency and make people more comfortable working with them. For instance, we seem to collaborate with robots more effectively when they make “eye contact” with us, and we think they’re cuter and more humanoid when they tilt their heads to one side. (Remember the Pleo?)
Researchers at Carnegie Mellon explored this idea with a four-and-a-half-foot-tall autonomous robot named Snackbot, which had wheels, arms, a male voice, and an LED mouth that could smile and frown. Snackbot’s job was to deliver snacks within an office, but it was explicitly designed to evoke an anthropomorphic response. As predicted, people in the office made conversation with it and treated it with kindness. Asked about their interactions with the robot, one participant said, “Snackbot doesn’t have feelings, but I wouldn’t want to just take the snack and shut the door in its face.”
Snackbot was programmed to have “personalized” conversations with some people, commenting on their favorite snacks, for example. Workers who got this treatment were more satisfied with the robot’s service and more likely to cooperate when Snackbot made requests of them, such as asking which parts of the office it should add to a tour it would be giving.
One challenge with adding humanoid features to thinking machines is that it may lead us to put too much faith in their abilities. Researchers at the University of Manitoba conducted a series of experiments in which people were asked to do dull, repetitive work: renaming files on a computer. The participants were not told how long the experiment would run—simply that they could leave at any time—but the number of files to be renamed appeared limitless. When, inevitably, they did try to quit, they were prodded to keep going by a two-foot-tall humanoid robot named Jim. Jim sat on a desk, spoke with a robotic voice, gazed inquisitively around the room, and made hand gestures. These features were designed to project intelligence. (Unbeknownst to participants, the robot was actually controlled by researchers and could do little on its own.) When someone tried to quit the task, Jim would say something like “Please continue—we need more data,” or “It’s essential that you continue.” This went on until either the participants ignored the prodding and gave up or 80 minutes had passed. What most struck James Young, one of the study’s authors, was that many people “treated the robot as someone they could negotiate with.” They argued about how unreasonable it was being by telling them to press on, even though the robot did nothing but repeat the same few phrases. The fact that the robot had a voice and a body seemed to be enough to persuade some people that it had the ability to reason.
Another problem is that as machines become more humanoid, we are likelier to stereotype or even discriminate against them, much as we do with people. An experiment by researchers at Soongsil University, in South Korea, gauged people’s satisfaction with a security robot that monitored CCTV footage looking for suspicious activity. When the robot was named John and had a male voice, it was rated as more useful than when it was named Joan and had a female voice—even though John and Joan did identical work. Other research has documented the opposite effect for robots that operate within the home.
Finally, humanoid robots can create interpersonal issues in the workplace. In the Snackbot experiment, one person felt awkward when the robot commented, within earshot of other employees, on how much that participant liked to order Reese’s Peanut Butter Cups. Another expressed jealousy after Snackbot complimented a colleague for being in the office all the time and therefore being a hard worker. “The more you add lifelike characteristics, and particularly the more you add things that seem like emotion, the more strongly it evokes these social effects,” says Jonathan Gratch, a professor at the University of Southern California who studies human-machine interactions. “It’s not always clear that you want your virtual robot teammate to be just like a person. You want it to be better than a person.”
In his own research Gratch has explored how thinking machines might get the best of both worlds, eliciting humans’ trust while avoiding some of the pitfalls of anthropomorphism. In one study he had participants in two groups discuss their health with a digitally animated figure on a television screen (dubbed a “virtual human”). One group was told that people were controlling the avatar; the other group was told that the avatar was fully automated. Those in the latter group were willing to disclose more about their health and even displayed more sadness. “When they’re being talked to by a person, they fear being negatively judged,” Gratch says.
Gratch hypothesizes that “in certain circumstances the lack of humanness of the machine is better.” For instance, “you might imagine that if you had a computer boss, you would be more likely to be truthful about what its shortcomings were.” And in some cases, Gratch thinks, less humanoid robots would even be perceived as less susceptible to bias or favoritism.
How we work with thinking machines will vary according to the work we’re doing, how it’s framed, and how the machines are designed. But under the right conditions, people are surprisingly open to a robotic coworker. Julie Shah and her colleagues at MIT set up an experiment in which a participant, an assistant, and a robot collaborated to build Lego kits. They were told to approach the job as if they were working in manufacturing and had a tight deadline to complete the work. Allocating tasks effectively among team members was critical to completing the project quickly.
Participants built three kits under three different conditions. In one case, the robot assigned the tasks—fetching Lego parts from one bench, assembling them on another. In the second case, the participant assigned the tasks. In the third case, the participant scheduled his or her own work, while the robot allocated the remaining tasks to itself and the assistant. The researchers guessed that the participants would be most satisfied in the third scenario, because they would get some benefit from the robot’s algorithmic expertise in scheduling but would also have autonomy over their own work. In fact people preferred having the robot assign all the tasks. That was also the most efficient scenario: Teams took the least time to complete the project.
Why were these participants so much more accepting than the ones at Wharton who refused to rely on an algorithm? We don’t yet know enough to say for sure. Shah points to the fact that the task was difficult to complete in the required time frame, so people recognized that they would benefit from the robot’s help. How the work was framed most likely helped too: The goal was to maximize productivity in a controlled environment while racing against the clock—the sort of logical challenge a robot might be good at meeting. Finally, although the robot had no voice and wasn’t designed to be social, it did have a body, which may have made it seem more intelligent than a disembodied algorithm.
At the end of Shah’s experiment the participants gave feedback about why they preferred one scenario over the others. Tellingly, those who preferred having the robot in charge didn’t emphasize its humanoid qualities or the bonds they had formed with it. Instead they gave reasons such as “I never felt like I was wasting time” and “It removes the possibility of scheduling being influenced by the ego of the team leader.” The robot made a great teammate because it did what robots do best.
Originally published in June 2015. Reprint R1506F