20

THE VOICE IS HERD

On the Sources of Herd Behavior

OCCASIONALLY WHEN I AM AT DINNER WITH FRIENDS AT A FINE RESTAURANT I notice a peculiar behavioral phenomenon. After we have all intently studied every item on the menu from start to end and even discussed what we might want to order and what we want to avoid, the moment of decision arrives. If it is my rotten luck to be the first person whom the waiter approaches, I try to make a brave choice, expressed in a determined manner.

When the next person in turn is asked by the waiter what she wishes to order, I look at her with compassion. But when the third person orders, I start raising an eyebrow and by the time it is the fourth person’s turn, I really start to sweat. After that I don’t even bother listening to what the rest of the party are ordering—I just know I have made a fatal mistake. There is nothing left to do but wait anxiously until the waiter is done taking the orders and walks away. At that point I weakly apologize to my friends and run to the kitchen to change my order.

If you go through even a pale version of this, you are not alone.

Our overconfidence evaporates the minute we are asked to make a decision in parallel with others or after others around us are deciding on the same question. That is when we are most susceptible to conformism, to copying others and dismissing our own opinions much too quickly in the face of majority opinion.

Our tendencies toward this type of conformism do not necessarily contradict our biases toward self-confidence. Self-confidence relates to our subjective judgment of our own abilities, while our tendency toward conformism is often due to mistaken information processing. Sometimes it stems from a fear of being perceived as peculiar.

The herd phenomenon has important implications in a wide range of social situations. Hundreds of studies have been composed on the subject in economics, finance, and psychology. To some extent herd behavior is responsible for many financial market crashes, along with the bubbles preceding them. It is also the reason many erroneous stigmas spread so easily (such as, “if I see that none of my acquaintances have hired employees with disabilities, then it is probably better for me also to avoid hiring such a person”). It is responsible for the sort of homogeneity in thought and behavior that depresses creativity and renewal in societies. But the worst effect of herd behavior is that it can cause an immense number of individuals to make wrong decisions in a dynamic process, with each person influencing others around him in a misguided way, despite good intentions.

Imagine yourself on holiday in Malaga, Spain, looking for a good place to eat lunch. After an hour of exhaustive search, hungry and tired, you decide that you will enter the next restaurant you pass, no matter what. After a minute, you find yourself looking at two adjacent restaurants: one so crowded that there is barely an empty table to be seen, the other as empty as a ghost town. It is not difficult to guess which restaurant you will choose. Researchers debate whether your decision to enter the crowded restaurant stems from efficient information processing or alternatively whether a misguided herd that knows nothing about restaurants has dragged you into making the wrong decision.

We will use this example to illustrate how herd behavior can occur even if every single person is acting fully rationally, meaning that everyone satisfies the following conditions:

       1.    Individuals have their own sources of information that they use to arrive at correct decisions.

       2.    Each individual perfectly well understands how to use probabilistic models and is not limited in calculation ability.

       3.    Individuals seek to maximize their own utility.

It is entirely possible that even under these perfect conditions of rationality, herd behavior can lead everyone to the worse restaurant.

Let’s call one of the restaurants Salvador’s and the other one El Torero. Let’s further suppose that Salvador’s is a better restaurant than El Torero. Suppose now that there are one hundred tourists on a particular day trying to decide whether to eat at Salvador’s or El Torero. With these assumptions, I will now describe a process that will lead all one hundred tourists to El Torero in an entirely rational and well-calculated manner.

Suppose that prior to arriving in Malaga, each tourist looks up some information on the city’s restaurants. That information is not enough to determine decisively which restaurant of the two is the better one, but let’s suppose that each tourist slightly prefers Salvador’s. This can happen, for example, if each tourist ascribes a 51 percent chance to Salvador’s being a better restaurant and only a 49 percent chance to El Torero’s being the better one (which could happen, for example, if a popular tourist guide book notes that Salvador once ranked higher in the Michelin restaurant ranking).

On arriving in Malaga, the tourists receive another indication of the relative qualities of the restaurants (such as an email from a friend, a Web site ranking, or a recommendation from a hotel clerk). It is reasonable to assume that since Salvador’s is objectively better, there will be more positive indications for Salvador’s than El Torero. But there is some random element to these recommendations. A tourist could, for example, have received an email from a friend who happened to wander in the past into El Torero and liked the food served there (it is not a bad restaurant, after all, just not as good as Salvador’s).

Based on the new information received, each tourist now updates his probabilistic assessment of the relative qualities of the two restaurants using Bayes’s Rule (as described in the previous chapter). Recall that we assumed that all of the tourists are not only rational, they are experts in probability theory. Suppose further that all the indications are sufficiently strong that after this updating process each tourist has a high level of confidence that he knows which restaurant is truly the better one. Given the rationality that everyone exhibits, a tourist who receives only one positive indication for one restaurant but two positive indications for the other restaurant updates his probabilities in such a way that he ascribes higher probability to the restaurant with two positive indications being the better one.

AND NOW, ON TO THE MAIN COURSE. IMAGINE ALL ONE HUNDRED tourists standing in a line at 11:59 a.m., waiting for the two restaurants to open their doors to the lunchtime crowd at noon. Each tourist has received one indication favoring one restaurant or another, with the two tourists at the head of the line having received positive indications for El Torero (remember again that some of the tourists have received recommendations for El Torero, and it is not surprising that the two who happen to be at the front of the line may be among them).

At noon the front doors of the restaurants swing open. The waiters in each of the so-far empty restaurants wait in anticipation for the lunchtime crowd to enter. Every successive tourist in the queue decides, in turn and entirely rationally, where he will eat. The tourist at the head of the line, based on the positive indication for El Torero that he received up to that moment, naturally chooses El Torero. The second tourist, who has also received a positive indication for El Torero, does the same.

What about the third tourist? Let’s suppose that prior to noon she has received an indication that Salvador’s restaurant is the better one. However, she has just seen the two people ahead of her in the queue choose El Torero. She thus surmises that they each received positive indications for El Torero (which clearly differ from the indication she received). She can now take this new information into account in making her decision: she knows that there are two indications for El Torero (based on the choices of the two people in front of her in the queue) and only one indication for Salvador’s, which she previously received. That makes a majority of two to one in favor of El Torero. The third tourist thus promptly enters El Torero for lunch, overriding the indication that she had personally previously received. In other words, the third tourist will choose El Torero regardless of the signal that she herself got.

The fourth tourist is in a situation similar to the one that the third tourist was in. He knows that he cannot really learn anything from the behavior of the third tourist, who chose El Torero independently of the indication that she received. But he does know that the first two tourists did receive positive indications for El Torero. From his perspective, that constitutes a majority of positive indications for El Torero over Salvador’s, and therefore he, too, goes directly to El Torero for lunch.

It should be clear to anyone by now how this interesting lunchtime crowd of tourists is going to behave. Each and every tourist, based on the choices of the first two tourists (the choices of the rest are irrelevant, since they are basing their choices on those of the first two) will choose El Torero over Salvador’s, using the same reasoning as the third tourist. And thus the poor owner of Salvador’s, who really has worked hard to produce a meal superior to that of El Torero, will spend the entire afternoon in his empty restaurant, sadly watching his rival El Torero filled to the rafters with a herd consisting of every single tourist in town.

The story I have just told is based on a mathematical model appearing in a paper published in 1992 by three professors of finance at UCLA.1 The authors of that paper claimed that herd behavior typically occurs as a result of the most rigorously rational thinking, as in their model, and not because of psychological biases such as conformity, lack of self-confidence, and so on. It is quite an ingenious observation (if slightly contrived) that perfect rationality can still lead to herd behavior. But is this really the way herd behavior actually happens?

It was precisely to answer this question that three colleagues (from the Max Planck Institute in Germany, the University of Paris, and the University of Aberdeen) and I conducted a research study that featured a laboratory experiment in which we induced herd behavior.2 In our experiment, subjects were not asked to choose between restaurants; we based the experiment instead on the urns described in the previous chapter.

Two urns were filled with balls, one hundred balls in each urn. The first urn contained fifty red balls and fifty black balls. The second urn was filled with twenty-five red balls and seventy-five black balls. The subjects of the experiment were informed that one of these two urns would be selected, with the first (50–50) urn selected 51 percent of the time and the second (75–25) urn selected 49 percent of the time. They were also told that they would be rewarded monetarily for correctly guessing which urn was actually selected. Each subject in turn was given one opportunity to secretly remove a ball at random from the urn, check its color, and then place the ball back in the urn. After doing so, he or she was to announce publicly, in front of all the other subjects in the experiment, his or her guess as to which urn was selected (note that this public announcement parallels choosing one of the restaurants in the above story, with correctly guessing the identity of the urn parallel to correctly choosing the better restaurant).

As expected, we managed to create significant herd behavior in the lab. The herd usually began to form after three or four identical guesses had been made out of nine, that is, after the first three participants had publicly announced the same guess, the six others in each experimental round made the same guess, independent of which color ball they pulled out of the urn.

In the second stage of the experiment we carefully tested whether the explanation for the occurrence of herd behavior suggested by the three UCLA professors held up to scrutiny. Note that their explanation depends crucially on the assumption that after the first two tourists have made identical choices all the rest of the herd follows their example, but everyone else is doing so knowing that they can only learn something from the behavior of that first pair, not from the behavior of all the others. In other words, when the one hundredth tourist sees the ninety-nine tourists preceding him entering El Torero, his level of confidence in the choice of El Torero as the better restaurant is identical to the level of confidence of the third tourist who has only seen two tourists before him choose that restaurant. Both are basing their decisions solely on the decisions of the first two tourists.

That seemed unrealistic to us. If that were true, it would mean that if we were to give the one hundredth tourist a slightly better indication than that given to the first two tourists, he would choose based solely on the indications he received personally, even after seeing ninety-eight tourists ahead of him choosing differently (since he is supposed to ignore the behavior of everyone but the first two tourists in the queue). We did the equivalent thing in our experiment in order to test these assumptions. Selected subjects in our experiment, at various points in time during the development of the herd behavior, were given significantly better indications than others regarding which urn had been selected.

If the UCLA professors’ explanation were correct, these subjects should have always followed the indications they received, independently of the intensity of herd behavior that they were witnessing. But that is not what happened. When the herd behavior was just beginning to develop, and only a small number of subjects had made identical guesses, the subjects given private extra indications did indeed follow those indications to a greater extent than they followed the herd. But after the herd behavior had gathered strong momentum, they ignored their private indications and joined the crowd, as we had expected. Our conclusion was that the UCLA explanation did not stand up to close scrutiny. Herd behavior is much more stable and less fragile than their model would indicate, and it cannot be explained within a purely materially rational framework.

It is unreasonable to expect there to be one dominant explanation for herd behavior. The context in which the herd phenomenon develops is relevant. Even in phenomena such as real estate bubbles or stock market crashes there are several forces at work. When the stock market enters a downward spiral, we usually rush to sell our shares for at least two reasons: first because falling stock prices may be an indication that market fundamentals have taken a downturn and our expectations of seeing share profits correspondingly fall. But even if we are perfectly assured that the fall in prices is due solely to irrational panic while market fundamentals remain strong and stable, we are perfectly justified in selling our stock holdings as quickly as possible. With everyone else selling, the longer we hold on to our stocks, the less they are worth, minute by minute. In other words, it is quite possible that everyone rationally knows that there is no fundamentally sound reason at all to sell stocks and flee the market, yet everyone does exactly that because of the expectation that everyone else is going to do that.

Most financial crises, in fact, are caused by such self-confirming expectations. It is precisely in such situations that government intervention can be used most effectively for rebuilding trust and cooperation, reducing the fears driving investors to flee the market. This is why many governments extend deposit insurance for bank accounts. Without it, bank runs would be far too common.

In contrast, in many situations herd behavior develops because people feel a desire to join certain groups. The rapidity with which clothing fashions, artistic styles, and even ideologies spread in societies are examples of this phenomenon. There is no role for information and updating probabilities here, only a desire on the part of some individuals to be identified with other individuals. Many instances of herd behavior arise from the types of collective emotions discussed in an earlier chapter.

There is another phenomenon studied in the economics literature that is not considered to be herd behavior but is definitely related to it: peer effects. This occurs in situations in which peers (work colleagues, fellow students, and so forth) tend to copy each other’s behaviors. Bruce Sacerdote, a Dartmouth College economist, published a study in 2001 of how peers influence the extent to which students invest time and effort in university studies.3 Students of various backgrounds, majoring in different subjects, were assigned to student dorms, two to a room. The students had no input or influence on these rooming assignments, which were effected entirely randomly. Despite this, by the end of the academic year dorm-mates exhibited strong degrees of correlations in the grades they received. The study’s conclusion was that these correlations were formed by mutual influences between dorm-mates. A student who conscientiously devoted time for studies apparently influenced his or her dorm-mate.

Similar phenomena have been noted among work colleagues in several research studies. But workers have a positive incentive in getting their colleagues to invest efforts in hard work (because the harder their colleagues work, the more successful the workplace will be, to the advantage of all the workers). It is more difficult to explain why peer effects, with respect to investing efforts in studies, should appear among students of diverse backgrounds and majoring in different subjects. One possible explanation is simply a human tendency to copy the behavior of others, but the phenomenon might also stem from competitiveness.

The simplest and most general explanation for the diverse varieties of herd phenomenon, in fact, goes back to the distinction between rule rationality and act rationality that was described earlier in this book. Correctly processing information is a very difficult task to accomplish. Experts often fail at it. To illustrate just how difficult it is to use correct probabilistic reasoning for making decisions, consider the following three stories, taken from scientific journals:

       1.    Nature Neuroscience, one of the leading journals in the field of brain studies, published a paper in 2011 looking into common mistakes in probability calculations made by neuroscientists. The authors reviewed 513 papers published in the foremost brain studies journals over a period of two years.4 They found that in 157 papers in which errors in probability could have been made, half contained such errors, compromising the conclusions that they reached.

       2.    One of the most impressive experiments conducted by Daniel Kahneman, a Nobel Prize winner in economics, along with his long-time collaborator Amos Tversky, dealt with the abilities of physicians to process probabilistic calculations in their decision making.5 Kahneman and Tversky’s simple experiment involved as subjects medical interns at leading hospitals in the United States. The interns were presented with true data on cancer mortality rates in patients in the first five years after their initial diagnoses of cancer, based on the types of treatments they received: surgery versus radiation treatment. Two separate groups of interns were given exactly the same data, but it was expressed in different ways. One group was informed what percentage of cancer patients died over a five-year period while the other group was informed what percentage survived over that same period (for example, if one group was told that 60 percent of patients treated by surgery died within the first five years, then the other group was told that 40 percent of patients treated by surgery survived the first five years). Obviously, both sets of data were saying exactly the same thing. Despite this, the two groups of interns gave very different treatment recommendations, depending on how the data were presented to them.

       3.    Maya Bar-Hillel, a student of Daniel Kahneman, conducted an interesting experiment, using senior Israeli court judges as subjects, to study the extent to which they understood principles of probability. Given that the Israeli justice system (like those of all Western nations) is based on a standard of evidence requiring “proof beyond reasonable doubt,” Bar-Hillel was interested in ascertaining what the judges regarded as reasonable doubt and whether they correctly apply the standards they are sworn to uphold. To achieve this, she presented the judges with examples of evidence and asked them to decide whether or not the examples satisfied the requirement of providing proof beyond reasonable doubt.

                     Here is one example of the sort of evidence that Bar-Hillel used in this study, slightly reworded: a motorist asked for a court to review a parking ticket that he was issued when his car was parked at a location with a maximal continuous parking time of one hour. A traffic warden testified that he had twice seen the car parked at the same spot over a period of an hour and a half. In his defense, the motorist claimed that he had parked at that location for three-quarters of an hour, moved the car backward to the spot behind him, and then returned to the same parking spot fifteen minutes later, hence he had not parked continuously in the same location for over an hour.

                     The traffic warden retorted that in this case he had conducted detailed surveillance of the car’s position by recording the positions of the air-pressure valves of each of the car’s four tires (assigning one of the four positions: north, south, east, and west) both times he saw the car parked at the same spot. In each case, their positions were identical. The claim that followed from this observation was that it is unreasonable for the car to have been moved and then returned to the same location with the four air-intake valves restored to their exact positions. Most judges tended to agree with this claim. They explained that if this had been observed of only one tire, they would be less inclined to accept the evidence. But if it was observed in all four tires—now that was convincing.

                     Only a few judges noticed that, in a straight and short move of the car, if the position of the air-intake valve of one tire had been restored to its previous position, then it is almost certain that the same holds true of all four tires. In fact, the probability that the positions of the air-intake valves would return to the same position entirely randomly turns out to be almost 25 percent—making it rather reasonable to suppose that the motorist could indeed have moved the car and later returned to the same spot.

Since we are, in general, unable to make efficient decisions when faced with the need to undertake complex probabilistic calculations, we all tend to use heuristic reasoning instead. The heuristic that supposes that “the majority is right” is a simple one that serves us well in many real-life situations. The herdlike behavior that results is also unfortunate, but it is ultimately an acceptable side effect.