Chapter 15.xhtml

Amazing Babies

Alison Gopnik

Psychologist, University of California, Berkeley; Author, The Philosophical Baby

The biggest question for me is, “How is it possible for children, young human beings, to learn as much as they do as quickly and as effectively as they do?” We’ve known for a long time that human children are the best learning machines in the universe. But it has always been like the mystery of the hummingbirds. We know that they fly, but we don’t know how they can possibly do it. We could say that babies learn, but we didn’t know how.

But now there’s this really exciting confluence of work in artificial intelligence and machine learning, neuroscience, and in developmental psychology, all trying to tackle this question about how children could possibly learn as much as they do.

What’s happened is that there are more and more really interesting models coming out of AI and machine learning. Computer scientists and philosophers are starting to understand how scientists or machines or brains could actually do something that looks like powerful inductive learning. The project we’ve been working on for the past ten years or so is to ask whether children and even young babies implicitly use some of those same really powerful inductive learning techniques.

It’s been very exciting because, on the one hand, it helps to explain the thing that’s been puzzling developmental psychologists since Piaget. Every week we discover some new amazing thing about what babies and young children know that we didn’t realize before. And then we discover some other amazing thing about what they don’t yet know. So we’ve charted a series of changes in children’s knowledge—we know a great deal about what children know when. But the great mystery is how could they possibly learn it? What computations are they performing? And we’re starting to answer that.

It’s also been illuminating because the developmentalists can help the AI people do a sort of reverse engineering. When you realize that human babies and children are these phenomenal learners, you can ask, okay, what would happen if we actually used what we know about children to help program a computer?

The research starts out from an empirical question and a practical question, How do children learn so much? How could we design computers that learn? But then it turns out that there’s a big grand philosophical question behind it. How do any of us learn as much as we do about the world? All we’ve got are these little vibrations of air in our eardrums and photons hitting the back of our retina. And yet human beings know about objects and people, not to mention quarks and electrons. How do we ever get there? How could our brains, evolved in the Pleistocene, get us from the photons hitting our retinas to quarks and electrons? That’s the big, grand philosophical question of knowledge.

Understanding how we learn as children actually ends up providing at least the beginning of an answer to this much bigger philosophical question. The philosophical convergence, which also has a nice moral quality, is that these very, very high-prestige learning systems like scientists and fancy computers at Microsoft turn out to be doing similar things to the very, very low-prestige babies and toddlers and preschoolers. Small children aren’t the sort of people that philosophers and psychologists and scientists have been paying much attention to over the last 2,000 years. But just looking at these babies and little kids running around turns out to be really informative about deep philosophical questions.

For example, it turns out that babies and very young children already are doing statistical analyses of data, which is not something that we knew about until the last ten years. This is a really very, very new set of findings. Jenny Saffran, Elissa Newport, and Dick Aslin at Rochester started it off when they discovered that infants could detect statistical patterns in nonsense syllables. Now every month there’s a new study that shows that babies and young children compute conditional probabilities, that they do Bayesian reasoning, that they can take a random sample and understand the relationship between that sample and the population that it’s drawn from. And children don’t just detect statistical patterns, they use them to infer the causal structure of the world. They do it in much the same way that sophisticated computers do. Or for that matter, they do it in the same way that every scientist does who looks at a pattern of statistics and doesn’t just say oh, that’s the data pattern, but can then say oh, and that data pattern tells us that the world must be this particular way.

How could we actually ask babies and young children to tell us whether they understand statistics? We know that when we even ask adults to actually explicitly solve a probability problem, they collapse. How could we ask little kids to do it?

The way that we started out was that we built a machine we called the blicket detector. The blicket detector is a little machine that lights up and plays music when you put certain things on it but not others. We can actually control the information that the child gets about the statistics of this machine. We put all sorts of different things on it. Sometimes the box lights up, sometimes it doesn’t, sometimes it plays music, sometimes it doesn’t. And then we can ask the child things like what would happen if I took the yellow block off? Or which block will make it go best? And we can design it so that, for example, one block makes it go two out of eight times, and one block makes it go two out of three times.

Four-year-olds, who can’t add yet, say that the block that makes it go two out of three times is a more powerful block than the one that makes it go two out of eight times. That’s an example of the kind of implicit statistics that even two- and three- and four-year-olds are using when they’re trying to just figure out something about how this particular machine goes. And we’ve used similar experiments to show that children can use Bayesian reasoning, infer complex causal structure, and even infer hidden, invisible causal variables.

With even younger babies, Fei Xu showed that nine-month-olds were already paying attention to the statistics of their environment. She would show the baby a box of mostly red Ping-Pong balls, 80 percent red, 20 percent white. And then a screen would come up in front of the Ping-Pong balls and someone would take out a number of Ping-Pong balls from that box. They would pick out five red Ping-Pong balls or else pick out five white Ping-Pong balls. Well, of course, neither of those events is impossible. But picking out five white Ping-Pong balls from an 80 percent red box is much less likely. And even nine-month-olds will look longer when they see the white Ping-Pong balls coming from the mostly red box than when they see the red Ping-Pong balls coming from the mostly red box.

Fei did a beautiful control condition. Exactly the same thing happens, except now instead of taking the balls from the box, the experimenter takes the balls from her pocket. When the experimenter takes the balls from her pocket, the baby doesn’t know what the population is that the experimenter is sampling. And in that case, the babies don’t show any preference for the all-red versus all-white sample. The babies really seem to have an idea that some random samples from a population are more probable, and some random samples from a population are less probable.

The important thing is not just that they know this, which is amazing, but that once they know this, then they can use that as a foundation for making all sorts of other inferences. Fei and Henry Wellman and one of my ex-students, Tamar Kushnir, have been doing studies where you show babies the unrepresentative sample . . . someone picks out five white balls from a mostly red box. And now there are red balls and white balls on the table and the experimenter puts her hand out and says, “Give me some.”

Well, if the sample wasn’t representative, then you think, well, okay, why would she have done that? She must like the white balls. And, in fact, when the sample’s not representative, the babies give her the white balls. In other words, not only do the babies recognize whether this is a random sample or not, but when it isn’t random, they say oh, this isn’t just a random sample, there must be something else going on. And by the time they’re eighteen months old, they seem to think oh, the thing that’s going on is that she would rather have white balls than red balls.

Not only does this show that babies are amazing, but it actually gives the babies a mechanism for learning all sorts of new things about the world. We can’t ask these kids explicitly about probability and statistics, because they don’t yet understand that two plus two equals four. But we can look at what they actually do and use that as a way of figuring out what’s going on in their minds. These abilities provide a framework by which the babies can learn all sorts of new things that they’re not innately programmed to know. And that helps to explain how all humans can learn so much, since we’re all only babies who have been around for a while.

Another thing that it turns out that kids are doing is that they’re experimenting. You see this just in their everyday play. They are going out into the world and picking up a toy and pressing the buttons and pulling the strings on it. It looks like random play, but when you look more carefully, it turns out that that apparently random play is actually a set of quite carefully done experiments that let children figure out how it is that that toy works. Laura Schulz at MIT has done a beautiful set of studies on this.

The most important thing for children to figure out is us, other human beings. We can show that when we interact with babies they recognize the contingencies between what we do and they do. Those are the statistics of human love. I smile at you and you smile at me. And children also experiment with people, trying to figure out what the other person is going to do and feel and think. If you think of them as little psychologists, we’re the lab rats.

The problem of learning is actually in Turing’s original paper that is the foundation of cognitive science. The classic Turing problem is, “Could you get a computer to be so sophisticated that you couldn’t tell the difference between that computer and a person?” But Turing said that there was an even more profound problem, a more profound Turing test. Could you get a computer, give it the kind of data that every human being gets as a child, and have it learn the kinds of things that a child can learn?

The way that Chomsky solved that problem was to say: Oh, well, we don’t actually learn very much. What happens is that it’s all there innately. That’s a philosophical answer that has a long tradition going back to Plato and Descartes and so forth. That set the tone for the first period of the cognitive revolution. And that was reinforced when developmentalists like Andrew Meltzoff, Liz Spelke, and Renee Baillargeon began finding that babies knew much more than we thought.

Part of the reason why innateness seemed convincing is because the traditional views of learning have been very narrow, like Skinnerian reinforcement or association. Some cognitive scientists, particularly connectionists and neural network theorists, tried to argue that these mechanisms could explain how children learn but it wasn’t convincing. Children’s knowledge seemed too abstract and coherent, too far removed from the data, to be learned by association. And, of course, Piaget rejected both these alternatives and talked about “constructivism,” but that wasn’t much more than a name.

Then about twenty years ago, a number of developmentalists working in the Piagetian tradition, including me and Meltzoff and Susan Carey, Henry Wellman, and Susan Gelman, started developing the idea that I call the “theory theory.” That’s the idea that what babies and children are doing is very much like scientific induction and theory change.

The problem with that was that when we went to talk to the philosophers of science and we said, “Okay, how is it that scientists can solve these problems of induction and learn as much as they do about the world?” they said, “We have no idea, go ask psychologists.” Seeing that what the kids were doing was like what scientists were doing was sort of helpful, but it wasn’t a real cognitive science answer.

About fifteen years ago, quite independently, a bunch of philosophers of science at Carnegie Mellon, Clark Glymour and his colleagues, and a bunch of computer scientists at UCLA, Judea Pearl and his colleagues, converged on some similar ideas. They independently developed these Bayesian causal graphical models. The models provide a graphical representation of how the world works and then systematically map that representation onto patterns of probability. That was a great formal computational advance.

Once you’ve got that kind of formal computational system, then you can start designing computers that actually use that system to learn about the causal structure of the world. But you can also start asking, well, do people do the same thing? Clark Glymour and I talked about this for a long time. He would say oh, we’re actually starting to understand something about how you can solve inductive problems. I’d say gee, that sounds a lot like what babies are doing. And he’d say no, no, come on, they’re just babies, they couldn’t be doing that.

What we started doing empirically about ten years ago is to actually test the idea that children might be using these computational procedures. My lab was the first to do it, but now there is a whole set of great young cognitive scientists working on these ideas. Josh Tenenbaum at MIT and Tom Griffiths at Berkeley have worked on the computational side. On the developmental side Fei Xu, who is now at Berkeley, Laura Schulz at MIT, and David Sobel at Brown, among others, have been working on empirical experiments with children. We’ve had this convergence of philosophers and computer scientists on the one hand, and empirical developmental psychologists on the other hand, and they’ve been putting these ideas together. It’s interesting that the two centers of this work, along with Rochester, have been MIT, the traditional locus of “East Coast” nativism, and Berkeley, the traditional locus of “West Coast” empiricism. The new ideas really cross the traditional divide between those two approaches.

A lot of the ideas are part of what’s really a kind of Bayesian revolution that’s been happening across cognitive science, in vision science, in neuroscience and cognitive psychology and now in developmental psychology. Ideas about Bayesian inference that originally came from the philosophy of science have started to become more and more powerful and influential in cognitive science in general.

Whenever you get a new set of tools unexpected insights pop up. And, surprisingly enough, thinking in this formal computational nerdy way actually gives us new insights into the value of imagination. This all started by thinking about babies and children as being like little scientists, right? We could actually show that children would develop theories and change them in the way that scientists do. Our picture was . . . there’s this universe, there’s this world that’s out there. How do we figure out how that world works?

What I’ve begun to realize is that there’s actually more going on than that. One of the things that makes these causal representations so powerful and useful in AI is that not only do they let you make predictions about the world, but they let you construct counterfactuals. And counterfactuals don’t just say what the world is like now. They say here’s the way the world could be, other than the way it is now. One of the great insights that Glymour and Pearl had was that, formally, constructing these counterfactual claims was quite different from just making predictions. And causal graphical representations and Bayesian reasoning are a very good combination because you’re not just talking about what’s here and now, you’re saying . . . here’s a possibility, and let me go and test this possibility.

If you think about that from the perspective of human evolution, our great capacity is not just that we learn about the world. The thing that really makes us distinctive is that we can imagine other ways that the world could be. That’s really where our enormous evolutionary juice comes from. We understand the world, but that also lets us imagine other ways the world could be, and actually make those other worlds come true. That’s what innovation, technology, and science are all about.

Think about everything that’s in this room right now, there’s a right-angle desk and electric light and computers and windowpanes. Every single thing in this room is imaginary from the perspective of the hunter-gatherer. We live in imaginary worlds.

When you think that way, a lot of other things about babies and young children start to make more sense. We know, for instance, that young children have these incredible, vivid, wild imaginations. They live 24/7 in these crazy pretend worlds. They have a zillion different imaginary friends. They turn themselves into ninjas and mermaids. Nobody’s really thought about that as having very much to do with real hard-nosed cognitive psychology. But once you start realizing that the reason why we want to build theories about the world is so that we can imagine other ways the world can be, you could say that not only are these young children the best learners in the world, but they’re also the most creative imaginers in the world. That’s what they’re doing in their pretend play.

About ten years ago psychologists like Paul Harris and Marjorie Taylor started to show that children aren’t confused about fantasy and imagination and reality, which is what psychologists from Freud to Piaget had thought before. They know the difference between imagination and reality really well. It’s just they’d rather live in imaginary worlds than in real ones. Who could blame them? In that respect, again, they’re a lot like scientists and technologists and innovators.

One of the other really unexpected outcomes of thinking about babies and children in this new way is that you start thinking about consciousness differently. Now of course there’s always been this big question . . . the capital C question of consciousness. How can a brain have experiences? I’m skeptical about whether we’re ever going to get a single answer to the big capital C question. But there are lots of very specific things to say about how particular kinds of consciousness are connected to particular kinds of functional or neural processes.

Edge asked a while ago in the World Question Center, what is something you believe but can’t prove? And I thought well, I believe that babies are actually not just conscious but more conscious than we are. But of course that’s not something that I could ever prove. Now, having thought about it and researched it for a while, I feel that I can not quite prove it, but at least I can make a pretty good empirical case for the idea that babies are in some ways more conscious, and certainly differently conscious, than we are.

For a long time, developmental psychologists like me had said, well, babies can do all these fantastic amazing things, but they’re all unconscious and implicit. A part of me was always skeptical about that, though, just intuitively, having spent so much time with babies. You sit opposite a seven-month-old, and you watch their eyes and you look at their face and you see that wide-eyed expression and you say, goddamn it, of course she’s conscious, she’s paying attention.

We know a lot about the neuroscience of attention. When we pay attention to something as adults, we’re more open to information about that thing, but the other parts of our brain get inhibited. The metaphor psychologists always use is that it’s like a spotlight. It’s as if what happens when you pay attention is that you shine a light on one particular part of the world, make that little part of your brain available for information processing, change what you think, and then leave all the rest of it alone.

When you look at both the physiology and the neurology of attention in babies, what you see is that instead of having this narrow focused top-down kind of attention, babies are open to all the things that are going on around them in the world. Their attention isn’t driven by what they’re paying attention to. It’s driven by how information-rich the world is around them. When you look at their brains, instead of just, as it were, squirting a little bit of neurotransmitter on the part of their brain that they want to learn, their whole brain is soaked in those neurotransmitters.

The thing that babies are really bad at is inhibition, so we say that babies are bad at paying attention. What we really mean is that they’re bad at not paying attention. What we’re great at as adults is not paying attention to all the distractions around us, and just paying attention to one thing at a time. Babies are really bad at that. But the result is that their consciousness is like a lantern instead of being like a spotlight.

They’re open to all of the experience that’s going on around them.

There are certain kinds of states that we’re in as adults, like when we go to a new city for the first time, where we recapture that baby information processing. When we do that, we feel as if our consciousness has expanded. We have more vivid memories of the three days in Beijing than we do of all the rest of the months that we spend as walking, talking, teaching, meeting-attending zombies. So that we can actually say something about what babies’ consciousness is like, and that might tell us some important things about what consciousness itself is like.

I come from a big, close family. Six children. It was a somewhat lunatic artistic intellectual family of the 1950s and 1960s, back in the golden days of postwar Jewish life. I had this wonderful rich, intellectual and artistic childhood. But I was also the oldest sister of six children, which meant that I was spending a lot of time with babies and young children.

I had the first of my own three babies when I was twenty-three. There’s really only been about five minutes in my entire life when I haven’t had babies and children around. I always thought from the very beginning that they were the most interesting people there could possibly be. I can remember being in a state of mild indignation, which I’ve managed to keep up for the rest of my life, about the fact that other people treated babies and children contemptuously or dismissively or neglectfully.

At the same time, from the time I was very young, I knew that I wanted to be a philosopher. I wanted to actually answer, or at least ask, big, deep questions about the world, and I wanted to spend my life talking and arguing. And, in fact, that’s what I did as an undergraduate. I was an absolutely straight down the line honors philosophy student as an undergraduate at McGill, president of the Philosophy Students Association, etc. I went to Oxford partly because I wanted to do both philosophy and psychology.

But what kept happening to me was that I asked these philosophical questions, and I’d say, well, you know, you could find out. You want to know where language comes from? You could go and look at children, and you could find out how children learn language. Or you want to find out how we understand about the world? You could look at children and find out how they, that is we, come to understand about the world. You want to understand how we come to be moral human beings? You could look at what happens to moral intuition in children. And every time I did that, back in those bad old days, the philosophers around me would look as if I had just eaten my peas with a knife. One of the Oxford philosophers said to me after one of these conversations, “Well, you know, one’s seen children about, of course. But one would never actually talk to them.” And that wasn’t atypical of the attitude of philosophy toward children and childhood back then.

I still think of myself as being fundamentally a philosopher; I’m an affiliate of the Philosophy Department at Berkeley. I give talks at the American Philosophical Association and publish philosophical papers. It’s coincidental that the technique I use to answer those philosophical questions is to look at children and think about children. And I’m not alone in this. Of course, there are still philosophers out there who believe that philosophy doesn’t need to look beyond the armchair. But many of the most influential thinkers in philosophy of mind understand the importance of empirical studies of development.

In fact, largely because of Piaget, cognitive development has always been the most philosophical branch of psychology. That’s true if you look not just at the work that I do, but the work that people like Andrew Meltzoff or Henry Wellman or Susan Carey or Elizabeth Spelke do, or certainly what Piaget did himself. Piaget also thought of himself as a philosopher who was answering philosophical questions by looking at children.

Thinking about development also changes the way we think about evolution. The traditional picture of evolutionary psychology is that our brains evolved in the Pleistocene, and we have these special purpose modules or innate devices for organizing the world. They’re all there in our genetic code, and then they just unfold maturationally. That sort of evolutionary psychology picture doesn’t fit very well with what most developmental psychologists see when they actually study children.

When you actually study children, you certainly do see a lot of innate structure. But you also see this capacity for learning and transforming and changing what you think about the world and for imagining other ways that the world could be. In fact, one really crucial evolutionary fact about us is that we have this very, very extended childhood. We have a much longer period of immaturity than any other species does. That’s a fundamental evolutionary fact about us, and on the surface a puzzling one. Why make babies so helpless for so long? And why do we have to invest so much time and energy, literally, just to keep them alive?

Well, when you look across lots and lots of different species, birds and rodents and all sorts of critters, you see that a long period of immaturity is correlated with a high degree of flexibility, intelligence, and learning. Look at crows and chickens, for example. Crows get on the cover of Science using tools, and chickens end up in the soup pot, right? And crows have a much longer period of immaturity, a much longer period of dependence than chickens.

If you have a strategy of having these very finely shaped innate modules just designed for a particular evolutionary niche, it makes sense to have those in place from the time you’re born. But you might have a more powerful strategy. You might not be very well designed for any particular niche, but instead be able to learn about all the different environments in which you can find yourself, including being able to imagine new environments and create them. That’s the human strategy.

But that strategy has one big disadvantage, which is that while you’re doing all that learning, you are going to be helpless. You’re better off being able to consider, for example, should I attack this mastodon with this kind of tool or that kind of tool? But you don’t want to be sitting and considering those possibilities when the mastodon is coming at you.

The way that evolution seems to have solved that problem is to have this kind of cognitive division of labor, so the babies and kids are really the R&D department of the human species. They’re the ones that get to do the blue-sky learning, imagining, thinking. And the adults are production and marketing. We can not only function effectively but we can continue to function in all these amazing new environments, totally unlike the environment in which we evolved. And we can do so just because we have this protected period when we’re children and babies in which we can do all of the learning and imagining. There’s really a kind of metamorphosis. It’s like the difference between a caterpillar and a butterfly, except it’s more like the babies are the butterflies that get to flitter around and explore, and we’re the caterpillars who are just humping along on our narrow adult path.

Thinking about development not only changes the way you think about learning, but it changes the way that you think about evolution. And again, it’s this morally appealing reversal, which you’re seeing in a lot of different areas of psychology now. Instead of just focusing on human beings as the competitive hunters and warriors, people are starting to recognize that our capacities for caregiving are also, and in many respects, even more fundamental in shaping what our human nature is like.