Girls and boys, in short, would play harmlessly together, if the distinction of sex was not inculcated long before nature makes any difference.
—Mary Wollstonecraft,
A Vindication of the Rights of Woman, 1792
“We live in jeans, don’t we? They go with everything!” coos the mother. Her six-month-old daughter is wearing the tiniest pair of jeans I’ve ever seen, and she herself is dressed head to foot in denim.
We’re sitting together in the baby lab at Birkbeck College in central London. It reminds me of a nursery, but a somewhat unusual one. A purple elephant decorates the door to a waiting area full of toys. Downstairs, meanwhile, a baby might be hooked up to an electroencephalograph that monitors her brain’s electrical activity while she watches pictures on a screen. In another room, scientists could be watching a toddler play, examining which toys he happens to choose. Meanwhile, in this small laboratory that I’ve been invited into, a baby is being gently stroked along her back with a paintbrush. She’s the thirtieth infant to be studied so far in this experiment.
“She really just likes sitting and watching, taking it all in. I’m happy sitting and observing, myself,” her mother continues, bouncing the girl on her knee. Researchers suspect that human touch like this has an important impact on development in the early years. They just don’t know how or why. So the goal of today’s experiment is to measure how touch affects a baby’s cognitive development. It’s one of countless ways in which children are affected by their upbringing, slowly shaped into the people they will become.
Cute though babies are, studying them this way is not as much fun as it might seem. It’s almost like working with animals. The challenge is to come up with clever experiments that get to the heart of their behavior without accidentally reading too much into what an infant does. A stare can be meaningful or mindless, while even the most charming a smile may just be wind. In this case, the researchers are using a paintbrush to run their touch experiment because that’s the only way to control for parents stroking their children in different ways. With a brush, you can be sure it’s the same every time.
Unfortunately, the baby’s bottom lip begins to quiver and she erupts into tears. It’s clear the paintbrush doesn’t measure up to real touch. This is one result that can’t be used.
“This is what baby science is. Trying to get a signal out of the noise,” laughs Teodora Gliga, a psychologist at Birkbeck’s Centre for Brain and Cognitive Development, who carries out research in the baby lab. Gliga’s work focuses on how children develop in their early years, in the tradition of the Swiss psychologist Jean Piaget who, from the early twentieth century, observed his own children and famously realized that many of the assumptions scientists had made about early development were wrong. Babies aren’t blank slates. Instead, he believed they are preprogrammed with their own ways of organizing knowledge about the world. The simplest example of this is a newborn’s instinctive reflex to suck.
But this is just the start, scientists are realizing. The aim now is to figure out exactly how smart children are at birth, and what this means. One other use of baby research is to investigate differences between boys and girls. If children really are preprogrammed in some way, is the programming different depending on the sex? Do little girls prefer dolls dressed in pink because they’re female or because society has taught them they should prefer dolls and the color pink?
Plenty of research has already been done. We know that around the age of two or three, children start to become aware of their own sex. Between the ages of four and six, a boy will realize that he will grow up to be a man and a girl that she will be a woman. It’s also by then that children have some understanding of what’s appropriate for each gender according to the culture they’re in. American psychologist Diane Ruble and gender development expert Carol Lynn Martin have explained how, by the age of five, children already have in their heads a constellation of gender stereotypes. They describe one experiment in which children were shown pictures of people doing things like sewing and cooking. When a picture contradicted a traditional stereotype, the kids were more likely to remember it incorrectly. In one instance, instead of remembering that they had seen a picture of a girl sawing wood—which they had—some said instead that they’d seen one of a boy sawing wood.
Some parents are acutely aware of the problem. The mother of the baby I’m observing in the lab today tells me that she’s a researcher with a PhD and she would like her daughter to have a PhD one day, too. Along the way, she’s trying to avoid exposing her to gender stereotypes that might harm her sense of what she’s able to do. “I’m not averse to pink, but we’ve tended to buy navy and blue things,” she tells me. Someone offered to sell her a dolls’ house recently, but she refused to take it. “I’d rather have something more neutral,” she adds.
Researchers like those at Birkbeck College have realized that one of the most effective ways for scientists to sift nature from nurture, the biological from the social, is by studying children so young that they haven’t yet been exposed to society’s heavily gendered ways. “I don’t think that studying adults tells us anything about sex differences. It tells us something about the lives those people lived. It’s more about their experiences than about the biology of it,” explains Teodora Gliga.
“The earlier you go in development, the closer you are to nature.”
In 2000 a brief scientific article was published in the international journal Infant Behavior and Development describing an experiment that would shape the way people around the world thought about sex differences at birth. It was written by a team from the Departments of Experimental Psychology and Psychiatry at Cambridge University, which included Simon Baron-Cohen, a psychologist, neuroscientist, and famous expert on the medical condition autism. The paper claimed to prove for the first time that there were noticeable and important sex differences in the way newborn babies behaved.
The results were so powerful that they’ve been cited at least three hundred times in other research papers, as well as in books about pregnancy and childhood. When the then president of Harvard University, Lawrence Summers, controversially suggested in 2005 that the shortfall of female scientists and mathematicians might be because of innate biological differences between women and men, Simon Baron-Cohen used this study to defend him. Harvard University cognitive scientist Steven Pinker and London School of Economics philosopher Helena Cronin have both deployed it to argue that innate differences between the sexes exist. It has even made it into a Bibleinspired self-help book, His Brain, Her Brain, about how “divinely designed differences” between the sexes can help strengthen a marriage.
Since 2000, Baron-Cohen’s department has made a formidable name for itself. At the time his paper was published, he was just two years away from unveiling a controversial and wide-ranging new theory about men and women, which he has named empathizing-systemizing theory. Its basic message is that the “female” brain is hardwired for empathy, while the “male” brain is built for analyzing and building systems, like cars and computers. People may show varying degrees of maleness and femaleness in their brains, but as the adjectives helpfully suggest, men on average tend to have “male” brains while women tend to have “female” ones.
Autism, which makes it difficult for people to understand and relate to others, is an extreme version of the male brain, adds Baron-Cohen. This is why people diagnosed with autism (until a few decades ago, they were almost all men, but many more women are now being identified with the condition, too) sometimes show unusual systemizing behavior, like the ability to do mathematical calculations in their heads very quickly or to memorize train timetables.
As yet, no one has been able to fully explain how, at the very start of a child’s life, its brain gets set on a path toward being more male or more female. If such a mechanism is at work, the details are likely to be complicated. But according to Baron-Cohen, the crucial element is sex hormones—the chemicals at the root of many of the physical differences we see between women and men. Testosterone exposure in the womb, he argues, doesn’t affect just the gonads and genitals but somehow also seeps into the male fetus’s developing brain, molding it into a systemizing male brain. Female fetuses, which tend not to have as much testosterone, are left by default with empathizing female brains.
So then, what was the significance of his paper on newborn babies? Baron-Cohen wanted to see whether the stereotypes of women having stronger social skills and men being more mechanically minded might have a biological basis—in other words, whether girls are born empathizers and boys are born systemizers. For the first time anywhere, as far as he and his team were aware, they convinced the maternity ward of a local hospital to allow them to run a study on the youngest possible group of people. More than a hundred babies were included in the study, all around two days or younger, and all clearly far too young to be affected by social conditioning. What they would observe, they claimed, would be nature untainted by nurture. And this made it a vitally important piece of evidence on which his empathizing-systemizing theory would hang.
Like many senior scientists do, Baron-Cohen left the experiment itself to a junior colleague, who had just joined his team. Jennifer Connellan was a twenty-two-year-old American postgraduate student. “I can’t believe he accepted me into his lab actually,” she tells me. By her own admission, she was young and inexperienced. Before arriving at Cambridge she was lifeguarding on a beach in California.
Each day, Connellan would turn up to the maternity ward to see if any mothers had given birth. The experiment itself was simple. “We wanted to contrast social versus mechanical,” she says. So every baby was shown a face, which happened to be Connellan’s own, and a mechanical mobile made from a picture of Connellan’s face. They then measured how long every child looked at each one, if they looked at all. This long-established experimental method in baby research is known as “preferential looking.” More socially inclined babies, the researchers hypothesized, would prefer to stare at the face, while more mechanically inclined babies might choose to look at the mobile. “It was quite rudimentary as far as the design,” she recalls. “I felt like it was kind of like a science fair project.”
When the results came in, a large proportion of babies showed no preference for the face or the mobile. But around 40 percent of the baby boys preferred to look at the mobile, compared to a quarter who preferred the face. Meanwhile, around 36 percent of the baby girls preferred the face, while only 17 percent preferred the mobile. It certainly wasn’t the case that every boy was different from every girl, but, in research terms, the difference was statistically significant, enough for the scientific community to take notice.
In the published paper, Jennifer Connellan, Simon Baron-Cohen, and their colleagues argued that this was overwhelming evidence that boys are born with a stronger interest in mechanical objects, while girls tend to have naturally better social skills and more emotional sensitivity. “Here we demonstrate beyond reasonable doubt that these differences are, in part, biological in origin,” they wrote.
“We were surprised that it was significant, that there was a significant difference,” Connellan remembers. “[Baron-Cohen] was excited. I would say both of us were. We spent a lot of time going through it, making sure the results were what we thought they were.” And sure enough, there it was, some of the seemingly strongest evidence yet that boys and girls really are born different. Cultural stereotypes about women being more empathic and men being more interested in building things might not just be due to the way their parents raised them and how society treated them.
“The fact that this was the earliest gender difference, that part was almost, like, shocking,” she tells me.
The next few years saw Simon Baron-Cohen put more meat on the bones of his idea that there are such things as distinctly female and male brains.
In 2003 he published The Essential Difference, a book written for the general public that lays bare what he sees as fundamental gaps between how men and women think. It includes a description of Connellan’s experiment, along with pictures of her face and the mobile she showed the babies. “This sex difference in social interest was on the first day of life,” he writes, adding elsewhere: “This difference at birth echoes a pattern we have seen right across the human lifespan. For example, on average, women engage in more ‘consistent’ social smiling.” The clear implication is that the sexes don’t appear to behave differently because of society or culture, but because of something profoundly innate and biological.
The differences, Baron-Cohen explains in his book, can be spotted in the types of hobbies people tend to choose.
Those with the male brain tend to spend hours happily engaged in car or motorbike maintenance, small-plane piloting, sailing, bird- or train-spotting, mathematics, tweaking their sound systems, or busy with computer games and programming, DIY or photography. Those with the female brain tend to prefer to spend their time engaged in coffee mornings or having supper with friends, advising them on relationship problems, or caring for people or pets, or working for volunteer phone-lines listening to depressed, hurt, needy or even suicidal callers.
It’s a slightly odd list. Peculiarly middle class and English, for one. It’s also difficult not to notice that the male brain appears better suited to higher-paying, higher-status jobs like computer programming or mathematics, while the female brain seems to fit best with lower-status jobs, such as a caregiver or unpaid helpline worker.
Nonetheless, Baron-Cohen’s ideas have been popular. His paper on the extreme male brain theory of autism has been cited more than a thousand times by other researchers. And the ideas behind empathizing-systemizing theory have been widely mentioned by academics and intellectuals working in child development and gender. The eminent British biologist Lewis Wolpert talks about his work in his own book on sex differences, Why Can’t a Woman Be More Like a Man?, published in 2014. “In general. . .the trend may be summarised as males tending to think narrowly while females think broadly,” writes Wolpert.
Professor of biology and gender studies at Brown University Anne Fausto-Sterling, however, is wary of research that claims to see sex differences in such young children. It’s a controversial area of science, especially given how unpredictable babies can be. It’s also too easily swallowed by parents looking to understand their kids better, she adds. “You see it on baby websites. You know, ‘Expect your girl to do this, expect your boy to do this.’” When scientists make these claims, argues Fausto-Sterling, they need to be sure their findings are reliable. If Simon Baron-Cohen’s work is taken seriously, his ideas could have important consequences for the way society makes judgments about what men and women should be doing with their lives. “I think you end up having a theory that gives you permission to limit both boys and girls to certain kinds of behaviors or longer term interests, eventually vocations,” she adds.
Simon Baron-Cohen was always aware that he was wading into divisive territory. He writes near the start of The Essential Difference that he delayed finishing it for years because he thought the topic was too politically sensitive. He makes the defense often made by scientists when they’re publishing work that might be interpreted as sexist—that science shouldn’t shy away from the truth, however uncomfortable it is. It’s a claim that runs all the way through work by people who claim to see sex differences. Objective research, they say, is objective research.
“A lot of research findings never get replicated and are probably false.”
When sex hormones were identified at the start of the twentieth century, many scientists assumed they had just a fleeting effect on sexual behavior, the same way we now realize that someone might get an adrenaline rush when they’re stressed or a surge of oxytocin when they’re in love. As research progressed, however, some began to suspect that there might be something more permanent going on.
In 1980 two American researchers, psychologist and primate expert Robert Goy and neuroscientist Bruce McEwen, published a survey of animal experiments from preceding decades that explored the effects of testosterone levels around the time of birth. One study revealed that female rats given a single injection of testosterone on the day they were born showed less sexual behavior associated with females and more that associated with males when they became adults. Similar results were shown in rhesus monkeys, a species that’s biologically not so far from humans and often used in research. A rhesus monkey was the first mammal sent into space, for instance. The more testosterone the monkeys were given, the more dramatic the differences.
Goy and McEwen’s book Sexual Differentiation of the Brain claimed that testosterone has a lasting impact on future sexual behavior. But research like theirs couldn’t be divorced from the age in which it was being done. Both science and gender studies had established the enormous role that culture plays in gender identity. In 1980 people commonly assumed that male and female brains were the same, and that behavioral differences in adults must be due to the way people were raised by their parents and shaped by society. One commentator compared talking about fetal testosterone and sex differences in the brain to talking about race and gaps in intelligence.
In an atmosphere like this, ideas like Goy and McEwen’s marked a radical shift. And of course, they didn’t go unchallenged. Critics pointed out, for instance, the bias in language being used to describe masculinity and femininity. Anything tomboyish, for instance, was interpreted as a girl behaving like a boy. But who was to say that this wasn’t in fact a normal, common feature of being female? Others later complained that theories relying on primate studies for evidence didn’t take into account that monkeys might treat their male and female offspring differently, as humans tend to do. If their genitals were affected by hormone treatment, this might affect how their mothers related to them, which might then have repercussions on their play or sexual behavior as adults.
Even though not everyone was comfortable with Goy and McEwen’s findings, their line of research continued. It took its biggest leap with the controversial idea that the brain’s entire structure might be shaped by testosterone levels in the womb, making men and women fundamentally different from birth—affecting not just sexual behavior but other behaviors as well.
Scottish neurologist Peter Behan and the US-based neurologists Norman Geschwind and Albert Galaburda said that studies on rats and rabbits showed how, even before a baby was born, higher than normal levels of testosterone slowed development on the left side of the brain, making the right side more dominant. Extended to humans, since boys naturally have more testosterone exposure before birth than girls do, it followed that men would be the ones who tend to have this larger right half of the brain. Interviewed by a reporter for the journal Science in 1983, Geschwind claimed that, if the mechanism connecting higher than usual levels of testosterone and the way a person responded to it was “just right, you get superior right hemisphere talents, such as artistic, musical or mathematical talent.” It might explain, he implied, the higher numbers of world-class male rather than female composers and artists.
At the time, there was no medical way of safely measuring testosterone levels in a living fetus. So Geschwind instead relied on studying people who were left-handed (the right half of the brain tends to control muscles on the left side of the body, and vice versa, so someone with a dominant right half would be more likely to be left-handed). By this rough measure, at least one study at the time did indeed show slightly more left-handers among mathematically gifted children compared to the population in general.
In 1984 Geschwind and Galaburda published a book titled Cerebral Dominance, spelling out how their evidence supported the concept that men’s brains were profoundly steered in a different direction by testosterone. And this is the very research that Simon Baron-Cohen has called upon in developing his own theory about empathizing female brains and systemizing male brains.
Geschwind died the same year that Cerebral Dominance came out. His death left the lingering question of whether he was right. Did the small amount of evidence in its favor mean that male brains really were hugely shaped by testosterone, or was the truth more complicated? “He was one of the most distinguished of neurologists,” says Chris McManus, a professor of psychology at University College London, who has spent years dissecting the Geschwind-Behan-Galaburda theory. This was part of the problem with his work on testosterone and the brain, he adds. Geschwind’s eminence in his field made it easy for his theory to be published in important journals, even when it turned out that the evidence for it was worryingly thin.
According to McManus, the Geschwind-Behan-Galaburda theory simply tried to do too much. At the time, it became a grand theory of how the brain was organized, drawing big connections between things that weren’t necessarily connected, and between which the connections hadn’t been proven. It was so broad that, even to this day, researchers have difficulty pinning it down. “If you’re lucky, you can make it explain anything. . . . You can cut these things any way you want when you float free from data,” says McManus.
But that doesn’t mean that it was utter hokum.
Since the 1980s, detailed research using new techniques on animals does seem to suggest that sex hormones affect the brains of fetuses as they develop, leading to small differences in certain behaviors later on. It’s a phenomenon that now has enough evidence behind it that neuroscientists and psychologists feel they cannot ignore it, even if this runs counter to their instincts. This is the unexpected nature of science: findings don’t always sit happily with politics, and results are not always black and white. In this case, even though Geschwind’s grand theory turns out to have been a little too grand, there may have been a kernel of a promising line of research hidden inside it.
In 2010 Cambridge University psychology professor Melissa Hines, who has carried out some of the world’s most influential studies on sex and gender and is heavily referenced in Baron-Cohen’s own papers, wrote in the journal Trends in Cognitive Sciences that thousands of experimental studies on nonhuman mammals show testosterone levels in the womb really do have an effect on behavior later on. Work like this is done by artificially injecting primates with extra hormones before monitoring their behavior. Her article includes a compelling pair of photographs, one of a female monkey inspecting a doll and the other of a male monkey moving a toy police car along the floor in a way that a child might.
But then, monkeys and humans are not the same. Making the leap from animals to people is critical to proving whether testosterone really does shape our own complicated minds in the same ways. If there is a similar difference, is it small, as it is in other mammals? Or is it large, in the way that Simon Baron-Cohen at Cambridge University suggests it is in his controversial empathizing-systemizing theory of male and female brains? Where does the truth lie?
Of course, the ethical standards for doing this research with humans are very different from primates. Scientists can’t artificially inject a fetus or a child with more hormones to study the effects. Instead they must turn to people who have naturally very high or very low sex hormone levels. And these people are rare.
“I was unfinished when I was born,” says Michael.
Michael isn’t his real name, which I agree not to use. His real name isn’t even his original name, which was Eilean. Michael’s fifty-first birthday was two days ago, but he tells me he chooses not to celebrate it because he doesn’t want to be reminded of the day he was born. That was the day his parents were told to raise him as a girl.
Michael was born a man, but a rare genetic condition meant that, at birth, his body didn’t reflect this. Women typically have two sex chromosomes, known as “XX,” and men have two called “XY.” This Y chromosome is crucial because it helps prompt a fetus to produce androgens such as testosterone that make his body become obviously male inside the womb. Genes and hormones working together in this way are what make males look more male and females look more female. Michael is a regular XY male, but he has five-alpha-reductase deficiency, which means he’s missing the enzyme that converts testosterone into a chemical that’s crucial to developing the sex organs before birth. This means that he was genetically male, but his genitalia were ambiguous.
Cases like Michael’s have helped biologists and psychologists get a grip on what it really means for humans to be born biologically one sex or another. If we want to know how sex hormones influence how masculine or feminine a person is, there’s no better way than to study a person who is genetically male but whose body doesn’t respond to hormones in the same way as the average man.
“When I was born, my sex wasn’t determined at first look,” he explains. “I had a penis but it was very, very small.” It used to be common in cases like these for doctors to advise people like Michael to live their lives out as girls, because surgery to make their genitals appear female is simpler than constructing a penis. When Michael was born, experts believed that gender was so heavily shaped by society that this was a perfectly reasonable choice to make. If he were treated like a girl, he might feel like one. Some children in similar situations have adapted to their new gender identities. But for many, including Michael, decisions like these have led to personal tragedy.
His underdeveloped testes were left inside his body, before being partly removed when was five years old, long before puberty set in. This surgery was accidentally left incomplete, which meant that he was still producing small amounts of testosterone. The whole time, he was oblivious to his genetic sex. To the world he was a girl, but inside he became increasingly aware that he didn’t feel like one.
At around the age of three, he started showing an interest in typical boys’ toys. Later in his school’s physical educational lessons, when girls were told to go to one side of the sports hall and boys to the other, he would stand in the middle, uncertain. “The teachers kept separating me off from the boys,” he remembers. For a young boy, the situation was as tragic as it was confusing. Another time, when a shopkeeper asked him, “What can I get for you, son?” he imagined in delight that she must have seen him for who he truly was. When someone behind him explained that he was actually a girl, it felt like a slap in the face. “As I got older, I looked at my grandmother, and mother, and female cousins and realized I will never be like them,” he recalls.
His childhood was an impossible confusion, trapped between what society expected of him—including being repeatedly told, “Girls don’t behave like that!”—and his personal conviction that he was a boy. He remembers his shame when, as a member of a choir as a child, his voice began to break and he had no choice but to blame it on a sore throat. When he was much older, people assumed he was just a very athletic girl. “People identified me as a tomboy,” he explains.
People with conditions like Michael’s are today described as “intersex.” It’s an umbrella under which many extremely rare conditions sit, including androgen insensitivity syndrome, in which a person with male chromosomes appears entirely female because their body doesn’t recognize testosterone, and congenital adrenal hyperplasia, in which women are born looking female but have high levels of male hormones, which can cause ambiguous genitalia. They’re not eunuchs or hermaphrodites. They don’t fit the binary categories of male and female, but instead occupy a biological middle point, which many people have yet to accept or understand.
“I have seen less than ten cases in my entire career,” says British endocrinologist Richard Quinton of androgen insensitivity syndrome, one condition he treats. A career spent observing people with intersex conditions, along with others who want to change gender, has given Quinton a special insight into how hormones affect sexual identity. Many patients choose to keep quiet about their conditions. But Quinton heard of an instance once in the Middle East where two sisters, both with androgen insensitivity syndrome, brought a case before the Islamic courts to be recognized as men so they could secure their family inheritance, which wouldn’t be passed down to them if they were women. With congenital adrenal hyperplasia, he says, “at the extreme end you can have some births that tend to look male,” although most look female with some masculine features. These patients “are said to be more tomboyish in their behavior, certainly in childhood. And when older, many are also attracted to the same sex.”
At sixteen years old, after finding out his true medical history, Michael finally had a chance to make his own decision about how to live the rest of his life. At nineteen he began transitioning into a man, taking weekly testosterone injections. His voice got deeper; hair grew on his arms, legs, and face; and he developed more muscle. “It was like the sun coming out,” he says.
The genital surgery inflicted on him when he was born was described at the time as “tidying up,” but he sees it now as child abuse. “What happens with a lot of these children is that they grow up in confusion,” says Michael, who has since found acceptance and understanding through the support group UK Intersex Association.
Today Michael is a psychologist with a specialty in child mental health, a career he chose partly because of his own experiences. His voice is strong and clear. His gender is indisputably male. He is also living evidence that at least some aspects of gender identity must be rooted in biology. Hormones don’t just affect how our bodies look, but how we perceive ourselves, too. The question this then raises is how much more of an effect do hormones have on how we think and behave? How much do testosterone, estrogen, and progesterone shape our minds and steer them in different directions?
I’m told that psychology professor Melissa Hines is one of the most balanced and fair researchers in her field—which is important in a field that is sometimes neither balanced nor fair. Her office, at the end of a warren of old, wooden corridors behind a small lane in Cambridge, is lined with books about gender from all sides of the debate. She chooses her words carefully.
“We’ve looked at a variety of behaviors,” she begins. Hines relies on intersex cases like Michael’s to carry out her research on the effects that hormones have on psychological sex difference, including intelligence. Like baby research, this is an important part of the equation when it comes to understanding nature and nurture. If testosterone does steer boys toward having a distinctly male brain, different from a female brain, then we should see clear differences in how people with unusually high or low testosterone behave.
Her findings reveal three areas that show a statistically marked difference. Starting with the obvious first, “for gender identity, the differences are huge. Most men think of themselves as men and most women don’t,” she states. “The second thing is sexual orientation. Most women are interested in men, and most men aren’t.” The third one is childhood play behavior. Studying girls with congenital adrenal hyperplasia, with higher than normal levels of testosterone, she found, “Rough-and-tumble play is increased in girls exposed to androgens. They like boys’ toys a bit more, girls’ toys a bit less, and they like to play with boys more than the average girl does, but not as much as the average boy. That’s been replicated by seven or eight independent research teams.”
The fact that research is replicated is crucial. A lot of work in the field of psychology, even the most widely reported on in the press, hasn’t been. If a number of independent scientists come to the same conclusions based on different studies across a broad range of people, then it’s far easier to be confident about the results. “A lot of research findings never get replicated and probably are false,” she admits. “It’s just the way science works. You can’t study the whole world, so you have to take a sample, and your sample may or may not be representative.” This is so important to Hines that, when I meet her, she goes so far as to warn me that she isn’t even sure about the reliability of some of her own research because it hasn’t yet been replicated elsewhere.
On toy preferences, now, she has little doubt left. “One of the first studies I did in this area was bringing children into the playroom with all the toys and just recording how much time they spend playing with each toy,” she describes. “I was really surprised by the results because, at the time, the thought was that toy choices are completely socially determined. And you can see why, because there is so much social pressure for children to play with the gender-appropriate toy.” She and others found in study after study that boys on average really do prefer to play with trucks and cars, while girls on average prefer dolls. “The main toys are vehicles and dolls. Those are the most gendered type of toys,” she says.
A study that Hines and her colleagues carried out on infants in 2010, watching for how long children look at one toy over another, suggested that these preferences might start to emerge close to the age of two. “Between twelve and twenty-four months, children were already showing preferences for sex-typed toys. So, the girls were looking longer at the dolls than at the car, and the boys were looking longer at the car than at the doll,” she says. But at twelve months, both boys and girls spent longer looking at the doll than the car.
Statistically, this difference in how young children play is significant. “Toy preferences, I like to compare to height,” she explains. “We know that men are taller than women but not all men are taller than all women. So the size of that sex difference is two standard deviations. The sex difference in time playing with dolls versus trucks is about the same as the sex difference in height.” A standard deviation is a measure of how spread out data are. The spread of height looks like a bell-shaped curve. The average height of men is around sixty-nine inches and the standard deviation is three inches. This means that, in a large group of men, more than two-thirds will be within one standard deviation of the average, making them between sixty-six and seventy-two inches tall. The farther you get from the average, toward the thin ends of the bell curve, the fewer men there are. Two standard deviations away will be men who are six inches taller or six inches shorter than the average man (less than 5 percent of men are two or more standard deviations away from the average). A difference in behavior of two standard deviations between men and women would therefore be like a difference of six inches between their average heights. In everyday life, it’s a noticeable gap.
In studying girls with congenital adrenal hyperplasia, Hines’s team was keen to test whether they might be getting some unconscious encouragement to play with boys’ toys, perhaps because their families knew of their intersex condition. “So we thought, let’s bring parents in with them and see how they react. Are they encouraging the girls to play in that way or not in the playroom?” she says. “But we found what they actually did was try to get them to play with female typical toys. More so than with their other daughters, they would introduce female toys. If she was playing with a female toy they would say, ‘That’s nice,’ and give them a hug.” It’s more evidence, she implies, that the differences they’ve seen in toy preferences aren’t purely due to social conditioning but have a biological element, too.
This difference in toy choices, however, is a far leap from the theory that the brains of men and women are deeply structurally different because of how much testosterone they’ve been exposed to. It’s also a considerable distance from Baron-Cohen’s claim that there’s such a thing as a typical male brain and a typical female brain—one that prefers mathematics and another that likes coffee mornings. For him to be right, there would have to be noticeable gaps in lots of other behaviors as well. Those with female brains would have to clearly behave on average like empathizers and those with male brains like systemizers.
According to Hines, this isn’t what we see. Tallying all the scientific data she has seen across all ages, Hines believes that the “sex difference in empathizing and systemizing is about half a standard deviation.” This would be equivalent to a gap of about an inch between the average heights of men and women. It’s small. “That’s typical,” she adds. “Most sex differences are in that range, And for a lot of things, we don’t show any sex differences.”
Researchers have known this for a long time. In their 1974 book The Psychology of Sex Differences, American researchers Eleanor Maccoby and Carol Nagy Jacklin picked through an enormous mass of studies looking at similarities and differences between boys and girls. They concluded that the psychological gaps between women and men were far smaller than the differences that existed in society among women and among men. In 2010, Hines repeated this exercise using more recent research. She found that only the tiniest gaps, if any, existed between boys’ and girls’ fine motor skills, ability to perform mental rotations, spatial visualization, mathematics ability, verbal fluency, and vocabulary. On all these measures, boys and girls performed almost the same.
Teodora Gliga from the Birkbeck baby lab agrees that when it comes to children raised under normal conditions, without unusual medical conditions, large gaps between girls and boys haven’t been found. “It’s quite rare to find differences in typical development.” The overlap between the sexes is so huge, she explains, that scientists have struggled to find and replicate results that suggest that there is a real gap between the sexes. “For the time being, the baby science is not convincingly showing any consistent differences.”
Even studying the tiny minority of girls who have been exposed to higher than usual levels of androgens, adds Hines, while it does tell us something about sex differences, doesn’t tell us that these differences are particularly big. “If genetically I am a girl fetus that produces a bit more androgen, maybe I’ll play a bit more with boys than if I had a bit less. Then maybe I’ll have two friends who are boys, instead of one.” Beyond gender identity and toy preference, on pretty much every other behavioral and cognitive measure that scientists have investigated (in a field that has left few stones unturned), girls and boys overlap hugely. Indeed, almost entirely. In a study by Hines exploring color preferences, for example, she found infant girls also had no more of a love of pink than boys did.
In 2005 University of Wisconsin, Madison, psychologist Janet Shibley Hyde proposed a “gender similarities hypothesis” to demonstrate just how big this overlap is. In a table more than three pages long, she lists the statistical gaps that have been found between the sexes on all kinds of measures, from vocabulary and anxiety about mathematics to aggression and self-esteem. In every case, except for throwing distance and vertical jumping, females are less than one standard deviation apart from males. On many measures, they are less than a tenth of a standard deviation apart, which is indistinguishable in everyday life.
When it comes to intelligence, too, it has been convincingly established that there are no differences between the average woman and man. Psychologist Roberto Colom at the Autonomous University of Madrid, Spain, found negligible differences in “general intelligence” (a measure that takes into account intelligence, cognitive ability, and mental ability) when he tested more than ten thousand adults who were applying to a private university between 1989 and 1995. His paper, published in the journal Intelligence in 2000, confirms what earlier studies have repeatedly shown.
Some have argued that there is statistically more variation among men than among women, which means that even though the average man is no more intelligent than the average woman, there are more men of extremely low intelligence and more men of extremely high intelligence. At the far ends of the bell curve where the overlap ends, they say, the difference becomes clear. This may have been the basis for the controversial point made by Harvard president Lawrence Summers in 2005 when he was hunting for explanations for why there are so many more male than female science professors at top universities.
Studies haven’t fully supported this explanation. In 2008, using populationwide surveys of general intelligence among eleven-year-olds in Scotland, a team of researchers based at the University of Edinburgh confirmed that males did show more variability in their test results. These differences aren’t extreme as some in the past have suggested they are, they note, but they are substantial. At the same time, the authors point out that the biggest effect is seen at the bottom end of the scale. Those with the very lowest intelligence scores tend to be male. This is partly genetic. X-linked mental retardation, for instance, affects far more men than women.
“Mainly it’s at the bottom extreme because they have more developmental disorders,” explains Melissa Hines. “At the upper extreme, it’s not that big a difference.” The authors of the Scottish study showed that the smaller differences they saw at the top end certainly weren’t enough to account for the gaps between women and men taking up mathematics and science. In their particular set of data, around two boys for every girl achieved the very highest intelligence test scores. At universities, gaps in the numbers of male and female science professors are usually far bigger.
Hines adds that this difference in Scottish test results could also be due to social factors. “Even though on the average there is no sex difference in IQ, I think still boys get encouraged at the top. I think in some social environments, they don’t get encouraged at all, but I think in affluent, educated social environments, there is still a tendency to expect more from boys, to invest more in boys,” she tells me.
This observation is backed up by recent research into how people often think of genius as being a male feature. A 2015 study published in the journal Science explored whether this expectation of raw brilliance in men might affect the gender balance in certain subjects. Led by the Princeton University philosophy professor Sarah-Jane Leslie and University of Illinois psychologist Andrei Cimpian, the researchers asked academics from thirty disciplines across the United States if they believed being a top scholar in their field required “a special aptitude that just can’t be taught.” They found that in those disciplines in which people thought you did need to have an innate gift or talent to succeed, there were fewer female PhDs.
The subjects that instead valued hard work tended to have more women.
“It’s hard to separate our opinion from the data.”
Perhaps naive, Jennifer Connellan didn’t expect the backlash when it came. But then, no one could have expected that, when it came, it would be so huge.
Not long after her and Simon Baron-Cohen’s study on newborns preferring faces or mobiles was published in 2000, people began to question their research. Could it be true that there was such a deep sex difference in the behavior of newborn babies? Were girls really preprogrammed to be empathizers while boys were born systemizers? Flickers of doubt were raised about her methods and the reliability of the results.
The skepticism came to a head in 2007 when New York psychologists Alison Nash and Giordana Grossi dissected the experiment in forensic detail and catalogued a string of problems, big and small. For one thing, the paper’s grand claim that the experiment’s conclusions were “beyond reasonable doubt” seemed an uncomfortable stretch when, in fact, not even half the boys in the study preferred to stare at the mobile and an even smaller percentage of the girls preferred to stare at the face.
But their most damning criticism was that Connellan knew the sex of at least some of the babies she was testing. This could have caused any number of subtle biases. For instance, consciously or not, she may have moved her head to make the girls look at her longer, Nash and Grossi pointed out. The need to avoid this sort of problem is exactly why scientists are advised to carry out these studies blind, without knowing the sex of their subjects. Without this safety measure, it’s hard to take the results seriously.
Psychologist and author Cordelia Fine, who in 2010 published Delusions of Gender, a book about the problems with brain research that includes Nash and Grossi’s findings, adds that, even if their findings were right, Connellan, Baron-Cohen, and their colleagues made too big a leap when speculating about what they might mean. “One assumption is that these visual preferences predict a child’s later empathizing versus systemizing interests, for which there is no evidence either way,” she tells me.
When I put these criticisms to Connellan herself, now fifteen years since her paper was published, she accepts them humbly. At the time, her paper was out before she had been awarded her doctorate, and the flood of criticism came to bite when she turned up to defend her work in front of a panel of reviewers. She was told she had failed. “To have the defense go as poorly as it did was really surprising,” she says. She attributed it to “lots of politics in there with the reviewers. . . . We appealed it and got some more neutral people.” Only then, with a new set of reviewers, did she finally pass.
The experiment did have its problems, she admits. She found it impossible to prevent herself from being aware of the sex of some babies, mostly because she was in a maternity ward surrounded by newborn paraphernalia, including pink and blue balloons, and sometimes even their names. “We were testing the babies in a neutral zone, where there were no balloons or anything like that, and the blankets were all neutral. That was actually where we did the experimentation,” she says. But in getting permission to test the babies, they had to go see the mothers first, in an environment that was far from neutral.
“We did the best we could with the results that we had,” she admits. “Are they perfect? No.” In writing the paper, too, she says that she may have become overexcited by the results. “I was very inexperienced, and I think that inexperience caused more of the problems than anything else.”
When I ask Simon Baron-Cohen to give me his own thoughts on the experiment, he tells me by e-mail, “It was designed thoroughly and was scrutinised through peer review and as such it met the bar for good science. No study is above criticism in the sense that one can always think of ways to improve the study, and I hope when a replication is attempted, it will also be improved.”
In fact, replication has been one of the biggest problems for the experiment. To date, nobody has attempted to copy it to check if the findings were reliable. “Studies have to be replicated,” comments Teodora Gliga, “especially if it’s a new idea. It needs to be replicated, otherwise it’s not believable. It’s an interesting idea, but not a fact.” Subsequent studies with slightly older children have shown no sex differences. And, as Melissa Hines’s work has revealed, there appear to be no toy preferences among children until at least the age of one, and possibly closer to two years old.
Baron-Cohen, however, tells me that “the fact that the study hasn’t yet been replicated does not invalidate it at all. It simply means we are still awaiting replication.” One explanation he gives for why no other researchers have tried to copy it is that babies are difficult to test, which means you need large groups to get a reliable result. “Second, it appears that testing for psychological sex differences in neonates still attracts a fair amount of controversy. So some researchers may have been deterred by not wanting to walk into a potential political minefield,” he adds.
Jennifer Connellan has since abandoned the minefield altogether. Her career in Simon Baron-Cohen’s lab turned out to be brief. After getting her degree, she left Cambridge to join Pepperdine University. Today, she runs a tutoring company in California. She’s also mother to a girl and a boy. She tells me that she remains intrigued by the idea of empathizing and systemizing brain types, but believes that it’s only at the extremes where researchers seem to find any discrepancies. “It’s all a bell curve. . .and for the kids in the middle there’s almost no sex difference there at all,” she says.
Baron-Cohen, meanwhile, presses on in trying to establish links between levels of testosterone before birth and sex differences in the brain. In 2002 he and another postgraduate student, Svetlana Lutchmaya, claimed that twelve-month-old girls they observed in experiments made more eye contact than boys of the same age did. This study has been cited by other researchers more than two hundred times.
Then in 2014 Baron-Cohen and his colleagues published the results of a study looking at one of the biggest sources of data in the world: more than nineteen thousand amniotic fluid samples in Denmark, taken from pregnant women for medical reasons between 1993 and 1999. If ever a set of data could reliably prove his hypothesis that high fetal testosterone levels are linked to autism, leading to the “extreme male brain,” it was this one. His team managed to measure hormone levels in these fluid samples to find out how much testosterone the babies would have been exposed to. They could then crosscheck all this with the medical and psychiatric records of the same set of children when they were older. It was an amazingly large and thorough set of patient information.
The database included 128 males who were diagnosed with a condition on the autism spectrum. But Melissa Hines tells me that Baron-Cohen’s results didn’t show a direct link between them and high fetal testosterone levels. “That was like the ultimate test, and there was no correlation between testosterone and getting an autism spectrum diagnosis,” she says. “That’s just one study, but it doesn’t support it.”
Without evidence of a clear connection between the “extreme male brain” and testosterone, when their findings were published in the journal Molecular Psychiatry in 2014, Baron-Cohen and his colleagues instead claimed to see a correlation between autism and a mixture of hormones, including testosterone, but also the female sex hormones, progesterone and estrogen. He tells me the reason they did this is because “the sex steroid hormones in that pathway are not independent of each other because each is synthesized from its precursor so that the level of one hormone will directly affect the level of the next one in the pathway.”
Hines has since run her own study of correlations between fetal testosterone levels and autistic traits on children with congenital adrenal hyperplasia, which was published in the Journal of Child Psychology and Psychiatry in 2016. She found no link.
I can’t help wondering what Hines thinks is going on in her own field. She falls short of using the word sexism, but she does believe scientists haven’t always done as good a job on sex and gender differences as they could have done. “I don’t think people do it intentionally. I think these are things we deal with every day,” she says. Gender is one of those subjects that everyone has an opinion on, and of course, of which everyone has direct experience. Perhaps unsurprisingly then, there’s sometimes a lack of objectivity in the field.
“It’s hard to separate our opinion from the data,” she warns. “I think this is something the human mind does. It wants to have things that define maleness and things that define femaleness. Now maleness, historically in psychology, has been instrumentality, so that’s kind of like systemizing, and femaleness has been nurturing, warmth, kind of like empathy. So there is a long tradition of conceptualizing this in similar ways. . . . But I’m not sure where it gets us, because there’s lots of overlap. So you can’t give someone a test and get these scores and say they’re male or female. There’s too much individual variability.”
“I think we really have to be extraordinarily careful. . .when we talk about overlapping populations with huge variability,” agrees Brown University’s Anne Fausto-Sterling, one of the world’s leading researchers on gender.
She believes that Simon Baron-Cohen’s theory of male and female brains makes little sense. Connecting testosterone levels before birth to behavioral sex differences later on, she says, “is just this huge explanatory leap, and it leaves me uncomfortable because I don’t think it’s much of a scientific explanation when you make such a big leap. . . . We do see the differences, and I don’t disagree with that finding. What I disagree with is leaping to the idea that that this means it is something innate or inborn,” she adds. “I do think that if you just jump to the prenatal. . .you miss a whole developmental window when something very important and very social is going on.”
Fausto-Sterling belongs to a vanguard of biologists and psychologists who see the nature versus nurture question as old-fashioned. “There is a better way of looking at the body and how it works in the world, and understanding the body as a socially formed entity, which it is,” she explains. Men and women may be different, but only in the same way that every individual is from the next. Or, as she has also put it, “that gender differences fall on a continuum, not into two separate buckets.”
“I think that people tend to think of this in an either-or kind of way,” agrees Teodora Gliga. Either girls and boys are born very different or they’re the same. The scientific picture emerging now is that there may be very small biological differences, but that these can be so easily reinforced by society that they appear much bigger as a child grows.
“My opinion is that you will find differences wherever they were reinforced, because we love categories,. . .we need to have categories. And so once we’ve decided, once we’ve labeled ‘this is a girl,’ ‘this is a boy,’ then we have so many culturally strong biases that we maybe produce differences in abilities. So for example, in physical abilities, if we push boys to be more active and to deal with danger, then of course later in life when they’re children, they will look different. But that does not mean the differences were in the biology,” says Gliga.
Instead of the binary categories we have now, Fausto-Sterling believes that every individual should be thought of as a developmental system—a unique and ever-changing product of upbringing, culture, history, and experience, as well as biology. Only this way, she argues, can we truly get to the heart of why women and men across the world appear to be so different from each other, when studies into mathematics ability, intelligence, motor skills, and almost every other measure consistently tell us they’re not.
If toy preferences don’t emerge until at least age one and other differences reveal themselves even later, she suggests, then what else could be happening up until age one? One line of research that hasn’t been fully explored, for example, is counting exactly how many toys babies are given in the first year of life, and what kinds of toys they are. “I can say that boys see more boys’ toys and girl see more girls’ toys, but honestly there is no data to show that,” she says.
In her most recent research project, Fausto-Sterling has tried to get closer to answers by filming mothers playing with their children. She recounts one vivid example: “You see a little three-month-old boy, just slouched on the couch. He’s not even big enough to sit up on his own, but he’s kind of propped up with pillows. His mother is trying to engage him in play, and she’s stuffing little soft footballs in his face, American footballs. . . . She’s thrusting this football at him and saying, ‘Don’t you want to hold the football? Don’t you want to play football like your daddy does?’ And he’s just sitting there like a kind of blob. He has no interest one way or the other,” she describes.
The impact of actions like these, small as they may seem, can be long lasting. “If that kind of interaction is going on iteratively in the early months, then if at some point he does reach out and grab, when he’s big enough to do that, at four months, five months, or six months, he’s going to get a very positive reinforcing response from his mother,” Fausto-Sterling explains. This relationship between the boy and footballs is strengthened as he sees how happy they make his mother, and also because the toy is already so familiar to him. “He may see them again at an older age, when he is more capable of physically interacting with them. And just seeing them and recognizing them may give him a certain kind of pleasure.” By the end, the boy appears to love football.
Fausto-Sterling adds that evidence is emerging from her team’s observations of mothers that boys are also handled differently from girls, which might be influencing the way they grow. “The mothers of sons in my cohort are moving them around a lot more. They’re shifting them, they’re playing with them, and they’re talking to them less. They’re more affectionate to them when they’re moving them physically.” This could simply be because boys demand more physical movement from the start, but again, it’s another element of the development process that hasn’t been fully studied.
Work like hers, while in its early days, reinforces that countless little thumb marks are in the ball of dough that is a developing child. Hormonal effects on the brain or other deep-seated biological gaps aren’t necessarily the most powerful reason for the gaps we see between the sexes. Culture and upbringing could better explain why boys and girls grow up to seem different from each other.
And if this is the case, a change in culture or tweaks to upbringing might reverse the differences. “If you see what you think is a disability, don’t understand how it developed in the body and where it came from. Understand that bodies are shaped by culture from the very get-go,” explains Fausto-Sterling. “If you neglect a child at birth, their brain stops developing and they’re pretty messed up. If you highly stimulate a child, if they’re within a normal developmental range, they now develop all sorts of capacities you didn’t know they had or didn’t have the potential to develop. So the question always goes back to how development works.”
Melissa Hines agrees that there’s no reason nature should determine a girl’s destiny, despite her studies showing that testosterone may explain some small behavioral sex differences. “I do believe that testosterone prenatally sets things in motion in a certain direction, but that doesn’t mean it’s inevitable. It’s like a river. You can change its course if you want to,” she tells me.
Changing the river’s course is easier than it seems. It depends on society wanting to change in the first place. And this is a world in which even cold, rational scientists can’t abandon their desire to hunt for differences between women and men. The effects of testosterone on the brain are just one example. In 2013 a team from Taiwan, Cyprus, and the United Kingdom (in which, incidentally, one member was neuroscientist Simon Baron-Cohen) highlighted another. They got together a large number of independent studies into sex differences in brain volume and density to see what they could tell us in summary. In their paper published the following year, the team proclaimed that men’s brains were typically bigger by volume than women’s brains. The gap ranges from 8 to 13 percent.
This isn’t news. It’s long been known that men have on average slightly bigger heads and slightly bigger brains than women. It’s a finding that’s been popping up in scientific journals for more than a century.
But it indicates a problem that doesn’t go away, no matter how much time passes. Brain researchers have never been able to resist the urge to scour the skulls of women and men in search of variation. And the reason they persist with this endeavor is simple. Because if a man’s brain looks physically different from a woman’s, well then, perhaps this will confirm that something different is going on in their minds, too.