12. LIKE YESTERDAY

There is a hard and fast convention about how scientific epiphanies are supposed to happen. The scientist should be all alone, deep in thought, perhaps in an empty late-night lab, and despairing of ever solving an intractable problem. But when Charles Darwin had his epiphany about the origin of language he was on all fours with an apron covering his face.

On 27 December 1839, after the birth of his first child, Darwin began taking detailed notes on the development of the boy William, nicknamed Doddy. In these notes, which he called ‘A Biographical Sketch of an Infant’, Darwin records the exact date at which Doddy first shows signs of the human behaviours, such as: ‘Anger. Fear. Affection. Association of Ideas, Reason,& c. Moral Sense. Shyness. Means of Communication’.

The purpose of these notes is to compile empirical data about what is instinctual and what is acquired in newborns. They are also an accidental portrait of a besotted father, woozy with oxytocin, and full of love for his first-born.

He lists all the things that make Doddy laugh, such as when he covers his face with an apron and then whisks it off. Over and over again, Darwin whisks the pinny from his face, eliciting squeals of joy. He throws shadows on the wall, pleads for a kiss, growls like a monster, pulls faces when they are both sitting in front of the mirror, and notices that precise date when Doddy first looks round from the reflection to him.

In making these notes, Darwin is on the lookout for universal milestones in infant development. What stands out, however, isn’t the universal but the particular. What we read are notes on a very Victorian, very English infant of the haute bourgeoisie – as the following entries show.

When five months old … as soon as his hat and cloak were put on, Doddy was very cross if he was not immediately taken out of doors.

At 6 and half months, he displayed violent passion when dressed in clothes he had already worn that season.

At 11 months Doddy became exceedingly distressed upon hearing nursemaid express support for Irish Home Rule.

At 14 months, after suckling at the bosom, burst into tears of remorse, and spoke his first complete sentence: ‘I have polluted your purity, Madam, with my depraved lusts and vile appetites!’

At 18 months, while nursemaid reads to him, Doddy becomes agitated by rise of literacy among the lower classes. Nursemaid manages to calm him down by saying that she memorised the story through an oral folk tradition, and just has the book there as a prop which she is only pretending to read, as she is completely illiterate. Doddy was much assuaged by this very clever answer. A bit too clever, if you ask me. Will fire her tomorrow. Let her find work with her Fenian socialist friends!

Some observations, however, do have the universality for which Darwin was probing. In the section entitled Moral Sense, for example, he finds an answer to a question which has long fascinated students of psychology: at what age do we first deliberately and consciously deceive? With Doddy this comes when he is 2 years and 7½ months old, which is when Darwin:

… met him coming out of the dining room with his eyes unnaturally bright, and an odd … or affected manner, so that I went into the room … and found that he had been taking pounded sugar, which he had been told not to do … A fortnight afterwards, I met him coming out of the same room, and he was eyeing his pinafore which he had carefully rolled up; and again his manner was so odd that I determined to see what was within his pinafore, notwithstanding that he said there was nothing and repeatedly commanded me to ‘go away’, and I found it stained with pickle-juice; so that here was carefully planned deceit. I informed him that from now on he was dead to me.

But what snags Darwin’s attention more than anything else is the musicality of Doddy’s speech. ‘The use of these intonations seems to have arisen instinctively’, he notes. The more he listens to Doddy’s cadences, the more he wonders whether language evolved from a combination of singing and mimicry. The first germs of Darwin’s origin of language theory begin to form. His theory of language can be stated in this way: first came the music and then came the lyrics.

With typical Darwinian alacrity he mulls over this theory for the next thirty years before rushing into print. This is the man, after all, who spent eight years writing a book about barnacles. (You don’t want to be too hasty about barnacles.) He was, if nothing else, a man tuned to geological timescales, who slowed his perception of the world until he could see the worms churning the soil and lifting the fields by inches.

The 1870s are his decade of publishing ideas about specifically human evolution, and this is when he develops his origin of language theory. In Descent of Man (1871), he argues that ‘some early progenitor of man probably first used his voice in producing true musical cadences, that is, in singing.’ Language came from mimicking ‘the voices of other animals’, and from modifying our ‘own instinctive cries, aided by signs and gestures’.

The following year, in The Expression of the Emotions In Man and Animals (1872), the origin of speech in song is narrowed down to love songs and serenades. After noting how gibbons sing ‘an exact octave of musical sounds, ascending and descending the scale by half-tones’, Darwin writes that our ancestral primates: ‘probably uttered musical tones before they had acquired the power of articulate speech’, and used these musical tones for wooing, and so they ‘became associated with the strongest emotions … ardent love, rivalry and triumph.’ A vestige of this operatic origin of language, he argues, is the way the voice gets more musical the more emotional we become.

In 1877 Darwin submitted ‘A Biographical Sketch of an Infant’ to the journal Mind: A Quarterly Review of Psychology and Philosophy. After first describing how Doddy’s ‘whine of dissent had a different resonance and timbre’ from his ‘strongly emphatic … humph of assent’, Darwin concludes that the instinctive sing-songs of infant speech allows us to infer that:

before man used articulate language, he uttered notes in a true musical scale as does the anthropoid ape Hylobates [gibbon].

The idea that language emerged from singing seems frivolous set against the dog-eat-dog tone prevalent in evolutionary theory.

In the classic work Language: Its Nature Development and Origin (1922), Otto Jespersen says that language theorists tend to imagine ‘our primitive ancestors … as sedate citizens with a strong interest in the purely business and matter-of-fact aspects of life.’ Jespersen argues that ‘the genesis of language is found … in the poetic side of life; the source of speech is not gloomy seriousness, but merry play and youthful hilarity.’

Though the insistence on dourness in theories of language is nothing new, it became more pronounced after the Time and Motion men came calling on evolutionary theory.

The Time and Motion men may have disappeared down the Bottomless Well of Infinite Regress, but their influence is everywhere, not least in the fact that Darwin’s theory that language evolved from singing is not widely known. But that may be about to change. Lately, two lines of enquiry have converged upon Darwin’s origin of language theory, one of which we can call Colossal Hypoglossals and the other The Paleoloithic Crowd Sourcing of Language.

Colossal Hypoglossals

In human skulls the anterior condylar canal is a groove scored by the hypoglossal nerve that articulates the tongue. The stronger and more dexterous the tongue the fatter the nerve. In ape and Australopithecus the anterior condylar canal is tiny. In endocasts of Homo heidelbergensis skulls, paleontologists have discovered the condylar canal to be exactly the same size as our own. These colossal hypoglossals show that our prelingual ancestors could do all the noises we do. Any sound we can make they could make. Hundreds of thousands of years before spoken language, they could perform alliterative tongue-twisters and assonant rhymes. They could do funny voices to stop the children crying, or mimic forests sounds to distract the dying from their pains. They could impersonate all the subtle gradients of sound that soup makes as it comes to the boil. And they could sing.

It would appear that cave men and women were not going ‘Ug Ug Ug’ after all. When producers of TV science documentaries reconstruct our hominin ancestors, their stunted vocabulary (hominins, not TV producers) always goes with stunted vocal range, the unlikely inference being that the vocal chords becomes more supple the more words you know, as if the best singers were the most articulate conversationalists. Is it time to start imagining our forebears as less butch? Perhaps we could begin this process by renaming early humans.

At present we name our ancestors after remote parts of Germany, such as the Neander Valley or Heidelberg – the Germans having got their bones, like their swimming towels, down early. Or else we name them after the most popular types of tools or hardware, be it hand-axe or beaker. It is a sobering thought that when our bones are dug up in a future epoch, we may also be classified according to our most popular hardware. The B&Q people, the Wickes Man, the Robert Dyas of the Younger Dryas.

Early humans were more likely to have identified themselves by musical genre than by hand-axes. If only acoustic caves existed that could somehow preserve ancient sounds, then we might more accurately name early humans by musical genre. Homo emo, Homo mbaqanga, Homo milonga, Homo gamelan, Homo Nederpop, Homo Happy Hardcore, The Old Wave of Old Wave people. The A Capella people.

Naming hominin groups after their distinctive musical genres might also throw light on intractable mysteries of prehistory. It could have revealed to us, for instance, that early humans left Africa due to musical differences.

The Crowd Sourcing of Language

We were living in complex communities for at least half a million years before we started talking to each other. A growing school of thought argues that a rich non-verbal human culture came first and language a distant second. There were complex interactions long before there were words to help them along. Or to unnecessarily complicate them, as the case may be. In this chapter I want to explore the idea that language was the first crowd source drawing on culture – the first cloud store.

In A Mind So Rare, Merlin Donald reverses the usual scenario where the early humans first evolve language and the use it to create a culture: ‘Just as the physical environment drove the evolution of perceptual capacities, so cultural energy drove the evolution of sophisticated communication capacities.’

I like the general tenor of this, but I’d just like to add a caveat that the physical environment can also drive cognitive capacities. In my last book The Entirely Accurate Encyclopaedia of Evolution, I looked at how more complex coral produces more intelligent fish. Of two populations of stripy blue cleaner fish on the Great Barrier Reef, the population living in coral, with more nooks, crannies, and biodiversity, are more intelligent than those inhabiting a duller stretch of coral.

In their 2014 Ethology paper, ‘Variation in Cleaner Wrasse Cooperation and Cognition: Influence of the Developmental Environment?’, Sharon Wismer et al., argue that the more complex physical environment elicits the phenotype of smarter fish among a population of stripy blue cleaner fish. They describe cleaner fish so intelligent that they recruit moray eels to their hunting parties, with nods of the head. In calm water moray eels use the lid of the sea as a mirror. By looking up at the glassy ceiling they can see who or what is hiding behind the coral. They also use this aquatic periscope to see round corners.

Sharon Wismer and colleagues found that cleaner wrasse from larger shoals are more intelligent than those from smaller populations. So it appears that both physical and social complexity make intelligent fish, which brings us back to Merlin Donald’s point about the social basis of the inner life, how language couldn’t have emerged except as a product of living in complex groups, since it exceeds what any one person was ever equipped to do on their own. Long before early Homo sapiens had language, he says, they were able to ‘bond, coordinate group activity, transfer and refine skills, and establish a network of custom and convention.’

This chimes with ideas of the social basis of thinking and speech advanced by the brilliant Russian psychologist Lev Vygotsky (1896–1934), who died of tuberculosis when he was just 37 years old, but whose work is currently enjoying a renaissance.

Fifty years after Darwin finally published his observations on Doddy’s early learning, Vygotsky and his team of researchers were combining detailed observation of early learning with imaginative experiments designed to explore the different ways in which children solve problems alone or in groups, the ways they talk to themselves, their use of whole phrases and chunks of borrowed speech, and how these are deployed when performing tasks within and then beyond their competency, with and without help from adults.

Until Vygotsky the following tale was told about how our thought and speech develop. (And since Vygotsky, too, alas, so much did his early death damage his influence.)

A child starts off with her own internal monologue which she then takes to nursery school where she learns to turn monologue into dialogue, and learns to merge and to attenuate the imperious demands of self with the recognition that there are others, and that those others possess, in the words of George Eliot, an ‘equivalent centre of self.’ Vygotsky came to believe that what actually happens is precisely the opposite. Through being with others – family, nursery, reception – infants learn speech and thought, which they then use to pollinate a rich inner life.

What we can do in groups excels what we do alone. Development traces an intricate dance between the inside and the outside world. When Vygotsky says that a five-year-old’s experiences shape her brain more than her brain shapes her experiences, he is absolutely not arguing that the child is a blank slate. It is the very richness of our emotional constitution that demands so much external nourishing. If there were less to us, we wouldn’t need so much outside assistance and support. As the philosopher Mary Midgley puts it:

… society is not an alternative to genetic programming. It requires it. To become a member of any kind of society, an infant must be programmed to respond to it. Others give him his cues. But he has to be able to pick them up and complete the dialogue.*

* Mary Midgley, Beast and Man:The Roots of Human Nature, 2002.

The greater the endowment the more there is to learn.

This learning takes place in what Vygotsky called the ‘zone of proximal development’. This zone describes those capacities just out of reach. The zone is the shimmery no man’s land between what the child can almost-but-not-quite do on her own, and what she can do with the teacher leading her. Good education focuses on this zone. Vygotsky developed these ideas in the 1930s. His ideas should not be confused with the contemporaneous teaching methods being developed at Marcia Blaine School for Girls in Edinburgh, where Miss Jean Brodie told her pupils: ‘The word “education” comes from the root e, out, and duco, I lead. It means a leading out. To me, education is a leading out of what is already there in the pupil’s soul.’ Education is that, of course, but it’s also the making of the pupil’s soul.

If not like Jean Brodie’s, Vygotsky’s ideas are close in sprit to those of another contemporary pedagogue, American philosopher John Dewey, who in 1929 wrote:

Failure to recognize that [our] world of inner experience is dependent on upon an extension of language which is a social product and operation led to the subjectivistic, solipsistic and egotistic strain in modern thought.

The social basis of consciousness allowed others to be present for us even when not actually physically there, because far away or dead. ‘The words of the dead’, wrote Auden, ‘are modified in the guts of the living.’ This modification was the germ of the inner voice. Long before writing, repeating an elder’s rhythmic mime when we were alone helped us recall where the fish nets were sunk, or helped us remember the correct sequence of actions needed to gut a marmot, or start a fire. This repetitive mime could have been done by actions and a work song with no words, only noises. This would have been a type of prelingual transmission of information and encouragement by way of movement and song. Slowly and gradually, early humans crowd-sourced language from the cognitive communal cloud store. Such theories of language, and the discovery that colossal hypoglossals scored deep grooves in Homo heidelbergensis skulls, lend support to the epiphany that first struck Darwin when playing with his first-born son, his theory that language evolved from 500,000 years of singing.

Our genus Homo is two million years old, but language emerged only forty thousand years ago. This is like yesterday. It’s also like ‘Yesterday’. Famously that song’s tune popped out perfect and fully formed (in a dream Paul McCartney had at Jane Asher’s house), but the words were a long time coming. According to John Lennon, ‘the song was around for months and months before we finally completed it.’ ‘Yesterday’ was originally ‘Scrambled Eggs’ and its opening couplet went like this:

Scrambled eggs,
Oh my baby how I love your legs.

But the words didn’t take because they didn’t fit the emotional timbre of the music. Just as complex, lyrical expressions gradually emerged to turn ‘Scrambled Eggs’ into ‘Yesterday’, so, says Darwin, ‘the most complex and fine shades of expression must all have had a gradual and natural origin.’

If Darwin is right, if we communicated by music tones before the invention of speech, then perhaps instead of names everyone had a leitmotif, like in Peter and the Wolf. I mean we must have had some way of naming each other before the invention of language some forty thousand years ago.

A dissenting voice to all this is my own theory of the origin of speech. After a thirty-year long stand up comedy career, I like to think I’ve become something of an authority on long uncomfortable silences. After the first quarter of a million years without talking, I suspect early humans couldn’t stand the silence anymore. The long silence started to get really uncomfortable. People couldn’t bear it. Something had to be done. Of course we didn’t burst into speech straight off. Instead I think we broke the eggy silence with some tentative beginnings. It was a gradual process. Language began, I believe, with a few thousand years of strained humming and self-conscious throat clearing. Towards the end of the Late Pleistocene, it seems probable that ‘ahem-ahem’ may have given way to someone rubbing their hands together after a single clap, and that was when, in an overly loud voice, the first ever human word was spoken, and it was: ‘Annn-yy-way…’