One of the grim things you learn after many years working as an editor – I hardly dare confess this – is that you can hold a poem at arm’s length and, without having read a word, know there’s a 90 per cent chance that it’s bad. Most often this is because any random two- or three-line passage appears to contain all the letters of the alphabet. (Centred text, coloured ink, copperplate fonts, falling potpourri, money and photos of babies, dogs and naked poets are also reliable pointers.) This means the poem is unlikely to have any music. The phenomenon of ‘music’ in poetry is often spoken about as if it were a mysterious quality; but if we mean ‘music’ as in ‘music’, rather than ‘some ineffable thing which my poetic intuition can subjectively divine but is beyond human articulation’, it’s very simply characterised. With few exceptions, it means that the poem displays deliberate organisation and some form of parallelism in its arrangement of sound. If a ‘music’ is ascribed to a poem, but cannot be described through pointing to some salient or parallel phonic effect (or – stretching the definition to its limit – a patterned silence), the only ‘music’ the listener has identified is that which resides naturally in the language itself.

The error is often made because this language-music is not inconsiderable. But nor is the poetic music we superimpose upon it a rare occurrence: even in everyday speech, given a choice of synonyms, we will express an unconscious preference for the more harmonious, contextually lyric sound when we need to make strong sense. Think of just about any phrase that strikes you as a memorable or felicitous – a proverb, a cliché, an advertising slogan – and you’ll almost invariably find some patterning in its sounds.1 The more considered our speech, the more this effect is naturally strengthened, and speech-writers alliterate and assonate almost helplessly. Written prose shows a higher degree of lyric patterning again, and poetry even more so. (The self-conscious foregrounding of this patterning in prose – along with an excessive lingering over concrete description – is what most often leads to that equivocal diagnosis, ‘poetic’ writing.) But even a random series of words will appear to demonstrate a musical coherence by virtue of any one language being a closed phonemic system, and having a finite set of sounds it can combine. This is the ‘musicality’ we quickly divine in languages or dialects we have trouble understanding, but are slow to acknowledge in our own: left to focus on the sound alone, we can attend to their distinct and often alien music (hence the apparently infinite suggestiveness of song-lyrics in languages we don’t know). Each language uses only a fraction of the possible sounds that the human voice can produce. English is a musically versatile tongue: of the 200-odd phonemes in global employment, it manages to use around 50. We might pity the native Hawaiian speaker with their mere 13, but a poet would sensibly envy them: it must be an effort to speak a sentence in Hawaiian that is not lyrically coherent.2 I was once asked to comment on that automatically generated collage-text you sometimes get with your spam-mail; the journalist remarked on how beautiful and poetic it could sound. The trouble is that any old random garbage often strikes us as beautiful and poetic – but this pays a compliment to language itself, not to the ‘poem’. In such circumstances, having little in the way of actual meaning to distract us, we can attend to the sound alone, and enjoy the distinctive gabble of the Anglophone.3

Nonetheless, even without salient sound effects like rhyme, assonance or alliteration to point to, we often have the strong sense that something else is going on beside the intrinsic musicality of the language; and indeed there is. In English poetry, the feeling that a piece of writing is ‘musical’ usually means that it also exhibits two kinds of phonetic bias. Between them, they effect a pattern of repetition and variation, similarity and difference – the motif the human brain craves in everything it perceives, if it is simultaneously to make both connection and distinction. The first is the deliberate variation of stressed vowels; the second is the quiet, generally backgrounded patterning of consonants.

Between them, these two tendencies have come to represent an unconscious ‘lyric ideal’ in English. Importantly, they must be no more than tendencies. Generally speaking, if sound-patterning is too strong and too conspicuous, it will be perceived as contrived and will distract from the sense (open Robert Bridges’s Collected Poems at random, if you want to see what I mean: ‘Alone, aloud in the raptured ear of men / We pour our dark nocturnal secret; and then, / As night is withdrawn / From these sweet-springing meads and bursting boughs of May, Dream, while the innumerable choir of day / Welcome the dawn …’4) unless it performs some explicit mnemonic or structural function, such as Anglo-Saxon alliteration, or terminal rhyme.5

A shift towards vowel heterophony and consonantal homophony creates the unconsciously experienced ‘lyric ground’ upon which the more consciously registered saliences of rhyme, assonance, alliteration and anomalous consonant can cleanly stand. Just as we see a global shift from denotative to connotative speech, so the phonosemantic principle ensures a concomitant global shift from an inchoate language-music to an explicit poetry-music. Nearly all poets with half an ear default to this lyric ground most of the time. It is, in effect, the poet’s working medium, the canvas, clay or stone from which they make the poem.

In the human voice, the vowel carries the bulk of our ‘feeling’ in its complex tonal and quantitative discriminations; the consonants which interrupt that breath create the bulk of the semantic sense. The consonant, in making the distinction between blue and shoe and true, gives the phonetic differentiation we need for a sign-system capable of carrying distinct meanings. The envelopes with which it shapes the vowel allow for discrete words to be cut from the air, in much the same way that physical borders allow us to perceive discrete objects. The material basis of that sign-system, though, is the voiced breath, which we use to sing the language. Vowel fills the word with its fairly uniform stuff, while the consonant carves it into a recognisable shape. Consider, say, a mother’s frustrated, third-attempt demand to her child, ‘Put down the cup!’ It’s easy to separate out the four vowels  (‘oo – ow – ih – uh’), then imagine the first vowel pitched high to indicate urgency; the second dipping down an interval of a fifth or so to reinforce the impression of sane control; the third pitched identically to the first to reinforce the imperative, and the fact of the demand’s repetition; and the last rising another fifth – and increased in loudness – to convey both the anger and nonnegotiability of the request. The emotional if not the literal sense would be clear from such a performed sequence of tones; whereas the consonants pt dn th cp alone will give us a fair stab at the semantic content, but would provide very few clues to the emotional context. (Note that with the consonants removed, speech suddenly becomes an extremely complex kind of singing – a kind of ‘jazz ballad for human horn’, full of emotional articulacy, but with no referential content.)

In the non-performative context of written language, however, things are trickier. If you want to demonstrate the hopelessly attenuated emotional palette of written speech, try all the different ways you can pitch and shape the vowels of ‘I love you’. Spoken, it’s easy to draw out shades of meaning that are alternately questioning, pleading, heartfelt, insecure, angry, desperate, tender, insincere, placatory and so on, just by modifying the song of the vowels. None of those things can be represented by the written phrase; the performative cues have to be figured out through interpreted context alone. At the level of word, meaning is something delegated largely to the consonant – and its emotional aspect is catastrophically attenuated in the process. Because written words can’t represent the pitches and lengths that give them their spoken expressive range, it’s easy for the vowel to become devalued, to the extent that some graphic systems, like Ancient Hebrew and Vedic Sanskrit, did without it altogether. (Writing systems have their left-brain, denotative bias built in; never forget that the original purpose of our immortal inscription was the recording of debt and the issuing of receipts.) In declaring itself as emotional, urgent speech, and in signalling its kinship with song, poetry must find a way to put the vowel back centre-stage.

Because vowels have perceptible duration, they are easy to hear. You can test this by trying to repeat the vowel sounds in the previous sentence. You should be able to do so almost without thinking about it, and just say the words as a form of de-consonated baby talk. Because vowels have duration, they are easy to hear  – or something like it. Now try and do the same with the consonants. You’ll find it almost impossible without thinking about it very carefully and consciously; this is because the vowel is the main durational component of the word, and the consonant is often experienced as temporally negligible.

From ‘temporally negligible’ it’s a short imaginative leap to ‘timeless’, and it’s interesting to note that those languages whose writing systems omit vowels – Egyptian hieroglyphs, Hebrew and Classical Arabic – have found it easy to sustain the idea of a holy book that had existed before the dawn of time, and then ‘fell to earth’ as a block of eternal and monolithic consonant, into which the impure, sour, time-bound breath of the human had not yet been introduced. Both the Torah and early versions of the Quran were written without vowels or diacritical marks, and both the Kabbalists and Sufis were engaged in the mystical project of re-envowelling their holy books in order to come up with alternative, deeper interpretations to set alongside the ‘standard’ reading; in this way, they could intuit the secret intentions of the Divine.

The researches of those early mystics are almost procedurally identical, incidentally, to what we call ‘pararhyme’ or ‘consonantal’ rhyme, where words with the same consonantal structure are sought out by choosing a word, removing the vowels, and then re-envowelling the consonantal string to generate a secret cognate, or a whole series of them. If we take ‘green’, we might produce groin, Egraine, groan, girn, garni, Goran, grown, migraine, agrin; ‘press’ gives Parisi, peruse, disperse, prissy, purse, Paris, Percy, oppress; ‘table’ – tableau, tibial, eatable, unstable, taboulleh, pitbull, beatable, tubule; ‘cut’ – cat, acute, Cato, kite, quite, Hecate, acquit, Kyoto, predicate, kumquat, coot. ‘Consonance’ is a global phonestheme whose semantic aspect is something like ‘formal or structural equivalence’ or ‘isomorphism’, and I’ll expand on this as we go – but if you recall my earlier remark: the way consonants shape the air to allow us to hear discrete words is deeply analogous to the way that physical borders allow us to perceive discrete objects. I strongly suspect that the concepts indicated by words connected by pararhyme and consonance are often regarded, in some unconscious and synaesthetic way, as also somehow related in their form;6 there is, I think, some unconsciously perceived formal equivalence or symmetry declared between Caesar and scissor and seizure, brick and barrack and bark, rubble and rebel and terrible, however nonsensical our conscious mind might find the idea. (Because they also have a visual presence, consonantal symmetries in written language are far more obvious than in spoken.)

One sensible use of this phenomenon is to use the generated wordseries as a means of interrogating the unconscious, the memory and the imagination by asking that they link the words up as stations in an intelligible narrative. Essentially, this procedure treats the mind as if it were itself a holy book.

Here, the use of pararhyme less ‘echoes’ Paul Muldoon’s larger artistic project of unexpected connection than facilitates it at every turn. Nairac was a rogue British Army officer who went missing while working undercover, and was killed by the IRA; his body was never found. He is found in this poem, however – by his rogue analogue, a freelance mink whose form is twinned in his army-issue anorak. I have no idea how this poem was composed; but I know that consonantal rhymes must be built far earlier into the compositional process than other kinds of rhyme. A poem does not just ‘accommodate’ them; it allows itself to be partly dictated by them.

Incidentally, Kabbalistic exegesis is not quite as crazy as it sounds. Semitic languages like Arabic and Hebrew work by using vowel-sounds to systematically modify a root group of consonants called a ‘triliteral’, or triconsonantal root. For example, in Arabic ‘k-t-b’ is the ‘write’ group, and yields kataba, to write; yaktubu, he writes; kitab, book; maktaba, library; and so on. Hebrew does something very similar with the same triliteral. Pararhyme, if you like, is built into their structure: the Kabbalists and Sufis were really just imaginatively extending the rules of their own morphology. English has no such excuse, and pararhymes generally sound a bit perverse if they’re not concealed a little. Wilfred Owen’s ‘Strange Meeting’ is a notable exception, though here the rhymes brilliantly enact the subject. Paul Muldoon conceals his rhymes through a mixture of wide separation and variable line length; a uniform line would foreground them uncomfortably. Alas, Muldoon has virtually copyrighted the procedure in English poetry, making it almost impossible for anyone else to employ it in a way that doesn’t simply sound imitative of him – although he uses pararhyme more and more as a deep compositional procedure, and less and less as a way of unifying the music simply for the reader’s enjoyment. This is seen in the distance placed between his rhymes, which are often many lines and pages apart – and sometimes separated by entire books.8 Nonetheless, the rhymes in his poetry, no matter how distant from each other they find themselves, seem bound by a kind of quantum entanglement.

Though it cheers me to think that poetry takes the opposite approach from that of religion: we have long thought of ourselves as starting not with the logos but the pneuma, not with those Platonic consonantal forms, but with the ether that encloses and unites them, the inspiration, the afflatus, the breath – the breath being that infinite possibility into which consonant, not vowel, must be driven to have it make any sense in the currency of our human speech. This strikes me as a far more serious kind of word-game than the Kabbalists ever played (and, of course, it’s the one Muldoon and Owen are principally playing, like all lyric poets). Poets from Tennyson to Antonio Machado have alighted on ‘wind’ as the idealised inspirational source – shaping its one long vowel around every object it meets, making the unity we pursue through our articulation of the specific less an impossible contradiction than a paradox to be dwelt within. And the wind brings weather, words, voices, scents from afar, from impossible elsewheres.9

Song works by ‘unnaturally’ elongating the vowel and diminishing the prominence of the consonant. This can be seen in its treatment of end-rhymes; when sung, the words soon, room, cool, roof will be perceived as close-to-full rhymes, and the longer the note, the fuller they sound. For lyricists, assonantal rhymes are often effectively full rhymes: when Bob Dylan sings ‘Let me sleep in your meadows with the green grassy leaves / Let me walk down the highway with my brother in peace’, we’re quite untroubled by the ‘inaccuracy’ of the rhyme. And as you’d expect, in exaggerating the vowel, singing will often strongly foreground the emotional sense at the expense of the denotative. This is why great songs can survive awful lyrics, and what lies behind statements like ‘She could break your heart singing the phonebook’.10

Instrumental music may be usefully considered as an unbroken vowel, a kind of pure tonal and quantitative speech whose sole purpose is to carry emotion – akin to the spoken vowel, but vastly more supple and articulate. However the absence of consonantal stops in music means that we are left with something possessive of emotional articulacy, but with no differentiating ability, and so no way of constructing a sign-system; it can therefore have no denotative power.11 When Richard Strauss said, ‘I look forward to the day I can describe a teaspoon accurately in music’, everyone was justifiably sceptical – and indeed Strauss’s teaspoon has remained wholly elusive. In a good jazz ballad solo – especially played on an instrument close in timbre to the human voice, like a tenor saxophone – the timing and pitch of the notes are so closely mapped to the rhythms and cadences of plaintive speech and articulate argument, you can easily imagine ‘enconsonating’ the notes to give a clear denotative sense. Indeed, some ‘vocalese’ artists couldn’t resist doing exactly that with a number of famous jazz solos, with predictably underwhelming results. In a sense, however, that’s just the solution poetry arrives at, albeit from the other direction: in making a shift towards the privileging of the vowel, we restore some of the quantitative length absent from speech; poetry then becomes a kind of transitive, articulate music. In doing so, it leaves the door open to the reader to make their own highly personal intonational interpretation, i.e. the superimposition of subjective, performed sense on its stressed vowels. (But fix those vowels as a melodic series of pitches, and in a sense you destroy the possibility of a wholly personal reading; the poem may now be a serviceable lyric, but it has ceased to be a poem. A ‘set poem’, in a sense, nails its emotional sense to a single interpretation. Of course, singers have found a thousand ways to get round this limitation, even if composers would often rather they didn’t. Since consonant is the tool of differentiation, of denotative meaning, a solid block of consonant like the Torah seems to propose a monosemic source. This is perhaps why the multiple interpretations proposed by Kabbalistic re-envowelling of the Pentateuch struck many as heretical. A block of uninterrupted vowel such as music represents, on the other hand, seems to imply a polysemic source, which is why a single interpretation often seems equally heretical in its ‘precision’ – as well as reductive and redundant.12)

It’s primarily the exaggerated prominence given to the vowel that distinguishes the characteristic noise of the poem from that of prose or conversational speech, though the effect can be a subtle one. It’s also what opens up the poem’s interpretative potential at the level of performance, since the vowel is also where the personality, geographical origin, age, social class, health, size and present mood of the reader find their tonal expression. There is an unconsciously received ideal of ‘beautiful’ English lyric. If we closely examine some representatively ‘beautiful’ texts, we find a backgrounded default where stressed vowels are arranged by careful contrast and variation. Each word retains its distinct spirit, and the reader has the vague sense of it standing in a clearly stated, discrete spatial and temporal relation to the words on either side. Against this varied ground, we also see occasional stark and consciously perceived deviation, which is to say assonance and rhyme. Assonance doesn’t have any effect unless the vowel-changes are continually being rung elsewhere, and is a way of foregrounding important detail:

… I remember no ship

slipping from the dock –

no cluster of hurt, proud family …

… but we have surely gone,

and must knock with brass

kilted pipers

doors to the old land;

we emigrants of no farewell

who keep our bit language

in jokes and quotes;

our working knowledge

of coal-pits, fevers …

(KATHLEEN JAMIE, ‘The Graduates’)15

Some poets use assonance far more than others – but even then it’s rare that you’ll find a long run of similar vowels or assonantal pairs, as anything approaching vowel homophony will diminish assonance’s foregrounding power. However, the ‘varied vowel rule’ may be arguably less a conscious strategy and more a matter of the unconscious avoidance of similarity.

Varied vowels also reinforce the impression that we are indeed saying things once, with especial clarity, just as our ‘invoked hush’ demands. Vowel-variation is a natural feature of both prose and spoken English too, but since they are generally conducted in rapidly delivered phrases, many stressed vowels on content-words are demoted to something approaching ‘schwa’, which I’ll explain shortly. By contrast, lineation slows speech, restores the vowel to its full value and makes us conscious of the tonal quality of its various formants; it means we can then deliberately exaggerate the vowel-varying tendency of spoken English. This is yet another ‘peak shift’ strategy. Try slowly mouthing only the vowel-sounds in the following passages (you might also find it useful to compare them with a random chunk of good journalistic prose as a ‘control’):

‘This man can’t bear our life here and will drown,’

The abbot said, ‘unless we help him.’ So

They did, the freed ship sailed, and the man climbed back

Out of the marvellous as he had known it.

(SEAMUS HEANEY, ‘Lightenings: viii’)17

I would spread the cloths under your feet:

But I, being poor, have only my dreams;

I have spread my dreams under your feet;

Tread softly, because you tread on my dreams.

(W. B. YEATS, ‘He Wishes for the Cloths of Heaven’)18

In each case you should have felt like you were trying to deal with a gigantic, intractable and invisible toffee. Upping the stressed vowel count has the effect of lengthening the line, and is actively achieved by deliberate word-choice, and passively aided by lessening the occurrence of thin or weak vowels. This means, essentially, keeping ‘schwa’ down to a minimum. Schwa is the short neutral vowel sound that occurs in most unstressed syllables, and is represented by the /ə/ symbol in the IPA. Schwa has its roots in the word for ‘nought’ in Hebrew, and its nondescript little grunt can be substituted for any written vowel, if it occurs in an unstressed position: the ‘a’ in ‘abet’ and ‘petal’; the ‘e’ in ‘bagel’; the ‘i’ in stencil; the ‘o’ in ‘arrogant’ or ‘condition’; the ‘u’ in ‘crocus’ and the ‘y’ in ‘satyr’, and so on. In rapid speech, their numbers multiply. Schwa by definition can’t be stressed or sustained; it’s no more possible to sing a long schwa than it is to play Gregorian chant on the banjo.

One way of de-schwa-ing the line is to avoid too many polysyllabic words drawn from a classical word-base. Though Latin pronunciation was a very different matter, the way the vowels in polysyllabic Latin words come out in Germanic speech collapses them around a dominant stress. (Listen to how short most of the vowels are in words like authoritative, intermittently, inimical and so on.) There’s a marked tendency in the English lyric tradition to just avoid them, and poetry often defaults to the mono- and disyllabic Anglo-Saxon, Norman or Norse word-hoard. However, there’s a more important way of lengthening and aerating the line: through metre. One immediate consequence of writing in metre is that the unstressed syllable count falls dramatically in comparison with a prose passage of equal length. Whatever other purposes it might serve, metrical writing is also a fine way of increasing the relative number of big vowels per sentence, because it insists that every second syllable (in duple metres) or every third (in triple) must take a strong stress. (This is a shocking oversimplification, but by the time we get to ‘Metre’ you’ll be thanking me for it. However, the broad principle holds.) This automatically lowers the schwa-count just by making it harder for unstressed vowels to squeeze in; it insists on an unusually high number of content words to take up those strong stresses, since function words are generally demoted to unstressed vowels in speech; and it even tricks the brain into processing some function words as content words when we drop them into a strong-stress position. (Note that if free verse is going to subscribe to the same lyric ideal, it must supply its large stressed vowels by more deliberate means, and its expert practitioners do precisely this.) This leaves the poetic line unusually information-rich. However, as I’ve mentioned, the act of lineation itself compels a slower spoken delivery, metred or not – and this often results in the re-emergence of our gobbled conversational schwas as full vowels. Contemporary poetry may indeed sound close to ‘chopped-up prose’ at times, but we use a magic knife – one it takes a fairly long apprenticeship to learn how to wield. It allows silence to take up residence between the stanzas, lines and words, and for the breath within words to expand. Or, to lineate the words of a certain disgraced US comedian:

Always end

the name

of your child

with a vowel

so when

you yell

the name

will carry

Here, what would be schwa or near-schwa when spoken rapidly in a joke – ‘your’, ‘with’, ‘so’, ‘you’ and ‘will’ – move towards full vowels. This is because short lines tend to be read very slowly with long gaps between them. This alone seems enough to confer a little ‘poetry’ on the statement; while it remains funny, we’re also forced into a slower meditation of its wider resonance. This increase in vowel-presence forms half our ‘lyric ground’. The effect is subtle, and the reader is mostly unconscious of it – but they experience it as sense of deepened length, space, breath, musicality and tonal (and hence emotional) differentiation.

The other half of the lyric ground is formed by the patterning of consonants. Simply put: if we employ consonants as the tool of semantic differentiation, the project of unifying the sense of our material will be broadly aided by our patterning them. This is, in effect, another global application of the phonosemantic principle described in the first part of this essay. Since consonants take up little space, their musical effect only really becomes audible through either their proximate repetition or their bold deviation from a repetitive ground. (Sound-linked but distant words – like pararhymed pairs – have to be processed consciously, and therefore highly salient, i.e. placed in prominent terminal, initial or caesural positions, or lyrically or semantically ‘deviant’. Some critics have the unfortunate habit of identifying non-salient lyric effect in words separated by many lines, and then claiming that the poet had intended some semantic echo. ‘In “Ground Beef”’ by Terence Unthank, for example, we see how the idea of the death of the parent is pursued through the repetition of the fa- sounds in ‘fatal’ and ‘father’ in lines 3 and 29 …’ and so on. This is popular criticism of the paranoiac school; these effects are nearly always accidental, and indeed imperceptible, unless you go hunting for them.) What we tend to find, just as we did with the variation of vowel-sound, is that in ‘musical’ writing there are almost always subtle patterns of consonantal echo. Here’s an unsubtle example, where the English lyric default has been turned up to 11:

Though it can hold an underrated appeal for the eye, spelling is irrelevant to the ear; a poet who has learned to hear and recognise close phonetic relatives is at a distinct advantage when it comes to their fluent patterning. For compositional purposes, they can often be freely substituted for one another, and the difference between, say, a voiced and unvoiced consonant is relatively slight. I’ll list the most common English sounds here. (I appreciate this book is not a primer, but a few poets may still be reading, and there is no earthly excuse for poets not to know them.) The unvoiced consonant is the first of each pair.

(As a Scot I am obliged to add the rolled ‘r’ or alveolar tap [ɾ], as well as the ‘ch’ of loch, or voiceless velar fricative [x].)

Most of these pairs or groups are close enough to function as interchangeable allophones in one language or another, l and r in Japanese being maybe the most famous example. [See endnote 5 for more on allophones and their role in shibboleths.] This is a very rough list, and the phonic relations between these sounds are far deeper and more interesting than just the voiced/unvoiced pairs, the nasal group, and so on. However, most of the closer cousins will already be familiar through various speech phenomena:

a) Grimm’s law of consonant shift, which showed, amongst other things, how the classical unvoiced p/ t/ k stops morph into the Germanic unvoiced f / th / h; (see also Verner’s Law).

b) Foreign allophones: Spanish betacism, where /b/ and /v/ are blurred; also Spanish /d/ and /r/; Japanese lambdacism, where /l/ and /r/ are blurred; the Filipino /p/ and /f/, as well as differences in pronunciation like French /R/ for /r/.

c) Dialectal pronunciation, as in the Cockney /f/ and /v/ for /θ/ and /ð/ (‘first’ for ‘thirst’ and ‘vose’ for ‘those’); the Scots /x/ for /h/ (‘daachter’ and ‘wecht’ for ‘daughter’ and ‘weight’; and so on. This is less a dialectal variant than a retention from an earlier Germanic speech – it was the English ‘gh’ that became silent). Also /r/ for /ɹ/ or /ɾ/, the burr of Scottish rhotacism; the Scouse /χ/ for final /k/; the almost universal Estuary English droppin’ of /ŋ/ for /n/; the Devon /z/ for /s/ (‘zider’ for ‘cider’); the Shetland /d/ for /ð/ (‘da’ for ‘the’); the US /d/ for /t/ (as in ‘Creadive Wriding’); the Irish /t/ for /θ/ (as in ‘tree fellers’ for ‘three fellers’); the Gaelic /tʃ/ for / dʒ/) (‘Chudge Chudy’ for ‘Judge Judy’). One could easily construct a map of the UK where every vaguely proximate consonant had been reversed somewhere; indeed you soon find bizarre examples like the Fife shibboleth /ʃ/ for / θ / (‘shree’ for ‘three’) or the Aberdonian /f/ for /ʍ/ (‘fit’ for ‘what’, ‘fun-bus’ for ‘whin-bush’), or the Doric /r/ which becomes a rather French /x/.21

d) Speech impediments: lisping, where /s/ becomes /θ/ (long a source of scurrilous Carry On-style British humour); the childish (or Cockney) /w/ for /r/ (‘weadin’, wi’in’ an’ awiffma’ic’); ‘talking through a cold’, where the nasals /m/, /n/ and /ɳ / become /b/, /d/ and /g/, the dose being bugged up with bucus.

While a thorough study of these connections can provide poets with some invaluable ear-training, the practical method of achieving a unified consonantal music in the poetic line is simple: one must be alive to the music each line itself suggests in the initial stages of its composition. The poet must divine its consonantal signature, then tighten and sharpen it, vary the vowels, foreground important detail through assonance and anomalous consonants – and eliminate all unproductive, loud sound effects that enhance no sense. It isn’t quite matter of looking at something like …

But when your fine fizzog filled up his verse,

Then lacked I theme; that really weakened mine.

and thinking – ‘OK, fine fizzog filled … verse is a pointless and gratuitous alliteration; but I can pick up on the ‘k’ of lacked if I change features to countenance, and enfeebled will chime better with filled, and while theme keeps the nasal music going nicely, matter does the same – and also solves the problem of that metre-padding really, as well as keeping these nice clipped plosives going … Let’s try

But when your countenance filled up his line,

Then lacked I matter; that enfeebled mine.

… Nailed it!’ No, few poets would come from so far behind in a first draft. But almost every line will contain some revisions of this sort – and even if these decisions are rarely consciously articulated, a good ear, allied to some intuitive reasoning, will allow us make them by a pretty similar process.

The poet’s almost pathological sensitivity to the weight and texture of words is a faculty that tends, I suspect, to get burned into the circuits in the age of lyric innocence, during one’s early adventures in ‘voice-finding’ – Pound claimed this occurs between the ages of seventeen and twenty-two – but it could surely be learned by anyone prepared to cultivate the necessary obsession.22 It leaves us with an odd skill: we can often identify, in the ‘given’ line or phrase, a kind of phonosemantic DNA, a generative proposition that, we feel, somehow prefigures the whole poem. The search for ‘what it is we mean’ is then conducted through that narrowed lyric-semantic channel, whose musical colour shifts from line to line, like another twist of the kaleidoscope. The compositional unit of the poetic line will display more and more of this memorable, unifying, song-like, phrase-creating patterning as the poet drafts and redrafts.

First lines are often good places to study such propositions. Take ‘Throw all your stagey chandeliers in wheelbarrows and move them north’ (Robert Crawford, ‘Opera’).23 Besides the surreal drama of the line, look at the assonantal and consonantal echo between ‘throw’ and ‘north’; the sibilants and liquids uniting ‘chandeliers’ and ‘wheel-barrows’; the tonal antithesis of the high-society ‘chandeliers’ and the labouring-class signifier of ‘wheelbarrows’ made all the more palpable through their assonantal connection, and their being both saliently polysyllabic;24 the insistent (Glaswegian) nasals of ‘and move them north’; ‘stagey chandeliers’ presents the line’s most memorable phrase, with the two words close enough in sound that we vaguely sense the ‘pale apple’ effect, that of a phonestheme being conjured from the air.

I’ll cover this in a more technical way when I discuss metre, but it’s important to mention for now that poetic lines tend to something around three seconds in duration. This is the length of the human ‘moment’, and corresponds to what our brains can experience as an unbroken, living instant – an instant whose auditory contents we can then mentally replay, then either choose to remember or not, depending on their salience and importance.25 (Although our brains can perform fairly accurate chronoceptions, our hearing is the only traditional sense which measures time with much accuracy.) The three-second rule has a powerful influence on the form and delivery of the poetic line, which – unsurprisingly, for an art whose cultural function has long been associated with memory and the memorable – universally defaults to a frequency of about 0.3Hz to form an ideal mnemonic unit. This is the frequency of the carrier-wave of poetic meaning. In English, it comes out as a line of between eight and twelve syllables; our most popular literary line, iambic pentameter, strongly reflects this tendency.

This is also explains what we mean by ‘short’ and ‘long’ lines. Short and long compared with what? Well – with a line that takes around three seconds to deliver. Short lines tend to lengthen the gap between them, as the brain will unconsciously seek to establish a three-second rhythm; similarly, long lines are read more quickly with only brief interlineal pauses as we try to shrink them down to size. This is also the reason that weak vowels on function words are often promoted to something closer to strong in short lines, and strong vowels on content words demoted to schwa in longer ones. All this will receive careful defence and explanation later, but for now it’s important to understand that the three-second rule also influences the lyric weave: a sound-event has about three seconds on either side in which it can be either prepared for, or have its use retrospectively sanctioned. Any further apart, and the reader is unlikely to perceive them as belonging to the same ‘moment’, and connect them. Therefore, as the poet instinctively pursues this strategy in their redrafting, working the line again and again through their lyric handloom, every two- or three-line passage will most often start to exhibit its own distinctive pattern and colour. As I’ve mentioned, most of the sound-correspondences that some commentators identify as working over the distance of several lines simply aren’t. The exceptions are sounds that the poet has made deliberately salient: the sounds that land on internal or terminal rhyme or pararhyme, sounds attached to dramatic or significant sense, rogue and unusual phonemes, or indeed any noise which is prominent enough to allow it to enter the memory, and then be recalled when its strong echo occurs later in the text.

Here’s a short passage from ‘The Dry Salvages’. Look at the immensely deft way in which Eliot weaves a pattern of consonants through the vowel’s warp:

The j in ‘reject’ could have been left standing very lonely indeed, but has been anticipated by the affricate in ‘voyage’, and later consolidated by that of ‘reach’ and ‘angelus’. Note too the guttural chime between ‘dark’ and ‘reject’; the power of the two monosyllables of ‘dark throat’, and their repeated plosive closure; the ‘ships’ / ‘lips’ rhyme; the echoic sibilance of ‘sound of the sea bell’s / Perpetual angelus’; the heavily patterned r/l liquids in the last two lines; the assonance of ‘bell’s / Perpetual’, doubly consolidated in sense by the alliterating labial plosives … and so on. These effects are, of course, not the only reason the passage works – but they’re the principal reason the passage is experienced as sensually beautiful. All this is work completed best by the instinct (and I have little doubt that this is mostly the way Eliot went about it); but the instinct can be consciously trained into making better and more consistent decisions.

1 Patterning seems exceptionally important in the formation of the lexicalised phrase or ‘phraseme’, which takes the forms of idiom, cliché and collocation. We don’t hear ‘below the belt’ as anything to do with belts or punches, since we have learned it as a fixed phrase meaning ‘unfair’ or ‘underhand’ – i.e. it has effectively been learned as a word like ‘house’ or ‘faith’, for which an approximate synonym (there is no other kind) can be readily provided. A phrase seems to have a better chance of becoming a phraseme when its sounds observe some assonantal or consonantal patterning; this clearly aids the left brain’s memorisation and storage of the term along the axis of selection.

2 Though this is nothing compared to Pirahã, the language of Pirahãn Amazons, which has around 10. ‘Xigihai xoi kapioxiai. Xigioawaxai? xai xigiai xaaga. Ti gaisai. Xigiaixaaga. Xaooii xoai?’ – seems a typical sample, as far as I can tell. They compensate for their limited sound-palette in the most beautiful way, however, with an intonational prosody so complex that their whole language can be encoded in music, and whistled or hummed instead of spoken. Note that this might appear directly to contradict my later assertion in this essay that music cannot be denotative; but this is less ‘music’ than a tonally encoded sign-system. If a language had a single phoneme, it would have to delegate its sign-making capability to intonational and qualitative manipulations of its one vowel, and such ‘music’ as it produced would be the mere byproduct of its sign-system (though this would, I suppose, provide one theory for the evolution of song in an ‘oligophonemic’ ur-language).

Much controversy surrounds Pirahã, and the linguist Daniel Everett has claimed that the apparent absence of recursion in Pirahã contradicts a fundamental tenet of the Chomskyan model – Daniel L. Everett, ‘Cultural Constraints on Grammar and Cognition in Pirahã: Another Look at the Design Features of Human Language’, Current Anthropology 76 (2005): 621–46. This claim has been widely discredited: Pirahã does indeed demonstrate recursive features; and besides, its speakers have no trouble learning a normal recursive language like Portuguese, for which it seems their brains are as just as grammatically well-equipped as those of any other human.

3 Spam text is produced using Markov chains, a stochastic algorithm which spits out a sequence of random variables; these can be used to generate ‘plausible’ fake texts from a series of real ones. I’ll say more on this later.

4 Robert Bridges, ‘Nightingales’, The Humours of the Court, and Other Poems (London: George Bell and Sons, 1893).

5 Of course, the formal conventions of Anglo-Saxon verse are perceived as contrivance nowadays, but they would have been passed over as unremarkable by the Anglo-Saxon ear as the culturally agreed mode of poetic artifice, the invisible fashion of the day. Some US readers feel that full rhyme is just as much of a glaring anachronism, a fact that causes me some misery.

6 Hence, I suspect, our stubborn superstition that an anagram of someone’s name reveals something of their inner character. (Florence Nightingale = angel of the reclining, etc. Although my favourite is Elvis = lives.) I also have some strong anecdotal evidence that editors are peculiarly susceptible to submissions made by authors whose names are anagrams of their own; we are all our own blind spots.

7 Paul Muldoon, Poems 1968–1998 (London: Faber & Faber, 2011).

8 Note that, if we’re using pararhyme purely as a compositional aid, we can bury the rhymes by simply not placing them in a terminal position. You’ll lose the reader’s applause, as they’ll likely be none the wiser; but it’s a legitimate use of a technique that, like syllabics, really has nothing to do with the reader at all. (Though I confess I was once reduced to the pathetic move of pointing out that a poem of mine was based on twenty-eight variations on the same consonantal string, as ten years had passed without anyone noticing – something a less insecure ego might have been pleased about.)

9 It seems to me that this universe operates on a principle of rapid alternation between ‘pneuma and logos’, duration and event, syntagm and paradigm, function and content – a pattern clearly imprinted in the basic linguistic phenomenon of metre. It is also realised in the vowel-consonant-vowel-consonant pattern of our speech, and further reflected in our unconscious mapping of consonants to ‘contentual’ qualities of physical boundary, and of vowels to ‘functional’ qualities of spatiotemporal relation.

10 Vowel-mangling can happen in other ways too. Most librettists will have had the miserable experience of hearing their best line rendered unintelligible by its being set for long notes in the upper register of the soprano voice, which has the vowel /i/, and no other. In their defence, though, composers have devised a million other ways to destroy a good line. This is one reason the librettist–composer relationship should sensibly be considered a co-operative rather than collaborative one. Hand it over, walk away and let them do their worst.

11 ‘Denotative’ in the normal sense of a sign indicating a concept. However, I believe there is a seme at work in music that corresponds to what I call the ‘patheme’. I’ll discuss this in Part II.

12 As anyone who has suffered Eddie Jefferson’s miserable enconsonation of Coleman Hawkins’s immortal solo on ‘Body and Soul’ will testify with alacrity: ‘Don’t you know he was the king of the saxophone – yes indeed he was, talkin’ ’bout the guy who made it sound so good, some people knew him as The Bean but Hawkins was his name. He sure could swing,’ etc.

13 Philip Larkin, Collected Poems, ed. Anthony Thwaite (London: Faber & Faber, 2003).

14 Seamus Heaney, New Selected Poems 1966–1987 (London: Faber & Faber, 1990).

15 Kathleen Jamie, Jizzen (London: Picador, 1999).

16 Elizabeth Bishop, Poems (New York: Farrar, Straus & Giroux, 2011), 43.

17 Seamus Heaney, Seeing Things (London: Faber & Faber, 1991).

18 W. B. Yeats, The Collected Poems of W. B. Yeats (Ware: Wordsworth Editions, 1994), 59.

19 As well as all its bonkers staple-gun consonants, there are a few too many assonantal pairs in this poem for my taste too. Hopkins is a fine poet, of course – but he also holds a perennial appeal for the tone deaf, as even they cannot fail to hear his glorious racket. Many of Hopkins’s effects are spectacular, but are sometimes achieved at the expense of his quieter ones; he can wander rather too cheerfully over the line between baroque lyricism and demented echolalia, and the result is language that can sound – to me at least – unrelated to anything we ever experience as natural human speech. Nonetheless, no poet better trains the apprentice lug, and neither Seamus Heaney nor George Mackay Brown (to name two marvellously attuned poets) would have grown the ears they had without him.

20 The last two categories are more properly ‘approximants’, but old habits die hard. ‘Glides’ and ‘liquids’ are nicer and more expressive names.

21 It’s off the chart in more ways than one, but I do want to mention my favourite substitution – the Dundee dialectal swapping of nearly everything for the voiceless glottal plosive, leaving us with a phonemic palette as impoverished as something you’d find up a mountain in Papua New Guinea. Thus a statement like I ate all of it, didn’t I – a common response to most food-related enquiries in Tayside – becomes Eh ay’ aa’ o’ i’ dih’ uh. (In an even more brutal linguistic economy, the negative affix is also dropped entirely: ‘did’ means both ‘did’ and ‘didn’t’. Thus Eh dih’ dae i’, dih’ uh is ‘I didn’t do it, did I’. An extra glottal stop standing in as the –n’t affix –dih’ /’/ uh – would be regarded as an effete refinement.) As the joke goes round here – Q: ‘And how many ts in Paterson, sir?’ A: ‘Nane’. The aleph of the Hebrew alphabet, incidentally, used to represent the sacred ur-consonant of the glottal stop, from which we can reasonably surmise that God talks with a Dundee accent.

22 Perhaps ‘poet’ can be simply defined as ‘anyone prepared to cultivate the necessary obsession with writing poems’. Indulged for long enough, we now know that just about any obsessive practice will effect physical change in the corresponding part of the brain, making it in turn easier to indulge and perfect the practice itself. Einstein’s brain turned out to be just as anomalous as we’d hoped – he was missing a bit of his lateral sulcus, which may or may not have helped with neuronal communication; but there was also a statistical bump in the glial cell count in the left inferior parietal region, a part of the associative cortex – which is where we synthesise information from other areas of the brain. It’s speculated that this may have been the product of a lifetime spent in the environment of scientific problem-solving. Perhaps most geniuses ‘mutantise’ themselves; it would surely be weirder to find that Shakespeare’s brain had not grown a little distorted in its language and empathy centres.

23 Robert Crawford, ‘Opera’, Selected Poems (London: Jonathan Cape, 2005).

24 This is a very Shakespearian trick: the poet creates a kind of tonal oxymoron by uniting an antithesis or deepening a paradox through a bonding lyric effect. Look at this palindromic assonance from Sonnet 79: ‘My verse alone had all thy gentle grace; / But now my gracious numbers are decayed.’ The frisson will be registered, but the means by which it has been achieved will remain hidden.

25 The relevance to poetry of the ‘auditory present’ was first affirmed in Ernst Pöppel and Frederick Turner’s famous paper ‘The Neural Lyre’ (1983); its research has been subsequently validated, and the ‘three-second line’ has become many a poet’s favourite scientific factoid. However, I have grave reservations over the original paper’s reactionary conclusions, which can be roughly summarised as ‘formal verse is therefore better than free’, a silly and unsupportable position. The three-second rule has implications that are, if anything, even more profound for free than formal verse.

26 T. S. Eliot, ‘The Dry Salvages’, Four Quartets (London: Faber & Faber, 1979).