Chapter 10
Grasping the Melody of Language
In This Chapter
Using juncture for different speaking styles and rates
Exploring the syllable and stress assignment
Patching with sonority and prominence measures
Transcribing is more than just getting the vowels and consonants down on paper. You need that extra zest! For instance, you should be able to describe how phonemes and syllables join together, a property called juncture. A phonetician must be able to hear and describe the melody of language, focusing on patterns meaningful for language. This important sound aspect, called prosody, gives speech its zing and is described with a number of specialized terms. This chapter gives you the tools to handle bigger chunks of language, so that you can master description of the melody of language.
Joining Words with Juncture
Unless you’re a lifeless android (or have simply had a very bad night), you probably don’t say things such as “Hel-lo-how-are-you-to-day?” That is, people don’t often speak one word (or syllable) at a time. Instead, speech sounds naturally flow together. Juncture is the degree to which words and syllables are connected in a language. These sections explain some characteristics of juncture and help you transcribe it.
Knowing what affects juncture
A number of factors can affect juncture, including the following:
Some factors are language-specific. Some languages (such as Hawaiian) break things up and have relatively little carryover between syllables, while other languages (such as French) allow sounds to be run together. In French, the process of sounds blending into each other is called liaison, in which sounds change across word boundaries. Check out these two examples:
In these examples, the syllables of Hawaiian have little effect on each other, whereas the French has resyllabification (the shift of a syllable boundary) and a voicing of an underlying /s/ sound — a clear example of adjacent sounds affecting each other.
Other factors are more personal. They include speaking formality and rate. Think about how your speech changes when you formally address a group versus talking casually with your friends. In a formal setting, you usually use more polite forms of address (sir and madam), fancier terms for things (restroom or public convenience instead of john or loo), and frillier sentence constructions (Would you kindly pass the hors d’oeuvre please? instead of Yo. The cheese, please?).
In informal speech, talkers usually have less precise boundaries than in formal speech. This register change often interacts with rate, because rapid speech often causes people to undershoot articulatory positions (not reach full articulatory positions). The result can be vowel centralization (sounds taking on more of an [ə]-like quality), de-diphthongization (diphthongs becoming monophthongs), changes in consonant quality (such as the tongue moving less completely to make speech sounds), and changes in juncture boundaries (including one boundary shifting into another).
Check out these examples from American and British English:
Changes in register and style clearly affect juncture (how speech sounds are connected in terms of pauses or gaps). Some phoneticians refer to juncture as oral punctuation because it acts somewhat like the commas and periods in written language.
Transcribing juncture
You can transcribe juncture in a couple different ways. They are as follows:
Close juncture: This default way of transcribing shows that sounds are close together by placing IPA symbols close together in transcription from phoneme to phoneme. An example is “Have a nice day!” /hӕvə naɪs ˈdeɪ/.
Open juncture: You use open juncture (also referred to as plus juncture) symbols when you need to emphasize gaps separating sounds. Consider these two expressions:
“Have a nice day!” /ˈhӕvə + naɪs ˈdeɪ/
“Have an ice day!” /ˈhӕvən + aɪs deɪ/
Many speakers would probably produce this second example (“Have an ice day”) with a glottal stop before the vowel of ice, as a way of marking the gap between the words “an” and “ice.” To distinguish these two expressions, the exact placement of the gap between the /ə/ and /n/ is critical. Therefore, open juncture symbols are helpful.
I . . . went to the store.
I went . . . to the store.
I went to . . . the store.
I went to the . . . store.
And so on. You get the idea. Transcribing all the potential variations in the exact same way wouldn’t make sense. What’s important is showing where all the gaps take place. Many phoneticians use the IPA pipe symbol ([ǀ]), which technically indicates a minor foot, a prosodic unit that acts like a comma (I describe it in greater detail in Chapter 11). However, many transcribers also use this symbol to represent a short pause, whereas they use a double bar ([‖]) to represent a long pause, such as at the end of a sentence. Here are some examples:
/aɪ ǀwɛn tə ðə stɔɹ‖/
/aɪ wɛnt ǀ tə ðə stɔɹ‖/
Emphasizing Your Syllables
A syllable is something everyone knows intuitively, but can drive phoneticians nuts trying to pin down precisely. By definition, a syllable is a unit of spoken language consisting of a single uninterrupted sound formed by a vowel, diphthong, or syllabic consonant, with other sounds preceding or following it. Phoneticians don’t see the definition so cut and dry.
Phoneticians consider a syllable an essential unit of speech production. It’s a unit with a center having a louder portion (made with more air flow) and optional ends having quieter portions (made with less air flow). Phoneticians agree on descriptive components of an English syllable, as shown in Figure 10-1.
Illustration by Wiley, Composition Services Graphics
Figure 10-1: Parts of an English syllable.
From Figure 10-1, you can see that an English syllable (often represented by the symbol sigma [σ]), consists of an optional onset (beginning) and a rhyme (main part). The rhyming part consists of the vowel and any consonants that come after it. The vowels in a rhyme sound alike. At a finer level of description, the rhyme is divided into the nucleus (the vowel part) and the coda (tail or end) where the final consonants are. From this figure, you can take a word like “cat” and identify the different parts of the syllable. For “cat” (/kæt/), the /k/ is the onset, /æ/ is the nucleus, and the /t/ is the coda.
This is why this type of poem rhymes:
Roses are red, violets are blue. . . .
blah blah blah blah, blah blah blah blah . . . you.
Languages vary considerably with which kinds of onsets and codas are allowed. Table 10-1 shows some samples of syllable types permissible for English.
Table 10-1 Sample Syllable Types in English
Example |
IPA |
Syllable Type |
eye |
/aɪ/ |
V |
hi |
/haɪ/ |
CV |
height |
/haɪt/ |
CVC |
slight |
/slaɪt/ |
CCVC |
sliced |
/slaɪst/ |
CCVCC |
sprints |
/spɹɪnts/ |
CCCVCCC |
The last column lists a common abbreviation for each syllable type, where “C” represents a consonant and “V” represents a vowel or diphthong. For instance, “eye” is a single diphthong and thus has the syllable structure “V.” At the bottom of the table, “sprints” consists of a vowel preceded and followed by three consonants, having the structure “CCCVCCC.”
Strings of consonants next to each other are called consonant clusters (or blends). Each language has its own rules for consonant cluster formation. The permissible types of consonants clusters in English are, well, rather odd. Figure 10-2 shows some of the English initial consonant clusters in a chart created by the famous Danish linguist, Eli Fisher-Jørgensen.
Illustration by Wiley, Composition Services Graphics
Figure 10-2: Some English syllable-initial consonant clusters.
Notice the phonotactic (permissible sound combination) constraints at work in Figure 10-2. It’s possible to have sm- and sn- word beginnings, but not sd-, sb-, or sg-. There can be an spl- cluster, but not a ps- or psl- cluster.
Stressing Stress
Nothing makes a person stand out as a foreign speaker more than placing stress on the wrong syllable. In order to effectively teach English as a second language, transcribe patient notes for speech language pathology purposes, or work with foreign accent reduction, you need to know how and where English stress is assigned. This, in turn, requires an understanding of phonetic stress at the physiologic and acoustic levels.
Table 10-2 Physical, Acoustic, and Perceptual Markers of Stress in English
Articulatory |
Acoustic Change |
Perceptual Impression |
Increased airflow, greater intensity of vocal fold vibration |
The amplitude increases |
Louder |
Increased duration of vocal and consonantal gestures |
The duration increases |
Longer sound (“length”) |
Higher rate of vocal fold vibration |
The fundamental frequency increases |
Higher pitch |
In each case (whether you’re correctly or incorrectly pronouncing it), the stressed syllable should sound as if someone cranked up the volume. The following sections tell you more about how stress operates at the word, phrase, and sentence level in English.
Eyeing the predictable cases
Stress serves four important roles in English. They are as follows:
Lexical (word level): When you learn an English word, you learn its stress. This is because stress plays a lexical (word specific) role in English: it’s assigned as part of the English vocabulary. For example, syllable is pronounced /ˈsɪlebəl/, not /sɪˈlʌbəl/ or /sɪləˈbʌl/.
Noun/verb pairs: In English, stress also describes different functions of words. Try saying these noun-verb pairs, and listen how stress alteration makes a difference (the stressed syllables are italicized):
Spelling |
Part of Speech |
IPA |
(to) record |
Verb |
[ɹəˈkʰɔɹd] |
(a) record |
Noun |
[ˈɹɛkɚd] |
(to) rebel |
Verb |
[ɹəˈbɛɫ] |
(a) rebel |
Noun |
[ˈɹɛbɫ̩] |
These stress contrasts are common in stress-timed languages, such as English and Dutch (whereas tone languages, such as Vietnamese, may distinguish word meaning by contrasts in pitch level or pitch contour on a given syllable).
Compounding: With compounding, two or more words come together to form a new meaning, and more stress is given to the first than the second. For example, the words “black” and “board” create “blackboard” /ˈblækbɔɹd/.
Also, the juncture is closer than a corresponding adjective + noun construction. For example, if you pronounce the following pairs, you’ll notice a longer pause between the words in the first example (the English column) than between the words in the second example (the IPA column).
Grammatical Role |
English |
IPA |
Adjective + noun |
a black board |
/ə blæk ˈbɔɹd/ |
Compound noun |
a blackboard |
/ə ˈblækbɔɹd/ |
Emphasis in phrases and sentences: Also known as focus, this is a pointer-like function that draws attention to a part of a phrase or sentence. By making a certain syllable’s stress louder, longer, and higher, the talker subtly changes the meaning. It’s as if the utterance answers a different question. For example:
Dylan sings better than Caruso. (Who sings better than Caruso?)
Dylan sings better than Caruso. (What does Dylan do better than Caruso?)
Dylan sings better than Caruso. (Who does Dylan sing better than?)
People handle this kind of subtlety every day without much problem. However, just think how difficult it is to get computers to understand this type of complexity.
Identifying the shifty cases
For the most part, English stress remains fairly consistent. However, some cases realign and readjust. You may think of it as a musical score having to be switched around here and there to keep with the rhythm. These adjustments, called stress-shift, are a quirky part of English phonology.
1. Say “Clarinet music” three times.
Doing so sounds a bit awkward, right? It should have been more difficult because two stressed syllables had to butt up against each other.
2. Say “Clarinet music” three times.
You should notice that this second pattern flows more naturally because it permits the usual English stress patterns (strong/weak/strong/weak) to persist.
Sticking to the Rhythm
Another way an English speaker can show adeptness with the language is having the ability to use English sentence rhythm patterns, where greater stresses occur at rhythmic intervals, depending on talking speed. To get a sense of these layered rhythms, consider these initially stressed polysyllabic words: “really,” “loony,” “poodle,” “swallowed,” “fifty,” “plastic,” and “noodles.”
When you put them together in a sentence, they form:
The really loony poodle swallowed fifty plastic noodles.
Although speaking this sentence is possible in many fashions, a typical way people produce it is something like this:
The really loony poodle swallowed fifty plastic noodles.
That is, regularly spaced, strongly stressed syllables (italicized) are interspersed with words that still retain their primary stress (such as “loony”), yet they’re relatively deemphasized in sentential context. This kind of timing is rhythmic and can reach high levels in art forms like vocal jazz (or perhaps, rap). Chapter 11 discusses ways you can transcribe this kind of information.
Tuning Up with Intonation
In phonetics, sentence-level intonation refers to the melodic patterns over a phrase or sentence that can change meaning. For instance, rising or falling melodic patterns that change a statement to a question, or vice-versa. Intonation is quite different from tone, which is the phoneme-level pitch differences that affect word meaning in languages such as Mandarin, Hausa, or Vietnamese (see Chapter 18). English really has no tone. The following sections take a closer look at the three patterns of sentence-level intonation that you find in English.
Making simple declaratives
A basic pattern of English intonation is the simple declarative sentence, which is a statement used to convey information. A couple examples are “The sky is blue” or “I have a red pencil box.”
Falling intonation seems to be a universal pattern, perhaps due to the fact that it takes energy to sustain the thoracic pressure needed to keep the voice box (larynx) buzzing. As a person talks, the air pressure drops and the amount of buzzing tends to drop, causing the perceived pitch to fall, as well.
Answering yes-no questions
The second pattern of sentences is called the “yes/no question.” When you’re asking a question that has a yes or no answer, you probably have rising intonation. This means you start low and end high.
You probably noticed these English statements (“The sky is blue?”) have now turned into questions. Specifically, they’re questions that can be answered with yes or no answers. This rising pitch pattern for questions is fairly common among the world’s languages. For instance, French forms most questions in this manner. Note: Some languages don’t use intonation at all to form a question. For instance, Japanese forms questions by simply sticking the particle /ka/ at the end of a sentence.
Focusing on “Wh” questions
The third pattern of sentences include English questions with the Wh questions, including “who,” “what,” “when,” “where,” “why,” and “how,” (which are produced with falling pitch, rather than rising). Try a few, while determining whether your voice goes up or down:
Who told you that?
What did he say?
When did he tell you?
Where will they take you?
Why are you going?
How much will it cost?
Showing Your Emotion in Speech
When someone talks, part of the melody serves a language purpose, and part serves an emotional purpose. When you’re transcribing speech, you need to understand emotional prosody because it can interact in complex ways with the linguistic functions of prosody. In fact, people can show many emotions in speech, including joy, disgust, anger, fear, sadness, boredom, and anxiety.
Studies have shown that people speak happiness (joy) and fear at higher frequency ranges (heard as pitch) than emotions such as sadness. Anger seems to be an emotion that can go in two directions, phonetically:
Hot anger: When people go up high with the voice and show much variability.
Cold anger: When people are brooding with low pitch range, high intensity, and fast attack times (sudden rise in amplitude) at voice onset.
Fine-Tuning Speech Melodies
Phoneticians can be sticklers for detail. They just don’t like messy bits left over. In addition to the different types of stress, intonation, focus, and emotional prosody, certain aspects of speech melody still require measures to account for them. These sections examine two such measures.
Sonority: A general measure of sound
Sono- means sounds, and sonority is therefore a measure of the relative amount of sound something has. Technically, sonority refers to a sound’s loudness relative to those of other sounds having the same length, stress, and pitch. This measure of sound is particularly handy for working with tone languages, such as Vietnamese, where decisions about tone structure are important.
The concept of sonority is relative, which means phoneticians often refer to sonority hierarchies or scales. In a sonority hierarchy, classes of sounds are grouped by their degree of relative loudness. Check out www-01.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsTheSonorityScale.htm
for an example of one.
A sonority scale expresses more fine-grained details. For instance, according to phonologist Elizabeth Selkirk, English sounds show the following ranking:
([ɑ] > [e=o] > [i=u] > [r] > [l] > [m=n] > [z=v=ð] > [s=f=θ] > [b=d=ɡ] > [p=t=k])
If you try out some points on this scale, you’ll hear, for example, that [ɑ] is more sonorous than [i] and [u].
Sonority is an important principle regulating many phonological processes in language, including phonotactics (permissible combinations of phonemes) syllable structure, and stress assignment.
Prominence: Sticking out in unexpected ways
When all is said and done, some problem cases of prosody can still challenge phoneticians. One such problem is exactly how stress is assigned to syllables in words. For instance, some English words can be produced with different amounts of syllables. Consider the words “frightening” and “maddening.”
Do you say them with two syllables, such as /ˈfɹaɪtnɪŋ/ and /ˈmӕdnɪŋ/? Or do you use three syllables, such as /ˈfɹaɪtənɪŋ/ and /ˈmӕdənɪŋ/? Or sometimes with two and sometimes with three?
Other English words may change meaning based on whether they are pronounced with two or three syllables. For instance:
“lightning” (such as in a storm) /ˈlaɪtnɪŋ/
“lightening” (such as, getting brighter) /ˈlaɪtənɪŋ/
A proposed solution for the more difficult cases of stress patterns is to rely on a feature called prominence, consisting of a combination of sonority, length, stress, and pitch. According to this view, prominence peaks are heard in words to define syllables, not solely sonority values.
Prominence remains a rather complex and controversial notion. It’s an important concept in metrical phonology (a theory concerned with organizing segments into groups of relative prominence), where it’s often supported with data from speech experiments. However, other phoneticians have suggested different approaches may be more beneficial in addressing the problems of syllabicity in English (such as the application of speech technology algorithms, rather than linguistic descriptions).