Form
Richard Ashley
When musicians speak of the “form” of a piece of music they are referring to its overall design, plan, pattern, or structure. The nature of musical form is a central concern in Western musical scholarship; systematic writings on form date from the 18th century (Burnham, 2002). This chapter introduces some basic concepts of form in Western tonal music, relates these theoretical constructs to empirical research into the perception and comprehension of larger-scale musical structure, evaluates such research, and offers some directions for future investigation. We will deal only with tonal music, primarily from the Western classical tradition as consideration of nontonal music would require another chapter-length treatment.
Musical works, composed or improvised, are rarely so brief as to be contained within the perceptual present (a duration of a few seconds). Folk songs, nursery rhymes, music for dancing, concert works, and popular songs exhibit global as well as local musical structure. What might serve as the foundation for perceiving and comprehending formal structures in music? The basis of perceptual patterning, in music as elsewhere, begins with detection of similarities and dissimilarities among the features, dimensions, or attributes of the perceptual signal (Francès, 1988; Deliège, 2001). Music, like speech, is built on contrastive aspects of acoustic units. To create patterns, percepts—in our case, musical events—must be discrim-inable from one another, allowing us to perceive them as individual acoustic events, whether notes and motives or phonemes and words. In Western music theory, analysis involves the segmentation of the musical work into smaller units and the description of relationships between these segments (Hanninen, 2012). Many kinds of musical events can be taken as segments out of which form may be built: individual notes; “ideas,” “motives,” or “gestures”; “phrases,” “periods,” and “sections”; and, in some compositions, “movements.” Such units’ combinations and relationships are the domain of formal analysis (e.g. Berry, 1966; Caplin, 1998).
Two aspects of musical structure are typically seen as fundamental in analysis of musical form: the pattern, plan, or design of the piece’s divisions and subdivisions, and the tonal relationships across the work. The first typically focuses on segments, which are identified and distinguished from others primarily by some kind of motivic or melodic content, whereas the second relates segments to one another by their tonal, harmonic functions. The first approach investigates, for example, a first or “principal” theme as opposed to a “secondary” theme in the opening movement of a Beethoven piano sonata, or the verse and chorus distinctions of pop songs. The second approach views sonata form not primarily as a sequence of contrasting melodies or themes, but as harmonic contrasts over time. Most current music-theoretic approaches to musical form integrate the two and recognize the simultaneous, interacting operations of musical parameters in both the frequency and time domains. First and second themes in sonatas may contrast not only harmonically, but also melodically and texturally (Berry, 1966; Caplin, 1998; Hepokoski & Darcy, 2006), or may be understood as a work’s “deep structure” elaborated at different levels or time-scales. Such elaborations are often conceived of hierarchically, using basic melodic, harmonic, and contrapuntal patterns such as harmonic motion from tonic to dominant and back to tonic (I–V–I), or ascending melodic arpeggios, such as an upward gesture of the scale degrees , which is then filled in with descending stepwise motion before coming to rest on the tonic. These patterns will be found simultaneously occurring at two or more different levels, with some unfolding more quickly near the musical surface, and others more slowly, at higher, more basic levels of musical structure (Schenker, 1979).
According to one current family of music theories, the fundamental building blocks of musical works are contrapuntal patterns which serve as “schemata,” which composers thread together skillfully, based on a deep, practiced understanding of their properties, relationships, and typical deployment (Gjerdingen, 2007, this volume). Timbre and texture may motivate form, as in popular music, where segmentation is often clearly marked by these parameters, or musical form may be seen as the result of processes of the ongoing transformation of groups of pitches over time, through processes such as inversion or reordering (Lewin, 1993). Other approaches to form are based in musical rhythm, arising from hierarchic patterns of upbeats and downbeats (Cone, 1968), or present approaches to repertoires outside the Western art music canon (Covach, 2005).
Common to most analytic approaches to musical form is a notion of coherence or unity in the musical work. An analyst is motivated to reveal not only a work’s constituent elements, but how the relationships between these elements cause the work to be unified and coherent in an organic or logical manner, thereby demonstrating a composer’s skill and adherence to fundamental principles of good musical organization (Schenker, 1979). As we will see, numerous studies have investigated compositional coherence and unity empirically.
Traditional music theory sees compositions as hierarchically structured, with smaller units combining into larger ones. An influential cognitively oriented theory of music’s hierarchical structure is Lerdahl and Jackendoff’s Generative Theory of Tonal Music, or GTTM (Lerdahl & Jackendoff, 1983). GTTM aims to explicate how a listener familiar with a tonal idiom understands works in that idiom; it is not a theory of the listening process, but of final-state understanding. GTTM proposes that a listener transforms the “musical surface”—the piece in sounding form, although conceived of as an aggregation of discrete events rather than as continuously varying amplitudes and spectra—through four interacting musical parameters which we now consider briefly.
Rhythmic analysis in GTTM begins with grouping structure, which segments the musical surface into contiguous events, or groups; lower-level segments combine into larger ones hierarchically. Grouping structure has parallels with traditional theories of phrase-structure and related hierarchical formal analyses. Grouping structure is produced through the application of analytic rules. As with Gestalt principles of perceptual organization (Lee, this volume), similar events are grouped together, and difference produces segmentation; likewise, events proximate in pitch or in time are grouped together, and grouping boundaries occur when distances of time or pitch are larger. GTTM also describes metric structure, based on previous hierarchic theories of musical rhythm (e.g. Cooper & Meyer, 1960; Cone 1968). In place of simpler accented/unaccented or upbeat/downbeat categorizations of musical events, and their hierarchical organization, GTTM begins with the beat—a durationless point in time— employed in a recurring, isochronous tactus or reference level pulse. The tactus may be divided and subdivided into faster levels, and also subsumed by hierarchically slower beat trains. These faster and slower sequences occur in phase with each other, creating emphases on higher-level beats and thus the structures of musical meter (see Martens & Benadon, this volume).
The third and fourth parameters in GTTM are pitch-based and are derived from prior music-theoretic notions (such as those of Schoenberg, 1969, and Schenker, 1979); they focus on harmony rather than melody or theme. Time-span reduction identifies a tone, or a chord, as being the most important in any event in grouping structure and can be thought of as a harmonic “reduction,” an underlying tonal structure for the composition. Prolongational structure addresses the phenomenological sense that pieces based on the harmonic principles of Western “classical” music change in their musical “tension” over time: “. . . the kind of tension we wish to address here is the . . . sort whose opposite is relaxation—the incessant breathing in and out of music in response to the juxtaposition of pitch and rhythmic factors” (Lerdahl & Jackendoff, 1983, p. 179). GTTM’s view of musical structure, especially hierarchical grouping and harmonic structure, has provided fertile ground for empirical researchers. We begin our survey of empirical research into musical form with segmentation and grouping structure.
GTTM’s grouping rules have been used as the basis for a number of experimental studies. In an early example, Deliège (1987) investigated listeners’ segmentations of musical works. Her participants’ judgments were generally found to be in agreement with the segmentations produced by the GTTM rules, although additional parameters (such as timbre and texture) were implicated, and segmentations were not always placed at the precise rhythmic positions proposed by GTTM. Additionally, some complexities related to multiple rules’ simultaneous application were noted. Clarke and Krumhansl (1990), using music by Mozart and Stock-hausen, and Krumhansl (1996), focusing on Mozart’s Sonata K. 282, Mvt. 1, also provided support for GTTM’s segmentation rules; the sonata movement was segmented into three main parts as expected. Throughout these studies, participants were in greater agreement at primary divisions of the compositions’ forms than at intermediate or lower level divisions; thus higher levels of hierarchic grouping structure were more strongly attested. The variability seen in listeners’ segmentations of the music, as when some listeners hear a segment boundary at a particular location and others do not, are in line with the spirit of GTTM’s preference rules, which indicate possible but not obligatory interpretations of musical structure. We will discuss processing of musical themes in more detail shortly.
A musical work’s form is typically analyzed not only in terms of segmentation or grouping structure but also with regard to the content of these groups and their relationships to one another. The presence of different, contrasting melodic or thematic materials is integral to sectional design, with sections often clearly distinguished by these means. Describing designs as ABA’ (e.g. rounded binary), ABACABA (as in a rondo), or Intro-Verse-Verse-Chorus-Bridge-Chorus (in a pop song), illustrates this approach. More complicated structures, including sonata-allegro form, can also be understood thematically (Hepokoski & Darcy, 2006).
A refinement of these approaches considers musical segments or groups in terms of their formal functions, such as beginning, middle, end, before-the-beginning, and after-the-end (Caplin, 1998); parallels may be seen in popular music, where such functions as introduction and bridge are common (Covach, 2005). GTTM considers the beginnings and ends of groups to be important, but gives little attention to thematic design; however, these matters have engaged empirical researchers. These studies often focus on listeners’ abilities to recognize different musical ideas or themes in a composition and to use these in comprehending the work’s structure.
Melodies and themes are of analytic and cognitive importance and researchers have sought to understand how they are represented and processed. However, it is common for musical ideas to undergo transformations during a composition, and thus thematic comprehension must involve abstraction from the musical surface. Listeners’ capacities to identify themes and melodies, including transformed versions, is thus an important topic.
It has long been known that listeners’ perception of and memory for melodies is imperfect and schematic in nature, even in studies using simplified stimuli (Dowling, 1978). Given the limitations of real-time perception and of working memory seen in such studies, the question arises: how do listeners deal with these constraints and come to grips with the rich, detailed and evanescent musical surfaces of “real” musical works? Addressing this question, Pollard-Gott (1983) investigated listeners’ abilities to recognize relationships between varied versions of musical themes in a complex nineteenth-century piano work, the Sonata in B minor by Liszt. Pollard-Gott focused on versions of the first (A) and second (B) main themes from the sonata. For each theme, four variants were excerpted which differed in their surface attributes (register, dynamics, etc.). Participants heard the first 12 minutes of the work, then the excerpts, taking notes on each of them. The excerpts were then presented in pairs and participants judged the similarity of each pair. Thereafter, the A and B themes were presented as examples of melodic categories in the work and participants assigned each of a new set of excerpts to one of these two categories.
Crucially, one group of participants made these categorical judgments after carrying out the listening, notetaking, and similarity judgments only once, but another group followed this procedure three times before categorizing the novel themes. A multidimensional scaling analysis indicated that the single-listening group’s similarity judgments were based largely on surface features such as loud-soft and smooth-jumpy; the repeated-listening group used such features as well, but by the final iteration, theme identity had emerged as one significant dimension, with A and B themes spaced well apart, rather than interspersed as in the other dimensions. Thus, listeners were able to infer—develop—categories, given more exposure to the music and the “depth of processing” facilitated by the notetaking procedure. Using an approach related to Pollard-Gott’s, Lamont and Dibben (2001) asked listeners to rate similarities of excerpt pairs from compositions by Beethoven and Schoenberg. As in Pollard-Gott’s single-listening condition, parameters such as register and features such as melodic contour were found to be more important than motivic content in explaining listeners’ judgments, although the authors note that a clean disjunction between thematic content and such “surface” features is not always possible to obtain.
What processes of thematic abstraction in music listening might explain such findings? GTTM provides one approach, through its time-span and prolongational reductions, which provide an underlying musical structure which listeners infer from the musical surface. Dib-ben (1994) investigated perceptual relationships between such reductions and their original sources. For brief musical excerpts (1 to 16 measures) from compositions by Handel and Brahms, Dibben produced two different reductions, one a “correct” or appropriate structural reduction and another “incorrect” one. Participants heard the original excerpt and then the reductions, indicating which of these best matched the original; they chose the correct version over its incorrect counterpart in a statistically reliable manner. Dibben concluded that her results supported listeners’ use of a hierarchical representation of the music. Aware that her “incorrect” reductions may have been less tonally coherent than their “correct” counterparts, Dibben conducted a second experiment that did not support lack of coherence as a factor in such judgments.
However, GTTM’s reductions are primarily harmonic in nature, and other approaches have been explored to deal with melodic and other significant musical features. In an extended research program, Deliège proposed and investigated a “cue abstraction” model (Deliège, & Mélen 1997; Deliège 2001), in which a listener first segments the musical surface, and then extracts salient cues from each group. These cues are then used as the basis for processes of categorization, recall, and comparison of groups and events. In one study, Deliège (1996) investigated listeners’ categorical perception of musical motifs in the Finale of Bach’s Sonata for unaccompanied violin, BWV 1005. Like Pollard-Gott, Deliège found that repeated listen-ings clarified categorical memberships of individual events, and that surface features as well as music-structural relationships were important in these processes; participants with less musical background benefitted the most from repeated exposure to the music. In this study, participants were presented with pre-identified thematic categories. To address these matters in a more naturalistic listening situation, Deliège, Mélen, Stammers, & Cross (1996) used a short but complete composition (Schubert’s Valse Sentimentale, D. 779) as the musical material for a set of three experiments. In the first of these, participants without musical training identified musically salient events—cues—while listening to the piece. Their choices, mostly consistent between a first and second listening, were based primarily on surface features, with harmonic structures being less rarely invoked. Subsequent experiments had participants place the piece’s segments on a timeline or attempt to reconstruct the piece from its segments; results showed considerable divergence from the original’s structure. Nonmusicians oriented toward surface features as the basis for their decisions whereas musically trained participants showed some sensitivity to formal function and larger patterns of tension/release.
Many music analysts consider the tonal plan of a work—the key areas set out and traversed over time—to be its most important structural factor. From a cognitive standpoint, this implies that listeners need to be able to, in some probably tacit manner, find keys and track their changes over moderate to extended time spans. Different approaches to investigating the influence of tonal structure and design on the perception of musical form have been used in the literature, including probing memory for tonal centers and relating tonal change to musical tension; we address these next.
Tonal structure in much Western art music follows certain statistically regulated patterns of chord progression (see Shanahan, this volume), although other tonal idioms, such as those used in popular music, may behave differently (Temperley and de Clerq, this volume). Some progressions, such as tonic to dominant and back, are found at multiple hierarchic levels in tonal music. A C chord may move directly to a G chord and back; a symphony movement in sonata-allegro may begin in the key of C major, modulate to G major after some time (all in the exposition), and after changing keys for some time in the development, move fixedly to C (recapitulation). The assumption is that return to the original key is perceived by listeners and provides perceptual coherence to the musical piece. Here we outline some findings regarding perception of larger (higher) levels of structure.
Cook (1987) addressed memory for tonal centers in two experiments. In Experiment 1, Cook’s participants listened to six excerpts from the Western art music repertoire in their original versions, in which establishing and returning to a central key is presumed to be crucial to structural unity and aesthetic value; each excerpt was also presented in a version where such tonal unity was disrupted by ending in a different key. Care was taken to make surface transitions in the modified excerpts unobtrusive. Listeners compared the two versions of each piece, rating them for pleasure, expressiveness, coherence, and sense of completion. Only two of the six stimuli—the shortest ones—showed any significant results in the ratings, suggesting that tonal closure had at best a limited role in listeners’ assessments of the works. A follow-up experiment produced similar results: tonal closure seemed to affect judgments only for very short pieces (ca. one minute).
Marvin and Brinkman (1999) revisited the question of tonal memory. In a first experiment, they presented musical excerpts (1.5–3 minutes) to participants with substantial musical experience. Excerpts remained in one key, modulated to the dominant, or modulated to a less-closely-related key. Participants answered various questions about each excerpt, including its historical style/period, meter, and texture; the crucial question was whether or not the excerpt began and ended in the same key, testing memory for tonality explicitly. Performance was significantly above chance regarding tonal change or not, indicating that skilled musicians were sensitive to longer-range tonal movement. A second experiment used short pieces (primarily Baroque dance movements) in two versions: originals, with idiomatic tonal relationships including the establishment of, departure from, and return to a central tonality, and alterations, where tonal plans of the works were disrupted by reordering the works’ segments so as to not begin and end in the originals’ tonic. Participants were unable to determine if the works began and ended in the same key, although they were able to distinguish between works that ended in the tonic from those that did not.
Research into memory for tonality and tonal structure continues to this writing. In a recent study, Farbood (2016) sought to determine for how long tonal centers were held in memory by listeners. Using tension judgments, and in particular the ways in which their slopes change as harmonic progressions unfold, Farbood’s two experiments found that some memory trace for a tonal center could persist for up to 20 seconds, but there was no evidence for times longer than this. We are thus unable to directly evaluate listeners’ comprehension or use of tonal structure over longer time spans.
Some researchers have related tonal progressions to a felt sense of musical tension. The empirical literature based on this notion stems from a variety of theoretical sources (Schoenberg, 1969; Parncutt, 1989; Lerdahl & Jackendoff, 1983; Lerdahl, 2001). Theories of tonal tension connect dynamic internal or phenomenological states (tension and relaxation) to the establishment of, departure from, and return to tonal points of repose (cf. Shanahan, this volume). Moving away from a tonal resting place increases tension: the further the distance, the greater the tension.
Krumhansl (1996) investigated listeners’ responses to the first movement of Mozart’s piano sonata K. 282. This movement is either a binary, two-part form, or a sonata-allegro form (Hepokoski & Darcy 2006) with three main divisions (exposition, development, and recapitulation); it presents a typical harmonic plan of tonic moving to dominant (exposition), working through other key areas more briefly (development), and returning to and remaining in the tonic (recapitulation). Krumhansl’s participants provided segmentations of the movement, continuous ratings of musical tension using a slider, and identifications of new ideas in the music as they arose. Musical tension responses seemed to be based on a variety of musical features including harmonic tension as well as melodic contour, dynamics, structural and stylistic details, and fluctuations in tempo. When tempo and dynamics were controlled for, tension ratings remained much the same, indicating that performance parameters may contribute to, but do not define, the ebb and flow of musical tension.
Bigand and Parncutt (1999) investigated musical tension in chord sequences longer than in most experiments but briefer than full compositions; one sequence was a simplification of most of the Prelude Op. 28/9 by Chopin. They tested predictions from Parncutt (1989) and Lerdahl (Lerdahl & Jackendoff, 1983; Lerdahl, 1988): Pitch commonality between successive chords versus hierarchic “distances” between harmonic events as sources of perceived tension. Their findings supported an important role for harmony in such judgments, but not necessarily as predicted: Local phenomena, such as cadence formulas, were more influential than higher-level hierarchic structures. Further, a return to tonic after a modulation was heard as increasing tension, rather than decreasing it. These results indicated that listeners were able to use harmonic information at local levels but not as part of an integrated hierarchic view of the excerpts, in line with findings of Tillmann, Bigand, and Madurell (1998) and Cook (1987).
Building on these earlier studies, Lerdahl and Krumhansl (2007) present a quantitative model of tonal tension with four components: hierarchical (prolongational) event structure, tonal pitch space distance, surface (psychoacoustic) dissonance, and voice-leading attraction (the sense that some tones tend more or less strongly toward others, e.g. "ti" → "do"). Excerpts included all of Chopin Op. 28/9 (ca. 90 seconds), as well as excerpts by Bach and Wagner, Participants heard the first event and made a judgment, then the first two events, and so on until the entire excerpt had been heard. Judged and predicted tension matched well after the addition of a fifth parameter, melodic contour. The authors conclude that their model is suitable for understanding perception of hierarchical tonal structure, although some tension will be situational rather than algorithmic.
An alternative to tonal hierarchy models suggests that idiom-specific categories of musical events act as “signposts” in a composition’s rhetorical flow to help a listener track long-range structure. Granot and Jacoby (2011a, 2011b) used a version of the puzzle-unscrambling task, with the first movement of sonatas by Mozart (first study) or Haydn (second study) as stimuli. The segments of a movement were presented in a random order, to be reassembled into a musically logical and coherent order. Few participants replicated the original composition, but their responses showed sensitivity to the placement of the opening and ending segments, with their conventional rhetorical features; the overall ABA’ design of a movement based on exposition, development, and recapitulation; and the function within the movements of the less stable segments (bridge and development). Their responses also indicated sensitivity to relationships between motivic/thematic materials, but as in other studies harmonic structure was important only at local, not global, levels. Participants’ decisions seemed to be guided by their knowledge of musical symmetry, tension, and repose; the uses of typical opening and closing gestures; and thematic, but not harmonic, organization.
On a musically smaller scale (ca. 10 seconds), Neuhaus (2013) provides evidence from both behavioral tasks and EEG (focusing on the N300 ERP) that participants were sensitive to formal units and their combinations, such as grouping AABB hierarchically into two chunks of AA and BB, or ABAB into two chunks, AB AB, as opposed to hearing these purely as linear sequences. Behavioral results indicated that contrasts between adjacent segments, for example in melodic content (A vs B) or contour, facilitated hierarchic interpretations. Contrastingly, the brain responses shown in the ERPs indicated participants’ recognition of recurring patterns in the music, particularly rhythmic patterns, irrespective of whether the same pattern occurred in adjacent or nonadjacent segments (e.g. AA BB or A B A B). Neither of these results support a strict concatenationist account of music listening.
Vallières, Tan, Caplin, and McAdams (2009) investigated formal functions (Caplin, 1998) directly, using excerpts from compositions by Mozart with functions of beginning, middle, or end, and asking participants to categorize each with one of these labels, as well as assessing how strongly the function was conveyed. Excerpts were correctly labeled at levels significantly above chance; endings were the most successfully identified, followed by middles and beginnings. Musical experience also impacted performance, with musically trained participants performing better, due to their greater experience with the musical style and its characteristic gestures (see Gjerdingen, this volume). These results echo the findings of Granot and Jacoby (2011a, 2011b), showing listeners are sensitive to such formal functions both in and out of whole-composition contexts.
Finally, Sears, Caplin, and McAdams (2012), focusing on stereotyped ending formulae, found that listeners distinguished between different types of cadential formulae (perfect, imperfect, and half) in their ratings of how complete a sense of closure these types communicated. Unlike some other studies of cadences, this study used repertoire excerpts (rather than more abstract chord progressions). Listeners did not have to rely solely on harmonic progression, but could make use of the rich variety of musical details available in understanding the excerpts they heard (cf. Gjerdingen, this volume).
We have seen that listeners’ ability to remember several larger-scale aspects of musical works seems to be quite limited, calling into question the status of music-analytic theories as psychological theories. But what if the goal of musical structures, and the listenings they enable, is not comprehension of musical events, but is instead aesthetic experience—to find the work pleasing, moving, or expressive (Margulis, this volume)? Remembering events and their interrelationships would then not be as important as aesthetic response.
This question has been investigated extensively by Bigand and Tillmann and their various collaborators; for a broad and thoughtful review, see Tillmann and Bigand (2004). A basic method used in their collaborations (Tillmann & Bigand, 1996, 1998; Tillmann, Bigand, & Madurell, 1998; Bigand, Madurell, Tillmann, & Pineau, 1999) was the manipulation of relatively short musical compositions (up to about three minutes in duration) so as to potentially impact their structural cohesiveness or musical expressiveness.
Tillmann and Bigand (1996) presented participants with three short compositions (by Bach, Mozart, and Schoenberg) in their normal form and with each piece divided into brief “chunks” and played with these segments in reverse order. Participants without significant musical expertise found the original and retrograded versions equivalent when rated on various scales related to expressiveness; the retrograding did affect ratings for the Schoenberg composition somewhat. The experimenters conclude that musical structure is not necessarily important at the global level for a piece to be heard as expressive. In a subsequent study (Tillmann & Bigand, 1998), short minuets were segmented into “chunks” of 4, 2, or 1 measures, and the coherence of the composition disrupted, for example by scrambling the order of the segments or by transposing the pitch level of the segments by 1 or 2 semitones. Transformations had little impact on listeners’ abilities to comprehend the music, as measured by their performance in identifying a “target” passage while listening; disturbing a piece’s tonal coherence by transposition had no significant effect, and the reordering manipulation’s impact was limited to the most extreme version, in which 15 of 16 measures were reordered. The overall conclusion was that musical coherence was a local phenomenon, not one related to a piece’s overall structure. Tillmann and Bigand read these results as support for the moment-to-moment “concatenationist” listening framework of Levinson (1997), in which musical comprehension is strongly constrained to the perceptual present.
However, claims for a purely concatenationist view of musical perception need some moderating. Lalitte and Bigand (2006) presented listeners with 3-minute excerpts from six pieces, original and segmented-(into 29 chunks)-and-scrambled. Participants found the original versions more musical (as assessed by their judgment of the musical skill of the producer assembling the segments into a whole), indicating that the lack of coherence in large-scale structure was perceptible.
Mindful of the violence being done to musical structure by such scrambling methods, Eitan and Granot (2008) took another approach. Their participants heard either intact compositions by Mozart (K. 332 Mvt. 1, or K. 280, Mvt. 1) or hybrids, where analogous sections from the two movements were interchanged, preserving the structural plan but changing content and disrupting any compositionally-ordained coherence and unity. Participants heard one intact and one hybrid version with a cover story: Mozart composed both of these but chose to publish only one—which one? Responses included ratings for coherence, interest, and other musical and aesthetic aspects, as well as indicating the preferred version. Listeners did not significantly prefer original versions to hybrids, sometimes preferring the hybrids. The authors conclude that these results provide evidence against organic coherence and inner unity as the basis for listeners’ assessment of musical works.
Empirical research into the perception of form in tonal music indicates that traditional theories of musical structure can provide motivation and hypotheses but are not satisfactory as theories of perception. In particular, limitations on real-time processing and on memory constrain listeners’ performance on many experimental measures of formal perception—but seemingly without fatally damaging the hedonic and aesthetic dimensions of music listening. Nevertheless, we are beginning to understand how listeners make sense of larger musical structures—of what matters and how. Cognitively-based theories of rhythmic and pitch structures have arisen in part as the result of such insights and have been useful in understanding hierarchical structure and the ebb and flow of musical tension; listeners understand the typical and schematic aspects of musical idioms, including formal functions and important semiotic and rhetorical signposts related to beginnings, middles, and ends of musical groups.
Taken together, such results reinforce findings from other studies which show that participants’ familiarity with the music they are hearing—for example in repeated and focused listening—facilitates the kind of categorization and associative thought which is crucial for grasping musical structure involving nonlocal relationships and nonadjacent events (Zbikowski, this volume). Passive listening, especially to less preferred or unfamiliar musical idioms, should not be expected to particularly facilitate musical comprehension, compared to the enriched results to be obtained with more active involvement with the music (God-hino, 2016). Listeners’ existing knowledge of music, stored in long-term memory, is the basis on which their comprehension of any new musical work is predicated, and to the degree their knowledge does not match new percepts, comprehension of these will be difficult and incomplete (cf. Danielssen & Brøvig-Hanssen, and Gjerdingen, this volume). Results from my own research group (Ashley, 2015, submitted) indicate that testing college-aged participants on their memory for and perception of form in pop songs—the kind of music with which all of them are most familiar—yields quite high levels of performance. We should not be surprised that listeners for whom “classical” music is a foreign or acquired musical language do not experience it as might be expected; we should, rather, seek to understand the ways they use their everyday musical experiences to understand new works, and to better understand how they comprehend musical works in their “mother tongues.”
Burnham, S. (2002). Form. In T. Christensen (Ed.), The Cambridge history of Western music theory (pp. 880–906). Cambridge: Cambridge University Press.
Deliège, I., & Mélen, M. (1997). Cue abstraction in the representation of musical form. In I. Deliège & J. A. Sloboda (Eds.), Perception and cognition of music (pp. 387–412). Hove, UK: Psychology Press.
Dibben, N. (1994) The cognitive reality of hierarchic structure in tonal and atonal music. Music Perception, 12, 1–25.
Krumhansl, C. (1996). A perceptual analysis of Mozart’s Piano Sonata K. 282: Segmentation, tension, and musical ideas. Music Perception, 13, 401–432.
Lerdahl, F., & Jackendoff, R. (1983). A generative theory of tonal music. Cambridge, MA: MIT Press.
Marvin, E. W., & Brinkman, A. (1999). The effect of modulation and formal manipulation on perception of tonic closure by expert listeners. Music Perception 16, 389–407.
Pollard-Gott, L. (1983). Emergence of thematic concepts in repeated listening to music. Cognitive Psychology, 15, 66–94.
Tillmann, B., & Bigand, E. (2004). The relative importance of local and global structures in music perception. The Journal of Aesthetics and Art Criticism, 62, 211–222.
Ashley, R. (2015). Grammaticality in popular music form: Perceptual and structural aspects. Presentation at the Society for Music Perception and Cognition, Nashville, Tennessee.
Ashley, R. (submitted). Comprehension of musical form in popular songs: Structure, perception, and memory.
Berry, W (1986). Form in music 2nd ed. Englewood Cliffs, NJ: Prentice-Hall.
Bigand, E., Madurell, F., Tillmann, B., & Pineau, M. (1999). Effect of global structure and temporal organization on chord processing. Journal of Experimental Psychology: Human Perception and Performance, 25(1), 184—197.
Bigand, E., & Parncutt, R. (1999). Perceiving musical tension in long chord sequences. Psychological Research, 62(4), 237—254.
Caplin, WE. (1998). Classical form: A theory of formal functions for the instrumental music of Haydn, Mozart, and Beethoven. New York, NY: Oxford University Press.
Clarke, E. F., & Krumhansl, C. L. (1990). Perceiving musical time. Music Perception, 7,213—251.
Cone, B.H (1968). Musical form and musical performance. New York, NY: W.W, Norton.
Cook, N. (1987). The perception of large-scale tonal closure. Music Perception, 5, 197—205.
Cooper, G.,&Meyer, L. (I960) The rhythmic structureof music. Chicago, IL: University of Chicago Press.
Covach, J. (2005). Form in rock music. In D. Stein (Ed.,) Engaging music: Essays in music analysis (pp. 65—76). New York, NY: Oxford University Press.
Deliège, I. (1987). Grouping conditions in listening to music: An approach to Lerdahl & Jackendoff's grouping preference rules. Music Perception, 4, 325—360.
Deliège, I. (1996). Cue abstraction as a component of categorisation processes in music listening. Psychology of Music, 24,131—156.
Deliège, I. (2001). Introduction: Similarity perception ↔ Categorization ↔ Cue abstraction. Music Perception, 18,233—243.
Deliège, I., Mélen, M., Stammers, D., & Cross, I. (1996). Musical schemata in real-time listening to a piece of music. Music Perception, 14, 117—160.
Dowling, W J. (1978) Scale and contour: Two components of a theory of memory for melodies. Psychological Review 85, 341—354.
Eitan, Z., & Granot, R.Y. (2008). Growing oranges on Mozart's apple tree: "Inner form" and aesthetic judgment. Music Perception 25,397—418.
Farbood, M. (2016). Memory of a tonal center after modulation. Music Perception 34, 71—93.
Francès, R. (1988). The perception of music. (W.J. Dowling, trans.). New York, NY: Psychology Press. (Originally published in 1958)
Gjerdingen, R. (2007). Music in the galant style. New York, NY: Oxford University Press.
Godinho, J. C. (2016). Miming to recorded music: Multimodality and education. Psycliomuskohgy: Music, Mind, and Brain, 26, 189—195.
Granot, R., & Jacoby, N. (2011 a). Musically puzzling I: Sensitivity to overall structure in the sonata form? Musicae Scientiae, 15, 365—386.
Granot, R., & Jacoby, N. (2011b). Musically puzzling II: Sensitivity to overall structure in a Haydn E-minor sonata, Musicae Scientiae, 16, 67—80.
Hanninen, D. (2012). A theory of music analysis: On segmentation and associative organization. Rochester, NY: University of Rochester Press.
Hepokoski, J. A., & Darcy, W. (2006). Elements of sonata theory: Norms, types, and deformations in the late-eighteenth-century sonata. New York, NY: Oxford University Press.
Lalitte, P., & Bigand, E. (2006). Music in the moment? Revisiting the effect of large scale structures 1, 2. Perceptual and Motor Skills, 103, 811–828.
Lamont, A., & Dibben, N. (2001) Motivic structure and the perception of similarity. Music Perception, 18, 245–274.
Lerdahl, F. (1988). Tonal pitch space. Music Perception 5, 315–345.
Lerdahl, F. (2001). Tonal pitch space. New York, NY: Oxford University Press.
Lerdahl & Krumhansl (2007). Modeling tonal tension. Music Perception, 24, 329–366.
Lewin, D. (1993). Musical form and transformation: Four analytic essays. New Haven, CT: Yale University Press.
Levinson, J. (1997). Music in the moment. Ithaca, NY: Cornell University Press.
Neuhaus, C. (2013). Processing musical form: Behavioural and neurocognitive approaches. Musicae Scientiae, 17, 109–127.
Parncutt, R. (1989). Harmony: A psychoacoustical approach. Berlin: Springer-Verlag.
Schenker, H. (1979). Free composition: Volume III of new musical theories and fantasies. (E. Oster, trans.) New York, NY: Pendragon Press. (Original work published 1935)
Schoenberg, A. (1969). Structural functions of harmony. New York, NY: W.W. Norton.
Sears, D., Caplin, W., & McAdams, S. (2012). Perceiving the classical cadence. Music Perception, 31, 397–417.
Tillmann, B., & Bigand, E. (1996). Does formal musical structure affect perception of musical expressiveness? Psychology of Music, 24, 3–17.
Tillmann, B., & Bigand, E. (1998). Influence of global structure on musical target detection and recognition. International Journal of Psychology, 33, 107–122.
Tillmann, B., Bigand, E., & Madurell, F. (1998). Local versus global processing of harmonic cadences in the solution of musical puzzles. Psychological Research, 61, 157–174.
Vallières, M., Tan, D., Caplin, W. E., & McAdams, S. (2009). Perception of intrinsic formal functionality: An empirical investigation of Mozart’s materials. Journal of Interdisciplinary Music Studies, 3, 17–43.