James A. W. Heffernan
In the Western world at least, the long history of theorizing about the relations among language, literature, and visual art is almost as old as written language itself. Some time before the middle of the fifth century BCE, the Greek lyric poet known as Simonides of Ceos (c.556–468 BCE) reportedly stated that “Painting is mute poetry, poetry a speaking picture.” (quoted in Plutarch 1954, 4: 346f). In his admirably compact history of interart theory up though the Renaissance, Leonard Barkan takes Simonides’ dictum as his point of departure, but in observing that it “seems to be even‐handed” (Barkan 2013: 30), he curiously fails to note that its chiastic symmetry masks a radical asymmetry: while poetry equals picture plus speech, painting equals poetry minus speech. Nevertheless, the logocentrism thus lodged in what is perhaps the oldest known formula for interart relations is the key to Barkan’s argument about the history of those relations up to and through the Renaissance. When, for instance, Horace compares poetry to painting (ut pictura poesis, “as in a painting, so a poem”), he takes for granted the meaning of pictura “so as to prove something about” poetry.1 From Aristotle to Sidney, Barkan contends, logocentric theorists use “the point that x is true of pictures” to argue that x is also true of poems, but x “refers to a set of properties that word‐makers have imposed on pictures” (Barkan 2013: 30).
In the history of post‐Renaissance theorizing about the arts, logocentrism likewise stamps the most influential of all treatises on the relation between poetry and painting: G. E. Lessing’s Laocoön, first published in 1766. Taking arms against the claim that poetry and the visual arts are fundamentally similar, and especially against the notion that poetry is a kind of painting, Lessing argued that visual art—sculpture as well as painting—fundamentally differs from poetry: while poetry is essentially temporal, representing a succession of actions, visual art is essentially spatial, representing fixed forms juxtaposed in space. “Succession of time,” Lessing writes, “is the province of the poet just as space is that of the painter. It is an intrusion of the painter into the domain of the poet, which good taste can never sanction, when the painter combines in one and the same picture two points necessarily separate in time” (Lessing 1766: 98).
By means of overtly territorial terms such as “domain,” Lessing’s decree severely restricts the stories that visual art can tell. Visual art can tell a story, he writes, but only by depicting its “most pregnant” (prägnantesten) or “most suggestive” moment: the moment that best recalls what precedes it and best anticipates what follows it (1766: 34). Representing, for instance, the Virgilian story of how Laocoön was killed, the famous ancient sculpture of the Trojan prophet and his sons shows them fatally gripped by the pair of giant serpents who have just slithered up out of the sea and will shortly leave the three men dead. Nevertheless, Lessing insists that no one work of visual art can combine “two points necessarily separate in time” without violating “good taste.” Does Michelangelo really affront “good taste” on the ceiling of the Sistine Chapel, where he depicts in a single fresco both the original sin of Adam and Eve and their expulsion from paradise?2 Even if paintings like these could be somehow categorized as aberrational departures from the mainstream tradition of one‐point perspective and uni‐temporal art, we may well ask ourselves if any such essentialist theory of the arts can survive in an age when all arts have been digitized, when all texts and pictures are ultimately reducible to pixels or dots—the stem cells of all printed words and reproducible images, most of which and soon all of which, no doubt, can be readily called up on our computer screens.3
If Lessing’s laws can hardly accommodate Michelangelo’s depiction of the Temptation and Fall, what would he say about works of digital art such as Ori Gersht’s Pomegranate (2006)? When examined for more than a few seconds, this would‐be still life of a pomegranate, a cabbage, and a pumpkin turns out to be a high‐definition film of the pomegranate struck by a bullet and then exploding its seeds in slow motion.4 Yet if Lessing’s theory seems to oversimplify the ways in which art and literature represent space and time, even in his own time (let alone ours), we may be underestimating the complexity of his text.
Consider for instance the conflict between his claims for the limitless freedom of poetry and his firmly stated rule about what poetry cannot do. On the one hand, Lessing follows Du Bos and Mendelssohn—the latter a crucially important precursor—in distinguishing between the so‐called “natural” symbols of visual art and the arbitrary signs of language, which give the poet much more freedom than the painter (Lessing 1766: 40). On the other hand, Lessing comes very near to decreeing that poetry cannot give us more than one verbal picture at a time. In Virgil’s description of the serpents encoiling the stomach (medium) and neck (collo) of Laocoön, he says, “we must see first only the serpents and then Laocoön; we must not attempt to picture to ourselves how both would look together” (1766: 43). How can we reconcile these severe restrictions on the writer and reader of poetry with the limitless freedom that Lessing claims for language, not to mention the intensity with which Virgil’s description binds the serpents to the body of Laocoon, making it impossible for us not to see both together?
The problem with Lessing’s argument in chapter 6 resurfaces in chapters 16 and 17, where Lessing puts both of the arts on a short leash: painting can imitate actions, but only through bodies; poetry can depict bodies, but only through actions (1766: 78). After thus giving perfectly symmetrical tasks to painting and poetry, he binds them both to a principle of physical, material analogy that effectively denies to poetry the freedom he elsewhere claims for the arbitrary signs of its language. “Signs that follow one another,” he says, “can express only objects whose wholes or parts are consecutive,” and “poetry in its progressive imitations can use only one single property of a body” (1766 78–9). How then can Lessing explain or justify Virgil’s reference to two parts of Laocoön’s body—his stomach (medium) and his neck (collo)—in a single line of the Aeneid (2: 218)?
In chapter 17, however, Lessing takes up a more fundamental objection to his argument when he finally tackles the glaring contradiction between his claims for the freedom of arbitrary signs in language and his severe restrictions on the language of poetry. With the aid of his explanatory letter to Friedrich Nicolai (written in 1769, three years after Laocoön first appeared), we can see how he tries to resolve this contradiction. While language consists of “arbitrary” signs, as he says in Laocoön (1766: 88), painting should use only “natural signs” (quoted in Krieger 48). But unlike prose, poetry “must try to raise its arbitrary signs to natural signs: only that way does it differentiate itself from prose and become poetry” (quoted in Krieger 1992: 49). Just as painting aims to create the illusion that we are seeing one or more objects actually deployed in space, poetry should create the illusion that we are actually watching one event follow another in time. For Lessing, then, poetry should take as its model the illusionism of visual art. Is this the very same theorist who set out to liberate poetry from the shackles of pictorialism? So far from reiterating a reductive formula, he seems to have had a lively debate with himself—and thus left us with provocative questions as much as with straightforward answers.
Two hundred and fifty years after Lessing’s Laocoön, with centuries of theorizing behind us and digital technology all around us, we might be inclined to say that Lessing has long since had his day—especially since his concept of the “natural sign” has been all but demolished by contemporary semiotics, the theory of signs. Yet paradoxically, Lessing’s own discussion of signs anticipates semiotics, which aims to bridge the divide that Lessing sought to establish between poetry and painting, language and visual art. Taking the elements of visual art as signs, semiotically minded theorists of art argue that pictures can be decoded and thus read just as if they were texts.
The semiotic approach to visual art remains truly provocative. Though all pictures worthy of close scrutiny demand some sort of interpretation, we instinctively tend to dissociate the viewing of pictures—or even the study of them—from reading, which typically means the perusal of verbal texts. Consider this simple linguistic fact: while literacy is a virtually universal criterion of cultural advancement for individuals and nations alike, the English language has no generally accepted word for the ability to understand pictures, which is commonly thought to be—so to speak—written into the eye at birth, a product of instinct rather than education. However technical, sophisticated, and deliberate may be the art of rousing this ability, the means by which a picture does so have been naturalized by master tropes such as the window, which literally as well as figuratively frames the rules of perspective formulated in the Renaissance. By translating the three dimensions of visual experience into two, Renaissance masters such as Alberti prompt us to take the translation for the original, to read depth into a flat surface without any consciousness of doing so, and hence to embrace the illusion that we are looking through an open window: that we are seeing rather than reading.
We feel anew our bondage to this illusion every time it is broken by verbal or visual means. Do we ever wholly cease to feel cognitive dissonance when a painting that prompts or even compels us to “see” a pipe is labelled “This is not a pipe”? Or when the sharp, pointed fragments of a window pane that has been painted to represent a rural sunset are shown fallen to the floor in a painting wittily entitled Evening Falls?5 Paradoxically, paintings such as these confirm the power of illusion in the very act of playfully undermining it. We cannot even discuss such paintings without referring to at least some of their elements as if they were real objects, as I did just now in mentioning “fragments of a window pane.”
The notion that pictures are automatically readable—instantly recognizable by “natural” resemblance to what they represent—is what chiefly underwrites the long history of theorizing about the difference between images and words, which signify by means of arbitrary convention. Surprisingly enough, the theorist best known for his landmark essay on the difference between words and images actually treated both of them as signs. But in defining poetry as essentially temporal and painting as essentially spatial, Lessing makes both depend on a natural correspondence that aims to circumvent or elide the process of decoding and hence reading signs, to make them windows through which we sensuously, intuitively “see” the natural world of time and space.
Though the problematic concept of the “natural sign” seems to bridge the divide between nature and convention in Lessing’s theory of visual art, the rise of semiotics has merely driven them farther apart. Forty years ago, in explaining C. S. Peirce’s typology of signs (icon, index, symbol), Jonathan Culler wrote that “the icon involves actual resemblance between signifiant and signifié: a portrait signifies the person of whom it is a portrait not by arbitrary convention only but by resemblance,” which he goes on to call “natural resemblance.” Yet “[i]n the sign proper as Saussure understood it,” Culler adds, “the relationship between signifier and signified is arbitrary and conventional” (Culler 1975: 16). Natural signs, therefore, can take their place within the realm of semiotics only if they have been denaturalized.
For semiotically minded theorists, all visual art is a compendium of denaturalized signs. In Vision and Painting (1983), which appeared just eight years after Culler’s Structuralist Poetics, Norman Bryson took arms against what he called the “doctrine of Perceptualism” espoused by E. H. Gombrich. According to Gombrich, whose theory of art and its history was largely based on the psychology of illusion, painting is a record of perception: re‐creating for the viewer what the artist has seen, it generates from a two‐dimensional canvas the illusion of three‐dimensional space (Gombrich 1969). Contesting this doctrine even as applied to figural painting (it hardly fits abstract art), Bryson defined painting as “an art of signs, rather than percepts.” Amplifying Ferdinand de Saussure’s conception of meaning as the product of binary oppositions among signs in a self‐enclosed system, Bryson argued that painting “is an art in constant touch with signifying forces outside” it. Signs, therefore, must be decoded in light of the world that surrounds the production of them, which includes the viewer as someone not timelessly “given” but historically constructed—diachronic rather than synchronic. The viewer thus becomes “an interpreter” (Bryson 1983: xii–xiv).
The concept of the viewer as interpreter trails a long history of its own. It dates at least as far back as the third century of the Christian era, when a Greek rhetorician named Philostratus described a series of paintings “for the young, that by this means they may learn to interpret paintings.” Fourteen centuries later, Nicolas Poussin advised a viewer of his Fall of the Manna in the Wilderness (1638–1639) to read the painting as well as the biblical story it represents in order to see how everything in it fits the subject. “Lisez l’histoire et le tableau,” he wrote, “afin de connaitre si chaque chose est appropriée au sujet.”(quoted in Lee 1967: 30). What Poussin meant by reading a painting closely resembles what Philostratus did, which was to convert each painting into a narrative, or deliver the narrative implied by the moment of action it represents.6 But as Bryson presents it, the semiotic approach to painting moves well beyond narrative. It construes each element of a painting as a historically determinate sign of the culture that generated it, and the painting itself as a site of “interaction between political, economic, and signifying practices” (Bryson 1983: xiii).
Three years after Bryson’s Vision and Painting appeared, W. J. T. Mitchell published Iconology, the first of a series of books that have long since established him as the premier theorist of interart relations in our time. Like Bryson, Mitchell regards pictures as pictorial signs, and with the aid of Nelson Goodman, he strongly contests Gombrich’s claim that images signify objects by means of “natural” resemblance rather than convention, as words do (Gombrich 1981). “The history of culture,” Mitchell writes, “is in part the story of a protracted struggle for dominance between pictorial and linguistic signs, each claiming for itself certain proprietary rights to a ‘nature’ to which only it has access” (Mitchell 1986: 43). Central to Mitchell’s ongoing case against this proprietary claim is the conviction that words and images deeply inform each other. Just as pictures can hardly be seen or “read” except in terms of language, language is so thoroughly steeped in the imagery of metaphor that we can hardly say where “image” ends and “word” begins. This is not to deny all differences between words and pictures, for as Mitchell has noted, “you can hang a picture, but you can’t hang an image.” An image, he writes, “is what can be copied from the painting in another medium, in a photograph or a slide projection or a digital file” (Mitchell 2015: 16).
Nevertheless, semiotic theory has not yet wholly vanquished perceptualism, nor altogether solved what might be called the enigma of recognition. In identifying what pictures and even photographs represent, we may well be reading them through the framework of cultural conventions, as semioticians argue. But how can cultural conventions explain our capacity to recognize pictures made by people to whose culture we have no other access—such as the Paleolithic cave paintings in Lascaux, France, which depict what are widely recognized as a bison, horses, stags, and a bull? Recognition, in short, has yet to be banished from the experience of art, has yet to be subsumed by any theory that would simply equate the viewing of a picture with the decoding of signs. What Mitchell himself wrote in 1994 remains true to this day: in an age of “all‐pervasive image‐making, we still do not know exactly what pictures are, what their relation to language is, how they operate on observers and on the world, and what is to be done with or about them” (Mitchell 1994: 13; emphasis mine). The point I have italicized, I believe, is the key one. No linguistically based theory of signs can exhaust the meanings generated by visual art, no label can predetermine or predict all that we can discover in the patient scrutiny of a painting, and some of its most poignant features may be impossible to name.
For the purposes of literary theory, however, the most important questions to be raised about the relation between the visual and the verbal revolve around the words we use to describe or interpret works of visual art in a kind of writing known as ekphrasis, which demands special attention.
Broadly speaking, visual art can respond to language in three ways: a work of art can illustrate a work of literature, as Poussin does with an Old Testament story in Fall of the Manna in the Wilderness; a work of art can incorporate language, as William Blake famously does in the composite art of his illuminated poems; and in our own time, works of art sometimes depict language alone, as in the black‐and‐white stenciled words of Christopher Wool’s Apocalypse Now (1988):7
To these three ways in which visual art may converse with language, ekphrasis adds a fourth: writing about art. Unlike illustration, composite art, and word pictures, ekphrasis is wholly verbal, written to be read. It does not even require the existence of an actual picture, for the picture represented by ekphrasis may be wholly imaginary or otherwise inaccessible to the reader, as in the Icones of Philostratus. As already noted, this ancient Greek rhetorician (born about 190 CE) unpacked the stories told by a number of paintings which have long been lost—if they ever existed at all.
Spawned by the ancient Greek teaching of rhetoric, the word ekphrasis has a complicated history. In the first century of our era, Ailios Theon of Alexandria defined it as logos periegematikos, literally “leading around speech”: language that vividly describes or exhibits anything visible (Webb 1992: 35). By the fifth century, however, ekphrasis had come to denote the description of visual art, and in current usage it is normally used to mean “the rhetorical description of a work of art,” as the Oxford Classical Dictionary defines it, or “the poetic description of a pictorial or sculptural work of art,” as Leo Spitzer defined in 1955 (Spitzer 1962: 72). More recently, Murray Krieger has treated ekphrasis chiefly as “word‐painting,” the verbal counterpart of visual art. As “the sought for equivalent in words of any visual image, in or out of art,” he writes, it “include[s] every attempt, within an art of words, to work toward the illusion that it is performing a task we usually associate with an art of natural signs.” Ekphrasis thus gratifies our lust for natural signs—for the immediate presence of the object signified—by defying the “arbitrary character and … temporality” of language. It offers us a verbal icon, “the verbal equivalent of an art object sensed in space” (Krieger 1992: 9–10). Yet if works of art “are structures in space–time” rather than either spatial or temporal, as W. J. T. Mitchell argues (Mitchell 1986: 103), ekphrasis must allow for both elements in the works it represents. For this reason I have defined it as the verbal representation of visual representation (J. Heffernan 1993: 3). This definition makes room for descriptions of paintings and sculptures that represent anything at all, whether someone or something in motion or a still object like Magritte’s famous pipe.
As a literary genre, ekphrasis ranges from ancient rhetorical exercises in description through art criticism to poetry and fiction. Furthermore, since digital technology and cinema have animated visual art itself, the verbal representation of visual representation has become more fluid than ever before. While traditional ekphrasis generates a narrative from a work of art that is still in both senses, silent and motionless, cinematic ekphrasis exploits the metamorphic power of film to conjure a dream world that rivals and contests the order of realistic fiction (J. Heffernan 2016). In all of these cases, the verbal version of a work of visual art remakes the original. The rhetoric of art criticism aspires to make the work of art “confess itself” in language that is always that of the critic (J. Heffernan 2006: 39–68); ekphrastic poetry turns the work of art into a story that expresses the mind of the speaker; and ekphrastic fiction turns the work of art—whether still or moving—into a story that mirrors the mind of a character. Ekphrasis, then, is a kind of writing that turns pictures into storytelling words.
Thus defined, ekphrastic writing invites comparison with art criticism, and specifically with its rhetoric. This move may seem a detour from the high road of literature—especially if art criticism entails art history, the compilation of facts about painters and paintings and schools of painting and the sequence of pictorial styles. But the line between literature and art criticism starts to blur as soon as we consider the kinship between Homer’s description of the shield sculpted for Achilles in the eighteenth book of the Iliad—the founding instance of ekphrasis in Western literature—and the Eikones of Philostratus, the father of art criticism. Typically, Philostratus interprets a painting by turning it into a narrative: not the story of its making, as in Homer’s account of Achilles’ shield, but the story suggested by its shapes, which are identified with the figures they represent. Though he never explains just how the episodes of a story are depicted or arranged in a painting, he aims to make the work “confess itself”—in Leo Steinberg’s phrase (Steinberg 1972: 6)—through the inferred speech of its characters. He sometimes tells us what painted figures are saying to each other and what sounds they signify, such as shouting and piping. As Leonard Barkan has recently observed, the Eikones “do everything that pictures cannot do by themselves … They exploit picture to create words. All the nonpictorial experiences that the ekphrases elicit from paintings are linguistic” (Barkan 2013: 22–4).
Like art itself, of course, art criticism has undergone major changes since the time of Philostratus. Until photographic reproductions became widely available in the twentieth century, art criticism had to reproduce paintings in words, as Denis Diderot regularly did in the later eighteenth century for subscribers to his Salons, where he describes the paintings regularly exhibited at the Louvre and often generates elaborate stories from them. But now, we might say, art criticism no longer needs description, and storytelling is surely irrelevant to much of modern art—especially abstract art.
What sort of story, after all, can be told about an art that seems to turn its back on representation, on reference to any object or figure that we might recognize from our experience of the world outside the painting, and that might thus give us something to talk about? Modern art has been charged with declaring war on language itself. Yet if modern art ever aimed to silence the viewer, it has conspicuously failed. Its very renunciation of what we commonly take to be subject‐matter intensifies our need to talk about it. So far from silencing the critic, then, abstract art provokes and demands at least as much commentary as any of its precursors. In writing, for instance, about Shade (1959), an “abstracted nightscape” by Jasper Johns, Leo Steinberg reactivates most of the rhetorical strategies that have permeated art criticism from Philostratus onward. His commentary is driven by a series of narratives: the Homeric story of how Johns made the painting, the quotidian tale of a day ending (the shade “has been pulled down as if for the night”), the quasi‐apocalyptic story of darkness immutable (“and obviously for the last time”), and finally the art‐historical narrative of what Johns does with Alberti’s master trope: the open window of Renaissance art, with its sunlit three‐dimensional vistas, becomes the impenetrably occluded window of modern or postmodern art, with its resolutely flattened opacity (Steinberg 1972: 309).
Unapologetically literary, with allusions to Milton and Joyce, Steinberg’s response to Johns’s painting clearly shows how much we can learn about the art of ekphrasis by studying it in what might be called its purest form—as art criticism. Art criticism works so close to the border of ekphrastic poetry that it sometimes crosses that border. In one of the most remarkable ekphrastic poems of the twentieth century, John Ashbery’s “Self‐Portrait in a Convex Mirror,” Ashbery quotes not only from Giorgio Vasari’s sixteenth‐century Lives of the Painters, Sculptors, and Architects but also from Sydney Freedberg’s Parmigianino (1950)—a modern scholarly monograph. Though Ashbery—who has written a great deal of art criticism himself— surely knows the difference between that and poetry, he also demonstrates how much the first can feed the second. (J. Heffernan 1993: 169–89).
Nevertheless, ekphrastic poetry differs from art criticism (almost but not quite equivalent to ekphrastic prose) in some important ways. Typically, I have argued, the art critic delivers from a painting or sculpture some kind of story about what it represents. At the same time, art criticism draws our attention to the medium of representation—oil, watercolor, stone, wood—and the technique of the artist, who is himself (or herself) a major part of the story told by the critic. In other words, art criticism typically operates on three major components: the work of art, the thing it represents, and the artist who represents it. In some cases, of course, one or more of these three components is suppressed. Philostratus makes no reference to any of the painters who produced the works he describes, and in explaining the painting of Narcissus, he nearly elides the difference between the work and what it represents.
Ekphrastic poetry may likewise blur this difference, as when John Keats’s “Ode on a Grecian Urn” addresses the sculpted figures on the urn as if they could think and feel and pant and move (Keats 1982: 282–3). On the other hand, in repeatedly reminding them that they are fixed and frozen, the poem highlights their difference from the figures they represent, thus reckoning with both the work and the world it signifies. At the same time, Keats elides any reference to the sculptor who stands behind the urn. In spite of all the art historical questions he raises about the figures on the urn—“what men or gods are these?”—he never asks the first question typically posed by art history: who made it? This is largely because the work of sculpture described in the poem is imaginary or “notional,” as John Hollander calls it, made up in words by the poet himself (Hollander 1988).8
While many other ekphrastic poems likewise ignore the artist, this is hardly a defining feature of ekphrastic poetry, which—as in Ashbery’s “Self‐Portrait”—may have plenty to say about the creator of the work it contemplates. What truly differentiates an ekphrastic poem from a piece of art criticism is that the poem demands to be read as a work of art in its own right. So while art criticism treats the painter, the painting, and the object represented, the critic of ekphrastic poetry must also reckon with two other elements: the poet and the poem. Here too some elements may be suppressed. In his ode on the urn, Keats says nothing explicit about himself; just as he elides the sculptor, he seems to edit out the poet. But in the final stanza the poet creeps in as one of the observers of the urn, which teases “us out of thought,” thus making explicit his presence as one who is both struggling to grasp what the urn represents and shaping his own work of art in the process.
Besides taking the form of poetry, ekphrasis can also be found in works of prose fiction (Karastathi 2015), and ekphrastic fiction can represent not only painting, still photography, and sculpture but also film. In Manuel Puig’s novel Kiss of the Spider Woman, one prisoner describes to another a series of films that are mostly imaginary or “notional” but often composed of elements drawn from actual films. Cinematic ekphrasis, which I have discussed at length elsewhere (J. Heffernan 2014), radically challenges the notion that ekphrasis deals only with pictures that are still in every sense, silent and motionless. In Puig’s novel, the stories told about films strongly suggest that cinematic ekphrasis exploits the inherently dreamlike character of film, its metamorphic fluidity. This metamorphic character of film evokes a particular kind of embedded narrative to be found in literature well before the advent of cinema: the story of a dream, which can all too easily become a nightmare.
If the history of the relations between the verbal and the visual—word and image, literature and visual art—could be summed up in a single word, that word might be dialectical. Though a great deal of early theorizing about the arts highlights their sisterhood or similitude, a strong suggestion of rivalry and contention emerges well before the eighteenth century, when Lessing codified their differences. Yet in plainly stating that words and images are both signs, Lessing laid the groundwork for a theoretical dialogue or conversation between them that has vigorously persisted into our own time. In practice as well as theory, the relation between the visual and verbal arts has become almost symbiotic. On one hand, theoreticians such as Mitchell have persuasively shown how much language depends on imagery; on the other hand, the kinetic energy of digitized art has broken the line between verbal narrative and pictorial stasis, and pictures made entirely of words have erased the line between words and images. Confronted with all of the ways in which the visual and verbal converse in works of literature and art, theory can accommodate them only by listening as carefully as possible to their conversation.