© Springer Nature Switzerland AG 2020
T. Catalano, L. R. WaughCritical Discourse Analysis, Critical Discourse Studies and BeyondPerspectives in Pragmatics, Philosophy & Psychology26https://doi.org/10.1007/978-3-030-49379-0_2

2. Precursors to CDA and Important Foundational Concepts

Theresa Catalano1  and Linda R. Waugh2
(1)
Department of Teaching, Learning and Teacher Education, University of Nebraska-Lincoln, Lincoln, NE, USA
(2)
Departments of French, English, Linguistics, Anthropology; Language, Reading and Culture; and Interdisciplinary Doctoral Program in Second Language Acquisition and Teaching, University of Arizona, Tucson, AZ, USA
 
Keywords
Precursors to CDASystemic functional linguistics (SFL)Systemic functional grammar (SFG)MetafunctionsCritical linguistics (CritLing)Social semiotics (SocSem)Visual designMultimodality

2.1 Introduction

This chapter and the next are devoted to the precursors to, and the beginnings of, CDA. CDA did not arise all at once, since the originating work was developed at different times in various different academic communities and, in some cases, without their knowing about similar work until later. It is widely agreed that Critical Linguistics (CritLing 1), which was developed in the UK and Australia in the 1970s, was the earliest founding stone of CDA from within linguistics. It culminated in two important books at the end of the 1970s (Language and Control, Fowler, Hodge, Kress, & Trew, 1979a ; Language and Ideology , Kress & Hodge, 1979) and then others in the 1980s (especially Kress, 1989, Linguistic Processes in Sociocultural Practice)—all of which we will discuss below. CritLing came out of the line of British functionalist linguistics begun by John R. Firth, which was infused with the ideas of his anthropological colleague Bronislaw Malinowski, and was quite different from the formalist and structuralist approaches to linguistics at that time. Firth’s most eminent successor was generally recognized to be Michael A. K. Halliday, who developed Systemic Functional Linguistics (SFL), Systemic Functional Grammar (SFG) and Social Semiotics (SocSem). CritLing “was closely associated with ‘systemic’ linguistic theory (Halliday, 1978, 1985a)” (Fairclough & Wodak, 1997: 263), and thus many critical linguists and many of those—but by no means all—in CDA have used SFL and SFG as their main linguistic source. As a result, “an understanding of the basic claims of Halliday’s grammar and his approach to linguistic analysis is essential for a proper understanding of CDA” (Wodak, 2001: 8), as developed not only by Halliday himself but also other ‘systemicists’ who have “not only applied the theory, but also elaborated it” (e.g., Kress, 1976; Martin & Hasan, 1989; Martin, 1992). Halliday’s approach is very different from other linguists of the time (and now), and therefore, we will discuss those facets that are most important for an understanding of CritLing and CDA/CDS. However, we should also say here that other approaches in linguistics, such as discourse analysis (DA) and text linguistics and also sociolinguistics, and/or in other disciplines (e.g., ethnography and linguistic anthropology in anthropology; speech act theory in philosophy and then pragmatics; microsociology in sociology) and the interdisciplinary area of pragmatics are important both as precursors to or developing at the same time as CDA/CDS and were used in CDA/CDS. We will define those very briefly below as they become relevant. But first, we set the scene briefly, with a few words about Firth and Malinowski.

2.2 British Linguistics: John R. Firth and Bronislaw Malinowski

John Rupert Firth (Firth, 1957a; Bazell, Catford, Halliday, & Robins, 1966; Palmer, 1968a) was seen by many as the ‘father’ of British linguistics and founder of the London School (aka Firthian linguistics) at the University College of London (UCL ). While he was familiar with the European and American approaches to linguistics at that time, he was very different from them, since he embraced many of the ideas of his well-regarded colleague at UCL , the British social anthropologist (of Polish origin) Bronislaw Malinowski (1923, 1935), the “father of modern ethnography” (Duranti, 1985: 196; see Sects. 4.​5, 6.​4, and 6.​5), who worked on the “primitive” languages and cultures of the South Pacific. Malinowski declared that language is “an instrument of action” (Malinowski, 1935); thus, language is not self-contained; rather, it is dependent on the society and culture in which it is used (Kress, 1976: viii). In this way, language and culture are “bound up inextricably with one another and the context of situation is indispensable for the understanding of the words” (Malinowski, 1923: 306; see Firth, 1957b), where ‘context of situation’ is often understood as the speech event, generally defined (Jakobson, 1960; Hymes, 1964, 1972; see Sects. 6.​4 and 6.​5) . Malinowski emphasized that the meaning of a word is its use, defined meaning as function in (social) context and declared that “the meaning of any particular instance of everyday speech is […] deeply embedded in the living processes of persons maintaining themselves in society” (1952: 13). He also pioneered the study of (types of) situational meaning (1957a: 179–180).

Firth developed his theory of language based on these ideas and established his “functional” approach (see Firth, 1934, 1935, 1957c), which accepted Malinowski’s idea of context of situation, his “view of the relation between language and society” and his “definition of meaning as function in context ” (Kress, 1976: x). He extended the latter to all linguistic units (e.g., sounds, words, sentences), and thus he didn’t consider the study of meaning (semantics) as a separate area of linguistics, since for him linguistics is ultimately concerned with the meaning of linguistic items in context . He also developed the notion of system as a set of choices in a given context ; and eventually he characterized the language system as polysystemic, composed of several systems, including the systems of sound (for which he developed prosodic analysis, which many see as his major achievement—see Palmer, 1968b: 8; Kress, 1976: xv).

2.3 Michael A. K. Halliday and the Systemicists

2.3.1 Introduction: Systemic Functional Linguistics (SFL), Systemic Functional Grammar (SFG), Language as Social Semiotic (SocSem)

The British-born linguist, Michael Alexander Kirkwood Halliday (aka M.A.K. or Michael Halliday) had some of his education in China and the rest in the UK, wrote his dissertation under Firth, taught at various universities in the UK, and ultimately became Professor and head of the Department of General Linguistics at UCL. After spending some time away from UCL at other places (including in the US), he went to Australia in early 1976 as the Foundation Professor and head of the Department of Linguistics, University of Sydney, from which he retired in 1987. Starting with his writings in the early 1960s, he was productive until well into the 2000s (he passed away in April 2018). He developed his own wide-ranging functional theory of language and grammar—Systemic Functional Linguistics (SFL, aka Systemic or Functional Linguistics) and Systemic Functional Grammar (SFG , aka Systemic or Functional Grammar), and language as Social Semiotic (SocSem)—with an emphasis on language purpose, language system and language use in social and cultural (i.e., semiotic ) contexts. The systemic linguists (systemicists) who generally followed this theory and agreed with Halliday on many (but not all) points suggested various additions, emendations, and elaborations to his approach, especially his wife, Ruqaiya Hasan (Halliday & Hasan, 1976, 1985; Martin & Hasan, 1989) at Macquarie University and his University of Sydney colleague James Martin starting in the 1980s (see Martin, 1992), Robin Fawcett (Fawcett, 2000) at Cardiff University, and his colleague and biographer Jonathan Webster, Director of the Halliday Centre for Language Studies at the City University of Hong Kong. We will pay special attention to SFL/SFG as it was defined from the 1970s until well into the 1990s in Australia, since it was during this time period when it was influential on CritLing, SocSem, and also (to a certain extent) on the beginnings of CDA; we will also identify proposals by Hasan and Martin in particular which are cited by others (see in particular Eggins, 2004, which we have used as our major source for description and exemplification of SFL).

Halliday called his approach ‘functional’, that is, he viewed language as being as it is for the expression and exchange of meaning (having to do with purpose and meaning); and eventually, he and his co-author and former student, Christian Matthiessen, said that “functionality is intrinsic to language […] the entire architecture of language is arranged along functional lines” (Halliday & Matthiessen, 2004: 31, bolding in the original)2. Due to this, he saw language as connected with all the areas that make human beings what we are, and he (and his colleagues, students, and followers) explored the overlapping areas of linguistics with other disciplines, such as sociology, anthropology/ethnography, psychology, history, politics, education (Halliday, 2002a: 5). In other words, one of the foundational ideas in SFL was that, rather than seeing linguistics as autonomous (independent) according to “the prevailing ideology of the 60s and 70s” (1978: 4, i.e., of the American linguist Noam Chomsky among others), Halliday believed that linguistics should be interdisciplinary, since each perspective for viewing language “is equally valid and language looks somewhat different from each of these vantage points” (2002a: 6) and an understanding of language needs to take all of this into account. Systemicists have looked at language in terms of, e.g., grammar, text and discourse analysis, stylistics, register variation, phonology (especially intonation and prosody), computational linguistics, language education, cognition , machine translation, and so on. However, Halliday favored the sociocultural angle and thus his major focus was on language use as an activity that takes place in sociocultural contexts3. He also believed that language system and language use are in a dialectic relationship; thus he disagreed with Chomsky and others, who divided abstract competence from concrete performance and viewed linguistics as the study of competence only (Halliday, 1978: 4; 1970: 145; see also Martin, 2013) .

For Halliday, SFL “provides something to think with, a framework of related concepts that can be drawn on in many different contexts where there are problems that turn out to be, when investigated, essentially problems of language” (2009: viii). It makes available what is necessary in order to “say sensible and useful things” (1994a: xv) about language (spoken or written) and understand the purposes it serves. This was based on four fundamental and intertwined theoretical claims about language use in the systemic approach (see Eggins, 2004: 3): (1) that it is functional: the form of language is determined by the functions it has evolved to serve; (2) that its function is semantic: it is “a system for making meanings” (Halliday, 1994: xvii); (3) “that these meanings are influenced by the social and cultural context in which they are exchanged” (Eggins, 2004: 3) and thus it is situationally and socioculturally contextual; and (4) “that the process of using language is a semiotic process, a process of making meanings by choosing” (2004: 3)—all of which Eggins summarized as a “functional-semantic approach to language” (2004: 3).

This functional-semantic approach is the backdrop for understanding how language operates in society. It “has developed as it has, both in the functions it serves, and in the structures which express these functions, in response to the demands made by society and as a reflection of these demands” (Kress, 1976: xx), as a means by which people can accomplish everyday social life (Eggins, 2004: 3) and thus, there is no separation between language and society. Halliday worked on “an interpretation of the functioning of language in socially significant contexts” (1973a: 8), including the deepest patterns of culture and social structure, as “the principle whereby the culture regulates the range of meanings […] that is typically associated by its members with particular social contexts” (Kress, 1976: xxi). He cited the British sociologist Basil Bernstein’s social-cultural theory of codes (1971, 1973, 1977), including restricted and elaborated codes, and the general principles of “sociological semantics” for relating meaning to “the social contexts in which language operates” (Halliday, 1973a: 8). His point was that meanings are “themselves the expression, or realization, of options in behavior, and some of these options have a broad socio-cultural significance”. And since the function of language is to make meanings, language is an “infinitely complex network of meaning potential” (1978: 5).

For Halliday, this meant that language is a ‘social semiotic’ (SocSem, see Halliday, 1978; see also Halliday & Hasan, 1985; Hasan, 2015). He insisted on the term ‘social’ “to indicate that language is an evolved system arising from the exchange of meanings between the members of a community as they lead their daily lives. Language constructs social life, just as social life is constructed by language” (Steele, 1987: xxii). He also used the formulation ‘social semiotic ’ to refer to “interpreting language within a sociocultural context , in which the culture itself is interpreted in semiotic terms” (1978: 2), since the context of situation is a “semiotic structure whose elements are social meanings” (Kress, 1976: xxi) and culture is “an edifice of meanings—a semiotic construct”.

For Halliday’s SocSem approach, “semiosis”—“the making and understanding of meaning” (Halliday & Matthiessen, 2004: 5)—is done by social practices in a community. In this, he was influenced by the ideas of the American anthropologist-linguist, Benjamin Lee Whorf (1956), especially his insistence on the link between language, culture , and thought and his “view of language as the embodiment of a conceptual system” (Kress, 1976: x) and grammatical categories as providing what he called a “world view” (Whorf, 1956). Halliday (1973a, 1973b) credited Bernstein (1967, 1971, 1973) with showing that many interesting questions about language have to do with meaning differences and their different functions in different contexts. He also stressed “how the semiotic systems of the culture become differentially accessible to different social groups” (1978: 2), as in Bernstein’s notion of restricted vs. elaborated codes and that, in order to explain words like AIDS, macho, or privatization , one needs to “refer to their social origins and uses” ( Fowler, 1991a: 91). Halliday lauded the American sociolinguist William Labov (1966, 1970a, 1970b) for having shown “how variation in the linguistic system is functional in expressing variation in social status and roles” (Halliday, 1978: 2) and advocated asking and answering such sociolinguistic questions as “How and why do people of different social classes or other subcultural groups develop different dialectal varieties and different orientations towards meaning?” (1978: 108)4 for which he would have a SocSem answer.

Furthermore, Halliday adopted “a perspective on language that is grounded in how we actually use language to construe reality and enact social relationships” (Webster 2009: 1), influenced again by Whorf (1956). For him, both language in general and the particular language(s) we learn, especially early in life, inform the way we see and understand the world, which means that our view of that world is the result of the social process by which children learn language (1978: 1; see also Halliday, 1975; Halliday & Hasan, 1976; Halliday, 1977a). In particular, the child is Learning how to mean (Halliday, 1975/1977), i.e., constructing the “functional semantic system of the mother tongue” (Kress, 1976: xxi) for use in future social interactions of meaning-making; at the same time, the child is building up an understanding of reality that is “inseparable from the construal of the semantic system in which the reality is encoded” (Halliday, 1978: 1). Indeed, “language (and other semiotic systems) has developed and is as it is because of the meanings that people have needed to create in order to communicate; semiotic systems (such as language) reflect, construe and enact our reality” (Andersen, Boeriis, Maagerø, & Tønessen, 2015: 2). In addition, in Halliday’s view, language use is in a dialectic relationship with society which it both reflects and creates, since language not only transmits the social order, it also potentially modifies it. And, since speakers have the possibility of creating new meanings and of acting upon and shaping the social world, language can change in order to adapt to the particular needs and interests of the society in which it is used and/or it can create or change those needs and interests (and thus he took exception to the strict separation of synchrony and diachrony/history in most other linguistic approaches).

And finally, Halliday also used the formulation ‘language is a social semiotic’ to mean that “language, like other semiotic systems, is a systemic resource for making and exchanging meaning” (Webster, 2015b: 316), and it also refers to “interpreting language within a sociocultural context , in which the culture itself is interpreted in semiotic terms” (1978: 2), since culture (including society) is “an edifice of meanings—a semiotic construct”. And since language is one of the semiotic systems that both constitutes and is constituted by a given society, this enables members of that society to understand each other, act out social structure, affirm their statuses and roles, establish and transmit shared systems of value and knowledge, and so forth.

This integration of language into a larger, SocSem framework (see Halliday, 1978; Halliday & Hasan, 1985; Hasan, 2015) started a robust intellectual trend in the field which still exists today in various forms (see Sects. 2.5, 2.6, and 3.​9) and is connected with CDA/CDS (see Sect. 4.​6); we view this as the SFL/SFG version of SocSem.

2.3.2 Stratal-Functional Model, System, Structure and SFG

For his functional-semantic approach, Halliday developed a coherent and explicit framework “within which it is possible to state the relationship of units on all levels to each other” and which “provides a statement of the context of all linguistic units” (Kress, 1976: xvi). He proposed a “stratal-functional” (aka “scale-and-category”) hierarchical model of levels of language (1973a, 1976a, 1976b, 1978; inspired by Hjelmslev, 1953; Firth, 1957a; Lamb, 1966)—and later he used a set of concentric circles (Halliday, 1985a, 1994a). In both of these models he visualized language as going from meaning to form (and not the reverse, as in many other approaches). The highest stratum (outermost circle) is ‘semantics’, the meaning related to the clause or clause complex in the sentence. The other meaning-oriented units go (downward or inward) from clauses to phrases, then to words and morphemes, and then to form-oriented units, such as sounds and their combinations, at the bottom (the innermost circle). For the meaning- (or content-) oriented units (i.e., clause complexes, clause, phrase, word, morpheme) Halliday used Firth’s term, ‘lexicogrammar’ (or just ‘grammar’), which combines syntax, morphology and lexis (lexicon) and which he defined in terms of meaning potential, i.e., its capacity to be used for meaning-making. And since for him, all form carries meaning, he rejected the strict separation of meaning and form in many twentieth century linguistic approaches. But lexis is not a separate component since it includes not only vocabulary (words), but also, e.g., unanalyzed phrases, lexical/syntactic patterns (e.g., greeting formulas) and collocations. Thus lexis is the “delicate” end of (lexico)grammar, where “delicacy” means depth in detail (Halliday & Matthiessen, 2006: 6; see also Halliday, 1961; Hasan, 1996). In other words, lexis and grammar are at different ends of a cline and are therefore “different ways of looking at the same phenomenon” (Halliday & Mathiessen, 2006: 6). And lexicogrammar is a ‘natural’ symbolic system because “both the general kinds of grammatical pattern that have evolved in language, and the specific manifestations of each kind, bear a natural relation to the meanings they have evolved to express” (Halliday, 1994a: xviii; see also Kress, 1993).

Inspired by Malinowski’s ideas about language as a range of possibilities (its potential), Prague School notions of the paradigmatic axis and Firth’s vision of language as polysystemic, Halliday defined system as a paradigmatic array of options (1973a: 51–52), of choices; for example, a clause may be imperative or indicative, and if indicative, declarative or interrogative, and if interrogative then either a yes/no question or a wh- question. Choices like these carry the meaning potential that lies behind every instance of text, i.e., what the speaker can do, say and mean in an act of communication; and the “power of language resides in its organization as a huge network of interrelated” and meaningful choices, which are “terms in systems, with interrelated systems represented in the form of a system network” (Webster, 2015b: 317). Language users select from those options for forming their texts and thus semantics includes the study of both the meaning potential of units in the system and the “realized” meaning in a specific instance of use (“instantiation”). In other words, in SFL there is no separation between ‘semantics’ (the study of meaning) and ‘pragmatics’ (the study of meaning in use) that other linguists make.

Once the language user makes the choices, they are arrayed in what Halliday called ‘structure’; this includes the syntactic (syntagmatic) arrangements and combinations, as in the clause; they are typically hierarchical in (semantic) organization but linear in (syntagmatic) realization. However, the ‘systemic networks’ that lie behind the structural array free the grammar from the restrictions imposed by structure since they precede it. In other words, Halliday construed the nature of language as going not only from meaning and function to form but also from system to structure; as a result, meaning and system have priority over, and determine, form and structure. This meant that his approach was very different from other contemporary approaches, since most of them were “primarily syntagmatic in orientation” (Halliday, 1994a: xxviii) and structuralist, with form at their foundation (‘formalist’), while he, in contrast, allied himself with primarily paradigmatic and functional(ist) ones, such as the Prague School (see Vachek, 1966), which takes semantics as its foundation and focuses on text (see Martin, 2013).

In contrast with other approaches to language, Halliday wrote a detailed (although not complete) (multi)functional grammar of English that would provide a concrete example of his SFL/SFG approach and serve the needs of, e.g., linguistic, stylistic, educational and semiotic research into language (use). His An Introduction to Functional Grammar (IFG, Halliday, 1985a, 1994a; Halliday & Matthiessen, 2004, Matthiessen, 2014) is “an introduction both to a functional theory of the grammar of human language in general and to a description of the grammar of a particular language, English, based on this theory” (Matthiessen, 2014: xiii); and more specifically it is an explicit and detailed description of the meaning-making resources of English (see also Eggins, 2004). In Halliday’s view his functional grammar of English, as detailed in IFG, is “functional in three distinct although closely related senses” (1994a: xiii), i.e., in its interpretation of: (1) texts, (2) the system, and (3) the elements of linguistic structures. What is detailed in IFG, however, is not the systemic, but rather “the structural portion which determines how the options are realized” (Halliday, 1994a: xv) in their syntagmatic (syntactic) combinations (e.g., phrases, clauses, clause complexes). As a result, IFG is meant to account for how (the English) language is used; and thus “the aim has been to construct a grammar for purposes of text analysis; one that would make it possible to say sensible and useful things about any text , spoken or written, in modern English” and show “how, and why, the text means what it does” (1994a: xv).

2.3.3 Text and Context : Register and Genre

In SFL, text (or discourse5) is a spoken or written instance of language use, an instantiation (realization) of the system and structure of a language “in any medium , that makes sense to someone who knows the language (cf. Halliday & Hasan, 1976: Chap. 1)” (Halliday & Matthiessen, 2004: 3). As we will see, systemicists refer to text either with the mass noun ‘text ’ (without an article), thereby giving it a more abstract meaning, or with the count noun ‘a/the text ’ (with an article), which has a more concrete meaning. We will follow these conventions for the use of ‘text ’, ‘a/the text 6 (and a few other nouns) in our discussion below. Given the different perspectives in the SFL approach, text is both process, e.g., speaking, which is dynamic and unfolding in time, and product, e.g., what has been said, which may be present to us in memory as product, or a written text which is presented as product (Halliday, 1994a: xxii). (Spoken) text as process is at the same time (an) intersubjective activity, sociocultural event, semiotic encounter, act of meaning making, exchange of meaning in context of situation, semantic process of social dynamics, and so forth (Halliday, 1977b). At the same time text is the primary channel of the transmission of culture , and as such may be long-lasting or ephemeral, momentous/memorable or trivial/soon forgotten, spoken or written, prose or verse. As well, in Halliday’s view, the system and its realization in text are the same thing seen from different points of view, since they display deep complementarity. And since text takes place, and is realized, in a socioculturally defined context , it is the meeting point of that context and linguistic expression (Eggins, 2004: 21); indeed, Eggins (2004: 2) stated that Halliday’s interest was in “the meanings of language in use in the textual processes of social life, the ‘sociosemantics’ of text ” (2004: 2). It should be underscored here that from systemicists’ point of view, “there is no single meaning ‘in’ a text which can be ‘uncovered’/‘discovered’ by analysts” (Birch & O’Toole, 1987a, 1987b: 11). Thus, textual meaning is not the sole property of the speaker/writer nor the hearer/reader; and how a text is received (and understood) is not a passive process, since meaning is constructed and/or interpreted by writer and reader, speaker and hearer. Therefore, what SFL does is to offer a way for textual analysis and criticism to show “how, why and where those interpretations come from” (1987a, 1987b: 11).

Halliday and other systemicists have participated in the analysis of both spoken and written highly valued literary texts and mundane everyday texts and insisted that no new branch or separate level of linguistics (e.g., stylistics) is needed for analyzing literary texts, scientific ones, a mystery novel, a fund-raising letter, or a dissertation (see Halliday, 2002b) since SFL/SFG should handle all types of text . Halliday described an (oral) defense of a dissertation that he participated in as a lexicogrammatical event (since it used lexical items/grammar to build the text of the defense), which exemplified the power of discourse to change the environment that engendered it (Webster, 2015b: 326), since at the end a Ph.D. was awarded and was thus an expanded performative. However, he also conceded that the latter “would obscure a more fundamental point, which is that every text is performative in this sense. There can be no semiotic act that leaves the world exactly as it was before” (Halliday, 1994b: 254).

The basic unit of text is clause, a lexicogrammatical unit which is made up of smaller units (phrases, words) and at the same time is part of higher units, such as clause complexes or complex sentences, which combine in various ways to create text . Clause contributes to the overall meaning of a text in three different ways. (1) It is a representation and conveys some type of information: a way of understanding since it construes ongoing experience with some type of meaning; thus, e.g., the ‘actor’ of the clause functions as the active participant in the process, the element portrayed by the speaker as the one who “did the deed” (Halliday, 1994a: 34). (2) It is an exchange: a transaction between speaker and hearer; thus, e.g., the ‘subject’ of the clause functions as the element the speaker makes responsible for the validity of what s/he is saying. (3) It is a message, which is construed by the way it functions in the overall text ; thus, e.g., the ‘theme’ of the clause selected by the speaker functions as the point of departure for the message, as the ground(ing) for what s/he is going to say. And, according to Halliday, “Theme, Subject and Actor do not occur as isolates: each occurs in association with other functions from the same strand of meaning” (1994b: 34).

The fact that Halliday discussed on the one hand lexicogrammar and semantics in terms of clause and on the other hand text in terms of cohesion and coherence (see below) was seen early on as a problem in his approach, which then led to a series of proposals by others in SFL/SFG about semantics at the text /discourse level, see, e.g., Martin’s proposal (1985, 1992; Martin & Rose, 2003, 2008; see Andersen et al., 2015) for “discourse semantics”, which is higher level or outer concentric circle after (lexicogrammatical) semantics and thus the place where theme and rheme are located, among others. Halliday also insisted that an understanding of the text as a whole rests on the connection between it and its context since the actual choice among the various possibilities (options) takes place in a given context : the context of situation (situational configuration), which is the immediate (linguistic and situational) context (the context of the speech event) in which a given text is situated, to which it contributes and from which it gets (part of) its meaning. For some in SFL, there is another level or type of context : the (more variable and larger) sociocultural context , which encompasses all facets that are relevant for a text in a particular society.

Halliday proposed that there are three aspects of the context of situation (the immediate context of the speech event) that are relevant collectively for understanding how we use language: field , tenor and mode (as in Halliday, 1985a, 1994b, although there has been some controversy about how they should be defined, which we can’t detail here). For Halliday, ‘field ’ refers to characteristics of the social process and the subject-matter being treated—for instance, a discussion about Bernstein’s ideas in a classroom, based on a reading of some of his work. ‘Tenor’ touches on the social characteristics of the participants, their status(es), their social roles, their relationship (power or solidarity), etc.—for instance, one teacher and many postgraduate students engaged in discussion in a classroom. ‘Mode ’ refers to the role language plays in the interaction, the kind of text that is being made by the interaction, the part it plays in the immediate context and various aspects of the channel of communication—for instance, a very complex monologic and dialogic, partly unscripted and partly prepared, classroom discussion. The values of the three variables of field , tenor, mode taken together help language users trace situational context , identify it and predict the meanings to be communicated, and thus make language efficient and understandable in communication. For some systemicists, field , tenor and mode together determine the ‘register’ of a text , the functional variety of language that corresponds to the specific situation, “the configuration of semantic resources that the member of a culture typically associates with a situation type” (Fowler, 1991b: 37).

The other facet of context, culture /society, enables participants, for example, to understand each other, act out social structure, affirm their statuses and roles, and establish and transmit shared cultural systems of value and knowledge (Halliday, 1978: 2). With regard to text , the dimension at issue here, according to some systemicists , especially Martin (1986, 1992; Martin & Rose 2008) and the “Sydney School”, is ‘genre ’, “which has to do with the social relevance of a text ” (Birch & O’Toole, 1987b: 1). It is the product of recognizable and recurrent social activity types, which “become habitualized and, eventually, institutionalized as genres” (Eggins, 2004: 58; see Bakhtin, 1994: 83)—e.g., ordinary conversations, (spoken and written) narratives, job interviews, medical pamphlets, textbooks, etc. Members of a society develop genres as models since they can reproduce them easily in order to accomplish their goals and because doing something in almost the same way over and over again saves time and energy (see Berger and Luckmann, 1967: 71; Eggins, 2004: 57). Genres also make communication with, and understanding by, the listener possible or at least more efficient (Bakhtin, 1994: 84). Genre is another contentious area of SFL and has been given other definitions; for example, in a SocSem context , genre is a reflection of the semiotic structures which mediate between the cultural context (institutions and ideologies) and the sayings and doings of the community (Threadgold, 1986: 5, 35; see also Lemke, 1985; Thibault, 1991). At the same time, since genres are social in nature, they are fluid, dynamic and subject to change as social patterns change in speaking and/or in writing.

2.3.4 The Three Metafunctions: Ideational, Interpersonal, and Textual

Halliday developed very early the notion of ‘metafunction’ and defined it as the organization of the functional framework around three major, and interconnected, kinds of meaning in (adult) language use: “firstly, the ideational function through which language lends structure to experience”, and construes reality and our understanding of the world; “secondly, the interpersonal function which constitutes relationships between the participants; and thirdly, the textual function which constitutes coherence and cohesion in texts” and texture (Wodak, 2001: 8). They permeate the whole view of language in SFL, for instance, they occur every time we use language and are “intrinsic to language as both system and process” (Hasan, 2015: 123); they are “each equally essential in the formation of the semantic and grammatical units”, including clauses, sentences and texts, which contain features of meaning which come from all three; and “each is nonhierarchical” (2015: 123). That is, “our utterances overwhelmingly display all metafunctional strands of meaning-wording” (Hasan, 2015: 132) continuously and simultaneously, and in principle, each clause contains features of meaning which come from each of the three metafunctional areas and every text encodes meaning on these three levels simultaneously. One general way to understand them is that the ‘ideational’ and the ‘interpersonal’ strands are woven together by the ‘textual’ metafunction into a unified text (or discourse); and the task of the analyst is to disentangle them in order to identify them.

In the ideational metafunction, language is used to transmit information between members of a society (Kress, 1976: xix) about the world around them; it also makes sense of or construes our experience in both our outer and inner (thought) worlds and is akin to what others have called the propositional content, cognitive meaning or referential function of sentences (early on it was separated into the ‘informational’ or ‘logical’ vs. ‘experiential’ functions, but then the two were coalesced). It also has a dialectical relationship with social structure—reflecting, creating and influencing it. Moreover, in this view, the world is not a fixed, objective reality represented neutrally through language (as assumed by many linguists, philosophers, and cognitive psychologists), since the world we talk about can differ depending on who is speaking, which language is being used, how it is used, what its socio-cultural context is, who is using it, and so forth—and also because language (use) lends structure to experience and thus can change it.

At the level of context of situation, in Halliday’s view, ideation has to do with ‘field ’, what the text is about; at the semantic level, it has to do with “how we represent reality in language” (Eggins, 2004: 206). The clause, which is the main resource provided by the grammar of English for ideation, is made up of two components of (lexico)grammar: experiential meaning and logical linkage. With regard to the former, the role of the clause is to represent “some process—some doing or happening, saying or sending, being or having—with its various participants and circumstances” (Halliday and Matthiessen, 2004: 29). Transitivity is the major system involved in this: “Transitivity patterns represent the encoding of experiential meanings: meanings about the world, about experience, about how we perceive and experience what is going on. By examining the transitivity patterns in text , we can explain how the situation is being constructed, i.e., we can describe ‘what is being talked about’ and how shifts in the field are achieved” (Eggins, 2004: 249). While the classification of process types expressed by verbs is not entirely agreed upon by all systemicists, we can say, following Halliday (1994a: 106–175, Halliday & Matthiessen, 2004), that there are three major types in English. (1) Material processes of doing-and-happening in the physical world, with the basic meaning “that some entity does something, undertakes some action” (Eggins, 2004: 215) at some time or place, under certain circumstances; (2) Mental processes of sensing, thinking, feeling, seeing, etc. in the world of consciousness, typically with a conscious human participant and a non-active participant; (3) Relational processes where things are stated in relation to other things (they are assigned attributes or identities). And, there are three minor process types, e.g., physiological or psychological behavior; verbal actions; and existential processes (e.g., there is). As for the logical linkage of clauses, they can put clause complexes into “coherent, semantically sequenced packages” (Eggins, 2004: 295), e.g., according to whether they are either equal and independent, as in coordination (parataxis, with and, or, but) or unequal with one dependent on the other, as in subordination (hypotaxis, with if, while, since).

In the ideational metafunction , the structural mechanisms create a grid through which the view of the (social and natural) world is mediated (Halliday & Mathiessen, 2004: 28). In addition, ideational structure, both experiential and logical, is in a dialectical relationship with social structure—it both reflects and influences it. Hence, a “text , under social pressures, offers a mediated, partial, interpretation of the objective reality of which it claims to speak ” ( Fowler, 1991a: 91). Language constructs “human experience. It names things, thus construing them into categories […] and the fact that these differ from one language to another is a reminder that the categories are in fact construed in language” (Halliday & Mathiessen, 2004: 29). As well, “there is no facet of human experience which cannot be transformed into meaning. In other words, language provides a theory of human experience, and certain of the resources of the lexicogrammar of every language are dedicated to that function”.

In the interpersonal metafunction , i.e., language is used to establish, maintain and specify relations between members of society and thus between the participants in an interaction; every text addresses someone and enacts our personal and social relations. When we question, offer something, or express our attitude about what we are saying, we are utilizing the interpersonal metafunction, which expresses intersubjective meanings (i.e., shared by the participants in an interaction) about roles (social relationships with other language users) and attitudes. This kind of meaning is quintessentially associated with “‘language as action’” (2004: 30); and since “one of the main purposes of communicating is to interact with other people: to establish and maintain appropriate social links with them” (Thompson, 1984: 38), it has to do with the way relations between speaker and hearer (or writer and reader) are established through, or expressed by, language.

In the context of situation, the interpersonal metafunction has to do with who the participants are, what relation they have with one another, and what they are doing with each other, and thus it is related to, e.g., “politeness”, a complex concept based on a number of linguistic, contextual and cultural factors (see Brown & Levinson, 1987). At the lexicogrammatical (clause) level, it has to do with text as an exchange between the participants, who make statements, ask questions or give commands, by using the mood structure of the clauses. It also has to do with modality, a very complex area of English grammar which allows language users to express attitudes or judgments and is typically divided into two different kinds of meanings (Eggins, 2004: 172–174): on the one hand, probability, where the speaker expresses judgments as to the degree of likelihood of something happening or being (e.g., possibly, probably, certainly, i.e., low, median, and high likelihood), and on the other hand, usuality, “where the speaker expresses judgments as to the frequency with which something happens” or exists (low: sometimes, median: usually, or high: always). Modality is also involved in conveying (degrees and types of) “obligation, necessity”, e.g., You should/must/are obliged to/are required to readHarry Potter’ (see Eggins, 2004: 179; cf. Halliday & Matthiessen, 2004: 147).

It is through these highly complex systems of mood and modality that speakers of English make meanings about “the power or solidarity of their relationship; the extent of their intimacy; their level of familiarity with each other; and their attitudes and judgments” (Eggins, 2004: 184) and so forth. Thus, there is a direct link between the clausal patterns, the semantics of interpersonal meanings and the context of situation; therefore, “in studying the grammar of the clause as exchange we are actually studying how interpersonal meanings get made […] we have a way of uncovering and studying the social creation and maintenance of hierarchic, socio-cultural roles” (2004: 187).

The ideational and interpersonal metafunctions are usually combined in messages as the two basic functions of language: (1) “every message is both about something and addressing someone”; and, (2) the two metafunctions can be “freely combined—by and large, they do not constrain each other” (Halliday & Matthiessen, 2004: 30). However, the successful “negotiation” of a text involves more than these two types of meaning; they have to be combined in a way that is understandable and reliable. This is the task of the textual metafunction , where language is used to provide textual meaning, “texture, the organization of discourse as relevant to the situation” (2004: 30). Thus in this metafunction, actual discourse (text ) is created. It facilitates the construction or composition of text for communication by building up sequences of sentences and by “organizing the discursive flow and creating cohesion and continuity as it moves along” (2004: 30), such that the ideational meanings and the interpersonal ones are woven together into a unified text .

Since the textual metafunction relates to how (the) text is organized as a message so that it can be negotiated, Halliday described it early on as the “relevance” or “enabling” metafunction (Halliday, 1975/1977: 95, 97; see also Halliday & Matthiessen, 2004: 30): it enables the connecting of facets of ideational meaning with interpersonal meaning so that the resulting text is effective (has textual meaning), given its purpose and context . The textual metafunction is thus of “crucial ideological significance … it undoubtedly ‘breathes relevance into the other two’” (Birch & O’Toole, 1987b: 11). It is also concerned with communicating both information and aspects of interpersonal relations as efficiently as possible through the overall organization of the text and through making it relevant to both the context of situation and the culture /society of which it is a part. This means that the content associated with the ideational and interpersonal metafunctions must be constructed in such a way that it “signals to us which part of the text is more/less important to an understanding of the overall text ” (Eggins, 2004: 295) and that it enables listeners and readers through clauses to interpret the speaker’s or writer’s priorities and direction.

Thus, the textual metafunction brings together the construction of a text , its internal organization, its composition, the way in which bits of text are related to each other semantically, and so forth, so that they have the property of textual unity (texture). This is the result of the interaction of two components: ‘cohesion’ and ‘coherence’. Cohesion ties the elements of the text together and creates connectedness of its linguistic forms and patterns and continuity between one part of text and another (Halliday & Hasan, 1976, 1985). It can be seen as the ‘glue’ of the text . At the same time, it “is fundamentally about the ongoing contextualization of meanings” (Eggins, 2004: 51) and can be divided into three main types (2004: 33–53; Halliday & Matthiessen, 2004: Chap. 9): “reference”, which has to do with the introduction of participants (people, places, and things) and keeping track of them; “lexical cohesion”, concerned with how words in a text relate to each other through classification or composition or word chains (lexical strings); and “conjunction”, focused on “how the writer creates and expresses logical relationships between the parts of a text ” (Eggins, 2004: 47), as in textual coordination and subordination. The other component of texture, ‘coherence’, refers to the way a group of sentences relates to the context (Halliday & Hasan, 1976: 23). For some systemicists (but not all), coherence can be broken down into the two types discussed above: registerial coherence in relation to the context of situation (Eggins, 2004: 29) and generic coherence in relation to the context of culture . In very effective texts, contextual coherence and internal (organizational) cohesion act together and reflect each other.

The textual metafunction also deals with the system of theme (vs. rheme), the topic of one of Halliday’s first major series of articles (1961, 1967–1968; see also 1985), based on work in the Prague School. The general assumption is that there is a major configuration of clauses “into the two functional components of Theme (point of departure for the message) and Rheme (new information about the point of departure)” (Eggins, 2004: 296). The system of theme “contributes very significantly to the communicative effect of the message” (2004: 298) since it is concerned with what the clause is going to be about (theme has sometimes been called the ‘psychological’ subject of the clause, but not necessarily the grammatical subject) and with what is going to develop the theme, i.e., the rheme. The thematic structure of the clause is typically signaled by the order of the constituents of the clause, with the theme coming first (in English, but not necessarily in other languages), and is often analyzed as given, or familiar, information. Rheme is everything after that in the clause (and is often new, or unfamiliar, information about the theme). Given and new are analyzed as part of information structure and typically realized through intonation, as well as other elements in spoken English (see Halliday, 1967) and still other, different elements in written English (and this seems to differ across varieties of English); given this complexity, they will not be treated in depth here (see Halliday & Matthiessen, 2004: 87–205; Halliday & Greaves, 2008). Work on the theme/rheme structure of the clause is merely the first (micro) level of textual organization since theme or rheme may be related to the topic and/or the subject, and may be associated with other elements of the clause. In addition, as proposed by Martin, texts are made of sentences, paragraphs, phases, etc. and include other elements, such as hyperTheme or macroTheme, etc. (Martin, 1992a; Martin & Rose, 2003, 2008; see also Eggins, 2004: 326).

2.3.5 Grammatical Metaphor

Halliday’s definition of what he called ‘grammatical metaphor’ was associated with the ideational and interpersonal metafunction , his work on the difference between spoken and written language, and his (controversial) claim (1985/1989: 95; see also 1994a) “that written language is associated with the use of grammatical metaphor”7 and spoken language isn’t or is less so. While the notion of metaphor is often found in, e.g., rhetoric, literature, and cognitive linguistics, and in CDA/CDS (see Sect. 4.​9) for, e.g., the use of a lexical item that normally means one thing to mean something else, Halliday’s notion is quite different. From his point of view, there are typical ways of saying things in (lexico)grammar, which he calls “congruent”, and there are “others that are in some respect ‘transferred’ or metaphorical” (1994a: 342) or ‘non-congruent’. In some cases, they become so frequent that they may not be recognized as metaphors (a point also made in cognitive linguistics and in CDA/CDS).

In grammatical metaphor, according to Halliday, there is first the decoupling and then the recoupling of lexicogrammar and semantics. Thus, in the lexicogrammar of everyday English, nouns typically encode things and people, while verbs typically encode happenings and processes, as with the verb discuss. However, a grammatical metaphor is used when, e.g., that same process is represented with different grammar, e.g., a noun such as discussion, which is non-congruent, i.e., metaphorical,8 because in this case the noun is used to encode a process. Thus, one of the major ways grammatical metaphor arises is through nominalization (e.g., a noun is used instead of a verb, or an adjective). The grammatical metaphor of nominalization can also be used to express a particular attribute of a person or thing: e.g., ambivalence which can be used to mean ‘people are ambivalent towards/about’ something (see Thompson, 1984: 167). There are of course types of grammatical metaphors that are not nominalizations, e.g., the verb shows that with discussion as its subject (as in e.g., ‘this discussion shows that …’) can be interpreted to mean ‘as a result of discussing, people find out …’.

More important than understanding what nominalizations are is knowing the consequences of using them. Nominalizations often lead to condensation and encapsulation, in which there is a reduction and/or loss of various types of information, e.g., the doer of the process (and details of the process) is unknown, as in the use of ambivalence or discussion where it’s unknown who is ambivalent (the word people can be used in the more congruent version, but that is still vague) or what it entails. These types have been the focus of much work first in CritLing and then in CDA/CDS since nominalization can lead to avoiding naming the doer of a process (which is also the case in the use of passive constructions). Nominalization can also make it more difficult for the reader or hearer to disagree with a point because it is expressed as if it were an objective, factual description of the event/activity, as in, e.g., David’s failure to apply common sense led to…—in the sense of ‘it is a fact that David failed to apply common sense and therefore…’—even if it is only a claim on the part of the speaker or writer). However, in some cases, it is almost impossible to give a complete, congruent rewording “which adequately reflects the meanings encoded in the metaphorical wording” and this “opens a potentially bottomless pit of possible rewordings” (Thompson, 1984: 177). In other words, the concept of grammatical metaphor “is essential in explaining how the language works, but it is a dangerously powerful” concept (1984: 177). Despite that issue, it has become an important part of CDA/CDS analyses, although they don’t often used the term ‘grammatical metaphor’ nor do they recognize the importance Halliday gave to it.

2.3.6 Appliable Linguistics’ and Social Action

Halliday called his approach to language part of ‘appliable linguistics ’, by which he meant that, given the interdisciplinary and functional orientation of his model of language, it could be applied to a variety of domains and uses. In IFG, Halliday said that “a theory is a means of action, and there are very different kinds of action one may want to take involving language” (1994a: ix; see also 1969). More generally, given his commitment to social justice from his earliest days as a student with a commitment to Marxim, he wanted to create a linguistic theory that would be “socially accountable” in two senses: “that it put language in its social context , and at the same time it put linguistics in its social context , as a mode of intervention in critical social practices” (Halliday, 1993: 73). In other words, he envisioned linguistics as an “ideologically committed form of social action” (1985b: 5; see also 2015) and he imagined the possibilities from a functional and social perspective of making the world better. Thus, he devoted his career—as have other systemicists—to the development of an approach that could be used to address productively many different human and societal concerns, most notably his “concern with language in relation to the process and experience of education” (1978: 5).

For instance, very early on, he searched for new ways of teaching language (see Halliday, McIntosh, & Stevens, 1964) in order to improve literacy rates, including programs for primary and secondary school students. In 1973b, Halliday argued for talking about reading readiness in social-functional terms, taking spoken language seriously in education, reappraising the significance of both reading and writing and seeing them in the context of the learning of language as a whole. He also worked with teachers at all levels (primary, secondary, tertiary) in various aspects of language teaching and learning, including developing one’s native language (mother tongue), studying foreign languages, and learning about the nature of language (1978: 5). He eventually became “convinced of the importance of the sociolinguistic background to everything that goes on in the classroom”, including the linguistic patterns found in the family , neighborhood, school and community, as well as the child’s specific experiences of language from infancy (1978: 5). He argued that we should build educational contexts on what children already know (including what they know about language), by starting from what is common knowledge to all, thereby creating continuity between the culture children come from and that of school. He also explored the sociolinguistic aspects of mathematical education in light of e.g., the relation between mathematics and natural language and the issue of levels of technicality; these were accompanied by a “‘checklist’ of possible sources of linguistic difficulty facing a learner of mathematics” (1978: 204). In all of this he greatly influenced many different scholars in SFL and in other approaches.

Halliday also said that knowledge of the standard language should not be a precondition of success by school children, but learning the standard language could be “a natural consequence of the process of learning to read and write” (1978: 210). He argued for “a milieu that is child-centred but in which the teacher functions as a guide, creating structure with the help of the students themselves” (1978: 210). His argument was that our societies need to change our cultural attitudes towards language (and education, learning and teaching)—we need to be “a lot more serious about language, and at the same time a great deal less solemn about it” (1978: 210). And he also suggested topics, each one accompanied by points to consider, that could be explored by “linguists of all ages” (1978: 211), such as teachers in their study groups, pupils in class or students in their families, regarding: language development in young children; language and socialization; a neighbourhood language profile ; language in the life of the individual; language and the context of situation; language and institutions (e.g., family, school, factory); language attitudes (see Halliday et al., 1964) (1978: 211–235); and so forth.

The applications of his approach by himself and others are too many to list here, but we can give an overall sense of their breadth by saying that they ranged from “research applications of a theoretical nature to quite practical tasks where problems have to be solved” (1994a: xxix) in a number of domains, e.g., theoretical, historical, developmental, textual, variational, aesthetic, evolutionary, societal/cultural, educational, medical, communicational, computational, legal, etc. domains. “Underlying all these very varied applications is a common focus on the analysis of authentic products of social interaction (texts ), considered in relation to the cultural and social context in which they are negotiated” (Eggins, 2004: 2) and the most generalizable application of SFL, is “to understand the quality of texts: why a text means what it does, and why it is valued as it is” (Halliday, 1994a: xix) and why it is or is not effective. In the 1990s through to the current decade, SFL has been “increasingly recognized as a very useful descriptive and interpretive framework for viewing language as a strategic, meaning-making resource” (Eggins, 2004: 2) and has been used by systemicists to say ‘sensible and useful’ things about texts not only in language education and child language development, but also in the study of computational linguistics, media discourse, casual conversation, history, administrative language, among others (see Eggins, 2004: 2 for specific references). Some have adopted SFL as a whole, with many aspects agreed upon and only a small list of others open to debate (e.g., genre , register, syntax), leading some to say that it is a ‘closed system’ in the sense of Chomsky’s theory. And, yet, many of the ideas that Halliday and the systemicists incorporated into SFL have inspired scholars in a variety of other intellectual and (inter)disciplinary areas. As a result, others, like ourselves, have found it ‘useful to think with’ and have been inspired by certain SFL/SFG ideas, such as: language (system) as resource and a set of options, meaning as central, meaning potential, meaning making, language system and language use as related to each other, language as social semiotic , metafunctions (especially the interpersonal and textual), the social nature of grammar, grammatical metaphor, sociocultural significance of meaning choices, grammar and lexicon as linked with each other, etc. As a linguistic and functional approach to meaning in text,

systemic linguistics has (or has had) common ground with text grammarians and discourse analysts from a range of perspectives […] points of connection with research in areas such as sociolinguistics […] and the ethnography of speaking […] exploring ways in which social and cultural context impacts on language use. As a semiotic approach, it has common ground with semiotic theoreticians and those, following Fairclough, working in what has become known as the Critical Discourse Analysis (CDA) approach (Eggins, 2004: 21)

In 2008, The Halliday Centre for Intelligent Applications of Language Studies was launched at the City University of Hong Kong (directed by Jonathan Webster, see Webster, 2015a) “to apply our knowledge about how language works” in order to “construe our experience and enact social relationship; apply linguistic insight in such areas as education, and computer processing of language, i.e. practicing ‘appliable linguistics’”. SFL/SFG as a theory of language is practiced world-wide, particularly in language education, where it is associated with work in applied linguistics, linguistics, educational linguistics, education, second language acquisition, corpus linguistics, computational linguistics, natural language generation and processing, and so forth. SFL has led to, e.g., the International Systemic Functional Linguistics Association (ISFLA), which puts on an annual conference that rotates between Australia, Asia, the Americas and Europe, has two highly informative websites (with many pages of information and lists of publications, conferences, publishers, software, etc.), and has published many volumes of collected or selected papers. And there are many other resources, too numerous to list here.

We will now turn to the discussion of CritLing.

2.4 Critical Linguistics (CritLing )

2.4.1 Introduction

Critical Linguistics (CritLing) began in the (mid-) 1970s and continued into the 1980s and 1990s (and beyond); the Critical Linguists were also called the East Anglians, since they were at the University of East Anglia (UEA) in Great Britain during a very formative period of their joint work. They were a group of “socially directed” ( Fowler, 1991a: 89), “and politically aware” (Wodak & Chilton, 2005 : xi) scholars, who, through very productive collaboration, proposed and developed “systematic ways of analyzing the political and social import of text ” (2005: xi). They drew their “theoretical support from the intellectual interests and social engagements of a group of co-writers/co-workers (David Aers, Roger Fowler, Bob [aka Robert] Hodge, Gunther Kress, Tony Trew) whose disciplinary interests ranged from Literary Criticism, Sociology, Politics, Philosophy to Linguistics, and who all had a theoretical and practical commitment to different forms of Marxism” (Kress, 1991: 166). During that time, being critical, Marxist and activist was very much in the air in the UK, intellectually and politically, and thus they were dealing with the issue of what linguistics could offer scholars in the humanities and social sciences that would be meaningful in that context .

While members of the group developed many of their ideas in papers with each other (and a few others) that were published in the local UEA Papers in Linguistics, they are best known for two major books which together have been called the CritLing manifesto: first, Language and Control (Fowler, Hodge, Kress, & Trew, 1979a), chapters of which were co-authored by one or more of them (and in one case, Gareth Jones); this book is known as the foundational text of CritLing . And, secondly, Language as Ideology (Kress & Hodge, 1979), which was seen by some as a more advanced version of CritLing , given its more interdisciplinary, programmatic and theoretically explicit account with sharper focus on ideology (Kress, 1991: 166, 172). It was also characterized (Hodge & Kress, 1993: 159) as “a handbook” for CritLing 9, and “as the first comprehensive account of the theory of language that underpinned the critical discourse enterprise” (1993: ix), i.e., CDA. During the 1980s and early-mid 1990s, there were modifications and extensions of CritLing by Fowler (in the UK) and especially by Hodge and Kress (in Australia and the UK in the case of Kress), and others. The most important other CritLing publications, from the point of view of their influence on and impetus towards CDA, were Kress 1985a and Hodge and Kress 1988 (on SocSem); we will discuss them below, after our general discussion of CritLing .

2.4.2 CritLing and Other Approaches to Linguistics

According to Kress (1990: 88), CritLing had two aims: (1) “to use the tools provided by linguistic theories […] to uncover the structures of power in texts, and (2) to make the discipline of linguistics itself more accountable, more responsible, and more responsive to questions of social equity”. For their linguistic approach, CritLing borrowed many ideas from many linguists, although Halliday’s approach, which was “the most fully developed” (Fowler, Hodge, Kress, & Trew, 1979b: 3) of the functionalist theories of language, was “the major inspiration behind the model” of CritLing (Fowler, 1991a: 91) since it used “chiefly concepts and methods associated with the ‘systemic-functional’ linguistics developed by M.A.K. Halliday” (Kress, 1990: 89). However, it needs to be understood that, on the one hand, there were some of Halliday’s (and systemicists’) ideas that they rejected explicitly or implicitly, and on the other hand, they combined Halliday’s approach with ideas from other approaches, so that it became an “eclectic” (Fowler, 1991b: 243) “composite of a number of sources” (Fowler et al., 1979b: 3), as we will see below. As we know from the discussion above, Halliday was actively working on his ideas and publishing them in the 1970s through the 1990s, including his contribution to the first issue of the UEA papers (1976b); as well, Kress had studied under Halliday and finished editing Halliday’s book on function and structure (Halliday, 1976a) after his arrival at UEA. They were also attracted by the fact that Halliday insisted that there are “strong and pervasive connections between language structure and social structure” (Fowler & Kress, 1979: 185) and that he “propose[d] that the structures of language have developed in response to the communicative needs that language is called upon to serve” (Fowler, 1991a: 90). For them, this “implied a demand for a thorough-going account of social structure in order to make sense of linguistic structurings” (Kress, 1991: 163), and therefore they rejected “theorizing ‘language’ and ‘society’ as separate entities” (Fowler, 1991a: 92), as was widespread at the time (and to this day) in much of ‘mainstream’ linguistics.

They took text “as the relevant linguistic unit, both in theory and in description/analysis” (Kress, 1990: 88). They selected and adapted certain parts of Halliday’s model for their own use, “drew largely on categories from sentence and below-sentence grammar” and used many Hallidayan concepts, such as metafunctions, transitivity types, modality, theme, and so forth. They worked on, e.g., the many lexico-grammatical resources of ideational and interpersonal meaning and cohesive and other devices for textual structuring. They ultimately “worked to make the model less ‘narrowly linguistic’ and more integrated with general theories of society and ideology ” (Fowler, 1991a: 91; see also Kress, 1985a; Threadgold, 1986) and also better suited to the analysis of text . They were also influenced by Chomsky’s early work in transformational(-generative) grammar (TG, 1957, 1965; see also Smith & Wilson, 1979), especially as “reinterpreted in the direction of the earlier formulations of Zellig Harris” (Kress, 1991: 166), but they were careful to say that they did not agree with post-1965 Chomskyan theory. On the one hand, they were in accord with Halliday that there were many flaws in Chomsky’s work; indeed, they used the term ‘autonomous linguistics’ (a negative evaluation) to group Chomsky with the earlier American descriptivists (influenced by Bloomfield) and the European structuralists, since they all separated language from society and culture . On the other hand, they endorsed Chomsky’s acceptance of Harris’ extension of linguistics to the sentence (i.e., not just sound and word structure), but they disagreed with his claim that the sentence is an abstract structure and the highest unit of language, since they analyzed actual, concrete language use and included discourse and text as an essential part of the scope of linguistics. And they also argued that what seems to be the same sentence could have different meanings in different contexts and thus that context and contextualization also had to be treated in depth in any linguistic approach.

As a result, they accepted Chomsky’s claim that an active sentence and its corresponding passive are (closely) related to each other through transformations, which Trew (1979a: 94; 1979b: 117; also Kress & Trew, 1978a, 1978b) characterized as a “departure […] from the more familiar notion of transformation”, since they were in total disagreement with Chomsky, who also said that the active and passive have the same meaning. For example, Trew analyzed two newspaper headlines—Rioting Blacks Shot Dead by Police [as ANC Leaders Meet] vs. Police Shoot 11 Dead in Salisbury Riot (1979a: 94)—as different in meaning and thus in need of an in-depth analysis according to the character of the discourse, its context(s), purpose(s), ideology , and so forth. That is, they ( van Leeuwen, 2006: 292–293):

took the fundamental step of interpreting grammatical categories as potential traces of ideological mystification, and broke with a tradition in which ways of saying the same thing were seen as mere stylistic variants, or as conventional and meaningless indicators of group membership categories such as class, professional role, and so on.

They also used Chomsky’s concept of surface vs. deep (or underlying) structure (Chomsky, 1965; see also 1970, 1971, 1972) as inspiration for their technique of starting from the surface of the text and attempting to “recover the forms which were the starting point of the utterance” (Kress & Hodge, 1979: 17) in their quest to uncover hidden meaning and ideologies and provide demystification. This was not at all what Chomsky meant by the relationship between deep and surface structure (since he didn’t include in that relationship hidden meaning, ideology or demystification), and he and his followers objected strongly to what they saw as a redefinition of his terminology (see Kress & Trew, 1978a, 1978b; Fowler, 1972, 1977).

The critical linguists referenced as well Chomsky’s writings on political and social issues and his condemnation of the war in Vietnam (e.g., Chomsky, 1969). Given their commitment to social change (discussed below), they found the political side of Chomsky’s thinking attractive, but they disagreed with his rigid separation of linguistic theory from political theory and his refusal to see any relationship or connection between his political writings and his linguistic work (see Caldas-Coulthard & Coulthard, 1996: xi). In the long run, the critical linguists continued to use the term transformation (and deep structure, although less often) according to their own definition.

2.4.3 Definition of CritLing; Fowler et al. (1979a): ‘Language and Control’, and Other Work

What differentiated CritLing from other linguistic approaches of that time the most clearly was announced boldly and unequivocally by the use of the term ‘critical’ in the name of their approach. The term ‘critical linguistics’ was “quite self-consciously adapted” (Kress, 1990: 88) from Critical Sociology (the title of Connerton, 1976; see now Cook, 1987), and was used in the title of “the synoptic and programmatical concluding chapter” (Fowler, 1991a: 89), their code of practice, by Fowler and Kress (1979), titled “Critical Linguistics”, of their co-authored book (Fowler, Hodge, Kress & Trew, 1979a, 1979b: Language and Control10). It was also used in the title of Part IV “Towards a Critical Linguistics” of ( Chilton, 1985a). Critical (and critique) were associated with a set of assumptions that are important in understanding their point of view, since many of them were different from many other approaches at that time and were carried into CDA. It meant an approach to linguistics “which is aware of the assumptions on which it is based and prepared to reflect critically about the underlying causes of the phenomena it studies, and the nature of the society whose language it is” (Fowler & Kress, 1979: 186). It critiqued both existent social forms and the discipline of linguistics, since the latter was dominated by asocial, apolitical approaches in which language was autonomous from society, including European and American linguistics (but not, as we saw above, SFL). It also positioned CritLing in the context of its more general socio-philosophical counterpart, critical theory, e.g., contemporary (neo)Marxist, post-structuralist, post-modernist and deconstructionist theories of the 1970s and the 1980s, as well as the “cross-fertilization between linguistics and the social sciences […] a remarkably interdisciplinary and international project” (Wodak & Chilton, 2005: xi). As a result, CritLing was responsive to the “major questions put by post-modernist writing, especially in its post-structuralist mode , without, at the same time, adopting headlong many of its major tenets” (Kress, 1991: 171). Thus, their goal “was to provide an illuminating account of verbal language as a social phenomenon, especially for use of critical theorists in a range of disciplines—history, literary and media studies, education, sociology—who wanted to explore social and political forces and processes as they act through and on texts and forms of discourse” ( Hodge & Kress, 1979: vii), and “to relate forms of thought to the existence of the producers of those thoughts, as individuals living in a material world under specific conditions in specific societies at given times” (1979: ix).

The critical linguists stated that their approach was motivated by the fact that “so much of social meaning is implicit” (Kress, 1990: 196), even when it is conveyed by language, so that what is needed is the activity of unveiling, or demystifying, a text’s (hidden) meaning. CritLing was also devised in response to problems of fixed, invisible ideology permeating language. As Kress said, they were ultimately critiquing “the structures and goals of a society which has impregnated its language with social meanings many of which we regard as negative, dehumanizing and restrictive in their effects”. They also aimed at developing a social, and socially directed, application of linguistic analysis which would expose the “strong and pervasive connections between language structure and social structure” (Fowler & Kress, 1979: 185), including structures of power. They sought, “to display to consciousness the patterns of belief and value which are encoded in the language—and which are below the threshold of notice for anyone who accepts the discourse as ‘natural’” (Fowler, 1991b: 67). In this respect, they endorsed the ideas of the American linguists/anthropologists Edward Sapir (1921) and Benjamin Lee Whorf (1956) about ‘linguistic relativity’ (in ways similar to Halliday), i.e., that the language we use “embodies specific views—or ‘theories’—of reality” (Fowler et al., 1979b: 1), they accepted the ‘weak’ (non-deterministic) version of Whorf’s view as ‘influence’ of language upon thought and agreed that “syntax can code a world-view without any conscious choice on the part of a writer or speaker” (Fowler & Kress, 1979: 185)—but they rejected the strong/extreme position of linguistic determinism also attributed to Whorf. They went on to argue that “world-view comes to language-users from their relation to the institutions and the socio-economic structure of their society. It is facilitated and confirmed for them by a language use which has society’s ideological impress”. Thus, they argued for an ideological point of view in research, “since any aspect of linguistic structure, whether phonological, syntactic, lexical, semantic, pragmatic or textual, can carry ideological significance” (Fowler, 1991b: 67) and society (including culture ) is neither innocent nor neutral nor natural.

In doing their version of social linguistics and their social theory of the functioning of language, they rejected “the dichotomy between the grammatical structures of a language and the ways in which these are employed in actual instances of communication” (Thompson, 1984: 118). Thus, they were pleased to see that sociolinguists (and others) were breaking with the structuralist and generativist tradition that regarded language as monolithic and were documenting various types of sociolinguistic variation. However, they were dismayed by conventional, or “correlational”, sociolinguistics (as exemplified by Labov, 1972a, 1972b; Trudgill, 1974), which studied language and society as two independent phenomena that can be separately described and quantified, “so that one is forced to talk of ‘links between the two’, whereas for us language is an integral part of social process” (Fowler & Kress, 1979: 189). As such, language “serves to confirm and consolidate the organizations that shape it, being used to manipulate people, to establish and maintain them in economically convenient roles and statuses, to maintain the power of state agencies, corporations and other institutions” (1979: 190). Thus, terms like correlation are “too weak an account of the relationship. Sociolinguistic variation is to be regarded as functional rather than merely fortuitous” (Fowler, 1991a: 92). Social groupings influence linguistic behavior, which in turn influences and manipulates (unconsciously, automatically) non-linguistic behavior, since “variation in types of discourse is inseparable from social and economic factors” (Fowler et al., 1979b: 1). As a result, “linguistic variations reflect and, what is more, actively express the structural social differences which give rise to them. They express social meanings” and should be studied in this light (Fowler et al., 1979b: 1).

The critical linguists also rejected the sociolinguists’ claim that their description of linguistic variation of various types and its circumstances was done “objectively and scientifically, without evaluation of the phenomena described” (Fowler & Kress, 1979: 192). As an example, they stated that Labov’s notion (1966, 1972a) of upward social mobility as the reason why certain individuals in a stratified class system use certain linguistic elements, “should be regarded not as a generally applicable concept in sociological theory but as a product of the academic ideology of a particular society” (Fowler & Kress, 1979: 192), i.e., the US. They also said that the idea that forms of language are “freely chosen” (1979: 194) hides the fact that they are selected not just because they are appropriate in the given situation, but also because that “appropriateness is established by socio-economic factors outside the control of the language-user” and which are often unconscious, learned through socialization and sanctioned by the “social norms” established by those in power. In addition, while they agreed that language is an instrument of social communication, they insisted that language usage is “a part of social process. It constitutes social meanings and thus social practices” (Fowler et al., 1979b: 1). What speakers say “is interconnected with the life of individuals in social formations” (Chilton, 1985b: xv), and, since language is “one of the mechanisms through which society reproduces and regulates itself” (Kress, 1991: 93), it is also “an intervention in social processes. Critical linguistics invites a view of language that makes ‘intervention’ a general principle: language is a social practice, one of the mechanisms through which society reproduces and regulates itself. Hence, language is ‘in’ rather than ‘alongside’ society” (Fowler, 1991a: 93) and society is ‘in’ language.

The critical linguists set out to “describe the social, interpersonal and ideological functions” (Fowler et al., 1979b: 3) of many linguistic constructions in a wide variety of actual examples of “ordinary texts”, including the media and popular culture , as well as literary texts (of “high” or “low” culture ), and so forth11 (see Fowler, 1981: 24–45). For them, all of these “show how linguistics structures are used to explore, systematize, transform, and often obscure, analyses of reality; to regulate the ideas and behavior of others; to classify and rank people, events and objects; to assert institutional or personal status” (Fowler et al., 1979b: 3). In other words, as stressed in the title of the book—Language and Control—language, language use and language structure (patterning) are employed for control or limitation of behavior, thought, belief, ideologies, etc. An early, much cited and admired, example of work in this area was Trew’s comparison (1979a, mentioned above; see also Kress & Trew, 1978a, 1978b) of headlines from different British newspapers that covered the same event of civil disorder in what is now Zimbawbwe (formerly Rhodesia). Trew argued that the choice of certain linguistic devices (e.g. the passive rather than the active) could affect the meaning and force of the text as a whole. Therefore, linguistic analysis could expose the potential ideological significance of using constructions in which agents of the verbal process are explicitly stated as subjects of the verb or downgraded in a by-phrase or not stated at all. He discussed in detail (1979a: 94) the implications of the newspaper headline: ‘RIOTING BLACKS SHOT DEAD BY POLICE AS ANC12 LEADERS MEET’.13 Here, ‘Blacks’ are classified as rioters, are put at the beginning of the sentence and are thus salient, while the actions of the police, who did the killing, are de-emphasized by the passive constructions, and in this case could be elided, as in (the possible but unattested) ‘rioting blacks shot dead as ANC leaders meet’, leading to what many called an agentless passive; this could also be done by the use of a grammatical metaphor, i.e., a nominalization (‘the shooting of rioting blacks’). By contrast, the structure of the headline about the same event in another paper, ‘Police Shoot 11 Dead in Salisbury Riot’, effectively does the reverse: that is, the police are prominent, while the phrase ‘11 Dead’, which is expanded in the text to ‘Eleven African Demonstrators’, are presented as victims. In sum, Trew suggests that linguistic structure is both an effect of, and contributes to, the different slant in the two newspapers and thus can be aligned with their different ideological orientations. He (1979a, 1979b) also did “some particularly fruitful work on ‘discourse in progress’ in newspapers—the transformation of materials from news agencies and other sources into news reports, and the transformations a story undergoes from one report to another, from reports to in-depth analyses to editorials, over a period of time” (Fairclough, 1995: 26) and also looked at the transformation in real time of the development (rewriting, recasting, updating, etc.) of a story over a period of days.

All of these findings were more than reminiscent for the critical linguists of George Orwell’s novel 1984 (Orwell, 1949; see also Orwell, 1968). In fact, Hodge and Fowler (1979) is a study of “Orwellian linguistics” and emphasizes Orwell’s recognition of “some of the connections between language, ideas and social structure” (Fowler et al., 1979b: 2). Indeed, they said that Orwell’s novel had an impact on their general consciousness about the language of politics ( Hodge & Fowler, 1979: 6), how thought could be controlled or limited through language, and how Orwell’s concepts of ‘doublethink’, ‘newspeak’, and ‘duckspeak’ “rest on recognizable principles of language-patterning” (Hodge & Fowler, 1979: 2, 9). In fact, Paul Chilton’s edited book, Language and the Nuclear Arms Debate: Nukespeak Today (1985a, 1985b; see also 1988, 1996), used the term ‘Nukespeak’ in its title as a punning allusion to ‘newspeak’; it included contributions by the critical linguists and other scholars with similar concerns from the UK, Australia, and western Europe about nuclear (dis)armament14.

2.4.4 Kress and Hodge 1979: ‘Language as Ideology

As said above, Kress and Hodge, who had participated in Fowler et al. 1979a, also co-authored Language as ideology (Kress & Hodge 1979 [1st Edn.]; Hodge & Kress, 1993 [2nd Edn.]). They underscored the social nature of language, its status as a social fact; they also did a more thorough exposition of CritLing than was in the other part of the CritLing ‘manifesto’ (Fowler et al., 1979a, 1979b); and in the context of their discussion of ideology in language, they explored their debt to Sapir (1921), Whorf (1956), Fillmore (1968) and Halliday (Sect. 2.3), and others, as well as their debt to and differences with Chomsky. Their book, as they said in the Preface, aimed “to provide an illuminating account of verbal language as a social phenomenon, especially for use of critical theorists in a range of disciplines—history, literary and media studies, education, sociology—who wanted to explore social and political forces and processes as they act through and on texts and forms of discourse” (Kress & Hodge, 1979: vii). Therefore, the task they set for themselves and for linguistics was “to relate forms of thought to the existence of the producers of those thoughts, as individuals living in a material world under specific conditions in specific societies at given times […] linguistics had to provide the theoretical and methodological framework for the analysis of materials studied by all kinds of intellectual and cultural historian, indeed, by everyone concerned with culture and thought” (1979: ix). In short, they set out to show that “the requisite theory must encompass the study of syntax and the basic rule systems of the language along with the social uses of language, that is, the relations between language and society and between language and mind, in a single integrated enterprise” (1979: 2). In conformity with CritLing thinking, they insisted that their “conception of social reality includes antagonisms and conflicts within and between groups in a class society […] linguistics, then, is an exceptionally subtle instrument for the analysis of consciousness and its ideological bases, the ‘true shapes […] of invisible and bodiless thought’” (1979: 13) and thus, it “should ‘be an instrument of discovery, clarification and insight’, to make language itself speak ” (1979: 14; with reference to Whorf, 1956).

As a result, they insisted, no linguistic form is neutral—not only does the (choice of a given) materiality of language have meaning, but also all representation is mediated. Thus a central component of their program was that language reflects, (re)produces, and constructs ideology, and linguistics allows the analyst to explore the value systems and sets of beliefs that reside in texts, since language use has “society’s ideological impress” and “ideology is linguistically mediated and habitual” for the user (1979: 185). For them, ideology is a systematic body of ideas “organized from a particular point of view” (1979: 6), which underlies our “everyday perceptions of the world (whether social or ‘natural’)” (Trew, 1979a: 97), including taken-for-granted assumptions, beliefs and value-systems. And since language is often overlooked or taken for granted, “the differences in constructs may seem to be natural, universal and unalterable when in reality they may be produced by a specific form of social organization” (Belsey, 1980: 42). What interested the critical linguists most, therefore, was to bring ideology , which is hidden under “the habitualisation of discourse, to the surface for inspection” ( Fowler, 1991a: 89), particularly in the context of social formations (Fowler, 1987: vol. 2: 482; see also Kress & Hodge, 1979)—in order to shed light not only on social and political processes (Fowler, 1991a: 89) but also on the “ways in which people order and justify their lives” (1991a: 92), and to challenge and change them. This type of consciousness-raising was also evident (Chilton, 1985a, 1985b) in Nukespeak, since it was intended to expose the obfuscation and dissimulation surrounding the discourse on nuclear arms, and also in (Threadgold, Grosz, Kress, & Halliday, 1986; Hodge & Kress, 1988; Thibault, 1989).

At the same time, Kress and Hodge stressed that “language is ideological in another, more political, sense […] it involves systematic distortion” (1979: 6), i.e., the abuse of language by those with social and political power. Indeed, they underscored that inequality of power is a prominent facet of social structure which influences linguistic structure and use, with the result that “language not only encodes power differences but is also instrumental in enforcing them” (Fowler & Kress, 1979: 195), and, in some cases, in creating them. As a result, ideology is pervasive and thus all texts in all types of contexts should be put under the CritLing linguistic lens. Kress also published an “extended account of the operation of ideological structures in the language uses of the media, in print, radio and television” (1991: 169); and in other work of the same time (cf. Kress & Threadgold, 1988), he gave more attention than in earlier work in CritLing to productive and interpretative practices associated with (types of) texts. Of particular interest also was the work of Theo van Leeuwen, who was to co-author important work later with Kress (see Sect. 2.6), and who at this time did meticulously detailed analyses (see 1983, 1985a) of the interrelation of sound, language, and other aspects of social practices in the speech of radio announcers on different radio stations, showing how the station policies, their audience, and their approaches to news values, were realized in certain phonological facets of their speech (e.g., pitch, rhythm, intensity). In the same vein, he also worked on images and the “ideological effects of editing of videotapes in television news programmes (1985b, 1986)”, and on the socio-semantics of music (1987) and other semiotic modes (see discussion in Sect. 3.​9). All of this would be part of his later work with Kress on SocSem (see Sects. 2.6 and 2.7).

2.4.5 CritLing : Interdisciplinarity and a ‘Useable’ Approach

Given their beliefs and goals, the critical linguists went further against the ‘mainstream’ of linguistics by espousing an inter/multidisciplinary approach to language and text, with input from other domains to linguistics and output from linguistics to those same and different domains; in other words, they engaged in what Kress (1991: 170) called “intellectual trade”. As said above, they were very interested in interdisciplinary connections with other fields of knowledge than those related to texts or linguistics, and as a result, they embraced insights, analytical concepts and terms from a variety of different sources. They insisted that “linguistics had to provide the theoretical and methodological framework for the analysis of materials studied by all kinds of intellectual and cultural historians, indeed, by everyone concerned with culture and thought” (Kress & Hodge, 1979 : ix) and thus linguistics could be seen as a branch of anthropology, sociology or psychology, and also as an instrument for the study of culture .

While their work in the late 1970s was strongly critical of “the dominant currents” in all the disciplines they dealt with (Fowler et al., 1979b: 5), by the late 1980s—in the opinion of Kress (1991: 171)—the “critical aspects of social and philosophical theorising” also led them to question some of the bases for linguistics itself, such as the notion of ‘speaker’ as an unproblematic construct. They were influenced by “the increasing focus on the socially and linguistically formed social ‘subject’ who is active in the construction and reconstruction of text. This in turn has led to questions around the concept of the language system as a stable and unproblematic entity in most linguistic theories, including systemic-functional linguistics” (1991: 171).

As part of their interdisciplinary effort and their desire to reach linguists and scholars in other domains, the critical linguists also insisted that their approach should lead to practical analysis, in the sense of an useable mode of analysis (but not an analytical routine), that is “simple and consistent enough to be applied by non-linguists in a ‘do-it-yourself’ critical linguistics of texts (Fowler et al., 1979b: 4). Thus, their aim was to create a wide-ranging and interdisciplinary approach for those interested in linguistics, sociolinguistics, sociology, political theory, social theory, and so forth. In this effort, Fowler and Kress (1979) included “an annotated checklist of linguistic features which have frequently proved revealing” (1979: 198), while cautioning that “there is no predictable one-to-one association between any one linguistic form and any specific social meaning”. Indeed, they warned that to lift “components of a discourse out of their context and consider them in isolation would be the very antithesis” (1979: 198) of their approach, which relates different features and processes to each other and insists on the “multi-functional use of linguistic form and […] emphasis on the systematic nature of selections”. For their part, Hodge and Kress included two appendices in the second edition of their book in an attempt to help the reader: (1) “Key concepts in a theory of social semiotics” (1988: 261–268), and (2) an “Annotated bibliography” of a diverse “set of readings from many different disciplines” that were influential on their thinking (1988: 269–272). These were invaluable then, and they still are, as a way of showing their originality, their positioning in linguistics and critical theory, and their interdisciplinarity.

2.4.6 Kress 1989: ‘Linguistic Processes in Sociocultural Practice’

While there were many works by the critical linguists in the 1980s and 1990s that were important and influential, there are certain works that have been singled out by one or more of the prominent originators and current practitioners of CDA15. Notable among those16 is Kress (1989, first edition in 1985b), Linguistic processes in sociocultural practice, especially for his discussion of educational texts and “spoken dialogue, including interview” (Fairclough & Wodak, 1997: 264); and the shifts he makes in CritLing can be seen as an affirmation of, and response to, critiques of his own earlier work as well as the need for more work on CritLing , some of which eventually led to CDA. Kress’s book17 presents “a theory in which all aspects of linguistic activity appear as social practice, and in which all linguistic forms and processes are treated as and accounted for in terms of social forms and social processes” (1989: 1). This includes the social practice of interviews, on which we will focus here. Kress had already participated in research in this area in (Fowler et al., 1979a), where they were treated as sociolinguistic mechanisms of control of subordinate groups by dominant ones, and “not simply a random example of the role of language in social practice. Their embodiment of inequality of power and their use as an instrument of control make them a typical example” (Fowler et al., 1979b: 2). Kress and Fowler (1979) and Hodge, Kress and Jones (1979) related the linguistic features of interviews to the functions and meanings of the social situation and the purposes of the participants. Kress and Fowler (1979) defined an interview as a type of conversation, a rather simple and clear genre in which the means of expression are highly overt, strict and (socially) legitimized; it is socially structured face-to-face discourse that “exhibits an inequality , a skew in the distribution of power” (1979: 63). And they showed how language reflects this inequality through an in-depth analysis of, e.g., on the one hand, type of interview, status of the participants, issue of agency, conflicting ideologies, etc., and on the other hand, passives, nominalization, pronouns, modality, questions, syntax, transitivity, verb types, etc. In their opinion, “these interviews are only a specialized, institutionally validated, variety of the interactions revolving around power differences which go on all the time in our society” (1979: 80).

The major difference between Kress’ account in (1989) and these earlier studies is his use of “discourse” as defined in the work of Michel Foucault (1972) , and in particular the idea that “institutions and social groupings have specific meanings and values which are articulated in language in specific ways” (Kress, 1989: 6). As a consequence, “discourses are systematically-organised sets of statements which give expression to the meanings and values of an institution. Beyond that, they define, describe, and delimit what it is possible to say and not possible to say […] with respect to the area of concerns of that institution, whether marginally or centrally” (1989: 6–7). For Foucault, Kress said, discourse also “organises and gives structure to the manner in which a particular topic, object, process is to be talked about. In that it provides descriptions, rules, permissions and prohibitions of social and individual actions” (1989: 7) from the point of view of the meanings and values of a specific institution. Citing types of discourse—such as educational, nationalist, sexist, feminist, patriarchal, romantic, Christian, conservative, capitalist, medical, etc.—Kress said that they “do not exist in isolation but within a larger system of sometimes opposing, contradictory, contending, or merely different discourses”, which result in dynamic relations, shifts, movement, mismatches, disjunctions, discontinuities, and so forth, especially when they collide in a particular context. Discourses attempt to reconcile these differences “by making that which is social seem natural and that which is problematic seem obvious […] unchallengeable […] ‘common sense’” (1989: 10), with no alternatives. As a result, speaking/listening and writing/reading are often determined by one’s place in (intersecting sets of) discourses.

He also insisted that we are all members of social groups in which several different institutions and their discourses operate and intersect and thus we are subject to “social group discursive multiplicity, contestation, and difference” (1989: 11). The discursive history of each member of a social group may be the same as, partially similar to, or quite different from, others in the same social place, and thus there is “social determination of an individual’s knowledge of language on the one hand, and individual difference and differing position vis-à-vis the linguistic system on the other” (1989: 11–12). This is particularly apparent in dialogues such as interviews, which “display discursive difference at every point” and “the structure of difference particularly clearly” (1989: 12, 14), since by definition they are built on differences around power and knowledge. Kress (1989: 23) also makes the point that the “forms and functions of the social occasion and the purposes of the participants clearly give form and meaning” to the interview as a genre in spoken language. Its interactional nature is foregrounded and a number of formal features structure the interaction, e.g., turn-taking is directed by the interviewer, who has power and control; the text of the interview is overtly motivated by difference; and “the textual strategies are direction and questioning, on the part of the interviewer, and response, information, and definition, on the part of the interviewee” (1989: 23). Interviews are, thus, highly structured and rule governed. Kress also emphasizes that while the forms and meanings of texts are determined by discourses and genres, the sources of which are social/cultural, they are at the same time the product of individual speakers who are located in “a network of social relations, in specific places in a social structure” (1989: 5) with their specific modes and forms of speaking, practices, values, meanings, demands, prohibitions and permissions, as well as “the kinds of texts that have currency and prominence in that community, and the forms, contents and functions of those texts” (1989: 6). Thus, in an interview, they are themselves “attempting to make sense of the competing, contradictory demands and claims of differing discourses” (1989: 31). As a result, “the discursive differences are negotiated, governed by differences in power, which are themselves in part encoded in and determined by discourse and by genre ” (1989: 32). The resolution of these discourses is the source of dialogue, with oneself or with another, and leads to texts that are not the work of any one person (Bakhtin, 1986) and are the sites of struggle and linguistic and cultural change.

For Kress, language is entwined in social power, including distributions of power and relations of difference, in a number of ways—by indexing it, expressing it, heightening it, challenging it, subverting it, and even altering it in some cases. Power differences and their effects are often conceptualized through textual and visual metaphors using space and distance, and language often articulates a finely nuanced means of talking about social hierarchical structures, both as a static system and as a dynamic process—as often happens in interviews (1989: 53), since they are social occasions and genres that are based on power differences. He examines ‘keeping one’s distance’, both in the spatial arrangement of interviewer and interviewee and through linguistic usage, such as pronouns, names, modes of behavior, commands, modality—all of which help in the assignment of, e.g., superior knowledge and more power to the interviewer vs. inferior knowledge and less power to the interviewee. This can be mitigated by politeness conventions on the part of the interviewer or heightened by subject positions, e.g., an assertive interviewer vs. a tentative interviewee. The interviewer may also create more distance, by assuming “a certain stance towards the content of the interaction, or to the possibility of an interaction … a retreat into an institutional impersonality, or a retreat into individual invisibility … [which] make the sources of power or authority difficult to detect, and therefore difficult or impossible to challenge” (1989: 57). This also extends to the various forms of language, e.g., high vs. low register, working class vs. middle class, etc. and which ones are deemed to be appropriate in a given interview situation. These issues are “subject to the laws of social power” (1989: 64), including the class, race, gender , ethnicity, age, status, etc. of the interlocutors, the particular genre (e.g., job interview for an English teacher vs. for a sales manager), the particular institution (e.g., education vs. industry), and so forth. This shows that “we need to adopt a constantly critical stance towards our own practices and assumptions, in every detail and at every level” (1989: 66) and to realize that all social activity, including linguistic activity, is governed by larger, and sometimes competing, ideologies.

All of this signifies that “while discourse and genre provide the systematically- organised linguistic categories which make up a text , ideology determines the configuration of discourses that are present together and their articulation in specific genres. Ideology is therefore intricately connected into the construction of texts” (1989: 83), into how the discourses are to be valued, how they relate to each other, and how they are arranged in text in response to the demands of larger social structures. As a result, he claims that ideology is typically resistant to change (and thus, perhaps paradoxically, ideology can be conservative), since it is based on established social and material practices, and even when there is change, it provides the categories that often shape the thinking about the new discourse practices, including how to classify them and how to make them into common sense. But there is at the same time a tension between social reality and social and material practices; and “material practices continue to affect and shape cultural ideological categories, those of language included. Here, in this difference and in this constant dialogue lies the motor of social change, and therefore of language change” (1989: 84). This can consist of a change in the ideological, discursive and generic positions of individual speakers or a change in the linguistic system brought about by speakers in some cases, which is in contrast with the tradition in (socio)linguistics and historical linguistics which focuses on, and considers as very likely, changes in the system with no human agency.

Kress was also interested in how individual action can effect change. He took as his example ‘sexist discourse’, about which there was much written at the time, which suggests specific subject positions for women, which in turn strongly shape the kinds of language women use or is used about women. The effects of gender roles (and sexism) meant that a woman had a typical placement in certain types of genres; for example, if she participated in an interview, she was not usually the interviewer and more likely the interviewee, treated as not being intelligent, patronized by the interviewer, and so forth. If the situation is generalized, the same types of texts are produced on many occasions and a recognizable manner of speaking emerges and is seen as the natural or proper way for women to talk or be talked about. However, “modes of talking can become altered. The theoretical analyses of feminist writers, and the social practices of feminists over a long period are bringing about a recognisable change in the discourse around gender , and in social practices” (Kress, 1989: 94, we will discuss this in more detail in Sect. 6.​7). As we will see, much of what Kress called for in 1989 was taken up by CDA and thus his book could be seen as pre-/proto-CDA. However, before we go into this, we first need to discuss developments within SocSem that Kress was also very much involved in.

2.5 Hodge and Kress 1988: ‘Social Semiotics ’ and Other Work

Hodge and Kress’s highly praised book, Social Semiotics (1988), while it acknowledged that Halliday’s book (1978)18 “had a profound influence on our own theory” (Hodge & Kress, 1988: 270), it set out to correct a number of intrinsic limitations in the scope of CritLing and in particular in the theory of their earlier work (Kress & Hodge, 1979; Fowler et al., 1979a; Aers, Hodge, & Kress, 1981). These needed to be redressed “in order to fulfill our initial aim for a usable, critical theory of language” (Hodge & Kress, 1988: vii). Thus they presented a way forward and proposed an updated CritLing version of SocSem. They began with “social structures and processes, messages and meanings as the proper standpoint from which to attempt the analysis of meaning systems” (1988: vii). They also emphasized that a theory of language “has to be seen in the context of a theory of all sign systems as socially constituted and treated as social practices” (1988: vii–viii), and thus that linguistics and the study of verbal language should be “thoroughly assimilated into a general theory of the social processes through which meaning is constituted and has its effects”, i.e., a “theory of communication and society” (1988: viii). Their earlier theoretical position of language as ideology (Kress & Hodge, 1979; see Sect. 2.4.4) was extended to all the means whereby a society constitutes its cultures and its meanings: “texts and contexts, agents and objects of meaning, social structures and forces and their complex relationship together constitute the minimal and irreducible object of semiotics analysis” (Hodge & Kress, 1988: viii). They thus extended the critical linguists’ sociopolitical orientation to meaning-making in general, and presented a coherent and useable framework which was both an interdisciplinary synthesis and a single coherent scheme of methods and concepts, from semiotics , linguistics, psychology, sociology and others (see also Hodge, 1990).

Their work was based especially on semiotics and in particular, on a number of premises that they wanted to emphasize as different from and/or a furthering of their earlier joint work (Kress & Hodge, 1979) as well as Halliday (1978), e.g.: meaning is produced and reproduced under specific social conditions and through specific material forms and agencies; meaning exists in relationship to concrete subjects and objects and is inexplicable except in terms of this set of relationships; society is typically constituted by structures and relations of power, either exercised or resisted; society is characterized by conflict and cohesion, so that the structures of meaning at all levels, from dominant ideological forms to local acts of meaning, show traces of contradiction, ambiguity, and polysemy. “So for us, texts and contexts, agents and objects of meaning, social structures and forces and their complex interrelationships together constitute the minimal and irreducible object of semiotic analysis” (Hodge and Kress, 1988: viii). In previous work Kress (1982) had already said that “a more comprehensive notion of ‘text ’ will have to include both the verbal and pictorial elements of the one text ” (1982: xi); and thus, in Social Semiotics “real efforts [were] made to understand systems of representation other than language—visual images, music, and performance. This understanding is then ‘turned back’ on language in new theorizations of the characteristics of language” (Kress, 1990: 94) and they argued that “no single code can be successfully studied or fully understood in isolation because meaning resides so persuasively in a multiplicity of visual, aural, and behavioral codes” (1990: 96). As a result, “meaning, in all its manifestations and in all places and how meaning is made, was the issue which provided the underlying coherence” of their work (Böck & Pachler, 2013: 23–24). They took a multi-semiotic standpoint and applied it to a wide range of semiotic media and forms: e.g., images, TV, comics, sculpture, fashion, architecture, culture , media, education, and advertising.

Since the central premise of their approach was that “the social dimensions of semiotic systems are so intrinsic to their nature and function that the systems cannot be studied in isolation” (Hodge & Kress, 1988: 1), they set themselves against ‘mainstream’ semiotics (or semiology), which, like ‘mainstream’ linguistics of the time, “emphasizes structures and codes, at the expense of functions and social uses of semiotic systems, the complex interrelations of semiotic systems in social practice, all of the factors which provide their motivation, their origins and destinations, their form and substance. It stresses system and product, rather than speakers and writers or other participants in semiotic activity as connected and interacting in a variety of ways in concrete social contexts” (1988: 1). Despite this criticism, they didn’t reject semiotics, as others did; rather, they argued for a reconstitution, and a reconsideration, of a ‘new’ semiotics (just as the critical linguists had argued for a ‘new’ linguistics), since “semiotics […] must provide this possibility of analytic practice, for the many people in different disciplines who deal with different problems of social meaning and need ways of describing and explaining the processes and structures through which meaning is constituted” (1988: 2; see also Hodge & Kress, 1982).

Hodge and Kress (1988) took as their starting point the Marxist critique of capitalism and their view of ideology as “a level of social meaning with distinctive functions, orientations and content for a social class or group” (1988: 3) which can be combined with other ideologies to represent “the social order as simultaneously serving the interests of both dominant and subordinate”. As a result, “the meanings and the interests of both dominant and non-dominant act together in proportions that are not predetermined, to constitute the forms and possibilities of meaning at every level” (1988: 8). For an example of this, see the discussion (in Sect. 4.​6) of their analysis of a billboard advertisement for Marlboro cigarettes and its amendment by a group against ‘unhealthy promotions’ (1988: 8–9). They used Saussure’s structuralist theory of semiotics as “an antiguide” and articulated an “alternative semiotics” (1988: 18) based on “Saussure’s Rubbish Bin” (1988: 15–18), which they conceptualized as being filled with those facets of language which Saussure minimized, or treated as fixed and not needing further linguistic analysis, or claimed were not amenable to scientific analysis, or excluded from linguistics and semiotics as extrasemiotic phenomena. They, however, treated those very same facets as the basic premises of their work. They focused on: culture , society and politics; other semiotic systems in addition to language; the processes and the products of speaking (parole); signifying practices in other codes; “the processes of signification, the transactions between signifying systems and structures of reference … [and] the material nature of the sign” (1988: 18). Reflecting on issues discussed in Kress and Hodge (1979), they argued that there is an intrinsic connection between diachrony, time, history, process and change, since all semiotic activity takes place in time and is subject to transformational activity, and every transformation is a concrete event with agents and reasons derived from material and social life (1988: 35).

In the course of the book, they amplified various facets of their theory: e.g., context as meaning; style as ideology ; social definitions of the real and reality; transformation and time; transformations of love and power: the social meaning of narrative; and entering semiosis: training (young) subjects for culture . At the same time, they analyzed a wide range of verbal and visual phenomena and discussed an impressive variety of visual artifacts with (usually photographic) visual accompaniment: e.g., paintings, mosaics, sculptures, a photograph in a magazine, a children’s drawing, cartoons, a hand-written text , kinship diagrams, family photographs, and so forth. Other phenomena discussed in the text (without visual accompaniment) were just as wide: e.g., fashion, a TV interview, a magazine article, rites of passage in various communities (e.g., weddings, birthdays, funerals), the social meaning of folklore, and so forth. Moreover, the inclusion of renderings of the visual phenomena made the book very different from earlier ones and helped to open up a new way of discussing and presenting visual (social) semiotics, which paved the way for further development in this area (see the next section and Sect. 4.​6).

In essence, SocSem as Hodge and Kress (1988) present it is the study of both the social dimensions of meaning and the power of human processes of signification and interpretation in shaping individuals and societies. It is primarily interested in the way language is used in social contexts, whether visual or verbal in nature, and the way we use language to create society (see Thibault, 1991; Machin and Mayr, 2012). It also includes the study of how people design and interpret the meanings of texts, and addresses the issue of how meanings are adapted as society changes. Hodge and Kress also tried to account for the variability of semiotic practices. This different focus shows how individual creativity, changing historical circumstances, and new social identities and projects can change patterns of design and usage, since, from their SocSem perspective , the many different channels for meaning-making are not fixed into unchanging codes but are resources which people use and adapt (or design) to make meaning. This view provided the impetus for them and many other SocSem scholars to reject the term ‘sign’ and replace it with ‘resource’, based on Halliday’s view that the grammar of a language is a “resource for making meanings” (1978: 192; see Sect. 2.3).

Two years later, Hodge (1990) published a book on literature as discourse, in which he provided a highly accessible exposition of the concepts and methods of CritLing SocSem and a new type of (literary/linguistic) criticism that puts forth a theorization of (English) literature. He characterized the framework presented in (Hodge & Kress, 1988) as an interdisciplinary synthesis and a single coherent scheme of methods and concepts (from semiotics, linguistics, psychology, sociology and others), a new strategy for dealing with text in all its media and forms, and “a broadly based practice that is situated socially and historically” (Hodge, 1990: 233)—which is also needed for work on literature. The notion of language as social semiotic , i.e., socially derived and with socially instrumental meanings, and this new interdisciplinary version of SocSem, became the model for investigation by Australian scholars at the interface of language, literary and semiotic studies, who did (dynamic) intertextual analysis, such as Terry Threadgold (1986, 1988a, 1988b) and Paul Thibault (1991), who found Halliday’s original SocSem model too closely preoccupied with linguistic structure (grammar).

In 1993, Hodge and Kress published the second edition of Language as Ideology , in which they positioned themselves as being proponents neither of the older version of CritLing nor of CDA, which had started in 1991 (see Sect. 3.​2). They developed CritLing as a theory of language as a social practice, where “the rules and norms that govern linguistic behaviour have a social function, origin and meaning” (1993: 204). Their involvement in SocSem had a profound influence on much subsequent research. The relation between CritLing and SocSem was very strong, and for some, CritLing encompassed SocSem, while for others SocSem encompassed CritLing ; those practicing SocSem used the term SocSem to emphasize the interplay between language and other social semiotic systems (Hodge & Kress 1988; Kress & Threadgold, 1988) and semiotically oriented studies of literature (Threadgold, 1988a, 1988b). Still others continued to use CritLing as they developed what would become CDA or they discussed CDA favorably (Fairclough 1992, 1995).

Meanwhile, SocSem was about to undergo a new phase.

2.6 Kress and Van Leeuwen 1996: ‘Reading Images: The Grammar of Visual Design’

2.6.1 Introduction

SocSem had a renewal of sorts through the publication of Kress and van Leeuwen’s book, Reading images, in 1996 (2nd edition 2006; earlier version 1990), which represented a turn to the visual. Theirs was the first systematic account of the “grammar” (i.e., the choices and rules of combination) of visual design, which “offers a much more comprehensive theory of visual communication than the earlier book” of 1990 (2006: ix), which was less theoretical and more practical (i.e., oriented on practice). They set their 1996 book in the theoretical framework of SocSem and SFL/SFG (cf. O’Toole, 1994)19, which gave them the tools they needed in order to understand visual representation and communication and to put analytical and methodological emphasis on the integration of the visual and the verbal as semiotic phenomena. They stipulated two major caveats: that the affordances and formal organizational meanings differ across the two modes (due to, e.g., the prevalence of time in spoken language and space in images) and that the term grammar means that visual analysis should move beyond interpreting the meaning of individual elements (i.e., treating them as isolated words) and beyond focusing on the ‘denotative’ vs. ‘connotative’ meanings, iconographical vs. iconological, significance of elements in images. In short, it should “examine the structures such elements form within a visual composition” (Djonov & Zhao, 2018: 4). Due to their 1996 book, Kress and van Leeuwen, separately and together, are seen as the founders of the addition of visual (aspects of) texts to SocSem, and more widely, of serious visual analysis in semiotics and other domains. They also provided an impetus for looking at the ways in which images (and other modes of communication) are not neutral, since they reflect social and power relationships, ideology , and a particular version of social reality (Machin, 2007). And they launched (see also Kress & van Leeuwen, 2001) the new area called multimodality (see Sect. 2.7), which has had far-reaching effects on SocSem in general, CDA/CDS (see Sect. 4.​6), work in visual communication and visual semiotics—and beyond.

Since Reading Images (the title was inspired by Reading Television, by Fiske & Hartley, 1979) was set within the theoretical framework of SFL/SFG SocSem, they discussed three schools of semiotics that “applied ideas from the domain of linguistics to other, non-linguistic modes of communication” (Kress & van Leeuwen, 1996: 6), such as painting, cinema, theatre, photography, fashion, music, etc. These were: the Prague School, which drew on the Russian Formalists and their concept of foregrounding, which results from deviation from standard forms for artistic or aesthetic purposes; the Paris School of semiotics (semiology), which applied the ideas of Saussure and others; and the “still fledging movement in which insights from linguistics have been applied to other modes of representation has two sources, both drawing on the ideas of Michael Halliday” (1996: 6). One source grew out of CritLing in the 1970s at UEA and led “to the outline of a theory that might encompass other semiotic modes” (1996: 6), as provided in Hodge and Kress (1988; discussed in Sect. 2.5). The other was a development of SFL and SocSem by Australian scholars in semiotically oriented studies of literature (Threadgold, Thibault), music (van Leeuwen, 1996: 6), and visual semiotics (O’Toole—and “ourselves”, Kress & van Leeuwen, 1996: 6). As they said in their preface (Kress & van Leeuwen, 2006: viii), Halliday’s “view of language as social semiotic and the wider implications of his theory” gave them the means to go beyond structuralist approaches and thus it influenced their work greatly. In addition, they cited and used much previous work in art history and visual communication.

We have discussed above various aspects of Kress’s prior work in SFL, CritLing and SocSem (see Sects. 2.3.1, 2.4, and 2.5). As for van Leeuwen, we should note here that, in an interview much later, he called his career “a mixture of design and serendipity” (Andersen et al., 2015: 93). In this case, he had much of the background needed to work with Kress on this project. He was both a practitioner and a theorist of the visual, who had long experience in scriptwriting, film direction, editing and production and at the same time, knowledge of Paris School semiotics, interest in social theory, experience working with SFL/SFG and SocSem, and a deep-seated conviction that “creativity and intellect can be combined” (2015: 94). He also had a desire to write about “the language of the image” (2015: 93—we will discuss further his work in SocSem and CDA in Sects. 3.​9 and 4.​6). Kress and van Leeuwen’s joint work on visual communication began in the later 1980s and included, especially, the stimulating environment of the Newtown Semiotic Circle in Sydney (the ‘Semiotics Salon’), where they participated in discussions and debates about SocSem which “helped shape our ideas in more ways than we can acknowledge” (2006: vii)—they gave special thanks to “Jim Martin” and “Fran Christie” (who were in Sydney) and “Bob Hodge” (who was at Murdoch University in Perth, Western Australia). From that participation came their 1990 book, which was “used in courses on communication and media studies, and as a methodology for research in areas such as media representation, film studies, children’s literature and the use of illustrations and layout in school textbooks” (1996: ix). The 1996 book was finished and published when both were in London, Kress at the Institute of Education and van Leeuwen at the College of Printing. Their “view that language and visual communication both realize the same more fundamental and far-reaching systems of meaning that constitute our cultures, but that each does so by means of its own specific forms” (1996: 17) and their specific proposals, “had a positive reception among a wide group from the professions and disciplines which have to deal with real problems and real issues involving images” (2006: vii).

However, the application of ideas from, or at least the search for parallels with, linguistics for the analysis of visuals was controversial (see Machin, 2007 for an overview). In the Preface to the 2nd edition (Kress & van Leeuwen, 2006), they answered some critiques by stating that they had attempted to use the general semiotic aspects of Halliday’s SFG (Halliday, 1985a), e.g., the metafunctions, and not its “specific linguistically focused features”, and that their goal of showing (Kress & van Leeuwen, 2006: viii) “how visual communication works in comparison to language” had been misunderstood as “an attempt to impose linguistic categories on the visual”, while their concern was, rather, to bring out both “the differences between language and visual communication” and “the broader semiotic principles that connect, not just language and image, but all the multiple modes in multimodal communication”. In light of these critiques, they delimited and/or reformulated in (Kress & van Leeuwen, 2006, 2nd edition) various parts of their proposal and addressed various omissions of the first edition, e.g., moving images, color, a wider range of three dimensional objects. At the same time, they reflected on the major societal changes in images and their use in the 10 years between the two editions and expressed their thoughts about the future of visual communication, given all the new affordances and meanings due to technology, the internet, websites, web-based images, social media, etc. Those reflections, however, led others to take a critical perspective on the issue of the unity and dominance of western visual language on which their analysis (in Kress & van Leeuwen, 1990, 1996, 2006) is based. This is because this unity derives from the global power of the western mass media and cultural industries and their technologies, which sometimes co-exists with more traditional forms (with higher, equal or lower status), or creates transitional forms/stages through the integration of (often dominant) western elements with a local visual semiotic , or exerts a normalizing influence through a variety of means on visual communication around the world, and so forth. All of this, others said, should be looked at more carefully with a critical lens (see further discussion of this in Sect. 3.​9.​4, which deals with van Leeuwen’s later work, in which he did just that).

2.6.2 General Overview

As said above, Kress and van Leeuwen deliberately left aside the study of visual lexis or vocabulary (specific signs, which many previous accounts of visual semiotics had concentrated on) since their primary aim was to make generalizations about visual design and also to discuss “the broad historical, social and cultural conditions that make and remake visual ‘language’” (1996: 4) in the Western visual cultural tradition (including regional and social variation) over the last five centuries or so. They used visual design as an all-encompassing term to cover “oil painting as well as magazine layout, the comic strip as well as the scientific diagram” (1996: 3), and many other types of visual texts, and thus they analyzed a wide range of examples, from children’s drawings to textbook illustrations, photo-journalism to fine art, scientific and other diagrams to maps and charts, and so forth. And they also “made a beginning with the study of three-dimensional communication: sculpture, children’s toys, architecture and everyday designed objects” (1996: vii). The term ‘grammar’ denotes how meaning is produced through recurring visual patterns of combination which are semiotic resources for meaning making in the actions and artifacts we use to communicate and the way in which people, places and things depicted in images “are combined into a meaningful whole […] in visual ‘statements’ of greater or lesser complexity and extension” (1996: 1). Thus, they set out to provide “inventories of the major compositional structures which have become established as conventions in the course of the history of visual semiotics, and to analyse how they are used to produce meaning by contemporary image-makers” (1996: 1). They also said that visual communication in general is less the domain of specialists than before and much more crucial in the area of public communication, leading to new and more rules and most importantly to the need for everyone to have ‘visual literacy’.

Reading Images had theoretical aims in addition to descriptive ones, and thus it developed a framework that could be used for ideological analysis, since, just as different ideological positions can be expressed by different grammatical structures (e.g., active vs. passive), they saw “images of whatever kind as entirely within the realm of ideology ” (1996: 12), and thus different images could convey different ideologies. They also regarded their book as “a contribution to a broadened critical discourse analysis” (1996: 13; CDA had already started in 1991; see Sect. 3.​2), which would encompass other semiotic modes than language, especially since at that time there was an “incursion of the visual into many domains of public communication where formerly language was the sole and dominant mode ” (1996: 13). This became a significant theme in CDA/CDS itself (see Sect. 4.​6). They argued that “visual structures realize meanings as linguistic structures do also” (1996: 2). Moreover, just as the term ‘grammar’ is often understood to mean a set of normative rules in speaking and writing, many creative and aesthetic uses of (the grammar of) language exist in literature and poetry (and elsewhere); in the same way, the grammar of visual design can be creatively employed by those in the fine arts, and at the same time it can be used in socially normalized ways to underpin the production of layouts, images, diagrams for reports, brochures, communiqués, advertising, etc.

Since their book is on visual design, it is about sign making (cf. Halliday’s notion of meaning making) or sign using, about “representation as a process in which the makers of signs […] seek to make a representation of some object or entity, whether physical or semiotic , and in which their interest in the object, at the point of making the representation, is a complex one, arising out of the cultural, social and psychological history of the sign-maker, and focused by the specific context in which the sign is produced” (1996: 6). This means that visual signs do not necessarily pre-exist their making, nor are they necessarily encoded in a repository of signs, rather they are the result of a process of construction of signs as “motivated conjunctions of signifiers (forms) and signifieds (meanings)” (1996: 7), where ‘motivation’ is understood “in relation to the sign maker and the context in which the sign is produced” (1996: 7; see also Kress, 1993), in relation to what the producer wants to convey. In this way, the producer’s choice of a given sign in a given context is motivated, in both its form and its meaning. What is also crucial here is that the concept of motivation comes into play in the use of a sign in a social context. This means that semiotic resources like images do not have fixed meanings but instead have a semiotic potential that can be applied differently in different contexts (see Abousnnouga & Machin, 2013).

In what is one of the striking aspects of the breadth of how they define the visual, Kress and van Leeuwen (1996: 11) exemplified these and many other theoretical points by children’s drawings (and illustrations in children’s books), because they believed that “the production of signs by children provides the best model for thinking about sign-making, and that it applies also to fully socialized and acculturated humans”. Indeed, the first few figures discussed in (1996: 6–10) are drawings by very young children and the section about visual literacy analyzes an illustration from Baby’s First Book by an adult about “every night I have my bath before I go to bed” (1996: 21). Later in their book (1996: 155–158), they compared two Self-Portraits by Rembrandt and two drawings by two 8-year-old boys of themselves for the covers of their class projects. Their point is that we can see the interpersonal metafunction at work in both the portraits and the drawings, especially in how they convey (in Rembrandt) and don’t convey (in the class projects) interaction with the viewer.

2.6.3 The Three Metafunctions in Visual Design

As Kress and van Leeuwen say in their Introduction (1996: 13), the structuring principle of both their ideas and the book came from Halliday (1985a, their major source of inspiration, and 1978), in particular his three metafunctions (see Sect. 2.3.4), which they (re)define as a “social semiotic theory of communication” (Kress & van Leeuwen, 1996: 40), with the understanding that each semiotic mode “has to serve several communication (and representational) requirements, in order to function as a full system of communication”.

Their discussion of the ideational (experiential) metafunction (1996: 45–113) deals with (1) narrative representations, which design social action, and (2) conceptual representations, which design social constructs—on the basis of both of which they discuss “the patterns of representation which the ‘grammar of visual design’ makes available” (1996: 13) for representing the world around us in social and natural space and inside us in conceptual space. They divided images into two types: detailed naturalistic pictures (e.g., a photograph of people in a landscape) and highly abstract images (such as abstract art, or diagrams, maps, charts). And they analyzed each of them into participants and processes (1996: 47). They combined Arnheim’s art theory (1974, 1982) about volumes or masses (i.e., participants) and vectors, tensions or dynamic forces (i.e. processes), with Halliday’s SFG and SocSem (1978, 1985), and extended Halliday’s notions of action and process vs. actor, goal, and recipient to specific visual signs that are “about something which participants are doing to other participants” (Kress & van Leeuwen, 1996: 49). In other cases, they analyzed a picture about participants who are simply carriers of meaning (through their attributes) as being “about the way participants fit together to make up a larger whole” (Machin, 2007: 127), i.e., not about doing anything, but about being “what they are”.

On this basis, they divided “representational structures” into two types: narrative or social action structures, which present “unfolding actions and events, processes of change, transitory spatial arrangements” (Kress & van Leeuwen, 1996: 79), vs. conceptual structures, which represents “participants in terms of their more generalized and more or less stable and timeless essence, in terms of class or structure or meaning”. In the case of social action, the participants (of various sorts, e.g., people, animals, objects) are the actors or agents who are connected to a vector (e.g., road, arrow, line), i.e., “they are represented as doing something to or for each other” (1996: 56). They call this a narrative, as long as there is some feature of directionality in the image, e.g., a painting of soldiers creeping up on their enemy with guns pointing at them. Conceptual representations design social constructs, such as classifications, where participants are related to each other in terms of a kind of relation, an overt or covert taxonomy (typically with at least one participant who is superordinate and one or more who are subordinate (1996: 81)). Examples include illustrations of various artifacts found in an archeological dig, tree structure diagrams, pie-charts, graphs—and many of the diagrams found in (Halliday, 1985a, 1978) and in this book (see Table 4.1, this volume) and many scientific and technical books) (see also Machin, 2007: 127).

Kress and van Leeuwen’s discussion of the interpersonal metafunction treats representation and interaction, i.e., designing the position of the viewer, and modality, i.e., designing models of reality. It “deals with the patterns of interaction which the ‘grammar of visual design’ makes available, and hence with the things we can do to or for each other with visual communication, and with the relations between the makers and viewers of visual ‘texts’ which this entails” (1996: 13). In their view “any semiotic system has to be able to project a particular social relation between the producer, the viewer and the object represented” (1996: 41). Participants produce and make sense of images “in the context of social institutions which, to different degrees and in different ways, regulate what may be ‘said’ with images, and how it should be said, and how images should be interpreted” (1996: 119).

The involvement of participants may be direct and immediate (face-to-face) or there may be no immediate and/or no direct involvement. The context of production and the context of reception may be the same or at least connected—or there may be a disjunction between them; in the former case, the producer and the viewer may be physically present, but in the (very common) case of disjunction, typically the producer is absent and the viewer has only the image. Whether there is connection or disjunction between the contexts of production and reception, they have in common “the image itself, and a knowledge of the communicative resources that allow its articulation and understanding, a knowledge of the way social interactions and social relations can be encoded in images” (1996: 120). As a result “the interactive meanings are visually encoded in ways that rest on competencies shared by producers and viewers … [they] derive from the visual articulation of social meaning in face-to-face interaction, the spatial positions allocated to difference kinds of social actors in interaction” (1996: 120–121).

From this a variety of meanings arise. There are many different kinds of interpersonal relations, e.g. a person in a photograph may “address viewers directly, by looking at the camera” (1996: 43) or may seem to look directly as the viewer’s eyes or make an inviting gestures, etc.—they call this a “demand” image (based on Halliday, 1985a), which demands something from the viewer (Kress & van Leeuwen, 1996: 118), asks for some sort of (possibly) imaginary relationship between the viewer and the image and therefore conveys a sense of interaction. Or the person may be “turned away from the viewer and this conveys the absence of a sense of interaction” (1996: 43), there is absence of gaze at the viewer—it is an ‘offer’ image (Halliday, 1985a), the participant is the object of the viewer’s scrutiny (not only in photographs of people, but also in the cases of a drawing in a scientific textbook, and also diagrams, maps and charts, etc., Kress & van Leeuwen, 1996: 119). And, we understand both types of interpersonal images, “because we understand the way images represent social interactions and social relations” (1996: 121), including relations with objects. There are other, interpersonal, facets of images: e.g., size of the frame (a scale from close-up to medium to long shot), which suggests different relations between participants and viewers, with regard to, e.g., social distance, ranging from close personal distance to far personal distance to close social distance to far social distance to public distance (1996: 130–135; see also van Leeuwen, 1986). These patterns can be conventionalized in, e.g., television, diagrams, newspaper and magazine photos, advertisements, landscapes, and they are typically not in an either-or relation but in “scales” or gradations (see Machin, 2007). However, because they are conventional, they can also lead to misunderstanding, due to intercultural differences.

Another way in which images bring about or reproduce relations between represented participants and the viewer is perspective, the selection of an angle, a point of view. There are various meanings associated with horizontal vs. oblique angle: a frontal view typically represents involvement or intimacy or subjectivity (part of our world, something we are involved with), whereas the further it goes to an oblique angle, the more detachment there is or objectivity (not part of our world), with degrees (or a scale) of involvement-shading-into-detachment (1996: 142–143). In either case, they can express subjective attitudes that are “often socially determined” (Kress & van Leeuwen, 1996: 135) and yet “naturalized” at the same time, based on an “impersonal, geometric basis”. Beginning with the Renaissance, “visual composition became dominated by the system of perspective , with its single, centralized viewpoint” (1996: 136), dependent on the viewer. This resulted in two types of images, either “subjective”, “with (central) perspective (with a built-in point of view)”, or “objective”, “without (central) perspective (without a built-in point of view)” (1996: 136). Many late nineteenth and twentieth century images combine both (as do, e.g., advertisements, 1996: 138–139). The angle of the shot is also important: a vertical angle can signify differences of imaginary symbolic power, depending on whether the viewer sees the participant from above, looking down (and thus, the viewer is represented as exerting symbolic power over the participant, who could have lower status, vulnerability, or inferiority), or looking up from below (which represents less power and status, vulnerability or inferiority for the viewer and more power, authority or respect for the participant (1996: 146)). A horizontal camera angle can symbolize equality and no power differential (see further discussion of distance, angle, gaze in Sect. 4.​6.​3).

According to Machin (2007: 48) “it was Kress and Hodge (1979) in Language as Ideology who first pointed out that modality could also be expressed non-verbally”, but it was Kress and van Leeuwen (1996) who proposed a variety of visual techniques “whereby modality can be reduced and reality can be avoided or changed”. As said (in Sect. 2.3.4), in SFG , ‘modality’ refers to the truth value or credibility or degree of certainty of statements about the world; and in their discussion of “modality: designing models of reality”, Kress and van Leeuwen (1996: 159) stated that “one of the crucial issues in communication is the question of the reliability of messages”. Is what we see true, real, or is it fiction, something outside reality? “The questions of truth and reality remain insecure, subject to doubt and uncertainty, and, even more significantly, to contestation and struggle” (1996: 159). And yet we have to make decisions and we have to trust (or not) the information we receive, and thus the message (text , visual image) should have some modality markers, i.e., cues established by the social groups within which we interact as reliable guides to the truth, factuality, certainty, credibility of messages, or their falsehood, fiction, doubt, unreliability20. Visual SocSem does not claim to establish absolute truth or falsity of messages, but it can show how a visual proposition is represented as true or not (1996: 159), according to the values and beliefs of a given group. That is, the definition of reality is based on “currently dominant conventions and technologies of visual representation” (1996: 163) and “abstraction relative to the standards of contemporary naturalistic representation” (1996: 165). And “modality is realized by a complex interplay of visual cues” (1996: 167), based on eight modality markers (scales) of visual modality that go from “‘certain’ to ‘uncertain’ with ‘probable’ in between” (Machin, 2007: 48), but also can descend again to lower modality, if it goes beyond what is judged to be “naturalistic”, according to the (social) criteria for what counts as real. There are also, e.g., scales for color (1996: 165–168; also Kress & van Leeuwen, 2002), contextualization (“degrees of articulation of the background”, Machin, 2007: 51), representation from “maximum abstraction to maximum representation of pictorial detail” (Kress & van Leeuwen, 1996: 166). These rest “on culturally and historically determined standards of what is real and what is not” (1996: 168), which can differ according to social communities and over time.

This has led to different coding orientations (Bernstein, 1981), “which inform the way in which texts are coded by specific social groups, or within specific institutional contexts” (Kress & van Leeuwen, 1996: 170), of which there are four types in western society: technological (based on the effectiveness of the visual representation as a kind of “blueprint”); sensory (used in contexts “in which the pleasure principle is allowed to be dominant”); abstract (used by sociocultural elites and thus a mark of social distinction, e.g., “in ‘high’ art, in academic and scientific contexts”); and naturalistic, assumed to be common sense. This last one has been dominant in our society—although with new image technologies, the status of this type of coding is coming into crisis (1996: 170–171). The issue of modality is particularly complex in modern art, which attempts to redefine reality and to reject photographic naturalism (1996: 171–180).

Kress and van Leeuwen (1996, 2006) also discussed the meaning of composition (see also van Leeuwen, 2003), which is based on the textual metafunction and deals with “the way in which representations and communicative acts cohere into the kind of meaningful whole we call ‘text ’” (Kress & van Leeuwen, 1996: 14), which covers any kind of semiotic artifact. Given that “any semiotic system has to have the capacity to form texts , complexes of signs which cohere both internally and with the context in and for which they were produced” (1996: 41), visual grammar provides a range of resources and different compositional arrangements in order “to allow the realization of different textual meanings”. In particular, it is concerned with “the way the representational and interactive elements are made to relate to each other, the way they are integrated into a meaningful whole” (1996: 181) by “three principles of composition” (1996: 183): (1) ideational “information value”, i.e., placement of elements left and right, top and bottom, (2) interpersonal salience which attracts the viewer’s attention according to foreground vs. background placement, relative size, contrasts in tonal value or color, differences in sharpness, etc.; and (3) textual “framing”, i.e., framing devices such as dividing lines, or connectedness or disconnectedness of the elements of the image. This happens not only with single pictures or simple images, but also with composite visuals, or “multimodal texts (and any text whose meanings are realized through more than one semiotic code is multimodal)” (1996: 183). They concluded that the meanings of this type of text are not simply the sum of the meaning of discrete elements or parts, but that “the parts should be looked upon as interacting with and affecting one another” (1996: 183) and thus, the whole is “an integrated text ”, and neither the verbal nor the visual aspects of the text are by definition prior to or more important than or independent of the other. This, they explain, is why they draw comparisons between visual and verbal communication, seek to break down disciplinary boundaries between the study of each, and use comparable language and terminology for both.

Using these multimodal texts/images, they then go on to discuss elements of the composition of the whole image (1996: 177–200) based on information value, salience and framing, which they apply to an array of visual phenomena (such as film, advertisements, newspapers, painting, diagrams, science textbooks, children’s drawings), each of which can have a variety of ideological meanings, depending on many factors. They identify three types of information value, dependent on placement in the image. In those cultures with a left to right, top to bottom writing system, the first is left vs. right placement, where left placement means ‘given’ (familiar, agreed upon, point of departure) and right placement means ‘new’ (unknown, not yet agreed upon, needing special attention) (1996: 181, cf. ‘theme’ vs. ‘rheme’ in Halliday, Sects. 2.3.22.3.4). The second is top vs. bottom placement, where top placement means ‘ideal’ (aspirations, desires, abstract representation, idealized, generalized) and bottom placement means ‘real’ (more or less factual, specific and detailed, realistic, practical). The third is center vs. margin(s), with “the crucial element” (Arnheim, 1982; Kress & van Leeuwen, 1996: 192), the ‘nucleus’ of the information, in the center vs. the elements that are ‘ancillary’ or ‘secondary’ in the margins. This third dimension is less used in Western art and when it is, it is typically combined with one or both of the others, and the center is the bridge between, e.g., ‘given’ on the left and ‘new’ on the right, and thus acts as ‘mediator’ (1996: 209, although there are many complexities).

They defined interpersonal ‘salience’ as “the degree to which an element draws attention to itself, due to its size, its place in the foreground or its overlapping of other elements, its colour, its tonal values, its sharpness or definition, and other features” (1996: 210). Typically, it creates an interpersonal hierarchy of importance among the elements, selecting some as more important or more worthy of attention, and it interacts with given-new, ideal-real and center-margin, in a variety of ways. In visuals, “when composition is the integration mode, salience is judged on the basis of visual clues” (1996: 202), such as the greater the weight of an element, the greater its salience, which may also depend on potent cultural symbols, size, color, tone, focus, foregrounding, overlapping, etc. The third key element in composition, ‘framing’, can be stronger or weaker; if it is stronger, what is framed is a separate unit of information, and context then shows the precise nature of the separation. Framing stresses individuality and differentiation, a gap of some sort, whereas its absence or a weak frame stresses group identity, as belonging together, and a strong sense of connection across the frame of the (two) parts of the image. Framing can be given by frame lines, objects in the image, discontinuities of color or shape, or empty space; and connectedness can be emphasized by vectors, depicted elements, abstract graphic elements, or repetition of shapes or colors. And every element, given or new, ideal or real, center or margin, can be framed either strongly or weakly or not at all.

While those compositional elements seem to be well established in Western art, Kress and van Leeuwen show that the issue of “reading paths” by the viewer is quite complex. Some reading paths are obvious, as in linear (syntagmatic) compositions which are strictly coded to be read horizontally left to right and line by line and vertically top to bottom (1996: 204) as in densely printed pages of text (such as this one). Or they can be more paradigmatic, less strictly coded and read in more than one way, such as websites, which are “specifically designed to allow multiple reading paths”, and also “increasingly many texts (newspapers, billboards, comic strips, advertisements, websites) are of this kind” (1996: 208), and this is even more the case in the twenty-first century. It is a fascinating part of the new ways of communication, and “the study of the meaning of new kinds of reading paths has barely begun” (1996: 206) and has become another issue of interest in visual studies, including in van Leeuwen’s work (see Sect. 3.​9.​4).

2.6.4 Other Facets of the Visual

Kress and van Leeuwen also discuss the issue of the materiality of meaning, i.e., the materiality of the signs themselves, including the surfaces on which inscriptions are made (e.g., paper, canvas, film, computer screen), and the means and processes of inscription (e.g., ink, paint, chemicals), since, e.g., “it means something quite specific whether a painting is executed in watercolours or in oil, whether a knife is used to apply the paint or a spraygun” (1996: 230). However, they point out that many linguists would say that it is the same text when written with a pencil on scrappy bits of paper with bad handwriting, or pen and ink on glossy paper with no cross-outs or corrections, or printed out on good bond paper using a word-processor and printer (Halliday, 1985a calls this ‘realization’, see Sect. 2.3)—as long as they are identical word-for-word. However, a teacher, a sculptor, an artist, a potential employer or a marketing executive would say that they are very different, since presentation matters. Thus, Kress and van Leeuwen (1996: 14) emphasize that differences in presentation contribute to the meaning of visual texts. And, we could add, the meaning of, e.g., a spoken text , differs when produced with a different regional or social or foreign accent and/or with a variety of different intonations (as van Leeuwen had said several years before, see Sect. 3.​9.​1).

For Kress and van Leeuwen, representational practices differ in the degree to which the materiality of the text plays a role in semiosis. In their view, “the material expression of the text is always significant” (1996: 216); it is a separately variable semiotic feature. “Texts are material objects which result from a variety of representational and production practices that make use of a variety of signifier resources organized as signifying systems” (1996: 216), each of which contributes to the meaning of the text in its own particular way. Thus, the production of a text is “a culturally and socially produced resource for meaning-making” and “it is in this process that unsemioticized materiality is drawn into semiosis” (1996: 231). Meaning potentials are different from culture to culture , from context to context , etc. And even typography and letterform have their own meaning potential, metaphorical association and transport of meaning, and metafunctions. Thus, it is not the case that all of the aspects that go into the making of an image are part of a single representational system in which all the units are of the same kind. For example, a portrait painting involves multiple signifying systems, including not only various aspects of the painting itself, but also the size of the painting, the type of canvas it is on, various aspects of the frame, the caption on the painting or above or next to the painting, the signature of the artist, and so forth. The different means and processes of inscribing words constitute one system among many since they can be changed while other aspects of the production of the image are held constant. In addition, any of the signifying systems can realize all the choices from the ideational, interpersonal and textual metafunctions . This is also the case with language; in particular, the material expression of the text is, from a social semiotic point of view, always meaningful, a separately variable semiotic feature, a culturally and socially produced resource for meaning making. This is highly complex and there is no established inventory in relation to, e.g., the way representations are produced, especially in view of various new technologies. Thus, there are produced by, e.g., (1) the hand (with a variety of means for creating the representation, e.g., pencil, pen, typewriter, etc.), (2) recording (of various sorts), e.g., printing press, and (3) synthesizing technologies. These are linked to ongoing theoretical discussions not only about production but also about transmission, reception, and distribution and the ongoing (changes in the) limits of technology—all of which bring up many other issues (see 1996: 217–238).

In addition, in their extension of their previous discussion “into the domain of three-dimensional visuals” (1996: 14), Kress and van Leeuwen underscored key similarities and differences between two- and three-dimensional objects. They showed that the latter are themselves also subject to a grammar of visual design, as in sculptures, which are primarily symbolic objects “for contemplation and veneration” (1996: 240) vs. three-dimensional “designed objects” such as scientific models, children’s toys, or everyday objects, which are made for practical use (for the user), although they may also “convey symbolic messages”, and there are many other types of objects such as motorcars, architecture, and stage sets (which were explored later by van Leeuwen, see Sect. 3.​9). All of these depend on available forms and meanings, but they also bring in new categories (e.g., for sculpture: presence or absence of a pedestal, frontal or oblique placement with respect to the viewer, etc.; see also discussion of van Leeuwen’s work in Sect. 3.​9). They also discussed moving images and the role of color, as in film (later discussed in Kress & van Leeuwen, 2002; van Leeuwen, 2011).

And, finally, as we have seen in our discussion, many of the images and objects discussed by Kress and van Leeuwen were composites of more than one semiotic mode —for which they used the term ‘multimodality’.

2.7 Kress and van Leeuwen 2001: ‘Multimodal Discourse: The Modes and Media of Contemporary Communication’

While Kress and van Leeuwen’s work on the grammar of visual design became well known for its many proposals about the analysis of visual images and was part of the turn to focusing on visual design as a distinct mode , there was perhaps even more interest within the SocSem and ultimately the CDA community in their new concept of ‘multimodality’. Multimodality was based on their recognition that the way we communicate is typically done through a number of semiotic modes simultaneously (see also O’Toole, 1994). One of the new ideas in Kress and van Leeuwen (1990, 1996) was their argument that we should not analyze each semiotic resource on its own, but rather the ensemble, the “multimodal text ” (1996: 177), whose meanings are realized simultaneously through more than one semiotic mode . They insisted that each of the modes “should be looked upon as interacting with and affecting one another” (1996: 177), and the choices from various semiotic resources should be studied as to how they interact to create meaning multimodally. In their work, they did not treat “the verbal text as prior and more important, nor treat visual and verbal text as entirely discrete elements” (1996: 177); rather they looked at the whole as an “integrated text ” in which the “integration of different semiotic modes is the work of an overarching code whose rules and meanings provide the multimodal text with the logic of its integration”. They also indicated that the verbal and visual were not the only semiotic modes that could combine into a text , but that there can be many modes in many different combinations with different (meta)functions and hierarchies, and that the analyst should pay equal and equally detailed attention to each without privileging one of them and also interpreting them in terms of how they interact with and affect each other.

Kress and van Leeuwen discussed the new theoretical concept of multimodality briefly in their 1996 and 2006 texts, but it received much greater attention in their co-authored book (Kress & van Leeuwen, 2001) on Multimodal Discourse, and other publications (e.g., van Leeuwen, 1999; Kress, van Leeuwen, & García, 2000; Kress, Jewitt, Ogborn, & Tsatsarelis, 2001; Kress, Jewitt, Franks, Bourne, Hardcastle, Jones, & Reid, 2005; van Leeuwen, 2005; Kress, 2010; see also Sects. 3.​9 and 4.​6). Kress and van Leeuwen (2001: 2) present “a view of multimodality in which common semiotic principles operate in and across different modes” and reflect on the fact that with advances in digital technologies, non-specialists are increasingly able to select and combine semiotic resources in a way that only specialists were able to do in the past, and that therefore, the study of contemporary communication requires “a unified and unifying semiotics” (2001: 2). This rests on two fundamental facets of human communication. The first is that it is multimodal: for example, “meaning-making involves selecting from different modes (e.g., written language, sound, gesture, visual design) and media (e.g., face-to-face, print, film) and combining these selections according to the logic of space (e.g., a sculpture), time (e.g., a sound composition), or both (e.g., a film)” (Djonov & Zhao, 2013: 1; see also Kress, 2010). Thus, “multimodality names both a field of work and a domain to be theorized” (Kress, 2010: 54). The second facet of human communication is that it is always social, as underscored by work in all types of SocSem, and thus it is both defined by and construes its social context , and over time it can be transformed by and transform that social context .

Multimodal Discourse (Kress & van Leeuwen, 2001) is based on “the idea of communication”, of how people “use the variety of of semiotic resources to make signs in concrete social contexts” (2001: Preface). At the same time, it also presents the “common semiotic principles [which] operate in and across different modes” (2001: 2). It presents “fundamental principles for a unified theory of multimodality” (Djonov & Zhao, 2018: 5), such as: the study of multimodal communication should focus on “broad semiotic principles that apply across different semiotic resources”; and multimodal analysis should always consider “semiotic resources in relation to specific, situated social practices” in context (see also Moschini, 2014) and should deal with all aspects of communication in their complexity (see Kress, 2018). Kress and van Leeuwen (2001) define four ‘strata’ (based on Halliday, 1985) of analysis: discourse, i.e., “socially constructed knowledges of (some aspect of) reality” (Kress & van Leeuwen, 2001: 4); design (the arrangement, the composition, of discursive materials); production (the material realization of a semiotic event or object); and distribution (which in the late twentieth and into the twenty-first century often adds or changes meaning, 2001: 7). Interest in multimodality led to work by others influenced by SocSem and/or SFL, who saw the exciting consequences of this new point of view for their own research and publications (e.g., Lemke, 1998; Scollon, 2001; Norris, 2004; and others). Multimodal SocSem (see van Leeuwen, 2005; Andersen et al., 201521) became even more interdisciplinary in nature and a way of understanding the practice of meaning-making across a range of texts and institutions, and also allowing different theoretical voices, all concerned with the process and practice of semiotic meaning-making, to dialogue with each other—for example, in the journal Social Semiotics (founded in 1991).

In light of heightened interest in multimodality, many have said that there is much more multimodality than there had been before, that the 2000s and on are a historic moment when there are ongoing changes in the roles of the different modes and a broad change in the way we communicate (going from monomodality to more and more multimodality, and a different distribution of the three metafunctions, see Machin, 2007), due to the many new digital, computational, and internet-based technologies and their semiotic potentials and to the fact that frames are dissolving everywhere and formerly clear boundaries are becoming ever more blurred (see more in Sect. 3.​9 on van Leeuwen and Sect. 4.​6 on SocSem and multimodality).

Multimodality has been embraced by various other domains and has become a field of research in its own right. It was also incorporated into SocSem and (along with SocSem and SFL) into DA /DS , which for some is named multimodal DA (MDA) and DS (MDS). Research on multimodality can also be critical (but not necessarily part of CDA/CDS) and yet at the same time some of the critical approaches have been taken into, or have become part of, CDA/CDS as multimodal CDA (MCDA, to be discussed in Sect. 4.​6). Fairclough and Wodak (1997: 264) characterize the visual and multimodal version of SocSem by saying that it “draws attention to the multi-semiotic character of most texts in contemporary society, and explores ways of analyzing visual images […] and the relationship between language and visual images”. In their estimation, SocSem also attends not only to productive and interpretive practices of texts but also to the texts themselves, and reflects a new orientation to struggle and historical change in discourse. Both Kress and van Leeuwen have taken multimodality into new domains—Kress in his research on education along with his students and colleagues (see Böck & Pachler, 2013) and van Leeuwen in his ongoing interest and innovations in SocSem (see Sect. 3.​9 and Djonov & Zhao, 2013, 2018). Given the important presence of multimodality in CDA, we postpone further discussion to Sects. 3.​9 and 4.​6.

We now turn to Chap. 3, in which we will discuss the (official) emergence of CDA and the work of its founders, Kress, Fairclough, van Dijk, Wodak and van Leeuwen.