Gert De Sutter, Marie-Aude Lefer and Isabelle Delaere
In corpus-based translation studies (CBTS), many scholars have conducted research based on the hypothesis that translated texts have certain linguistic characteristics in common which do not, or to a lesser extent, occur in original, non-translated texts. Baker’s (1993) seminal paper described these characteristics as “features which typically occur in translated text rather than original utterances and which are not the result of interference from specific linguistic systems” (Baker 1993: 243). Research of this kind has resulted in observations of, for example, how translations conform to the typical characteristics of the target language (normalization) (Bernardini and Ferraresi 2011; Scott 1998), how translated texts are linguistically more homogeneous than non-translated texts (levelling out) (Olohan 2004), how translated texts are more explicit than non-translated texts (explicitation) (Olohan and Baker 2000; Øverås 1998) or how translated texts exhibit fewer unique items (under-representation) (Tirkkonen-Condit 2004). In recent years, however, it has been shown that these detected characteristics are not only attributable to the difference between translated and non-translated texts, but co-vary with other (language-external) factors as well, such as text type, source language and the translator’s educational background (see e.g. Bernardini and Ferraresi 2011; De Sutter, Delaere, and Plevoets 2012; Kruger and van Rooy 2012; Neumann 2011). As a consequence, linguistic behaviour in translations versus non-translations has to be considered a multifactorial phenomenon rather than a monofactorial one. Multifactorial investigations into the linguistic behaviour of translators compared to non-translators remain rather scarce though, and, as a result, standard multivariate statistical techniques which can be used to visualize, describe, explain and predict patterns of variation within translations and between translations and non-translations do not easily find their way into CBTS (e.g. multidimensional scaling, hierarchical cluster analysis, mixed-effect models). This type of multifactorial investigation, using highly advanced and adequate statistical techniques, is urgently needed in order to find out which factors simultaneously affect linguistic behaviour in translations compared to non-translations. Next to the (language-external) factors mentioned above, other possibly influencing factors include characteristics of the writing process (did the translator use translation software?, did the translator experience any time pressure?, what is the degree of editorial control?, what is the policy of the publishing house?), typological or usage differences between source and target languages, the sociological status of the source and target languages, the style of the translator or original author, the sociological status of different types of translators, etc.
Whereas the identification of the determining factors is a necessary first step to take, the ultimate goal of CBTS is to find out what these factors reveal – on a higher level – about underlying sociological, cognitive, . . . causes and motivations of linguistic choices in translations vs. non-translations. In recent years, several interesting high-level explanatory mechanisms have been developed, from different perspectives, but they have not been the object of extensive empirical testing yet. From a sociological point of view, Pym (2008) has introduced the idea of translators being risk averse: if they can choose between a safe option (e.g. a variant that is widely accepted as a standard variant), and a risky option (e.g. a variant that is considered restricted to informal conversations), translators will most often opt for the former option, depending on whether they get rewarded or not when taking a risk. From a cognitive point of view, Halverson (2003, 2010) has introduced the so-called gravitational pull hypothesis, which seeks to connect translation behaviour with underlying cognitive properties, such as salience and activation. The gravitational pull hypothesis states that translation characteristics such as under-representation can be explained by the structure of semantic networks and prototypes, i.e. the distance between the activated concepts in the semantic network of the bilingual or multilingual translator.
The present volume aims to push the frontiers of CBTS by presenting original and innovative research which is methodologically rigorous, descriptively adequate and theoretically relevant. Each of the chapters sheds new light on what constrains translational behavior – and to what extent – and how this all fits in an empirical theory of translation. More particularly, this book’s aim is twofold: (i) to bring together advanced quantitative (multifactorial) studies of translated texts (compared to non-translated texts on the one hand and/or source texts on the other hand), building on large-scale, well-structured parallel or comparable corpora, which provide additional evidence for the effect of (language-external) factors on translation behavior, resulting in more fine-grained insights into translational tendencies, and which elaborate on explanatory devices uncovered in previous studies; (ii) to investigate to what extent other, complementary methods from related research fields or new data sources can improve the descriptive and explanatory accuracy of corpus-based results. By embracing other, complementary methods aiming at descriptive and theoretical progress, the field of Corpus-Based Translation Studies will eventually emerge as Empirical Translation Studies, in which different methods and models are confronted, ultimately leading to a more adequate and fully-fledged empirical theory of translation.
Sandra Halverson’s chapter is exemplary for the type of new-generation research envisaged in the previous paragraph, viz. theory-based, methodologically pluralistic and improving our understanding of the translational act. Starting out from a well-informed cognitive-linguistic model of bilingual language processing, Halverson investigates how translators deal with semasiological salience, using so-called converging empirical evidence (corpus data and elicited data). She distinguishes between three different types of salience, which might cause translations to be linguistically different from non-translations: a magnetism effect occurs when a translator is attracted to a prominent sense in the target language, a gravitational pull effect occurs when a translator is attracted to a prominent sense in the source language, and an effect of association strength occurs when two senses in the source and target language are often used as translational equivalents. In order to test which of these effects occur under which circumstances, Halverson develops a multi-stage and multi-methodological research design. First, an independent sentence generation test and a semasiological contrastive corpus analysis of the English polysemous verb to get and two of its Norwegian equivalents få and bli are conducted in order to establish a semantic network of these verbs, elucidating which senses are more salient and how strong the connection between the translation equivalents is. Then, a corpus analysis of Norwegian fiction and non-fiction translated into English is carried out in order to determine which of the above-mentioned salience effects occur. Her results show a.o. a clear magnetism effect for one of get’s most prominent senses, but other hypothesized effects remain unverified. Finally, an online keystroke experiment reveals that salience also affects revision behavior in that highly frequent verbs tend to be replaced more often than low frequent verbs during later stages of the translation process. Although much more research is needed along the lines sketched in this chapter, the research presented here clearly demonstrates how the effect of bilingual cognition can be studied within an empirical translation framework.
Stefan Evert and Stella Neumann present an advanced multivariate methodology for investigating differences and similarities between original and translated German and English. Starting out from no less than 27 lexicogrammatical features shared by both languages (frequency of finite verbs, passives, prepositions, etc.), they apply a series of multivariate techniques, such as principal component analysis, linear discriminant analysis and support vector machines, to discern visual patterns in the data. The results convincingly show that English and German originals have a clearly unique profile in terms of the lexicogrammatical bundles they display, and that translations shift to some extent towards the source language, which is interpreted as a shining-through effect. This effect, however, is more prominent in translated German (from English) than in translated English (from German). The authors connect this finding tentatively with Toury’s hypothesis that less-prestigious languages are more tolerant towards interference (or shining through) than vice versa. In sum, this chapter does not only stand out because of its solid empirical foundations (27 features) and the use of a series of multivariate techniques, it is also remarkable because of the clear presentation of the methodology and the reasoning behind it (thereby enabling replication studies) while at the same time revealing clear patterns, thus contributing to a better understanding of translational behavior.
Isabelle Delaere and Gert De Sutter investigate three fundamental factors that can impact on the linguistic features of translated text, namely source language, register and editorial intervention. Relying on the Dutch Parallel Corpus, the authors apply two multivariate statistics (profile-based correspondence analysis and logistic regression analysis) to measure the exact effect of the three factors investigated on the variability of English loanword use in translated and non-translated Belgian Dutch. Their study, which draws on both comparable and parallel data, shows that source language, register and editorial intervention all influence the use of loanwords (vs. endogeneous alternatives) in translated Belgian Dutch. The findings are interpreted in relation to the normalization behavior of both translators and writers of original texts. Isabelle Delaere and Gert De Sutter’s study compellingly illustrates the need to simultaneously consider a wide range of factors that can influence the linguistic make-up of translated language. As shown by their study, this can be done by relying on a combination of advanced multivariate statistics and careful qualitative analyses, which makes it possible to further our understanding of the cognitive and social mechanisms that shape translation.
Next, Haidee Kruger examines the under-researched effect of editorial intervention on the linguistic traits of texts. To do so, she relies on data extracted from a monolingual English parallel corpus of originally produced edited texts and their unedited counterparts, representing 4 registers (academic, instructional, popular writing and reportage). Looking at 8 features traditionally used as linguistic operationalizations of increased explicitness, simplification and conventionalization in CBTS (such as cohesive markers, sentence length and trigrams), she convincingly shows that revisers/editors make texts more explicit, syntactically simpler and more conventional, three features which, to date, have been attributed to the translation process itself. Haidee Kruger’s study has far-ranging implications for CBTS and – more generally – for studies of language mediation and constrained communication (Lanstyák and Heltai 2012), as it demonstrates that features attributed to translation may very well, in fact, be features of editing/revision, or more general features typical of mediated and constrained language (some of these traits, for instance, have also been found to characterize New Englishes). This can only encourage translation scholars to take editorial intervention into account in their own work and to start collecting new types of corpora to tease apart features of translated language and edited language.
Adriano Ferraresi and Maja Miličević’s chapter also addresses issues related to language mediation, as it adopts an intermodal approach, i.e. an approach where two translation modes (written translation and simultaneous interpreting) are compared, with the aim of identifying the typical features of translated language and interpreted language. Together with Silvia Bernardini, the authors have built the comparable and parallel European Parliament Translation and Interpreting Corpus (EPTIC), which contains four components: (1) speeches delivered at the European Parliament and (2) their interpretations, (3) verbatim reports of the proceedings (which are edited versions of the original speeches) and (4) their translations. In this study, the authors rely on four EPTIC sub-corpora: interpreted Italian, translated Italian (both with English as source language), original spoken Italian and original written Italian. The study focuses on phraseology, which has been extensively studied in CBTS so far, mainly in relation to interference and normalization/conventionalization. More specifically, the study is devoted to infrequent, highly frequent and strongly associated collocations made up of a noun and a modifier. The results suggest that translations are more phraseologically conventional than interpretations, especially as regards strongly associated expressions, which require more time for processing. This trend, the authors argue, may be related to the cognitive and task-related constraints characterizing translation and interpreting. It clearly emerges from this chapter that CBTS can (and will) benefit from a broader research focus, where a.o. different translation modes are systematically compared (not only comparing written translation with simultaneous interpreting, but also considering sight translation, consecutive interpreting, voice-over, subtitling, dubbing, etc., provided comparable corpora can be compiled).
Oliver Čulo, Silvia Hansen-Schirra and Jean Nitzke focus on an under-researched, technology-related factor in CBTS, viz. the effect of computer-aided translation. More particularly, the authors investigate terminological variation across three types of translations: human translations, machine translations and post-edited translations. They contrast texts translated from English into German from two specific genres, which have been relatively overlooked in previous research, viz. manuals and patient information leaflets. To do so, they rely on the perplexity coefficient, a technique borrowed from the domain of Machine Translation, which, to date, has not been used in CBTS. Although the results suggest that post-edited translations are influenced by the initial machine translation output, further research is needed to determine the cause(s) of this trend. The authors put forward a number of hypotheses that require further investigation, such as the idea that post-editors might tend to focus on the micro-level rather than the overall text, thereby paying less attention to terminological consistency.
Along the same lines, Ekaterina Lapshinova-Koltunski is the first to shed empirical-quantitative light on the interplay between translation method and text register. In this study, she compares a number of linguistic features in 5 translation varieties, such as professional human translation and rule-based machine translation, and in seven written registers (including, for example, manuals, tourism leaflets and fiction). The lexico-grammatical patterns under investigation originate from the Hallidayan framework of field, tenor and mode and are linked to a number of well-known translation features such as explicitation, simplification and shining through. The author applies an unsupervised technique, i.e. hierarchical cluster analysis, to investigate (i) variation across translation methods, (ii) variation across registers, and (iii) the interplay between translation method and register. The results reveal that both dimensions are present in the clusters. Interestingly, an additional dimension emerges from the analysis, i.e. translation expertise, which certainly requires further research in the field.
The study presented by Bert Cappelle and Rudy Loock re-opens a discussion, which had been relegated to the periphery in Mona Baker’s research programme (Baker 1993), viz. the effect of typological differences in source languages on translational products. The authors set out to determine whether there is a difference in usage of phrasal verbs in English translations from Romance languages and from Germanic languages. Their study relies on a monolingual comparable corpus made up of three components: the British National Corpus and two Translational English Corpus components, representing six Romance source languages and six Germanic source languages, respectively. The distribution of phrasal verbs with up, down and out reveals that source language family interference has a significant effect on translation. This leads the authors to dismiss normalization and levelling-out as translation universals. Additionally, a small-scale, more qualitative complementary study on Le Petit Prince and its English translation is carried out to determine what elements in the source text lead to phrasal verbs in the target text, revealing that morphologically complex verbs are much more likely to be translated with a phrasal verb than simplex source verbs.
Finally, Kerstin Kunz, Stefania Degaetano-Ortlieb, Ekaterina Lapshinova-Koltunksi, Katrin Menzel and Erich Steiner present the findings of a contrastive study of cohesive devices in German and English original texts. Their aim is to uncover contrastive trends that can help translators overcome language-pair specific pitfalls and make strategic choices with regard to the translation and use of cohesive devices. The distribution of cohesive features is analysed both in written and spoken registers in GECCo, a German-English corpus, which allows for deriving suggestions with regard to register-specific translation strategies as well. GECCo is analyzed by means of an exploratory data analysis technique, i.e. correspondence analysis, so as to uncover similarities and differences with regard to cohesive devices between the languages and the registers investigated. In addition, a supervised technique with support vector machines is applied to determine which cohesive features are distinctive and therefore contribute to the differences between the languages and registers under investigation. The results show, among others, that (i) register is an important variable when it comes to lexicogrammatical variation, and (ii) the differences between registers in the German subcorpus are more pronounced than those in the English subcorpus which, in turn, reflects the importance of the language variable.
Most of the chapters in this volume were first presented at the New Ways of Analyzing Translational Behavior in Corpus-Based Translation Studies workshop, held at the 46th Societas Linguistica Europeae (SLE) meeting in Split, Croatia in 2013. We would like to thank the organizers of the SLE meeting in Split for providing us with the most optimal circumstances to discuss the current state of the art in CBTS and identify some of the future directions that need to be explored. In editing the present volume we were supported by various reviewers. We would like to thank them all for sharing their insightful comments and advice. We are also deeply grateful to the authors for doing such a wonderful job and for not giving up on us after another round of critical remarks and suggestions. We are very well aware that it might have annoyed them (and perhaps even frustrated them at times), but we are convinced that it has greatly contributed to the quality of the present volume. Finally, we wish to thank most heartedly Julie Miess at Mouton for taking care of all practical details concerning this publication.
Baker, M. 1993. Corpus linguistics and translation studies. Implications and applications. In M. Baker, G. Francis & E. Tognini-Bonelli (eds.), Text and technology. In honour of John Sinclair, 233–250. Amsterdam: John Benjamins.
Bernardini, S. & A. Ferraresi. 2011. Practice, description and theory come together: Normalization or interference in Italian technical translation? Meta 56(2). 226–246.
De Sutter, G., I. Delaere & K. Plevoets. 2012. Lexical lectometry in corpus-based translation studies. Combining profile-based correspondence analysis and logistic regression modeling. In M. Oakes & J. Meng (eds.), Quantitative Methods in Corpus-based Translation Studies. A practical guide to descriptive translation research, 325–345. Amsterdam/Philadelphia: John Benjamins.
Halverson, S. 2003. The cognitive basis of translation universals. Target. International Journal of Translation Studies 15(2). 197–241.
Halverson, S. 2010. Cognitive translation studies: developments in theory and method. In G. Shreve & E. Angelone (eds.), Translation and Cognition, 349–369. Amsterdam: John Benjamins.
Kruger, H. & B. van Rooy. 2012. Register and the features of translated language. Across Languages and Cultures 13(1). 33–65.
Lanstyák, I. & P. Heltai. 2012. Universals in language contact and translation. Across Languages and Cultures 13(1). 99–121.
Neumann, S. 2011. Contrastive register variation. A quantitative approach to the comparison of English and German. Berlin: Mouton de Gruyter.
Olohan, M. 2004. Introducing corpora in translation studies. Taylor & Francis.
Olohan, M. & M. Baker. 2000. Reporting that in translated English: Evidence for subconscious processes of explicitation? Across Languages and Cultures 1(2). 141–158.
Øverås, L. 1998. In search of the third code. An investigation of norms in literary translation. Meta 43(4). 557–570.
Pym, A. 2008. On Toury’s laws of how translators translate. In A. Pym, M. Shlesinger & D. Simeoni (eds.), Descriptive Translation Studies and beyond. Investigations in Honor of Gideon Toury, 311–328. Amsterdam/Philadelphia: John Benjamins.
Scott, N. 1998. Normalisation and readers’ expectations: A study of literary translation with reference to Lispector’s A Hora Da Estrela. Liverpool: University of Liverpool doctoral dissertation.
Tirkkonen-Condit, S. 2004. Keywords and ideology in translated history texts: A corpus-based analysis. In A. K. Mauranen & P. Kujamäki (eds.), Translation Universals. Do they exist?, 177–184. Amsterdam/Philadelphia: John Benjamins.