Bibliography and Further Reading

This book is accompanied by the following website: http://lattice.cnrs.fr/machinetranslation. The website provides a variety of supplementary material, including corrections of mistakes and other resources that should be useful to readers. This website also presents the analysis of the output of different machine translation systems that could not be included in this book since systems’ performance evolves too quickly. In addition, this chapter contains suggestions for further reading for the reader who would like to know more than what could be said in this short introduction. This list is by no means exhaustive, which is in any case an impossible task since new publications appear every day on the topic. The following references can be considered the main ones to consult in order to explore different aspects of the topic in greater detail. Most references contain, in turn, their own list of references on specific issues.

Historical aspects of the topic are very well documented thanks to the comprehensive work of John Hutchins. Some other aspects are more difficult to explore, because they are plentiful and very technical (especially concerning current lines of research) or rare and quickly obsolete (e.g., questions related to the commercial aspects of the field).

On historical aspects, the reader should refer to John Hutchins’ website: http://www.hutchinsweb.me.uk. John Hutchins has also written three major books on the question:

The 1992 book, co-authored with Harold Somers, remains interesting, even if it is of course now dated. The other two books are two musts for anyone interested in the history of machine translation. The 1986 book contains a description of the main research groups involved in the domain up to the early 1980s. It also includes descriptions of the main systems and the techniques used by different research groups. The 2000 book contains more historical anecdotes and personal stories, as well as firsthand accounts by the main actors of the domain.

Other publications by John Hutchins are also interesting to get a quick but reliable overview. For example:

On corpus alignment and the statistical approach to machine translation, the following books are quite technical but important:

On natural language processing in general, several good overviews exist, for instance:

In what follows, we also give other references that could be useful to the reader who wants to know more on some specific aspects of the question. The following references have also been used as the main sources for this book.

Chapter 2: The Trouble with Translation

Many publications have addressed the problem of translation, but they cannot all be listed here. The recent book by David Bellos is an entertaining and captivating introduction, even if many others could also have been cited.

Chapter 3: A Quick Overview of the Evolution of Machine Translation

This chapter presents a quick overview of the field. The reader should thus refer to the general references given at the beginning of this section.

Chapter 4: Before the Advent of Computers…

The literature on universal languages is huge but the introduction by Umberto Eco is accessible and entertaining. Hutchins’ article on Artsrouni and Trojanskij is the main source in English on these two researchers.

Chapter 5: The Beginnings of Machine Translation: The First Rule-Based Systems

For this chapter, apart from the writings from Weaver and Bar-Hillel themselves, one can refer to Hutchins’ text on Weaver (“Warren Weaver and the launching of MT: Brief biographical note”) and Y. Bar-Hillel (“Yehoshua Bar-Hillel: A philosopher’s contribution to machine translation), both in Early Years in Machine Translation (see the full reference at the beginning of this chapter).

Chapter 6: The 1966 ALPAC Report and Its Consequences

The ALPAC report and some comments on it, especially by John Hutchins, can easily be found on the Internet.

Chapter 7: Parallel Corpora and Sentence Alignment

The book by Tiedemann, Bi-text Alignment (Morgan and Claypool Publishers, 2011; see full reference at the beginning of this chapter) gives a general overview on the topic. A few historical research papers remain the main contribution to the domain. See, for example:

Chapter 8: Example-Based Machine Translation

Several research papers are accessible and give a good overview of the benefits but also the limitations of this paradigm.

Chapter 9: Statistical Machine Translation and Word Alignment

The most important references (by Koehn on statistical machine translation and by Tiedemann on corpus alignment) were given at the beginning of this chapter. The series of historical papers published by the IBM team in the late 1980s and beginning of the 1990s should be read carefully by anyone interested in statistical machine translation.

A website (http://www.statmt.org) gives access to a large amount of information on the domain, including research papers, tutorials, links to free software, and so on.

Chapter 10: Segment-Based Machine Translation

The previous website (http://www.statmt.org) is probably the best source of information for recent trends related to statistical machine translation, of which segment-based machine translation is part.

Chapter 11: Challenges and Limitations of Statistical Machine Translation

See http://www.statmt.org,as for chapter 10 above.

Kenneth Church (2011). “A pendulum swung too far.” Linguistic Issues in Language Technology, 6(5).

Chapter 12: Deep Learning Machine Translation

The book by Goodfellow et al., although technical, offers an affordable and comprehensible introduction to deep learning. One can also refer to the blogs of commercial systems that offer interesting overviews (see, for example, Google’s Research Blog, https://research.googleblog.com/2016/09/a-neural-network-for-machine.html, or the Systran blog: http://blog.systransoft.com/how-does-neural-machine-translation-work). Google’s paper describing their first operational deep learning machine translation system is also worth being read.

Chapter 13: The Evaluation of Machine Translation Systems

The BLEU, NIST, and METEOR measures are described in the following three publications:

We have also cited the following four references:

Chapter 14: The Machine Translation Industry: Between Professional and Mass-Market Applications

There are very few studies on this topic. The Directorate-General for Translation of the European Commission gives some figures on its website: http://ec.europa.eu/dgs/translation/faq/index_en.htm#faq_4/.

The following compendium of companies in the field was quite comprehensive in 2010 but is already obsolete because the field evolves very quickly.

In addition, specialized journals and magazines in computer sciences and information technology report the main news concerning the domain, along with traditional newspapers from the financial domain.

  1. John Hutchins (1986). Machine Translation: Past, Present, Future. Series in Computers and Their Applications. Chichester, UK: Ellis Horwood.
  2. John Hutchins and Harold L. Somers (1992). An Introduction to Machine Translation. London: Academic Press.
  3. John Hutchins (ed.) (2000). Early Years in Machine Translation: Memoirs and Biographies of Pioneers. Amsterdam: John Benjamins.
  1. John Hutchins (2010). “Machine translation: A concise history.” Journal of Translation Studies 13 (1–2): 29–70. Special issue: The teaching of computer-aided translation, ed. Chan Sin Wai.
  1. Philipp Koehn (2009). Statistical Machine Translation. Cambridge: Cambridge University Press.
  2. Jorg Tiedemann (2011). Bitext Alignment. San Rafael, CA: Morgan and Claypool Publishers.
  1. Dan Jurafsky and James H. Martin (2016). Speech and Language Processing (3rd ed. draft). Available online: https://web.stanford.edu/~jurafsky/slp3/.
  1. David Bellos (2011). Is That a Fish in Your Ear? Translation and the Meaning of Everything. London: Penguin/Particular Books.
  2. Adam Kilgarriff (2006). “Word senses.” In Word Sense Disambiguation: Algorithms and Applications (E. Agirre and P. Edmonds, eds.). Dordrecht: Springer.
  1. René Descartes (1991). The Philosophical Writings of Descartes. Volume 3: The Correspondence. Cambridge: Cambridge University Press.
  2. Umberto Eco (1997). The Search for the Perfect Language. Oxford: Wiley.
  3. John Hutchins (2004). “Two precursors of machine translation: Artsrouni and Trojanskij.” International Journal of Translation 16 (1): 11–31.
  4. Philip P. Wiener (ed., 1951). Leibniz Selections. New York: Simon and Schuster.
  1. Yehoshua Bar-Hillel (1958 [1961]). “Some linguistic obstacles to machine translation.” Proceedings of the Second International Congress on Cybernetics (Namur, 1958), 197–207, 1961 (reprinted as Appendix II in Bar-Hillel 1959).
  2. Yehoshua Bar-Hillel (1959). “Report on the state of machine translation in the United States and Great Britain.” Technical report, 15 February 1959. Jerusalem: Hebrew University.
  3. Yehoshua Bar-Hillel (1960). “The present status of automatic translation of languages.” Advances in Computers 1: 91–163.
  4. Richard H. Richens (1956). “A general program for mechanical translation between two languages via an algebraic interlingua.” Mechanical Translation, 3(2): 37.
  5. Karen Sparck Jones (2000). “R. H. Richens: Translation in the NUDE.” In Early Years in Machine Translation (W. J. Hutchins, ed.). Amsterdam: John Benjamins, 263–278.
  6. Warren Weaver (1949 [1955]). “Translation.” Reproduced in Machine Translation of Languages (W. N. Locke and D. A. Booth, eds.). Cambridge, MA: MIT Press, 15–23.
  1. The Automatic Language Processing Advisory Committee (1966). “Language and Machines—Computers in Translation and Linguistics.” Washington, DC: National Academy of Sciences, National Research Council. [This publication is more popular under the name “ALPAC Report.”]
  2. John Hutchins (2003). “ALPAC: The (in)famous report.” In Readings in Machine Translation (S. Nirenburg, H. L. Somers, Y. Wilks, eds.), 131–135. Cambridge, MA: MIT Press.
  3. John Hutchins (1988). “Recent developments in machine translation: A review of the last five years.” In New Directions in Machine Translation: Conference Proceedings, Budapest 18–19 August 1988 (D. Maxwell, K. Schubert, and T. Witkam, eds.), 7–62. Foris Publications (Distributed Language Translation 4), Dordrecht.
  4. Anthony G. Oettinger (1963). “The state of the art of automatic language translation: an appraisal” In Beiträge zur Sprachkunde und Informationsverarbeitung, n°2, 17–29.
  1. William A. Gale and Kenneth W. Church (1993). “A program for aligning sentences in bilingual corpora.” Journal of Computational Linguistics 19 (1): 75–102.
  2. Martin Kay and Martin Röscheisen (1993). “Text-translation alignment.” Journal of Computational Linguistics 19 (1): 121–142.
  1. Makoto Nagao (1984). “A framework of a mechanical translation between Japanese and English by analogy principle.” In Artificial and Human Intelligence (A. Elithorn and R. Banerji, eds.). Elsevier Science Publishers, Amsterdam.
  2. Eiichiro Sumita and Hitoshi Iida (1991). “Experiments and prospects of example-based machine translation.” Proceedings of the Twenty-Ninth Conference of the Association for Computational Linguistics, 185–192. Berkeley, CA.
  3. Thomas R. Green (1979). “The necessity of syntax markers: Two experiments with artificial languages.” Verbal Learning and Verbal Behavior 18: 481–496.
  4. Harold Somers (1999). “Example-based machine translation.” Machine translation 14 (2): 113–157.
  5. Nano Gough and Andy Way (2004). “Robust large-scale EBMT with marker-based segmentation.” Proceedings of the Tenth International Conference on Theoretical and Methodological Issues in Machine Translation, 95–104. Baltimore, MD.
  1. Peter Brown, John Cocke, Stephen Della Pietra, Vincent Della Pietra, Frederick Jelinek, Robert Mercer, and Paul Roossin (1988). “A statistical approach to language translation.” In Proceedings of the Twelfth Conference on Computational Linguistics, Vol. 1, 71–76. Association for Computational Linguistics, Stroudsburg, PA. http://dx.doi.org/10.3115/991635.991651/.
  2. Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Frederick Jelinek, John D. Lafferty, Robert L. Mercer, and Paul S. Roossin (1990). “A statistical approach to machine translation.” Computational Linguistics 16 (2): 79–85.
  3. Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer (1993). “The mathematics of statistical machine translation: Parameter estimation.” Computational Linguistics 19 (2): 263–311.
  1. Ian Goodfellow, Yoshua Bengio and Aaron Courville (2016). Deep Learning. Cambridge, MA: MIT Press.
  2. Yonghui Wu, et al. (2016). “Google's neural machine translation system: Bridging the gap between human and machine translation.” Published online. arXiv:1609.08144.
  1. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu (2002). “BLEU: A method for automatic evaluation of machine translation.” Fortieth Annual Meeting of the Association for Computational Linguistics, 311–318. Philadelphia.
  2. George Doddington (2002). “Automatic evaluation of machine translation quality using n-gram cooccurrence statistics.” Proceedings of the Human Language Technology Conference, 128–132. San Diego.
  3. Satanjeev Banerjee and Alon Lavie (2005). “METEOR: An automatic metric for MT evaluation with improved correlation with human judgments.” Proceedings of Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization at the Forty-Third Annual Meeting of the Association of Computational Linguistics. Ann Arbor, MI.
  1. Martin Kay (2013). “Putting linguistics back into computational linguistics.” Conference given at the Ecole normale supérieure, Paris. http://savoirs.ens.fr/expose.php?id=1291/.
  2. Philipp Koehn, Alexandra Birch, and Ralf Steinberger (2009). “462 machine translation systems for Europe.” Proceedings of MT Summit XII, 65–72. Ottawa, Canada.
  3. David Vilar, Jia Xu, Luis Fernando D’Haro, and Hermann Ney (2006). “Error analysis of machine translation output.” Proceedings of the Language Resource and Evaluation Conference, 697–702. Genoa, Italy.
  4. John S. White, Theresa O’Connell, and Francis O’Mara. (1994). “The ARPA MT evaluation methodologies: Evolution, lessons, and future approaches.” Proceedings of the 1994 Conference, Association for Machine Translation in the Americas, 193–205. Columbia, MD.
  1. John Hutchins, on behalf of the European Association for Machine translation, (2010). “Compendium of translation software.” http://www.hutchinsweb.me.uk/Compendium.htm.