Conclusion: The Future of Machine Translation

To conclude this journey, we would like to say a few words about cognitive issues. The most active researchers in the field of machine translation generally avoid addressing cognitive issues and make few parallels with the way humans perform a translation. The artificial intelligence domain has suffered from spectacular and inflated claims too much in the past, and in relation to systems that had nothing to do with the way humans think or reason. It may thus seem reasonable to focus on technological issues and leave any parallel with human behavior aside, especially because we do not, in fact, know much about the way the human brain works.

However, it may be interesting in this conclusion to have a look at cognitive issues despite what has just been said, because the evolution of the field of machine translation is arguably highly relevant from this point of view. The first systems were based on dictionaries and rules and on the assumption that it was necessary to encode all kinds of knowledge in the source and target languages in order to produce a relevant translation. This approach largely failed because information is often partial and sometimes contradictory, and knowledge is contextual and fuzzy. Moreover, no one really knows what knowledge is, or where it begins and where it ends. In other words, developing an efficient system of rules for machine translation cannot be carried out efficiently by humans, since the task is potentially infinite and it is not clear what should be encoded in practice.

Statistical systems then seemed like a good solution, since these systems are able to efficiently calculate complex contextual representations for thousands of words and expressions. This is something the brain probably does in a very different way, but nevertheless very efficiently: we have seen in chapter 2 that any language is full of ambiguities (cf. “the bank of a river” vs. “the bank that lends money”). Humans are not bothered at all by these ambiguities: most of the time we choose the right meaning in context without even considering the other meanings. In “I went to the bank to negotiate a mortgage,” it is clear that the word “bank” refers to the lending institution, and the fact that there is another meaning for “bank” is simply ignored by most humans. A computer still has to consider all options, but at least statistical systems offer interesting and efficient ways to model word senses based on the context of use.

We have also witnessed rapid progress, from the very first systems based on a word-for-word approach to segment-based approaches, which means that gradually longer sequences of text have been taken into account, leading to better translations. The new generation of systems based on deep learning takes into account the whole sentence as the basic translation unit and thus offer a valuable solution to the limitations of previous approaches. We have also seen that this approach takes into account all kinds of relations between words in the sentence, which means that structural knowledge (i.e., some kind of syntax) is involved in the translation process. The fact that all this information is embedded and processed at the same time in a unique learning process means that one does not need to deal either with the delicate integration of various complex modules or with the propagation of analysis errors, contrary to what happened with most previous systems (but note that errors can also be percolated into the neural network; the sole use of neural networks does not solve all problems magically, of course).

The new generation of systems based on deep learning directly takes into account the whole sentence as the basic translation unit and thus offers a valuable answer to the limitations of previous approaches.

In practice, deep learning systems still suffer from important limitations, and we saw in chapter 12 a number of the research issues at stake (unknown words, long sentences, optimization problems, etc.). While we are still far from perfect machine translation systems, it is nonetheless interesting to see that the best-performing systems now operate directly at sentence level, make limited use of manually defined syntactic or semantic knowledge, and produce translations on the fly based on huge quantities of data used for training. They thus seem to account for several characteristics of human language: for example, the fact that child language acquisition is based on language exposure (and not on the explicit learning of grammar rules); and the fact that word distribution and linguistic complexity play a role (some words are more frequent than others and are learned before others, and simpler syntactic structures are easier to acquire and easier to translate). It is not completely clear how neural networks work, what knowledge they effectively use, and how their architecture influences the overall result, but it is clear that they bear interesting similarities to basic features of human languages.

As already said, deep learning machine translation is still in its infancy. We can expect quick progress as the systems achieve better quality and will gradually appear in a broader number of professional contexts. Automatic systems will, of course, not replace human translation—this is neither a goal nor a desired outcome—but they will help millions of people have access to information they could not grasp otherwise. Digital communication will continue to grow, as will research in the machine translation domain, and one can expect that in the not-too-distant future, it will be possible to dialogue over the phone with someone speaking another language. One will then just need to introduce a small device into one’s ear to understand any language, and Douglas Adams’ Babel fish will no longer be a fiction—although the device may not be a fish!

15 Conclusion: The Future of Machine Translation

Commercial Challenges

A Cognitively Sound Approach to Machine Translation?