The Manifest Destiny of Artificial Intelligence
Will AI create mindlike machines, or will it show how much a mindless machine can do?
The Vodka Is Strong
In The Hitchhiker’s Guide to the Galaxy, Douglas Adams introduces the Babel Fish: “If you stick one in your ear, you can instantly understand anything said to you in any form of language.” Here in our little corner of the galaxy, Babel Fish is the name of a web service (part of Yahoo) that also performs translation, though it’s limited to languages from Planet Earth. Google and Microsoft offer similar services. Depending on your needs and expectations, the quality of the results can seem either amazing or risible.
Efforts to build a translation machine were already under way in the 1950s. The simplest of the early schemes was essentially an automated bilingual dictionary: The machine would read each word in the source text, look it up in the dictionary, and return the corresponding word or words in the target language. The failure of this approach is sometimes dramatized with the tale of the English→ Russian→ English translation that began with “The spirit is willing but the flesh is weak” and ended with “The vodka is strong but the meat is rotten.” John Hutchins, in a history of machine translation, thoroughly debunks that story, but the fact remains that word-by-word dictionary lookup was eventually dismissed as useless.
Later programs worked with higher-level linguistic structures—phrases and sentences rather than individual words. In the early 1970s Yorick Wilks, who was then at Stanford, built an English-to-French translation program that explicitly tried to reproduce some of the mental processes of a human translator. The program would read a sentence, break it into component phrases, try to assign meanings to the words based on their local context, and then generate corresponding phrases in the target language.
Wilks’s project never got beyond the prototype stage, but another translation system with roots in the same era is still in wide use and under active development. SYSTRAN, founded in 1968 by Peter Toma, now powers the Babel Fish service. It began as a dictionary-based system but now has a complex structure with many modules.
In recent years a quite different approach to machine translation has attracted more interest and enthusiasm. The idea is to ignore the entire hierarchy of syntactic and semantic structures—the nouns and verbs, the subjects and predicates, even the definitions of words—and simply tabulate correlations between words in a large collection of bilingual texts. The early work on this statistical approach to translation was done by Peter F. Brown and his colleagues at IBM, who had already applied analogous ideas to problems of speech recognition. The method is now the basis of translation services from both Google and Microsoft.
Deliberately ignoring everything we know about grammar and meaning would seem to be a step backward. However, all the information encoded in grammar rules and dictionary definitions is implicitly present in a large collection of texts; after all, that’s where the grammarians and the lexicographers get it from in the first place. Where do the bilingual texts come from? Government bodies that publish official documents in two or more languages, such as Canada and the European Union, have been an important source.
Suppose you have a large corpus of parallel documents in French and English, broken down into pairs of matched sentences. With this resource in hand, you are asked to provide an English translation of the French proverb Chat échaudé craint l’eau froide. Here is one way to begin: Take each word of the proverb, find all the French sentences in the corpus that include this word, retrieve the corresponding English sentences, and look for words that appear in these sentences with unusually high frequency. In some cases the outcome of this process will be easy to interpret. If a French sentence includes the word chat, the English version is very likely to mention cat. Other cases could be equivocal. The French craint might be strongly correlated with several English words, such as fears, dreads and afraid. And occasionally it might happen that no word stands out clearly. By taking all the English words identified in this way, and perhaps applying a threshold rule of some kind, you can come up with a list of words that have a good chance of appearing in the translation.
Now the task is to put the selected words in order, and thus make an English sentence out of them. This too can be done by a probabilistic process, guided by the relative frequencies of short sequences of words (n-grams) in English text. The likeliest arrangement of the words is taken as the translation.
In practice, statistical translation programs are not quite as crude and simple-minded as the algorithm presented here. In particular, the ordering of the English words is done by an alignment process that starts with the French sequence and allows for insertions, deletions and transpositions. Still, the entire translation is done in total ignorance of meaning and grammatical structure. It seems a bit of a miracle when something sensible comes out. For Chat échaudé craint l’eau froide, Google Translate suggests: A scalded cat fears cold water. My high school French teacher would have given full credit for that answer.
This numerical or statistical approach to translation seems utterly alien to the human experience of language. As in the case of game-playing, I am tempted to protest that the computer has solved the problem but missed the point. Surely a human translator works at a higher level, seeking not just statistical correlations but an equivalence of meaning and perhaps also of mood and tone. In the case at hand, the artful translator might render a proverb with another proverb: Once burned, twice shy. That kind of deft linkage between languages seems beyond the reach of programs that merely shuffle symbols.
But are human readers really so different from the computer plodding through its database of sentences? How do people learn all those nuanced meanings of thousands of words and the elaborate rules for putting them together in well-formed sentences? We don’t do it by consulting dictionaries and grammar books. Starting as young children, we absorb this knowledge from exposure to language itself; we listen and talk, then later we read and write. For the lucky polyglots among us, these activities go on in multiple languages at once, with fluid interchange between them. In other words, we infer syntax and semantics from a corpus of texts, just as the statistical translator does. Most likely the mental process underlying the acquisition of language involves no explicit calculation of probabilities of n-grams, but it also requires no dictionary definitions or memorized conjugations of verbs. In spirit at least, our experience of language seems closer to statistical inference than to rule-based deduction.