When Intoxicado ≠ Intoxicated: Avoiding False Friends and Critical Consequences in Machine Translation

In Madam Bovary, Gustave Flaubert observed that “Human language is like a cracked kettle on which we beat out tunes for bears to dance to, when all the time we are longing to move the stars to pity.” Referring to the painful misconceptions that arise when we use language to translate human emotion, he knew how flawed an instrument human language is. We still rely on it—we have no choice. And sometimes we also have no choice but to rely on a machine-generated version that creates challenges way beyond those of misconstrued feeling.—Editor

Screen shot courtesy of Google and Google Translate

Author: Jennifer DeCamp, Ph.D

In 1980, medical personnel in a Florida emergency room mistakenly translated the Spanish “intoxicado” as the seemingly similar “intoxicated” (i.e., by drugs or alcohol) rather than as “food poisoned.” This error resulted in delays in critical care that left a Cuban-American boy a quadriplegic. It caused anguish not only to the patient and his family and friends, but also to the hospital community. It also cost the hospital system a $17 million lawsuit [1].

A check of Google Translate—one of the best general machine translation (MT) systems according to testing by the National Institute of Science and Technology (NIST)— found that it translated “intoxicado” as “intoxicated” [2]. It provided “poison” as a secondary translation in the accompanying notes (see screen capture above). However, secondary translations do not show up when Google Translate is integrated with other tools, as it is in the popular application Skype Translate [3]. Another popular MT system, SYSTRANet, provided only the translation of “intoxicated” [4].

It was not only disturbing but also odd that both MT systems were providing the translation of “intoxicated.” Since Google Translate and SystranNet are both based extensively on statistical machine translation, they draw on past translations to provide the statistically most likely term in English. To double-check the translation, I looked up “intoxicado” on ProZ Term Search [5], a popular online terminology used extensively by professional translators. The translation for “intoxicado” was “food poisoned.” In online chat on the site, translators discussed whether “intoxicado” could also mean “nauseous”, but the community offered that nausea was just a symptom of food poisoning. There was no mention of intoxication by drugs or alcohol.

I then searched ProZ for instances of past translations and found only “food poisoned.” However, when I expanded the search from just Spanish-into-English to any languages, I found instances where English “intoxicated” had been translated into Spanish “intoxicado.” When I expanded my search from ProZ to the full internet, (i.e., by searching on “intoxicado translate”), I found more instances of “intoxicado” being translated as “intoxicated” (e.g., by the bab.la Spanish-to-English online dictionary [6]).

As T.S. Eliot wrote: “Words strain, / Crack and sometimes break, under the burden [of time], under the tension, slip, slide, perish, decay with imprecision, will not stay in place, / Will not stay still” [7]. Language changes, and it changes particularly fast at points where different cultures come together. The high rate of translation of “intoxicado” as “intoxicated” shows language change and variation. But when there is variation, how do we protect people from miscommunications like the one that occurred in the Florida emergency room?

Many Spanish-English medical interpreters receive certification training which includes discussion of these false cognates (known informally as “false friends”). The interpreters learn the translation to be used in the medical environment, regardless of what is used elsewhere in the community. They have access to terminologies that show them possible and preferred translations. They are alert to terms that cause confusion, and they learn to check which meaning is being used. However, this degree of control and accuracy does not always extend to MT.

Like these human interpreters, machine translation can be trained on selected and standardized translations and terminologies specific to certain fields and subject areas. In MT systems, this correct data can be prioritized so that a given translation appears each and every time. Some applications, like Google Translate, enable interpreters to enter a term and view alternatives (see the screen capture above). Like human interpreters, MT systems can be rigorously tested by third parties and certified for specific applications. Like these interpreters, systems could be programmed to alert the user of the presence of these potentially dangerous cognates and to suggest alternatives. Systems could also use Artificial Intelligence (AI) or even simple programming to ensure that the right translation is being used. They could use interactive MT so that each translation is checked by a human expert.

In addition, just as medical staff are increasingly being trained in how to use interpreters, these professionals and volunteers need training in how to use machine translation. They can learn to check whether an MT system (e.g., Conversa for Healthcare [9]) is specifically trained for that application and subject area in that particular hospital or hospital system. They can learn to double-check critical information through asking a question in different ways, collecting more information, checking online terminologies, and reaching back to trained expert human interpreters or translators. They can use more than one MT system, although—as shown in the case of “intoxicado”—that strategy is not foolproof. If the input is unclear, they can move from speech to written text to view the input with fewer errors. In addition, they can learn and be alert to the presence of the false friends that occur in their subject areas where misinterpretations can create critical consequences.


Mar 23, 2017



Pin It on Pinterest

Share This