Managing Language Chaos

Echoing Lewis Carroll, a colleague reminded me once that words mean whatever we choose to have them mean—a practice that might work if we lived in isolation from one another and if nothing that we said or did mattered. Alas, language, like personal devices, seems to function best when it wields a certain amount of interoperability. And as machines become more proficient speakers, the need for consensus will only grow.—Editor

Image of Tower of Babel, Pieter Bruegel the Elder, 1563, public domain

Author: Dr. Jennifer DeCamp

A fundamental fact documented for thousands of years, including in the story of the Tower of Babel [1], is that a lack of common language and means of communication often leads to bad engineering. People using different definitions for the terms—or worse, using different definitions and not even knowing that the definitions are operationally divergent—end up with mismatches of plans and implementations that resemble the tower painted by Pieter Bruegel in the mid-1500s. My concept is not your concept, and my dimensions are not your dimensions, even when the terms look the same. An example is the term situational awareness, for which the Air Force Materiel Command documented fifteen operationally different definitions just within their military branch [2].

There are many reasons for this rapid change in definitions for the same term, a condition technically known as “polysemy.” First, language adapts to new environments and ecosystems, as the beaks of Darwin’s finches adapted to demands and opportunities in the Galápagos Islands. Second, governments issue directives and other guidance, such as to implement “data reuse.” Planners then try to figure out what that term means in their spaces. Third, some terms such as situational awareness (SA) become popular in government presentations, scientific reports, or other forms of communication, and people then want to weigh in on the subject, as described in a paper called Can SA Be Defined?, by Captain Cynthia Dominguez. The term becomes “an organizing feature” for comments from many sources, as well as serving to “generate a unique train of study designs and influence the choice of measures used in these studies.” And as Captain Dominguez pointed out, “Although the concept of SA is accepted as important without qualification, nobody is willing to accept anybody else’s definition” [2].

Despite this chaos, people and technology to date have been fairly adept at deciphering the different usages or sometimes just adept at muddling through. However, the world is becoming much more challenging. Documents are no longer just used within small closed communities, but rather are increasingly posted for broader use. Analytic tools crunch through large quantities of text, often further obscuring the source. And the fast pace of life creates an ever faster pace of change. There are thus ever more readers who do not know the specific definition used by the writers and who may not even know that more than one definition is in play.

The problem becomes compounded—or at least more complicated—as terms are translated by humans and/or machines into foreign languages. Tools such as Translation Memory Systems (TMS) and Machine Translation (MT) check past translations to see what terms had been previously used. As broader sets of documents are used for reference, the systems provide the translation that is statistically most prevalent in that set. Terms with different definitions (i.e., different concepts) are probably going to have different translations. However, the use of undifferentiated terms in English means that those tools may make the wrong choices of translated terms.

So, understanding that we cannot manage all the chaos, how do we at least manage a small amount of the chaos where it matters, such as where polysemy creates misunderstandings or wasted time and effort or loss of critical information? First, we can stop loading up critical terms with so many definitions. When we create new definitions, we can modify the new term, so that there is some apparent (i.e., above-water) distinction that the terms are different. This modification might be a date, a location, a version number, or another means of designating that a different meaning is in play. It might be an annotation, such as a parenthetical notation or a footnote that the term being used is from a different organization. In each case, we would have made the differentiation apparent in the material presented to the reader or listener. With applications like HTML namespace, we could make the differentiation apparent to machines as well.

Second, we can focus on additional means to help the reader or listener better understand the different definitions in play. We can devote space and time in documents and meetings to defining the terms that will be used, so that readers and speakers can better understand what is being discussed. We can migrate from definitions applicable only to narrow communities to definitions useful to broader groups, which may mean unloading some information from the specific term. We can use existing authoritative definitions (e.g., from the Department of Defense DOD Dictionary of Military and Associated Terms) rather than making up our own.

We can also practice terminology management, using terminology management systems and hiring professional terminologists. In addition, we can discuss definitions with our partners—including our international partners—so that we have agreement on what means what. And once we have these agreements, we use those negotiated terms in given situations with those partners in meetings and in translations of materials. Above all, we can start to think in terms of language as communication, and our audience as an increasingly broad set of national and international participants.

References

[1] The Tower of Babel. Genesis II: 1-9. (2001). The Holy Bible: King James Version. Iowa Falls, IA: World Bible Publishers.

 

 

[2] Domingues, Captain C. (1992). Can SA Be Defined? In Situation Awareness: Papers and Annotated Bibliography. Air Force Materiel Command, Wright Patterson Air Force Base. Publicly released; unlimited distribution. Pp. 11-15.

 

 

Dr. Jennifer DeCamp serves as principal engineer for Translation and Terminology at The MITRE Corporation. In this position, she works across the U.S. government and across international standards bodies to improve translation and terminology practices and tools. Dr. DeCamp is chair of the American Translators Association (ATA) Translation Committee, and is U.S. Head of Delegation for the International Organization (ISO) Technical Committee 37 on Terminology and Other Language and Content Resources.

See also:

When Intoxicado ≠ Intoxicated: Avoiding False Friends and Critical Consequences in Machine Translation

© 2017 The MITRE Corporation. All rights reserved. Approved for public release. Distribution unlimited. Case number: 17-3026.

The MITRE Corporation is a not-for-profit organization that operates research and development centers sponsored by the federal government. Learn more about MITRE.

Archives

Pin It on Pinterest

Share This