You are here

Translation in the New Millennium: I, Translator

With all the new developments and increasing accessibility of electronic translation tools such as machine translations, some have argued that these technologies will eventually replace the human translator. Are human translators really essential for the language translation process, or can their labor be simplified and processed by a machine?

 Researchers have been working for decades in order to establish computer systems capable of translating from one natural language to another. These systems do so by essentially deconstructing the components of the text such as the punctuation marks, recognized idioms, single word terminology, and sentence structure and then reconstructing these elements in the target language by applying specific linguistic rules and “learning” from already existing translations.

 This all sounds promising, but we are forgetting that languages are filled with ambiguities and complex rules that not even a computer can successfully resolve. As all translators know, the process of translation involves much more than simply replacing the source word with the target language word. This process involves using cultural, grammatical, syntactic and semantic knowledge in order to interpret the real meaning and ensure that it makes sense to native readers.

Translation is often noted as one of the oldest vocation in human history. Since the beginning of time, humans have sought ways to communicate across cultures and language.  Today global trade is at its pinnacle and the amount of content being produced for global audiences is massive. Never before has there been such a content demand driving the need for low cost, automated translation. 

 With open source initiatives, Google influence and new technology, why haven’t we perfected machine translation?  Or, have we?  

 How exactly does it work?

MT is the process by which computer software is used to translate text from one natural language to another. In order for any translation, human or machine, to be successful, the meaning of the text in the original source language must be fully restored in the target language.

Although this sounds straightforward, it is actually much more complex as translation is not simply word‐for‐word substitution. The machine must interpret and analyze all the features of a text including grammar, semantics, syntax and culture in order to effectively convey the meaning and intention of the text as a whole.

 Different models and their applications

1. Stand alone- pure machine translation which is set up on a series of rules that you enter to “train” your engine to understand your corpus -a large body of reference content that the machine learns from.  There are several  methods  or MT engine types including:

  •   - Rule-based model - A rule based machine translation system consists of collection of rules called grammar rules, lexicon and software programs to process the rules. It is extensible and maintainable. Rule based approach is the first strategy ever developed in the field of machine translation. Rules are written with linguistic knowledge gathered from linguists.
  •  - Statistical model - Statistical machine translation starts with a very large data set of good translations which have already been translated into multiple languages, and then uses those texts to automatically infer a statistical model of translation. That statistical model is then applied to new texts to make a guess as to a reasonable translation.
  •  - Hybrid model– The hybrid model leverages the strengths of statistical and rule-based translation methodologies.  The approaches differ in a number of ways:

 - -Rules post-processed by statistics: Translations are performed using a rules based engine. Statistics are then used in an attempt to adjust/correct the output from the rules engine.

  •  - -Statistics guided by rules: Rules are used to pre-process data in an attempt to better guide the statistical engine. Rules are also used to post-process the statistical output to perform functions such as normalization. This approach has a lot more power, flexibility and control when translating.

 2. Integrated- Machine translation with human post-editing - Integrating machine translation with a (human) translation based workflow can increase your productivity significantly and even improve the results coming from machine translation.

  • How good is it?

Machine translation quality is getting better.  The quality varies considerably by language pair, translation engine and the degree to which the engine is trained.  Training the engine costs money and requires a lot of sample content to build a corpus. Although it is possible to get the machine to a very high level of output, the cost to achieve 100% human level quality, in most cases, is an expense that outweighs the effort.

Machine translation is coming into its own as a practical aide to translation. On its own it is still not suited for publication without human revision and post-editing. More commonly, it is used by companies that are looking for an instant gist of a document or to get a non-human translation for documents that only require a 'good enough' translation.

Why haven’t we mastered it?

Most of Americans speak only one language, English. And most of us have forgotten the rules of grammar – proper tenses, prepositions and punctuation.  We speak and write from habit.  Without knowledge of another language we may not know that there are many differences in the grammar.  Some have additional gender rules, cases, classifiers, formal and informal structure and other elements that differ from English.  

Google Translate and other free online translation tools can be great for instant, informal translation. When expectations are properly set, particularly for low-value text, unedited machine translation can be quite useful. However, when a user overestimates machine translation capabilities, the results can be confusing at best.

For instance, when one online machine translation tool apparently mistranslated a common Chinese word as “Wikipedia,” Chinese menus began popping up everywhere with English translations for menu items like “stir-fried Wikipedia" and “barbecued Congo eel with Wikipedia and fermented bean curd.” Though odd, the error is relatively harmless. However, when the text has important implications in law, finance or marketing, the results can be terribly costly.

You have a need for it. How do you buy it?

So you’ve determined you have a need for machine translation.  You’ve concluded that the content requiring translation does not warrant the high-price, 100% perfect human translation, and that you are fine with lower quality.  So, how do you go about setting this up or buying it?  Here are four ways to get started ranging from free to very expensive.

1. Do it yourself via the Internet.  You can run machine translation through a web-based tool like Google. Typically the number of characters allowed is limited and the quality varies by language pair and subject matter.

  • Pros.  it’s usually free                    
  • Cons. Limited number of words you can translate at a time
  • Quality:  Low to medium  (BLEU score range)
  •  
  • 2. Build your own.   Moses open source code is available to anyone to build a translation engine.  (include url to site) This undertaking is not for the faint of heart and requires not only skilled engineers and linguists, but lots and lots of content to train the engine.
  • Pros.  A very cost-effective way to manage high volumes of translation in-house.            
  • Cons.  Time consuming and an upfront investment required in engineering and talent to get started.
  • Quality:  Low to 100%. It’s really dependent upon on how well you train your engine and control your content.
  •  
  • 3. Buy a commercial tool.  Companies like Systran and WordMagic offer translation software by language pair.  Features of the software allow you to train the engine and customize. They are typically rule-based.
  • Pros.  A very cost-effective way to get started for in-house use.               
  • Cons.  Time consuming and an upfront investment required in engineering and talent to get started.
  • Quality:  Low to 100%. It’s really dependent upon on how well you train your engine and control your content.
  •  

4. Contact a Language Service Provider.  An increasing number of language service providers are offering this service in conjunction with their own tools and will work with you to build and train the engine.  Often, the output is matched with human editing to achieve the desired level of quality required.  

  • Pros.   Expertise and consultation provided across many languages and subject matter. Hybrid and human editing services also available.  
  • Cons.  Upfront investment required, although most likely less than if you build your own.
  • Quality:  Medium to high.  Again, this is heavily influenced by how well the engine is trained to your subject matter, but an agency can help to make this possible and bring the value of its own tools and content.

 

In the end, machine translations offer an attractive package for users, ensuring instant turnaround times and a systematic and consistent approach when handling translations. However, we must consider that while the process of human translations is much slower in this sense, only humans can determine the suitability of a translation for a particular audience, culture and make linguistic and style choices based on experience instead of a database.