Google Translate
Encyclopedia
Google Translate is a free
Gratis
Gratis is the process of providing goods or services without compensation. It is often referred to in English as "free of charge" or "complimentary"...

 statistical
Statistical machine translation
Statistical machine translation is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora...

 machine translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

 service provided by Google Inc.
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

 to translate a section of text, document or webpage, into another language.

The service was introduced in April 28, 2006 for the Arabic language. Prior to October 2007, for languages other than Arabic, Chinese and Russian, Google used a SYSTRAN
SYSTRAN
SYSTRAN, founded by Dr. Peter Toma in 1968, is one of the oldest machine translation companies. SYSTRAN has done extensive work for the United States Department of Defense and the European Commission....

 based translator which is used by other translation services such as Babel Fish
Babel Fish (website)
Yahoo! Babel Fish is a web-based machine translation application on Yahoo! that translates text or web pages from one of several languages into another....

, AOL
AOL
AOL Inc. is an American global Internet services and media company. AOL is headquartered at 770 Broadway in New York. Founded in 1983 as Control Video Corporation, it has franchised its services to companies in several nations around the world or set up international versions of its services...

, and Yahoo
Yahoo!
Yahoo! Inc. is an American multinational internet corporation headquartered in Sunnyvale, California, United States. The company is perhaps best known for its web portal, search engine , Yahoo! Directory, Yahoo! Mail, Yahoo! News, Yahoo! Groups, Yahoo! Answers, advertising, online mapping ,...

.

On May 26, 2011, Google announced that the Google Translate API had been deprecated and that it would cease functioning on December 1, 2011 "due to the substantial economic burden caused by extensive abuse." The shutting down of the API, which is used by a number of websites, has led to criticism of Google and developers questioning the viability of using Google APIs in their products.

On June 3, 2011, Google announced that they were canceling their plan to terminate the Translate API due to public pressure. In the same announcement, Google said that they will release a paid version of the Translate API.

Features and limitations

The service limits the number of paragraphs, or range of technical terms, that will be translated. It is also possible to enter searches in a source language that are first translated to a destination language allowing you to browse and interpret results from the selected destination language in the source language. For some languages, users are asked for alternate translations such as for technical terms, to be included for future updates to the translation process. Text in a foreign language can be typed, and if "Detect Language" is selected, it will not only detect the language but also translate it into English by default.

Google Translate, like other automatic translation tools, has its limitations. While it can help the reader to understand the general content of a foreign language text, it does not always deliver accurate translations. Some languages produce better results than others. As of 2010, French to English translation is very good; however, rule-based machine translations perform better if the text to be translated is shorter; this effect is particularly evident in Chinese to English translations.

Texts written in the Greek
Greek alphabet
The Greek alphabet is the script that has been used to write the Greek language since at least 730 BC . The alphabet in its classical and modern form consists of 24 letters ordered in sequence from alpha to omega...

, Devanagari
Devanagari
Devanagari |deva]]" and "nāgarī" ), also called Nagari , is an abugida alphabet of India and Nepal...

, Cyrillic and Arabic scripts can be transliterated automatically from phonetic equivalents written in the Latin alphabet
Latin alphabet
The Latin alphabet, also called the Roman alphabet, is the most recognized alphabet used in the world today. It evolved from a western variety of the Greek alphabet called the Cumaean alphabet, which was adopted and modified by the Etruscans who ruled early Rome...

.

Browser integration

A number of Firefox extensions exist for Google services, and likewise for Google Translate, which allow right-click command access to the translation service.

An extension for Google's Chrome browser also exists; in February 2010, Google translate was integrated into the standard Google Chrome browser for automatic webpage translation.

Android version

Google Translate is available as a free downloadable application for Android OS users. The first version was launched in January 2010. It works simply like the browser version. Google translation for Android contains two main options: "SMS translation" and "History".

An early 2011 version supported Conversation Mode when translating between English and Spanish (in alpha testing). This new interface within Google Translate allows users to communicate fluidly with a nearby person in another language. In October it was expanded to 14 languages.

The application supports 53 languages and voice input for 15 languages. It is available for devices running Android 2.1 and above and can be downloaded by searching for “Google Translate” in Android Market
Android Market
Android Market is an online software store developed by Google for Android OS devices. Its gateway is an application program called "Market", preinstalled on most Android devices, allows users to browse and download mobile apps published by third-party developers...

. It was first released in January 2010, with an improved version available on January 12, 2011.

Latest version: 2.0.0 build 42


iOS version

In August 2008, Google launched a Google Translate HTML5 web application
Web application
A web application is an application that is accessed over a network such as the Internet or an intranet. The term may also mean a computer software application that is coded in a browser-supported language and reliant on a common web browser to render the application executable.Web applications are...

 for iOS for iPhone
IPhone
The iPhone is a line of Internet and multimedia-enabled smartphones marketed by Apple Inc. The first iPhone was unveiled by Steve Jobs, then CEO of Apple, on January 9, 2007, and released on June 29, 2007...

 and iPod touch
IPod touch
The iPod Touch is a portable media player, personal digital assistant, handheld game console, and Wi-Fi mobile device designed and marketed by Apple Inc. The iPod Touch adds the multi-touch graphical user interface to the iPod line...

 users. The official iOS app for Google Translate was released February 8, 2011. It accepts voice input for 15 languages and allows translation of a word or phrase into one of more than 50 languages. Translations can be spoken out loud in 23 different languages.

Language options

(by chronological order of introduction)
  • 1st stage
  • English
    English language
    English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

     to French
    French language
    French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

  • English to German
    German language
    German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

  • English to Spanish
    Spanish language
    Spanish , also known as Castilian , is a Romance language in the Ibero-Romance group that evolved from several languages and dialects in central-northern Iberia around the 9th century and gradually spread with the expansion of the Kingdom of Castile into central and southern Iberia during the...

  •      
  • French to English
  • German to English
  • Spanish to English
  • 2nd stage
  • English to Portuguese
    Portuguese language
    Portuguese is a Romance language that arose in the medieval Kingdom of Galicia, nowadays Galicia and Northern Portugal. The southern part of the Kingdom of Galicia became independent as the County of Portugal in 1095...

  • English to Dutch
    Dutch language
    Dutch is a West Germanic language and the native language of the majority of the population of the Netherlands, Belgium, and Suriname, the three member states of the Dutch Language Union. Most speakers live in the European Union, where it is a first language for about 23 million and a second...

  • Portuguese to English
  • Dutch to English

  • 3rd stage
  • English to Italian
    Italian language
    Italian is a Romance language spoken mainly in Europe: Italy, Switzerland, San Marino, Vatican City, by minorities in Malta, Monaco, Croatia, Slovenia, France, Libya, Eritrea, and Somalia, and by immigrant communities in the Americas and Australia...

  • Italian to English

  • 4th stage
  • English to Chinese
    Chinese language
    The Chinese language is a language or language family consisting of varieties which are mutually intelligible to varying degrees. Originally the indigenous languages spoken by the Han Chinese in China, it forms one of the branches of Sino-Tibetan family of languages...

     (Simplified)
  • English to Japanese
    Japanese language
    is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is a member of the Japonic language family, which has a number of proposed relationships with other languages, none of which has gained wide acceptance among historical linguists .Japanese is an...

  • English to Korean
    Korean language
    Korean is the official language of the country Korea, in both South and North. It is also one of the two official languages in the Yanbian Korean Autonomous Prefecture in People's Republic of China. There are about 78 million Korean speakers worldwide. In the 15th century, a national writing...

  •      
  • Chinese (Simplified) to English
  • Japanese to English
  • Korean to English
  • 5th stage (launched April 2006)
  • English to Arabic
    Arabic language
    Arabic is a name applied to the descendants of the Classical Arabic language of the 6th century AD, used most prominently in the Quran, the Islamic Holy Book...

  • Arabic to English
  • 6th stage (launched December 2006)
  • English to Russian
    Russian language
    Russian is a Slavic language used primarily in Russia, Belarus, Uzbekistan, Kazakhstan, Tajikistan and Kyrgyzstan. It is an unofficial but widely spoken language in Ukraine, Moldova, Latvia, Turkmenistan and Estonia and, to a lesser extent, the other countries that were once constituent republics...

  • Russian to English
  • 7th stage (launched February 2007)
  • English to Chinese (Traditional)
  • Chinese (Simplified to Traditional)
  •      
  • Chinese (Traditional) to English
  • Chinese (Traditional to Simplified)
  • 8th stage (launched October 2007)
    • all 25 language pairs use Google's machine translation system

  • 9th stage
  • English to Hindi
  • Hindi to English
  • 10th stage (as of this stage, translation can be done between any two languages, going through English, if needed) (launched May 2008)
  • Bulgarian
    Bulgarian language
    Bulgarian is an Indo-European language, a member of the Slavic linguistic group.Bulgarian, along with the closely related Macedonian language, demonstrates several linguistic characteristics that set it apart from all other Slavic languages such as the elimination of case declension, the...

  • Croatian
    Croatian language
    Croatian is the collective name for the standard language and dialects spoken by Croats, principally in Croatia, Bosnia and Herzegovina, the Serbian province of Vojvodina and other neighbouring countries...

  • Czech
    Czech language
    Czech is a West Slavic language with about 12 million native speakers; it is the majority language in the Czech Republic and spoken by Czechs worldwide. The language was known as Bohemian in English until the late 19th century...

  • Danish
    Danish language
    Danish is a North Germanic language spoken by around six million people, principally in the country of Denmark. It is also spoken by 50,000 Germans of Danish ethnicity in the northern parts of Schleswig-Holstein, Germany, where it holds the status of minority language...

  •      
  • Finnish
    Finnish language
    Finnish is the language spoken by the majority of the population in Finland Primarily for use by restaurant menus and by ethnic Finns outside Finland. It is one of the two official languages of Finland and an official minority language in Sweden. In Sweden, both standard Finnish and Meänkieli, a...

  • Greek
    Greek language
    Greek is an independent branch of the Indo-European family of languages. Native to the southern Balkans, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history;...

  • Norwegian
    Norwegian language
    Norwegian is a North Germanic language spoken primarily in Norway, where it is the official language. Together with Swedish and Danish, Norwegian forms a continuum of more or less mutually intelligible local and regional variants .These Scandinavian languages together with the Faroese language...

  • Polish
    Polish language
    Polish is a language of the Lechitic subgroup of West Slavic languages, used throughout Poland and by Polish minorities in other countries...

  •      
  • Romanian
    Romanian language
    Romanian Romanian Romanian (or Daco-Romanian; obsolete spellings Rumanian, Roumanian; self-designation: română, limba română ("the Romanian language") or românește (lit. "in Romanian") is a Romance language spoken by around 24 to 28 million people, primarily in Romania and Moldova...

  • Swedish
    Swedish language
    Swedish is a North Germanic language, spoken by approximately 10 million people, predominantly in Sweden and parts of Finland, especially along its coast and on the Åland islands. It is largely mutually intelligible with Norwegian and Danish...

  • 11th stage (launched September 25, 2008)
  • Catalan
    Catalan language
    Catalan is a Romance language, the national and only official language of Andorra and a co-official language in the Spanish autonomous communities of Catalonia, the Balearic Islands and Valencian Community, where it is known as Valencian , as well as in the city of Alghero, on the Italian island...

  • Filipino
    Filipino language
    This move has drawn much criticism from other regional groups.In 1987, a new constitution introduced many provisions for the language.Article XIV, Section 6, omits any mention of Tagalog as the basis for Filipino, and states that:...

  • Hebrew
    Hebrew language
    Hebrew is a Semitic language of the Afroasiatic language family. Culturally, is it considered by Jews and other religious groups as the language of the Jewish people, though other Jewish languages had originated among diaspora Jews, and the Hebrew language is also used by non-Jewish groups, such...

  •      
  • Indonesian
    Indonesian language
    Indonesian is the official language of Indonesia. Indonesian is a normative form of the Riau Islands dialect of Malay, an Austronesian language which has been used as a lingua franca in the Indonesian archipelago for centuries....

  • Latvian
    Latvian language
    Latvian is the official state language of Latvia. It is also sometimes referred to as Lettish. There are about 1.4 million native Latvian speakers in Latvia and about 150,000 abroad. The Latvian language has a relatively large number of non-native speakers, atypical for a small language...

  • Lithuanian
    Lithuanian language
    Lithuanian is the official state language of Lithuania and is recognized as one of the official languages of the European Union. There are about 2.96 million native Lithuanian speakers in Lithuania and about 170,000 abroad. Lithuanian is a Baltic language, closely related to Latvian, although they...

  •      
  • Serbian
    Serbian language
    Serbian is a form of Serbo-Croatian, a South Slavic language, spoken by Serbs in Serbia, Bosnia and Herzegovina, Montenegro, Croatia and neighbouring countries....

  • Slovak
    Slovak language
    Slovak , is an Indo-European language that belongs to the West Slavic languages .Slovak is the official language of Slovakia, where it is spoken by 5 million people...

  • Slovene
  •      
  • Ukrainian
    Ukrainian language
    Ukrainian is a language of the East Slavic subgroup of the Slavic languages. It is the official state language of Ukraine. Written Ukrainian uses a variant of the Cyrillic alphabet....

  • Vietnamese
    Vietnamese language
    Vietnamese is the national and official language of Vietnam. It is the mother tongue of 86% of Vietnam's population, and of about three million overseas Vietnamese. It is also spoken as a second language by many ethnic minorities of Vietnam...

  • 12th stage (launched January 30, 2009)
  • Albanian
    Albanian language
    Albanian is an Indo-European language spoken by approximately 7.6 million people, primarily in Albania and Kosovo but also in other areas of the Balkans in which there is an Albanian population, including western Macedonia, southern Montenegro, southern Serbia and northwestern Greece...

  • Estonian
    Estonian language
    Estonian is the official language of Estonia, spoken by about 1.1 million people in Estonia and tens of thousands in various émigré communities...

  • Galician
    Galician language
    Galician is a language of the Western Ibero-Romance branch, spoken in Galicia, an autonomous community located in northwestern Spain, where it is co-official with Castilian Spanish, as well as in border zones of the neighbouring territories of Asturias and Castile and León.Modern Galician and...

  •        
  • Hungarian
    Hungarian language
    Hungarian is a Uralic language, part of the Ugric group. With some 14 million speakers, it is one of the most widely spoken non-Indo-European languages in Europe....

  • Maltese
    Maltese language
    Maltese is the national language of Malta, and a co-official language of the country alongside English,while also serving as an official language of the European Union, the only Semitic language so distinguished. Maltese is descended from Siculo-Arabic...

  • Thai
    Thai language
    Thai , also known as Central Thai and Siamese, is the national and official language of Thailand and the native language of the Thai people, Thailand's dominant ethnic group. Thai is a member of the Tai group of the Tai–Kadai language family. Historical linguists have been unable to definitively...

  •        
  • Turkish
    Turkish language
    Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

  • 13th stage (launched June 19, 2009)
  • Persian
    Persian language
    Persian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is primarily spoken in Iran, Afghanistan, Tajikistan and countries which historically came under Persian influence...

  • 14th stage (launched August 24, 2009)
  • Afrikaans
  • Belarusian
    Belarusian language
    The Belarusian language , sometimes referred to as White Russian or White Ruthenian, is the language of the Belarusian people...

  • Icelandic
    Icelandic language
    Icelandic is a North Germanic language, the main language of Iceland. Its closest relative is Faroese.Icelandic is an Indo-European language belonging to the North Germanic or Nordic branch of the Germanic languages. Historically, it was the westernmost of the Indo-European languages prior to the...

  •        
  • Irish
    Irish language
    Irish , also known as Irish Gaelic, is a Goidelic language of the Indo-European language family, originating in Ireland and historically spoken by the Irish people. Irish is now spoken as a first language by a minority of Irish people, as well as being a second language of a larger proportion of...

  • Macedonian
    Macedonian language
    Macedonian is a South Slavic language spoken as a first language by approximately 2–3 million people principally in the region of Macedonia but also in the Macedonian diaspora...

  • Malay
    Malay language
    Malay is a major language of the Austronesian family. It is the official language of Malaysia , Indonesia , Brunei and Singapore...

  •        
  • Swahili
    Swahili language
    Swahili or Kiswahili is a Bantu language spoken by various ethnic groups that inhabit several large stretches of the Mozambique Channel coastline from northern Kenya to northern Mozambique, including the Comoro Islands. It is also spoken by ethnic minority groups in Somalia...

  • Welsh
    Welsh language
    Welsh is a member of the Brythonic branch of the Celtic languages spoken natively in Wales, by some along the Welsh border in England, and in Y Wladfa...

  • Yiddish
    Yiddish language
    Yiddish is a High German language of Ashkenazi Jewish origin, spoken throughout the world. It developed as a fusion of German dialects with Hebrew, Aramaic, Slavic languages and traces of Romance languages...

  •        
  • 15th stage (launched November 19, 2009)
    • The Beta stage is finished. Users can now choose to have the romanization
      Romanization
      In linguistics, romanization or latinization is the representation of a written word or spoken speech with the Roman script, or a system for doing so, where the original word or language uses a different writing system . Methods of romanization include transliteration, for representing written...

       written for Chinese, Japanese, Korean, Russian, Ukrainian, Belarusian, Bulgarian, Greek, Hindi and Thai. For translations from Arabic, Persian and Hindi, the user can enter a Latin transliteration of the text and the text will be translated to the native script for these languages as the user is writing. The text can now be read by a text-to-speech program in English, Italian, French and German
  • 16th stage (launched January 30, 2010)
    • Haitian Creole
      Haitian Creole language
      Haitian Creole language , often called simply Creole or Kreyòl, is a language spoken in Haiti by about twelve million people, which includes all Haitians in Haiti and via emigration, by about two to three million speakers residing in the Bahamas, Cuba, Canada, France, Cayman Islands, French...

  • 17th stage (launched April 2010)
    • Speech program launched in Hindi and Spanish
  • 18th stage (launched May 5, 2010)
    • Speech program launched in Afrikaans, Albanian, Catalan, Chinese (Mandarin), Croatian, Czech, Danish, Dutch, Finnish, Greek, Hungarian, Icelandic, Indonesian, Latvian, Macedonian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Swahili, Swedish, Turkish, Vietnamese and Welsh (based in eSpeak
      ESpeak
      eSpeak is a compact open source software speech synthesizer for Linux, Windows, and other platforms. It uses a formant synthesis method, providing many languages in a small size. Much of the programming for eSpeak's languages was based on information found on Wikipedia, with some subsequent...

      ).
  • 19th stage (launched May 13, 2010)
  • Armenian
    Armenian language
    The Armenian language is an Indo-European language spoken by the Armenian people. It is the official language of the Republic of Armenia as well as in the region of Nagorno-Karabakh. The language is also widely spoken by Armenian communities in the Armenian diaspora...

  • Azerbaijani
    Azerbaijani language
    Azerbaijani or Azeri or Torki is a language belonging to the Turkic language family, spoken in southwestern Asia by the Azerbaijani people, primarily in Azerbaijan and northwestern Iran...

  •        
  • Basque
    Basque language
    Basque is the ancestral language of the Basque people, who inhabit the Basque Country, a region spanning an area in northeastern Spain and southwestern France. It is spoken by 25.7% of Basques in all territories...

  • Georgian
    Georgian language
    Georgian is the native language of the Georgians and the official language of Georgia, a country in the Caucasus.Georgian is the primary language of about 4 million people in Georgia itself, and of another 500,000 abroad...

  •        
  • Urdu
  • 20th stage (launched June 2010)
  • Provides romanization for Arabic.
  • 21st stage (launched September 2010)
  • Allows phonetic typing for Arabic, Greek, Hindi, Persian, Russian, Serbian and Urdu.
  • Latin
    Latin
    Latin is an Italic language originally spoken in Latium and Ancient Rome. It, along with most European languages, is a descendant of the ancient Proto-Indo-European language. Although it is considered a dead language, a number of scholars and members of the Christian clergy speak it fluently, and...

  • 22nd stage (launched December 2010)
    • Romanization of Arabic removed.
    • Spell check added.
    • Google replaced some languages' text-to-speech synthesizers from eSpeak's robot voice to native speaker's nature voice technologies made by SVOX
      SVOX
      SVOX AG is an embedded speech technology company founded in 2000 and headquartered in Zurich, Switzerland. Company’s products include Automated Speech Recognition , Text-to-Speech and Speech Dialog systems, with customers mostly being manufacturers and system integrators in automotive and mobile...

      (Chinese, Czech, Danish, Dutch, Finnish, Greek, Hungarian, Norwegian, Polish, Portuguese, Russian, Swedish, Turkish). Also the old versions of French, German, Italian and Spanish. Latin uses the same synthesizer as Italian.
    • Speech program launched in Arabic, Japanese, and Korean.
  • 23rd stage (Launched January 2011)
    • Now you can choose different translations for a word.

  • 24th stage (Launched June 2011)
    • 5 new Indian languages (in alpha) and a transliterated input method:
      • Bengali
        Bengali language
        Bengali or Bangla is an eastern Indo-Aryan language. It is native to the region of eastern South Asia known as Bengal, which comprises present day Bangladesh, the Indian state of West Bengal, and parts of the Indian states of Tripura and Assam. It is written with the Bengali script...

      • Gujarati
        Gujarati language
        Gujarati is an Indo-Aryan language, and part of the greater Indo-European language family. It is derived from a language called Old Gujarati which is the ancestor language of the modern Gujarati and Rajasthani languages...

      • Kannada
        Kannada language
        Kannada or , is a language spoken in India predominantly in the state of Karnataka. Kannada, whose native speakers are called Kannadigas and number roughly 50 million, is one of the 30 most spoken languages in the world...

      • Tamil
        Tamil language
        Tamil is a Dravidian language spoken predominantly by Tamil people of the Indian subcontinent. It has official status in the Indian state of Tamil Nadu and in the Indian union territory of Pondicherry. Tamil is also an official language of Sri Lanka and Singapore...

      • Telugu
        Telugu language
        Telugu is a Central Dravidian language primarily spoken in the state of Andhra Pradesh, India, where it is an official language. It is also spoken in the neighbouring states of Chattisgarh, Karnataka, Maharashtra, Orissa and Tamil Nadu...


  • 25th stage (Launched July 2011)
    • Now you can rate the translations.

Translation methodology

Google Translate does not apply grammatical
Grammar
In linguistics, grammar is the set of structural rules that govern the composition of clauses, phrases, and words in any given natural language. The term refers also to the study of such rules, and this field includes morphology, syntax, and phonology, often complemented by phonetics, semantics,...

 rules, since its algorithms are based on statistical analysis rather than traditional rule-based analysis. Indeed, the system's original creator, Franz Josef Och
Franz Josef Och
Franz Josef Och is a German computer scientist and currently a researcher at Google.- Life and Work :He studied computer science at the University of Erlangen-Nuremberg , Germany, where he graduated with a Dipl.Ing. degree in 1998...

, has criticized the effectiveness of rule-based algorithms in favor of empirical approaches. It is based on a method called statistical machine translation
Statistical machine translation
Statistical machine translation is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora...

, and more specifically, on research by Och who won the DARPA contest for speed machine translation in 2003. He is now the head of Google's machine translation group.

Understanding how Google Translate works helps achieving better translations.

Google does not translate from one language to another (L1 -> L2), it most often translates first to English and then to the target language (L1 -> EN -> L2)
.
This, of course, reduces down to about 70 a number of about 2500 dictionaries (70×70÷2).

But English, like all human languages, is ambiguous and depends on context. This causes translation errors. For example, translating vous from French to Russian gives vous -> you -> ты OR вы . If Google were using an unambiguous, artificial language as the intermediary, it would be vous -> you2 -> вы OR tu -> you1 -> ты. Such a suffixing of words disambiguates their different meanings.

Hence, publishing in English, using non ambiguous words, providing context, using expressions such as "you all" often make a better one-step translation.

Overlooking the grammar of the language can cause mistakes. For example, consider the following sentence:

Пишет (3rd person: it writes) вам (dative: to you (all)) письмо (letter) семья (family) Дарьи (genitive: of Daria).

Based on the word order, Google translates: You wrote a letter to family Darya.

Based on declensions (word functions), it means: [it's] Daria's family [that] writes you a letter, exactly the opposite.

Google took you for to you, Daria for of Daria as well as to the family for the family.

When translating back to Russian, however, Google says: Семья Дарьи пишет вам письмо.

That's correct because Google understood the English words order.

Respecting the same word order as in English or publishing in English as above may help.

According to Och, a solid base for developing a usable statistical machine translation system for a new pair of languages from scratch, would consist in having a bilingual text corpus
Text corpus
In linguistics, a corpus or text corpus is a large and structured set of texts...

 (or parallel collection
Parallel text
A parallel text is a text placed alongside its translation or translations. Parallel text alignment is the identification of the corresponding sentences in both halves of the parallel text. The Loeb Classical Library and the Clay Sanskrit Library are two examples of dual-language series of texts...

) of more than a million words and two monolingual corpora of each more than a billion words. Statistical models
Mathematical model
A mathematical model is a description of a system using mathematical concepts and language. The process of developing a mathematical model is termed mathematical modeling. Mathematical models are used not only in the natural sciences and engineering disciplines A mathematical model is a...

 from this data are then used to translate between those languages.

To acquire this huge amount of linguistic data, Google used United Nations
United Nations
The United Nations is an international organization whose stated aims are facilitating cooperation in international law, international security, economic development, social progress, human rights, and achievement of world peace...

 documents. The UN typically publishes documents in all six official UN languages
Official languages of the United Nations
The official languages of the United Nations are the six languages that are used in UN meetings, and in which all official UN documents are written...

, which has produced a very large 6-language corpus.

Google representatives have been involved with domestic conferences in Japan where it has solicited bilingual data from researchers.

Translation mistakes and oddities

Because Google Translate uses statistical matching to translate rather than a dictionary/grammar rules approach, translated text can often include apparently nonsensical and obvious errors, often swapping common terms for similar but nonequivalent common terms in the other language, as well as inverting sentence meaning.

See also

  • Asia Online
    Asia Online
    Asia Online is a privately owned company backed by individual investors and institutional venture capital. Its corporate headquarters are in Singapore, and it has significant operations in Bangkok, Thailand, with R&D activities throughout Asia and expanding sales operations in Europe and North...

  • Babel Fish (website)
    Babel Fish (website)
    Yahoo! Babel Fish is a web-based machine translation application on Yahoo! that translates text or web pages from one of several languages into another....

  • Bing Translator
  • Comparison of machine translation applications
    Comparison of machine translation applications
    A machine translation application is a program which can translate text or speech from one natural language to another. Machine translation applications are essential to the modern language industry...

  • Google Dictionary
    Google Dictionary
    Google Dictionary was an online dictionary service of Google, originating in its Google Translate service.The Google Dictionary website was terminated on August 5, 2011 after part of its functionality was integrated into Google Search using the define: operator.It was believed that, until August...

  • Jollo
    Jollo
    Jollo is an online translation website where users can instantly translate texts into 23 languages, request human translations from a community of volunteers around the world and compare the correctness of several leading Machine Translation websites.-System:...

  • List of Google services and tools
  • Rosetta Stone
    Rosetta Stone (software)
    Rosetta Stone is proprietary computer-assisted language learning software developed by Rosetta Stone Inc. Both its title and logo refer to the Rosetta Stone, an artifact inscribed in multiple languages that helped Jean-François Champollion to decipher Ancient Egyptian hieroglyphs...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK