Wikipedia:Wikipedia Signpost/2015-06-24/Special report


Special report

Small impact of the large Google Translation Project on Telugu Wikipedia

Belgium, according to the Telugu Wikipedia from English Wikipedia original;one of the thousands of articles generated by the Google Translation Project. But how much did this help Wikipedia?

During 2009–2011 Google ran the Google Translation Project (GTP), a program utilising paid translators to translate most popular English Wikipedia articles to various Indian language Wikipedias. The program was organized as a part of a bid to extend and improve Google Translate software services in various languages: in a presentation[1] at Wikimania 2010 a company presenter stated that "Google has been working with the Wikimedia Foundation, students, professors, Google volunteers, paid translators, and members of the Wikipedia community to increase Wikipedia content in Arabic, Indic languages, and Swahili"; for more background on the effort see Signpost coverage on Wikimania 2010 and Bengali and Swahili experience.

The Google Translation Project was at first visible only through the generation of "Recent changes" items with comments mentioning the use of a "Google translator toolkit".[2] This toolkit was first made public in June 2009; Google initially experimented with Hindi, but quickly expanded the initiative to Arabic, Tamil, Telugu, Bengali, Kannada and Swahili. Google shared the details through a presentation [1] in Wikimania 2010. In the same event, a critique of GTP[3] was presented by a Tamil Wikipedian representing Ravishankar who could not attend due to visa delays. He identified many of the issues facing the project: first, what was popular amongst English readers rarely matched the same amongst Tamil readers, and moreover the many quality problems of the translations (too many red links, mechanical translation, operational problems such as overwriting of stub articles) were all highlighted. Google tried to address the community recommendations on improving the quality of the content generated by engaging in a dialogue but did not succeed. In response to a query from the author Google informed the closure of the project in June 2011.[4] It also announced the launch of indic web and the availability of Google Translate for several indic languages.[5] As one of the first large-scale human aided machine translation efforts on Wikipedia the project also exposed important philosophical friction within the community as to the nature of volunteerism on the projects, friction that, still unaddressed, would go on to re-emerge in the debate over the role and propriety of bots on the Swedish Wikipedia—the wiki passed the million article milestone in 2013, but with almost half (~454,000) of them being bot-created.

In this review the metrics on contributions and page requests from Wikipedia are used to analyze the impact of the project by focusing, as a case study, on just one of the targeted wikis: the Telugu Wikipedia. The entire data and code is also being made available[6] for other communities to validate and apply the analysis to their Wikipedias.

  1. ^ a b Galvej, Michael (2010). "Submissions/Google translation – Wikimania 2010 in Gdańsk". wikimania2010.wikimedia.org. Retrieved 19 June 2015.
  2. ^ "Google Translate Blog: Translating Wikipedia". googletranslate.blogspot.in. 2010. Retrieved 28 May 2015.
  3. ^ Ayyakanu, Ravishankar (2010). "A Review on Google Translation project in Tamil Wikipedia - A-Review-on-Google-Translation-project-in-Tamil.pdf" (PDF). pdf.js. Retrieved 28 May 2015.
  4. ^ A, Ravishankar (2011). "[Wikimediaindia-l] Google's Indic Wikipedia translation project closing down". lists.wikimedia.org. Retrieved 19 June 2015.
  5. ^ "Official Google Blog: Google Translate welcomes you to the Indic web". googleblog.blogspot.in. 2011. Retrieved 19 June 2015.
  6. ^ Chavala, Arjuna Rao (2015). "Github repository with data and analysis for data scientists for reproducing/validating the research". github.com. Retrieved 17 June 2015.