Would you believe seeing a clause from Google’s Terms of Service could become a jaw-dropping experience for a group of professionals and business owners? Neither could I, until last week I presented a short resources workshop to a Jerusalem networking group.
When asked whether they use Google Translate in their workflow, almost one half of the participants raised their hands. However, when I showed a slide with the following excerpts from Google’s TOS, hands clapped over mouths all around the room:
“ …By submitting, posting or displaying the content you give Google a perpetual, irrevocable, worldwide, royalty-free, and non-exclusive license to reproduce, adapt, modify, translate, publish, publicly perform, publicly display and distribute any Content which you submit, post or display on or through, the Services.”
“… You agree that this license includes a right for Google to make such Content available to other companies, organizations or individuals with whom Google has relationships for the provision of syndicated services, and to use such Content in connection with the provision of those services.”
In human language that means that by uploading any content to any one of Google’s services, including Gmail, Docs, and Translate, we allow Google unlimited use of our content and give up all rights to the privacy of information. (I would love to be proven wrong on this, but after searching through all of Google’s licensing and privacy materials, I have not been able to find anything to the contrary).
Google already makes extensive use of search word analysis for displaying ads that match the content of Gmail messages. As Orwellian as that may sound, the greatest privacy issues are posed by Google Translate.
Translators should understand that all content uploaded to Google Translate is immediately used by the company for future translation unit matches. This is especially poignant when using Google Translate with computer-aided-translation tools, such as Trados. (Trados actually warns users of possible privacy issues before enabling the Google Translate option). By turning on Google Translate, we give Google access to both source and target texts. Google then replicates those matches whenever anyone else tries to translate even remotely similar text strings.
From conversations with colleagues, I have heard of instances in which Google has returned 100% perfect matches on translation units containing proprietary client information (including company names). That means that somebody somewhere had already used Google Translate while working on sensitive documents. These strings can now surface as fuzzy matches. That sets the stage for data mining.
The same is true of any public translation memory. Although a CAT tool with access to an automated translation service can be a powerful timesaver, our clients’ privacy and proprietary rights take precedence over any considerations of expediency. Before enabling any information-sharing service ask yourself the following questions:
- Have you signed a Non-disclosure Agreement (NDA) with your client? If yes, you are prohibited from sharing any information with a third party (yes, Google is a third party).
- Even if there is no NDA, would the client mind making this information available to others? You can never know the answer to that question. Even seemingly innocuous information may be considered highly confidential by a particular client. It is always prudent to ask for permission.
- In regard to public translation memories, such as MyMemory, does contributing a translation memory violate the client’s privacy? Although MyMemory makes it possible to hide company and brand names when using its service, it is not always sufficient for maintaining appropriate confidentiality.
Obviously, it is unpractical to read the End User License Agreement for every website we encounter on the Internet. However, translators must understand the terms of any service, which has access to confidential information on their computers.
If you would like to learn more about translation and small business technology and resources, come to my Tools for Wordsmiths workshop on June 21 at Writepoint. Contact me for more information.
Very useful information! I guess I’m incurably old-fashioned and prefer the long way…dictionaries, thesauri, etc…tedious, but safer and surer!
[...] This post was mentioned on Twitter by Sarah Lipman, Leah Aharoni. Leah Aharoni said: Google #Translate – What Every User Should Know http://bit.ly/c2Kjsu #translation #xl8 #smallbiz #kishor #locworld #LWB10 [...]
Its actually interesting. Take into account though that we are not talking about allowing Google to be exposed to the content, but rather about allowing them to use it. Yahoo, Google, Facebook and also all the ISPs can view anything you send using their respective tools and services. If you are illegally downloading your ISP does know about it, but it can do nothing to stop you because it violates your privacy. Google is taking it a step farther by saying we have the rights to your privacy and content and can do with it whatever we want, and by agreeing to these terms, you are acknowledging this reality. If you want real privacy work offline, or exclusively on the intranet of the company that hired you.
@Yehoshua I think there is a gap between imperfect privacy associated with Internet use and conscious submission of confidential information to a service which mines that information for its own and public use. The former is a given in today’s commercial dealings, while the latter is a violation of trust (or law).
Thank you for sharing this on your blog. “What every user should know” is far more that a catchy title for an article. I really is important for people to understand that when they use Google translate to help them, Google is obtaining full control over the uploaded texts.
[...] This post was Twitted by PremierFocus [...]
Some good points, especially about client concerns even if no NDA exists. One slight correction, however: when you turn on Google Translate in SDL Trados Studio 2009, the process is one-way only – only source text lookups are performed and no translations are uploaded back to Google’s servers. Of course, what we don’t know is whether Google retains the source text lookups and what use it makes of them. But by that stage it’s largely irrelevant anyway, since as you say, you’ve probably already broken an NDA by submitting the query.
Using Google Translation Tookit – Google’s own online translation memory interface – is different. Here, target text translations are stored and potentially mined.
As others note, however, even searching on the web could cause problems – imagine a situation where current search terms are displayed on a public page in real time, as has been done in the past by various search engines.
I’m not sure this article makes sense. If you use Google to translate or detect language, you use Google’s AJAX license here http://code.google.com/apis/ajaxlanguage/documentation/#Translate. The license terms subject to language translation is here http://code.google.com/apis/ajaxlanguage/terms.html. That license does not contain any Google license rights to your content. In fact, it contains a clause that says if there is a conflict with other Google agreements, that AJAX license controls. And Universal TOS has a clause that says any conflicts with the TOS and any other Google service will be resolved in favor of the latter.
So I think you have to be very careful to make sure you understand exactly which agreement you’re subject to with Google. It is not an easy task, and there is a lot of misinformation. But we use AJAX to do language detection and translation and operate under the AJAX agreement.
@Miryam – I find it unfortunate that most users have no clue of possible privacy and rights issues. Even technology-savvy reviewers usually overlook the issue.
@Iwan – Thank you for the correction. I’d appreciate hearing about your source for this information, as I was under the impression that Google can access both source and target segments.
@Bill – What’s the connection? AJAX API and Translate are two distinct services for completely different audiences. Regarding AJAX terms, these cover the relationship between the developer and Google, not between Google and application users. Just reread the Proprietary Rights clause (2.1) and you’ll see that.
@ Leah – You’re right. I assumed the professionals addressed in the seminar were building translation capability into their work flows by developing and auto-translating code. They would fall under the AJAX terms and would be okay. If they are cutting and pasting text for Google to translate text via Google’s online service, then you’re right about the license, and it is really helpful to bring that to people’s attention.
THANK YOU. When I point blank asked Goggle reps about this and how it fit into corporate (un)responsibility at this year’s ALC conference, they refused to give a straight-forward answer and a lot of people admonished me for even asking. But this is something that end clients need to know!
I have no problem with MT and think that Google translate–or a similar service–is even the best tool for certain types of translation. But the average Google user is not aware of the point you address, and people need to be fully aware of where they are sending their information and what can in turn be done with it. Thank you for bringing some exposure to this important topic.
[...] reaction, when I read an article this morning, warning all users of the tool to be aware, they actually DO this!! … is, [...]
A good reason to use a downloadable MT system such as http://apertium.org ;-)
However, I’m not sure how they would be able to get at your post-edited target text — why would that be uploaded? (Unless the application you use explicitly uploads it to Google as a “suggested translation”; since that is more work for the developer, and only helps Google, I doubt they would do that without at least asking the user.)
.-= Kevin Brubeck Unhammer´s last blog ..Konverter OpenOffice-dokument til LyX – LaTeX utan å miste Zotero-referansar =-.
I think this is a well written article in some respects … I agree there still is an element of naivity regarding confidentiality issues and the internet in general, but my own take, is that there is a little too much finger-pointing. Perhaps we should turn this around on ourselves a little more and agree that we need to learn to take ownership of any material we expose on the web. Surely, in taking the decision to utilise a service offered to us ‘free of charge’ on a public forum, such as Google Translate, one immediately takes responsibility, oneself, for any data one ‘chooses’ to submit. Censorship starts at one’s own fingertips and common-sense plays a great part here. For me, it is not enough to blame Google for storing a confidential document, or for the fact that a segment of that document may appear later as a perfect translation match in someone else’s search, including names, product data and / or any other sensitive material that we consciously uploaded. We would no sooner publish a confidential customer document in Facebook than on a public blog, so why would the principles of uploading the same document to Google Translate be any different. Exposing the dangers of naivity is one thing and is, to a certain extent commendable, but pointing fingers away from those who should know better and thus implicating someone else for individual failings, goes a little against the grain … for me. I feel strongly about the twist of perspectives here and have posted a response to the theme on my own blog. It has already shown up in your comments as a link (here again) : http://metajugglamum.wordpress.com/2010/10/29/breach-of-confidentiality-or-breach-of-common-sense/
Nevertheless an interesting and thought-provoking post! Thank you!
MJM.
.-= Metajugglamum´s last blog ..Breach of Confidentiality or breach of Common Sense =-.
@Terena, thank you for your comment. The fact that Google is not upfront about this is a major part of the problem
@Kevin, Trados notifies the user of the potential confidentiality problem the first time Google Translate is chosen as a TM. I think it matters less whether Google uploads only the source text or both source and target (though in the second case, in addition to obtaining information, Google is also making use of translators’ work without them knowing it). It might be worthwhile to look into that.
@Metajugglamum, I absolutely agree with you that as professionals we must take responsibility for the consequences of using certain services and tools. The aim of this post was to bring the issue to people’s attention and not to point fingers at Google, though as Terena has pointed out, Google has made it pretty hard for people to know just what it is doing with their information. Thank G-d for TOSs.
I think another issue is, if someone uploads confidential content that, per a previous NDA with a client, they are not allowed to upload (but do it anyway), they really cannot give away the “rights” to this information. Meaning, that Google is assuming that people who agree to these terms of service have the AUTHORITY to do so, when, in fact, they may not! So, this poses yet another question. I am not sure how they could ever tell the difference, but this is a problem.
@Eve, Good point, this is definitely a complex intellectual property isssue
@Leah – How about when not using the online “toolkit” specifically, but rather the embedded translation app using SSL. See following statement from google found at http://translate.google.com/support/?hl=en
“The website translator also works securely if it’s embedded on a web page that is served from a secure server. In such cases, the content of the web page will be sent to Google for translation using a secure connection (HTTPS), and Google will not log any of the text.”
Most professional translators work with Google Translate as part of a CAT tool (Trados, etc), which allows Google to log the text. IMHO, the toolkit is not powerful or feature-rich enough for professional work.
[...] contributions from translators from the world over. One word of caution though. Be sure to consider your clients’ confidentiality before using MyMemory or any other public TM [...]
That’s why I don’t use the cat tool provided by Google, only Google Translate so Google doesn’t see my translation and can’t use it.
For more on translations issues, like rates: http://www.leblogdelamirabelle.net/my-translators-blog/tradshirts
P.S. : Are you using the Askimet plugin? If so, you should really delete it and replace it with another anti-spam plugin. Askimet deletes the comments of a LOT of legitimate commentators. If I submit your website now as spam to Askimet, you’ll be blocked from posting to other websites with the Askimet plugin. This could be very damaging to your biz. since anybody with bad intentions could hinder their competition that way!!
mirabelle recently posted..Touristes et sable à Waikikiland – Tourists in the sand at Waikikiland
Thanks for sharing this. We’ll think again before using Google to find out what a text ‘ is about’ in a language we don’t know. There just might be a companyname in Hindi!
[...] : Nous avons librement traduit et adapté certains extraits de cet article. Merci à HZouar pour le lien. Remplis sous: Articles en français Laisser un commentaire [...]
Definitely not true. The TARGET text is not uploaded, neither in SDL Studio, nor in MemoQ. GT does of course receive input in form of the requests – the SOURCE segments – and that is what the warning in SDL Studio (here erroneously called Trados) and MemoQ is about. You are revealing the source document, which can be considered a privacy breach in certain cases.
Does Google Translate benefit by knowing what the ‘question’ is, even if it doesn’t get the translator’s ‘answer’? Possibly it does, although I’ve not yet read a decent explanation on how that is supposed to work. I doubt that it can be of much use, so I have no objections against the use of Google Translate as a reference. It can be very helpful to find specific lingo of the trade you’re dealing with, particularly if your segments are long – thus providing context for GT to work with.
This article is scare mongering. Reality is unfolding in front of our eyes. Better get used to it and adapt.
“By turning on Google Translate, we give Google access to both source and target texts. Google then replicates those matches whenever anyone else tries to translate even remotely similar text strings.”
The truth is that this is about the only way they can display search results. Think about it. What it is saying is just stating the obvious. And the fact that this is in fact a problem only means that existing copyright laws are totally inappropriate in the Internet age.
[...] use of our content and give up all rights to the privacy of information.” (-> cited from http://aqtext.com/blog/google-translate/; June 1st, [...]
It is really a nice and helpful piece of information. I am happy that you shared this helpful information with us. Please keep us informed like this. Thanks for sharing.
OmegaT can connect to Apertium and I know Apertium does not store any information you provide, so that is an option.