Facebook and Google Want to Translate Your Site
Translations for Facebook Connect
Earlier this week Facebook announced a new service built on the Facebook Connect API called Translations for Facebook Connect. In general, the idea behind this tool is to allow developers the ability to translate a web site (into a language currently supported by Facebook, 65+ right now,) by crowdsourcing the translation itself. Leaning on their own experience letting users translate the Facebook user interface, Facebook is essentially opening its translation workflow process to the world.
From the How It Works portion of the announcement:
After you choose what languages you want your site or application to support, you can get help from the Facebook community to translate your site, as we did, or you can do the translation yourself, or make a specific person the administrator of the process. […] Once you register content for translation, your connected Facebook users can start translating your sites' content just as users helped translate Facebook.
Facebook has also created a new client-side featureset, using their XFBML framework, to allow developers to automatically submit content wrapped in a
fb:intl tag for translation and to allow translators to translate the content inline. Facebook has put together a simple demo to show this in action [Note: the link is gone and Facebook blocked it from being archived in the Wayback machine].
The Facebook Developer Wiki article on internationalization outlines the process from start to finish. Here you can see that the process isn't just one of machine translation, but that a site owner must designate content strings (pages, words, phrases, UI elements, images, etc.) that are to be translated into a language selected by the site owner. The article also provides samples of how a translator might perform bulk translation or inline translation with screenshots showing orange underlines and menus that appear on a right-click to perform the translations.
The most important part of this tool, however, isn't the technology but instead a fundamental understanding of the process of translation. Facebook has a pretty good article on Internationalization Best Practices that at least presents a primer to those who have never been through this process, gleaned from Facebook's own experience translating its site. This addresses, albeit without a lot of detail, some of the pitfalls those of us in the world of localization have experienced, such as the tendency for translated phrases to contain more characters than the original phrase (which can mess with a pixel-precise layout), or choosing general phrases to leverage throughout the site instead of translating similar but different bits of words over and over.
Unlike machine translation, humans are performing the translations, greatly increasing the likelihood that they will understand context. However, there are many conditions that must be satisfied or assumptions that must be made to implement this set of tools…
Does the web site attract users who are fluent in another language (the desired language)? If the site isn't already in the native tongue of a user, why would he/she be there?
Are these users willing to help translate your site for free? Given the concept of "you get what you pay for," there has to be a consideration for the quality and skill of the translators who take it upon themselves to do this work.
Is the translator a subject matter expert? You may not want to have just anyone translate your site about widgets. Widgets may require some very technical language that a casual reader may not grasp and may not understand in context. Words that are mundane in daily use may have a very specific meaning in the world of widgets and might not suffer a loose translation well (or perhaps aren't even supposed to be translated).
Is the translator part of your target audience? It's one thing to understand Portugese because your are from Portugal. It's another thing entirely to translate into Portuguese for use by Brazilians. Understanding the region, dialect, and local idioms can be very important for a proper translation (which is why we call it localization).
Are the site owners comfortable letting unknown third parties translate their message? Tag lines, marketing lingo and other carefully crafted content usually doesn't translate easily. Even with an approval process in place, it may take many passes to get site owners confident with a brands translation in language they do not speak.
Does the site already have Facebook Connect enabled? If not, then time must be taken to add the appropriate code to each page of the site to be translated.
Is time budgeted to identify content for translation? Somebody has to walk through the entire site (or at least any pages to be translated) and mark up the content for translation using what may be arcane XFBML tags.
Does the user even allow Facebook Connect? If your ideal translator doesn't want to log in to Facebook Connect on your site, then all this is moot. The user/translator has to trust both Facebook and your site, and feel altruistic enough to want to help.
Google Web Site Translator Gadget
The Google Translator Toolkit powers this machine translation tool, and while Google states that the toolkit learns from corrections to translations provided by users, it's still just machine translation without human intervention. It's also far easier to implement on a site and doesn't require much overhead. As an aid to the user, when the mouse hovers over the translated content, the block of copy is highlighted and the original content is displayed using a "tooltip" (or the title attribute of the element).
The Google Translate human intervention cycle
Because Google Translate has been around for so long, many users are accustomed to the quirks and know better than to rely on it to translate marketing content or poetic prose. Where it excels is in quickly allowing a site visitor to get the gist of a page, understand an address, or otherwise grab content that isn't so intertwined with a need for context that a user can glean some use.
Please understand that I am not referring to the "Translate" feature that was added to the Google Toolbar for Firefox as of yesterday (October 1). That is a client-side function that operates independently of the web site owner and I am only addressing translation features that a web site owner might want to implement to translate his/her site.
Machine translation is a risky endeavor. Even the most robust natural language processors have trouble with elements of language that humans understand naturally, such as context, ambiguity, syntactic irregularity, multiple meanings of words, etc. An example often used in this discussion is evident in these sentences:
Time flies like an arrow.
Fruit flies like an apple.
The words "flies" and "like" have completely different meaning between sentences. Only our knowledge of the metaphor and the bug allow us to understand the differences without further context. As my brand manager for me, I took the introductory paragraph from my web site (excuse my ego, in case you hadn't noticed it already):
I am a founder, partner, and Senior Usability Engineer at Algonquin Studios, responsible for bridging the gap between the worlds of design and technology. With experience in both, I bring a unique perspective to projects, allowing both design and implementation to merge seamlessly.
I translated it into German and then back to English:
I am a founder, partner and senior usability engineer at AlgonquinStudios, responsible for bridging the gap between the world of design and technology. With experience in both, I bring a unique perspective on projects, up so that both design and implementation to proceed smoothly integrated.
We can see it does pretty well for the first half of the paragraph and then it falls apart. This experiment is inherently unfair, of course, but it does demonstrate some of the risks with machine translation. It becomes particularly problematic when proper names that correspond to common words are translated.
Which to Use?
Machine translation is a risky proposition at best. You have no control over what content will come back to your end user. Adding a Google Translate feature to your site can give the appearance to some users that you have effectively signed off on the content. If that's not a concern, perhaps because of the nature of your site (fan site, personal site, etc.) then it's certainly your cheapest option. If you have a business site then this may not be the right path for you. You may still want to link to the Google Translate page to let end users perform translations on their own. At that point you have set an expectation that this is not a serice you provide and you are not responsible for how bizarrely an idiom may be translated.
Crowdsourcing your translation may sound like a better idea simply because you now have humans making decisions about what makes sense in the translation, but you have to take into account the caveats above. Are you prepared to pay for (or spend) the time preparing a site for translation and then babysitting the process, hoping the entire time someone with enough linguistic skill will come along and translate it?
This is where I would lean on a third option. Hire professionals who have done this before. If you are talking about translating your corporate brand and message, then you shouldn't leave it to machines or the waiting game of altruism. If you just want to translate a personal site, a fan site, or perhaps a strong community site (like Facebook), then either other option may work well for you depending on what you can commit. It may even be worth considering Google for the short term option while waiting for the Facebook option to pay off.
Just added Google translate into my mambo-driven site (www.ivyhurst.com) and it seems to work pretty well. I'm impressed, for sure!