by Jake Romer
2019-11-03T23:00:00-0600
Topics: Tech
Hallo, vrienden!
Great news for our friends in the Netherlands – Bike Index is now available in Dutch 🎉
As part of its ongoing mission to eliminate bicycle theft worldwide, Bike Index has partnered with BikeFair, a Dutch bike marketplace dedicated to bringing safety and transparency to second-hand bike sales. Making Bike Index accessible to Dutch users has been a critical component of that partnership.
Together with our recent integration with Dutch stolen goods registries stopheling.nl and verlorenofgevonden.nl, the internationalization project will enable Dutch bicyclists to register and search for their bikes using Bike Index, and to use Bike Index's new Promoted Alerts service, which uses targeted Facebook ads to more effectively recover lost and stolen bikes.
As a resource guide for other open-source projects that may need to undertake a similar project (or contributors to Bike Index – we are open-source!), what follows is a brief outline of the considerations involved in internationalizing a Rails app, our particular constraints and desiderata, and the decisions we made in our implementation.
Unless your Rails app has been internationalized since its inception, internationalizing it minimally entails three broad efforts:
For Bike Index, we did some research into the approaches taken by other internationalized open-source Rails projects – in particular, Discourse and GitLab. This work was useful in developing a mental model of the work to be done, although naturally we deviated with them where different needs or constraints demanded it.
There are a number of ways to detect a user's locale:
locale
query param (settable via a UI element),ACCEPT_LANGUAGE
header set on a request (settable via the user's browser preferences), andTo minimize complexity, we implemented only (1) through (3).
Translation management is a "buy vs. build" decision point. The central questions to engage with here are
That left us pricing a variety of translation management services we'd seen used elsewhere and researching their feature sets, including Transifex, LingoHub, and Phrase.
All involved committing to a monthly subscription that ranged from $19 to $180 per month, in addition to the cost of translation, which we estimated would cost $8,000-$10,000 for an initial Dutch translation.
Some more digging surfaced Translation.io, which is lightweight, focused on Rails (and Laravel) projects, and pushed all the right buttons for us:
As a non-profit, we're relatively price-sensitive and don't want to use funds inefficiently, so the potential savings gave Translation.io a big leg up in our deliberations.
Its most significant feature-gap relative to its alternatives – automated GitHub PRs to sync translations – could be implemented with some shell script integrated into our build pipeline, so we had a clear winner.
The key decision for this stage is what format to use for translation files, the choices being YAML (the Rails default) and GetText (broadly popular beyond the Rails ecosystem).
GetText has several advantages over the Rails default, especially for large projects – the most compelling arguably being that strings don't need to be externalized from templates to a translation file. Instead, the source string lives in the template but is merely wrapped in a special method.
But, as is often the case in a Rails context, the defaults are collectively better optimized on the needs of a moderately-scaled project like Bike Index than the alternatives, even if those alternatives are in one sense or another individually better.
There is pre-existing tooling that both mitigate the disadvantages of the Rails default i18n framework and amplify its benefits, so we chose to not stray too far from Rails conventions in order to leverage as much open-source prior art as possible. Additionally, the YAML approach allows non-developers (marketing?) to edit source copy without diving into the source code.
String externalization is by far the most time-consuming and labor-intensive part of a translation project.
We automated as much as possible using a variety of code-gen and text wrangling tools:
Some learnings emerged over the course of scanning through and extracting strings from ~15,000 lines of template, controller, and React code. Check out our internationalization docs if you'd like to read more about them!
An expanded version of this post - and others by Jake - can be found on his blog.