Marta Bartnicka, inspired by Bogna Czyżewska
I have put together a few examples of placing user interface strings in code. After working several years in the localization industry, all professional translators have a feeling that there must be some reason why programmers usually manage the strings wrong: in such a way that makes it hard not only to translate UI, but – in the first place – to extract the strings and to put them back.
The reason exists, and can be found in pretty every programming course, be it academic, bootcamp, or self-study. And while print ("Hello, World!")
is a cool way to get started, you can really do better with UI strings management – not only for the future translators, but also for your future self maintaining the code (and resources; we’ll get to that).
NOTE: Programming examples in this article are written in Cython++.
- 0. Level disaster: just add them in code
- 1. Level in the know: use resource files
- 2. Level I got the power: concatenate
- 3. Level enlightened: use variables
- 4. Level God of I18n: handle risks
- Bonus
0. Level disaster: just add them in code
Because this is what you learn in your first programming lesson!
English should be OK for everyone
Code
if situation == welcome:
print "Welcome!";
else:
print "Goodbye!"
Resources
What???
Need translation
Code
if situation == welcome:
if language == en_US:
print "Welcome!";
if language == es_ES:
print "¡Hola!";
if language == pl_PL:
print "Witaj!";
else:
if language == en_US:
print "Goodbye!";
if language == es_ES:
print "¡Adiós!";
if language == pl_PL:
print "Do widzenia!"
Resources
Busy typing the conditions, no time for that!!!!!1111
1. Level in the know: use resource files
Whatever made you separate UI strings from code, you are going in the right direction!
English should be OK for everyone
Code
// from_resources picks string from a resource file by key
if situation == welcome:
print (from_resources (UI.resources, "welcome_msg"))
else:
print (from_resources (UI.resources, "goodbye_msg"))
Resources
UI.resources
welcome_msg = Welcome!
goodbye_msg = Goodbye!
Need translation
Code
// from_resources picks string from a resource file by key and language
// other than that, no changes to code!
if situation == welcome:
print (from_resources (UI.resources, "welcome_msg"))
else:
print (from_resources (UI.resources, "goodbye_msg"))
Resources
UI.resources for English
language_code = en_US
welcome_msg = Welcome!
goodbye_msg = Goodbye!
NOTE: Only the resource file goes to translation! This applies to all further examples.
UI.resources for Spanish
language_code = es_ES
welcome_msg = ¡Hola!
goodbye_msg = ¡Adiós!
UI.resources for Polish
language_code = pl_PL
welcome_msg = Cześć!
goodbye_msg = Do widzenia!
2. Level I got the power: concatenate
This was in your second programming lesson. Or in the third.
English should be OK for everyone
Code
// from_resources picks string from a resource file by key
// user_name picks user name
if situation == welcome:
print (from_resources (UI.resources, "welcome_msg") + ", " + user_name + !")
else:
print (from_resources (UI.resources, "goodbye_msg") + ", " + user_name + !")
Resources
UI.resources
welcome_msg = Welcome
goodbye_msg = Goodbye
Results:
Welcome, Tina Turner!
Goodbye, Tina Turner!
Need translation
Code
// from_resources picks string from a resource file by key and language
// other than that, no changes to code!
// user_name picks user name
if situation == welcome:
print (from_resources (UI.resources, "welcome_msg") + " " + user_name + !")
else:
print (from_resources (UI.resources, "goodbye_msg") + " " + user_name + !")
Resources
UI.resources for English
language_code = en_US
welcome_msg = Welcome
goodbye_msg = Goodbye
Results:
Welcome, Tina Turner!
Goodbye, Tina Turner!
UI.resources for Spanish
language_code = es_ES
welcome_msg = Hola
goodbye_msg = Adiós
Results:
Hola, Concha Buika!
Adiós, Concha Buika!
This has wrong punctuation for Spanish
UI.resources for Polish
language_code = pl_PL
welcome_msg = Cześć
goodbye_msg = Do widzenia
Results:
Cześć, Maryla Rodowicz!
Do widzenia, Maryla Rodowicz!
This sounds in Polish as creepy as a talking robot
3. Level enlightened: use variables
Do you want to define an UI string as a whole? Yes, please!
English should be OK for everyone
Code
// from_resources picks string from a resource file by key and inserts variable into placeholder
// user_name picks user name
if situation == welcome:
print (from_resources (UI.resources, "welcome_msg", user_name))
else:
print (from_resources (UI.resources, "goodbye_msg", user_name))
Resources
UI.resources
welcome_msg = Welcome, {0}!
goodbye_msg = Goodbye, {0}!
Results:
Welcome, Tina Turner!
Goodbye, Tina Turner!
Need translation
Code
// from_resources picks string from a resource file by key and language, and inserts variable into placeholder
// other than that, no changes to code!
// user_name picks user name
if situation == welcome:
print (from_resources (UI.resources, "welcome_msg", user_name))
else:
print (from_resources (UI.resources, "goodbye_msg", user_name))
Resources
UI.resources for English
language_code = en_US
welcome_msg = Welcome, {0}!
goodbye_msg = Goodbye, {0}!
Results:
Welcome, Tina Turner!
Goodbye, Tina Turner!
UI.resources for Spanish
language_code = es_ES
welcome_msg = ¡Hola {0}!
goodbye_msg = ¡Adiós {0}!
Results:
¡Hola Concha Buika!
¡Adiós Concha Buika!
Punctuation is adjusted to Spanish standards
UI.resources for Polish
language_code = pl_PL
welcome_msg = {0}: witamy!
goodbye_msg = {0}: do widzenia!
Results:
Maryla Rodowicz: witamy!
Maryla Rodowicz: do widzenia!
Phrase is reverted to create a message that is acceptable in Polish
4. Level God of I18n: handle risks
Because the world is complicated.
NOTE: For clarity, examples in this chapter only show resources. Code to pick strings from resources is the same as in the last previous example.
Concatenate strings only when necessary
You already know that putting two sub-sentence fragments together, or adding variable text into a phrase, can cause trouble in translation. Please don’t concatenate to save a few bytes only.
Let’s imagine menu options: New file, New table, New tree (whatever they are in a given use case).
It’s technically possible to organize resources like this:
option_1_new = New
option_2_file = file
option_2_table = table
option_2_tree = tree
And then, you could build options from resources like this:
option_1_new + option_2_file
option_1_new + option_2_table
option_1_new + option_2_tree
(Yes, you also need to insert the space somewhere between the strings.)
It’s not a problem to translate these strings into Polish for example:
option_1_new = Nowe
option_2_file = plik
option_2_table = tabela
option_2_tree = drzewo
So what’s wrong with this solution in translations?
Gender.
The nouns „file”, „table” and „tree” may have different genders in various languages, and in many of these languages the gender of a noun also implies gender of an adjective.
In Polish, the only correct translation of the entire options would be:
Nowy plik
Nowa tabela
Nowe drzewo
The translation I randomly chose for „New” – „Nowe” – only matches the last option, „tree” („drzewo”), but is incorrect for the other two.
There is no single translation of „New” that would fit in all cases, and there is no way to translate the nouns in a gender-neutral way. The only reasonable translation you get when translating phrases „New file”, „New table” and „New tree” as whole.
Now, was it really worth it? Knowing that translation may be impacted, does it pay off to extract „New” into a separate string to reuse it 2 (two) more times? I say no.
If the list of objects attached to „New” were longer, some 10 for example, then it would actually make sense to cut the string in two. And there is a work-around that takes into account any gender-aware languages and allows to translate „New” in a neutral way that fits in all cases; it’s a bit stiff an unnatural, but works and is both understandable and acceptable for such cases:
option_1_new = New item:
option_2_file = file
option_2_table = table
option_2_tree = tree
option_2_collection = collection
option_2_list = list
...
Polish translation:
option_1_new = Nowy element:
option_2_file = plik
option_2_table = tabela
option_2_tree = drzewo
option_2_collection = kolekcja
option_2_list = lista
...
This solution makes sense for a list of 10 or so objects, but would look rather awkward for a list of 3.
NOTE: Beside gender, there are other factor that disables building sentences and phrases from fragments: numbers, flexion and more. And Polish just one of good examples because it is a morphologically rich language.
Be prepared that something may still not work in some language
In the earlier code and resources sample, the phrase „Welcome, X” does not translate well to any form of welcome that is actually used in Poland; to get something merely acceptable, translators had to revert the order of string and variable and change punctuation.
A question that experienced localization professionals would ask, is: Do we really need to mention the user name in this message?
- If yes, then we communicate to translators that variable is mandatory, and they use their magic to rephrase the message so that it sounds, at least, acceptable in their language.
- If not (for example: the name of an user is mentioned earlier/later on the same screen, page or message), then we should allow the variable to be deleted in translation – in the languages where the entire message is hard to translate. In other words, we make variable not mandatory:
UI.resources for English
language_code = en_US
welcome_msg = Welcome, {0}!
goodbye_msg = Goodbye, {0}!
Results:
Welcome, Tina Turner!
Goodbye, Tina Turner!
UI.resources for Spanish
language_code = es_ES
welcome_msg = ¡Hola {0}!
goodbye_msg = ¡Adiós {0}!
Results:
¡Hola Concha Buika!
¡Adiós Concha Buika!
UI.resources for Polish
language_code = pl_PL
welcome_msg = Witamy!
goodbye_msg = Do widzenia!
Results:
Witamy!
Do widzenia!
Ask for advice
What if none of the above solutions fit your problem? Please put together the code, resources and your use case, list the issues that you encountered in localization, and mail kontakt@localization.pl. We will respond as soon as we can to propose a solution, or at least we will point you in the right direction.
Bonus
I collected a few recent real-life use cases that are repeatable – and thus, worth mentioning.
- Collect a list of Important Names used in the UI: trademarks, references to other applications, company names, etc. I know that an app does not always have a name approved by marketing at the time you code the UI; in this case, select a Placeholder Name and use it consistently across the UI, so that it’s easy to replace it later. When you prepare the UI for localization, add a brief definition to each Important Name (or Placeholder Name) and share the list with translators. You just created the glossary – and glossaries really improves the speed and quality of translation.
- Free-style capitalization is a nice feature in English that might be used, for example, to highlight a cool behaviour of the app that you are working on. However, starting anything except Important Names (or Placeholder Names) with capital letters is bad for translation: Any Words Starting with Capital Letters could indicate a name of an application, an option, or a technology, and translators will either ask a lot of questions whether those are translatable, or – way worse – make different assumptions and Leave Random Strings in English.
- Whatever format and tool you use for extracting translatable strings, ensure that you mark 100% of non-translatable as non-translatable, so that they are skipped when extracting. When you do not send to translation the stuff like %2$d⏎exp: %3$d or nonSupportedView, you save time in two ways: you do not have to reply to translators’ questions what those weird strings mean, and you do not have to deal with build errors in case translators did not ask.
- Assuming that Tatanka is the name of your company, don’t hurt yourself and the translators with messages like this: Your Tatanka ID account email address hasn’t been verified. Clusters of more than 3 nouns are hard to parse for humans. Instead of dealing with questions from translators, or bad translations resulting from bad guessing, rephrase the message: The email address linked to your Tatanka ID has not been verified.
- Once more: do not concatenate for fun. [Log in] to change your profile – where [Log in] is a button with text and to change your profile is a separate string – is cute in English, and probably in a few other languages, but will not work in many other languages. This is because in those languages Log in may need to stand in some other place than the beginning of the string, and to change your profile may be hardly translatable standalone, or may have 3 different translations depending on what stands before it. Mixing the button with its label is not really worth the mess.