Localization

Describe the advantages of localizing an application

Internationalization is the process of designing an application so that it can be adapted to various languages and regions without engineering changes.

An internationalized program has the following characteristics:

  • With the addition of localized data, the same executable can run worldwide.

  • Textual elements, such as status messages and the GUI component labels, are not hardcoded in the program. Instead they are stored outside the source code and retrieved dynamically.

  • Support for new languages does not require recompilation.

  • Culturally-dependent data, such as dates and currencies, appear in formats that conform to the end user's region and language.

  • It can be localized quickly.

Localization is the process of adapting software for a specific region or language by adding locale-specific components and translating text.

The primary task of localization is translating the user interface elements and documentation. Localization involves not only changing the language interaction, but also other relevant changes such as display of numbers, dates, currency, and so on. Other types of data, such as sounds and images, may require localization if they are culturally sensitive. The better internationalized an application is, the easier it is to localize it for a particular language and character encoding scheme.

An internationalized program can display information differently throughout the world. For example, the program will display different messages in Paris, Tokyo, and New York. If the localization process has been fine-tuned, the program will display different messages in New York and London to account for the differences between American and British English. An internationalized program references a Locale object to identify the appropriate language and region of its end users.

A java.util.Locale object is an identifier for a particular combination of language and region. If a class varies its behavior according to Locale, it is said to be locale-sensitive. For example, the java.text.NumberFormat class is locale-sensitive; the format of the number it returns depends on the Locale. Thus NumberFormat may return a number as 14 092 011 (France), or 14.092.011 (Germany), or 14,092,011 (United States).

int i = 14_092_011;
Locale l3 = new Locale("en", "US");
System.out.print(l3 + " uses ");
System.out.println(NumberFormat.getInstance(l3).format(i));
					

en_US uses 14,092,011
					

Locale objects are only identifiers. The real work, such as formatting and detecting word boundaries, is performed by the methods of the locale-sensitive classes.



Define what a locale represents

A java.util.Locale consists of three parts:

  1. The language (mandatory, only lower case letter)

    The language code is either two or three lowercase letters that conform to the ISO 639 standard.

    Examples:

    • de - German

    • en - English

    • fr - French

    • es - Spanish

    • be - Belarusian

  2. The country (optional, only capital letters)

    The region (country) code consists of either two or three uppercase letters that conform to the ISO 3166 standard, or three numbers that conform to the UN M.49 standard.

    Examples:

    • US - United States

    • CA - Canada

    • DE - Germany

    • FR - France

    • ES - Spain

    • BY - Belarus

  3. The variant, often used for a dialect (optional, only capital letters)

It is very common to use only the language part. Sometimes it is handy to add the country part. The variant is almost never used.

The idea behind this concept is that it is possible to build a system that implements a hierarchy of translations: common elements are defined in the languages, those that are country specific are defined on country level and finally messages that are even more specific use the variant part:

  • de: contains the translations that are valid in all German speaking countries.

  • de_CH contains translations that differ from the basic German translation, i.e. spellings and expressions unique to Switzerland. So does de_DE contain specific translation for German users.



Read and set the locale by using the Locale object

The four ways to create a Locale object are:

  • java.util.Locale.Builder class

    The java.util.Locale.Builder utility class can be used to construct a java.util.Locale object that conforms to the IETF BCP 47 syntax. For example, to specify the French language and the country of Canada, you could invoke the Locale.Builder constructor and then chain the setter methods as follows:

    Locale l = new Locale.Builder().setLanguage("fr").setRegion("CA").build();
    								

  • java.util.Locale constructors

    There are three constructors available in the Locale class for creating a Locale object:

    Locale(String language) {...}
    Locale(String language, String country) {...}
    Locale(String language, String country, String variant) {...}
    								

    The following example creates Locale object for the French language in Canada:

    Locale l = new Locale("fr", "CA");
    								

  • forLanguageTag factory method

    If you have a language tag string that conforms to the IETF BCP 47 standard, you can use the forLanguageTag(String) factory method, which was introduced in the Java SE 7 release. For example:

    Locale l = Locale.forLanguageTag("fr-CA");
    								

  • Locale constants

    For your convenience the Locale class provides constants for some languages and countries. For example:

    Locale l = Locale.CANADA_FRENCH;
    								

The Java platform does not require you to use the same Locale throughout your program. If you wish, you can assign a different Locale to every locale-sensitive object in your program. This flexibility allows you to develop multilingual applications, which can display information in multiple languages.

However, most applications are not multi-lingual and their locale-sensitive objects rely on the default Locale. Set by the Java Virtual Machine when it starts up, the default Locale corresponds to the locale of the host platform. To determine the default Locale of your Java Virtual Machine, invoke the Locale.getDefault() method:

Locale l = Locale.getDefault();
					

You can set the default locale for all locate-sensitive classes by using the Locale.setDefault(Locale l) method:

Locale.setDefault(new Locale("be", "BY"));
					

NOTE: you should not set the default Locale programmatically because it is shared by all locale-sensitive classes.



Build a resource bundle for each locale

Conceptually each ResourceBundle is a set of related subclasses that share the same base name. The list that follows shows a set of related subclasses. MyApp is the base name. The characters following the base name indicate the language code, country code, and variant of a Locale. MyApp_en_GB, for example, matches the Locale specified by the language code for English (en) and the country code for Great Britain (GB).

MyApp.class
MyApp_de.class
MyApp_en_GB.class
MyApp_fr_CA_UNIX.class
					

To select the appropriate ResourceBundle, invoke the ResourceBundle.getBundle(...) method. The following example selects the MyApp ResourceBundle for the Locale that matches the French language, the country of Canada, and the UNIX platform.

Locale locale = new Locale("fr", "CA", "UNIX");
ResourceBundle introLabels = ResourceBundle.getBundle("MyApp", locale);
					

Note that getBundle(...) looks for classes based on the default Locale before it selects the base class (MyApp). If getBundle(...) fails to find a match in the preceding list of classes, it throws a MissingResourceException. To avoid throwing this exception, you should always provide a base class with no suffixes.

The abstract class ResourceBundle has two subclasses:

  • PropertyResourceBundle

    A PropertyResourceBundle is backed by a properties file. A properties file is a plain-text file that contains translatable text. Properties files are not part of the Java source code, and they can contain values for String objects only. If you need to store other types of objects, use a ListResourceBundle instead.

    A property resource bundle is a text file of "key=value" pairs such as:

    okButtonLabel=Ok
    cancelButtonLabel=Cancel
    								

    These are stored in a file baseName_locale.properties

    NOTE: the default property resource bundle (for situations when there is no match for locale) will have MyApp.properties name.

    The English USA version would be in file MyApp_en_US.properties

    okButtonLabel=Ok
    cancelButtonLabel=Cancel
    								

    The Belarusian version would be in file MyApp_be_BY.properties

    okButtonLabel=Добра
    cancelButtonLabel=Скасаваць
    								

  • ListResourceBundle

    The ListResourceBundle class manages resources with a convenient list. Each ListResourceBundle is backed by a class file. You can store any locale-specific object in a ListResourceBundle. To add support for an additional Locale, you create another source file and compile it into a class file.



Call a resource bundle from an application

The first argument to ResourceBundle.getBundle(String s, Locale l) is the bundle name. This argument must be the fully qualified name of the base resource bundle class. Thus, it must include the full package name as well as the classname: myPackage.MyResources.

Loading a resource bundle via ResourceBundle.getBundle(String s, Locale l) is a locale-sensitive operation. Thus, the second argument to getBundle(...) is a Locale. The getBundle(...) uses this locale object to identify which version of the resource bundle to load.

To find the correct, locale-specific, resource bundle, getBundle(...) builds variations of the bundle name until it finds the name of a class that can be loaded.

When you call getBundle(...), you specify the base name of the desired ResourceBundle and a desired Locale (if you do not want to rely on the default locale). Recall that a Locale is specified with a two-letter language code, an optional two-letter country code, and an optional variant string. getBundle(...) looks for an appropriate ResourceBundle class for the locale by appending this locale information to the base name for the bundle. The method looks for an appropriate class with the following order:

  1. bundleName + "_" + localeLanguage + "_" + localeCountry + "_" + localeVariant

  2. bundleName + "_" + localeLanguage + "_" + localeCountry (example: MyResources_be_BY.class)

  3. bundleName + "_" + localeLanguage

  4. bundleName + "_" + defaultLanguage + "_" + defaultCountry + "_" + defaultVariant

  5. bundleName + "_" + defaultLanguage + "_" + defaultCountry (example: MyResources_en_US.class)

  6. bundleName + "_" + defaultLanguage

  7. bundleName (example: MyResources.class)

where localeLanguage, localeCountry and localeVariant are taken from the locale specified in the getBundle(...) call. The defaultLanguage, defaultCountry and defaultVariant are taken from the default locale. As you can see, the resource bundle named bundleName is the bundle of last resort and contains the values to be used if a version of the bundle is not available for a specific locale. If no ResourceBundle subclass can be found, getBundle(...) throws a MissingResourceException.

Typically, a program provides a default bundle for each of its resource bundles. The default bundle contains the full set of key-value pairs in the bundle. Thus, people performing the localization on the bundle have all the information required.

If the bundle in question is a properties bundle, ResourceBundle.getBundle(...) creates a PropertyResourceBundle and initializes it with the information from a properties file. ResourceBundle.getBundle(...) derives the name of the properties file in the same manner as it derives resource bundle class names.

At each step in search process above, getBundle(...) checks first for a class file with the given name. If no class file is found, it uses the getResourceAsStream(...) method of ClassLoader to look for a properties file with the same name as the class and a .properties extension. If such a properties file is found, its contents are used to create a Properties object, and getBundle(...) instantiates and returns a PropertyResourceBundle that exports the properties in the Properties file through the ResourceBundle API.

If getBundle(...) cannot find a class or properties file for the specified locale in any of the search steps, it repeats the search using the default locale instead of the specified locale. If no appropriate ResourceBundle is found in this search either, getBundle(...) throws a MissingResourceException.

The method looks for an appropriate properties file with the following order:

  1. bundleName + "_" + localeLanguage + "_" + localeCountry + "_" + localeVariant

  2. bundleName + "_" + localeLanguage + "_" + localeCountry (example: MyApp_be_BY.properties)

  3. bundleName + "_" + localeLanguage

  4. bundleName + "_" + defaultLanguage + "_" + defaultCountry + "_" + defaultVariant

  5. bundleName + "_" + defaultLanguage + "_" + defaultCountry (example: MyApp_en_US.properties)

  6. bundleName + "_" + defaultLanguage

  7. bundleName (example: MyApp.properties)

The properties file has a .properties extension.



Format text for localization by using NumberFormat and DateFormat

NumberFormat class

By invoking the methods provided by the NumberFormat class, you can format numbers, currencies, and percentages according to Locale.

You can use the NumberFormat methods to format primitive-type numbers, such as double, int and their corresponding wrapper objects, such as Double, Integer.

The following code example formats an int according to Locale. Invoking the getNumberInstance(Locale l) method returns a locale-specific instance of NumberFormat:

Locale l = Locale.US;
NumberFormat formatter = NumberFormat.getNumberInstance(l);
					

The format(Object o) method accepts the Integer as an argument and returns the formatted number in a String:


Integer i = 15_091_974;
System.out.println(String.format("Locale: %s; int: %s", l, formatter.format(i)));

					

The output generated by this code follows:

Locale: en_US; int: 15,091,974
					

If you are writing business applications, you will probably need to format and display currencies. You format currencies in the same manner as numbers, except that you call getCurrencyInstance(Locale l) to create a formatter. When you invoke the format(Object o) method, it returns a String that includes the formatted number and the appropriate currency sign.

This code example shows how to format currency in a locale-specific manner:


Integer i = 15_091_974;
Locale l = Locale.US;
NumberFormat formatter = NumberFormat.getCurrencyInstance(l);
System.out.println(String.format("Locale: %s; currency: %s", l, formatter.format(i)));

					

The output generated by the preceding lines of code is as follows:

Locale: en_US; currency: $15,091,974.00
					

DateFormat class

The DateFormat class allows you to format dates and times with predefined styles in a locale-sensitive manner.

Formatting dates with the DateFormat class is a two-step process:

  • You create a formatter with the getDateInstance(...) method:

    Locale l = Locale.US;
    DateFormat formatter = DateFormat.getDateInstance(DateFormat.DEFAULT, l);
    								

  • You invoke the format(...) method, which returns a String containing the formatted date:

    
    Date d = new Date();
    System.out.println(String.format("Locale: %s; Date: %s", l, formatter.format(d)));
    
    								

The output generated by this code follows:

Locale: en_US; Date: Apr 5, 2012
					

SimpleDateFormat class

If you want to create your own customized date formats, you can use the SimpleDateFormat class.

When you create a SimpleDateFormat object, you specify a pattern String. The contents of the pattern String determine the format of the date and time.

The following pattern letters are defined:

G   Era designator                                       AD
y   Year                                                 1996; 96
M   Month in year                                        July; Jul; 07
w   Week in year                                         27
W   Week in month                                        2
D   Day in year                                          189
d   Day in month                                         10
F   Day of week in month                                 2
E   Day name in week                                     Tuesday; Tue
u   Day number of week (1 = Monday, ..., 7 = Sunday)     1
a   Am/pm marker                                         PM
H   Hour in day (0-23)                                   0
k   Hour in day (1-24)                                   24
K   Hour in am/pm (0-11)                                 0
h   Hour in am/pm (1-12)                                 12
m   Minute in hour                                       30
s   Second in minute                                     55
S   Millisecond                                          978
z   Time zone                                            Pacific Standard Time; PST; GMT-08:00
Z   Time zone                                            -0800
X   Time zone                                            -08; -0800; -08:00
					

The following code formats a date according to the pattern String passed to the SimpleDateFormat constructor. The String returned by the format(...) method contains the formatted date that is to be displayed:


Date d = new Date();
String pattern = "EEE, MMMM d, yyyy";
Locale l = Locale.US;
SimpleDateFormat sdf  = new SimpleDateFormat(pattern, l);
System.out.println(String.format("Locale: %s; custom date format: %s", l, sdf.format(d)));

					

output:

Locale: en_US; custom date format: Thu, April 5, 2012