Localization Changes Everything

Sep 10, 2018 | Test Automation Insights

As I go through the ways that I can automate certain tasks, there are a few items I look for. Of course, the standard options are text fields, links, and buttons. These items are common points of interaction with any site, and obviously, they will be a focus for interaction. The flow is simple; click on an item with a particular text value and confirm the result. That works most of the time but there is one area where this will likely fail. That area is Localization.

What is localization?

Localization, sometimes also referred to as Internationalization, involves various methods for adapting computer software to different languages, customs, and methods of display common to another area or locale. To be specific, Internationalization focuses on developing software to be adapted for different languages and locales without requiring significant coding changes. By contrast, Localization adapts software by translating specific text making changes that will work for a particular locale.

There are variations that can be made for different language groups. Examples include differences in standard French and French-Canadian, or European Portuguese and Brazilian Portuguese. Localization may vary depending on how the software is developed, but a common method is using tokenized strings. This means that anywhere a string appears that has been tokenized, an external library will be accessed and, depending on the locale, the appropriate language or standard will be used. This goes beyond text strings being converted into a different language. Date formats may be different. Calendars could be displayed differently. Holidays highlighted would be different. Text direction could be different, changing from left-to-right to right-to-left as well as vertical in some cases.

Challenges for automation

Text literals

Text strings in the application will not be reliable markers to base tests on. At least, they will not be reliable unless the tests in question are specifically developed to target that language. The problem with an approach that looks for literal text is the need to duplicate test procedures and test steps. If my application supports fifteen different languages, and I want to verify that the title text appears correctly in each of the languages, that means I will have to look for fifteen different strings and compare them with what they should be. In short, I’d be looking at fifteen different tests to verify the accuracy of the translated strings.

Additionally, if I wanted to look for values to confirm actions, such as pressing a button with the label “Next”, I would need to have a way to verify that the label I am looking for is displaying correctly in the appropriate language. If that is the purpose of the test, i.e. to make sure that the label appears correctly for that locale, then that makes sense. However, if the purpose of the test is to navigate to different regions of a page to perform certain steps, that level of specificity is unnecessary and needlessly repetitive. To that end, addressing the elements by their ID values is more effective, as those values will not change, even if the text strings change.

Text direction

As I am an English speaker, I am used to text flowing from left-to-right. However, a variety of languages read from right-to-left. Will my test be able to confirm that the values displayed are running in the right direction? What if I don’t want to have that displayed for the entire page? To this end, there are a couple of tags that I would want to place into my markup so that I can confirm that I am looking at the right values and they are following the correct language rules. First, I would want to confirm I am using the correct language tag (such as lang=”ar”) and that the language is being directed in the correct manner (such as dir=”rtl”). These values can be set at the top level of a document or in a block if the section in question is only a part of a page. HTML5 includes the dir=”auto” option, which looks at the first characters typed in a given field and, depending on what is entered, will determine if LTR or RTL is the appropriate action.

Context and Placement

At times, it can be difficult to just look at a set of strings and see if they will be appropriate for a given area. Also, it may be weeks or months before the strings that are needed are available if an outside group is performing the translation services. If I am testing an application for localization, I don’t want to have to wait for the text to be available before I can begin my testing. There is a technique called pseudo-localization, where tokens can be placed into the code to show where the text strings will appear. This way, rather than looking at specific language or direction, I can look at where the text strings will be applied and if the locations are relevant. This is important if there will be areas where there will be a mix of languages or if there will be some parts of the interface that will remain in English while other parts are translated. By using pseudo-localization, I can see what areas are targeted for showing the translations and can determine if that would be appropriate for the context of the page or elements being displayed. More to the point, it will also help me determine areas that literal text will not be available to me for use in automating (or at least not as a value that I can rely upon to be the same).

Culturally Specific Elements

In addition to the languages and the direction of text, there are a variety of additional items that are handled uniquely in different locales. Examples of these are:

  • Dates, times, and numbers, such as the difference between mm-dd-yyyy and dd-mm-yyyy, am/pm vs. 24 hour time, and the ways that large numbers are represented.
  • The currency used, and the correct valuation (if we are starting off with dollars, what does that equal in Pounds, Kroner, Euros or Yen and do they display the correct symbols?)
  • Diacritical characters: are they being inserted correctly? Are we able to use different keyboards appropriately?
  • Does the application correctly support Unicode characters or the rendering on non-roman alphabets correctly?

All of these are part of the puzzle and may be significant challenges to ensure that the values displayed are indeed correct. A linguist may be required to ensure that the correct transitions occur and in the appropriate context for a given locale.

Localization efforts are often done late in the game, or as add-ons for some organizations. This can make for a challenging addition to certain products. A key to success in Localization and Internationalization is to start early where possible and help ensure that the development of a product can be as smooth as possible with the goal of adding language and locale features as the product matures. This may not always be possible or feasible, but if the goal is to make a product usable by the broadest group of people possible, then Localization and Internationalization will make a lot of sense. To borrow from a Chinese Proverb, “The best time to plant a tree is twenty years ago. The second best time is now.” Likewise, starting Localization may be a difficult process but it will yield results over time.

Related Posts:

5 Software Quality Metrics That Matter

5 Software Quality Metrics That Matter

Which software quality metrics matter most? That’s the question we all need to ask. If your company is dedicated to developing high-quality software, then you need a definition for what “high-quality” actually looks like. This means understanding different aspects of...

The Ins and Outs of Pairwise Testing

The Ins and Outs of Pairwise Testing

Software testing typically involves taking user requirements and stories to create test cases that provide a desired level of coverage. Many of these tests contain a certain level of redundancy. Traditional testing methods can lead to a lot of wasted time and extend...