Wer fremde Sprachen nicht kennt, weiβ nichts von seiner eigenen. (He who ignores foreign languages knows nothing of his own.)
—Goethe
As the reach of the Web expands, developers find that their web applications must be customized to match the needs of new audiences of different cultures. Internationalization is the process of adapting software so that it may be used across many various cultures and locales. Localization is the process of actually modifying the product and creating a version customized for a particular language, country, or locale.
The difference between internationalization and localization can be fuzzy, and it can change from situation to situation. As a simplistic example, consider a social networking site. At a minimum, internationalization would involve adapting the application to accept and display data in a wide variety of character sets (say, by using UTF-8 for all input, output, and storage). Localization would at least involve translation of user interface elements to several languages, and possibly much more.
The term internationalization is usually abbreviated i18n, short for "i, 18 letters, and then n." Similarly, "localization" is abbreviated L10n. To avoid ambiguity, i18n is always written with a lowercase i, while L10n always uses an uppercase L. I will use this convention throughout this chapter.
Although language translation gets the lion's share of attention in this field, it is but one part of i18n. A human language may have significant regional differences or variants between countries where the language is spoken. Dialects aside, there can be large differences in currency, collation (sort order), number and date format, and even writing system across regional or political divisions within a country.
These differences are encapsulated in the concept of
locale. A locale is usually defined as a language
plus a country or region. It includes not only language but also
regional and local preferences and possibly a character encoding. A
POSIX-style locale identifier looks like en_US.UTF-8
(English, United States, UTF-8
character encoding).