Locale
Encyclopedia
In computing
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...

, locale is a set of parameter
Parameter
Parameter from Ancient Greek παρά also “para” meaning “beside, subsidiary” and μέτρον also “metron” meaning “measure”, can be interpreted in mathematics, logic, linguistics, environmental science and other disciplines....

s that defines the user's language, country and any special variant preferences that the user wants to see in their user interface
User interface
The user interface, in the industrial design field of human–machine interaction, is the space where interaction between humans and machines occurs. The goal of interaction between a human and a machine at the user interface is effective operation and control of the machine, and feedback from the...

. Usually a locale identifier consists of at least a language identifier and a region identifier.

On Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

, Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 and other POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

-type platforms, locale identifiers are defined similar to the BCP 47 definition of language tags, but the locale variant modifier is defined differently, and the character set is included as a part of the identifier. It is defined in this format: [language[_territory][.codeset][@modifier]]. (For example, Australian English
Australian English
Australian English is the name given to the group of dialects spoken in Australia that form a major variety of the English language....

 using the UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...

 encoding is en_AU.UTF-8.)

General locale settings

These settings usually include the following display (output) format settings:
  • Number format setting
  • Character classification, case conversion settings
  • Date/Time format setting
  • String collation setting
  • Currency format setting
  • Paper size setting
  • other minor settings ...


The locale settings are about formatting output given a locale. So, the timezone information and daylight saving time are not usually part of the locale settings.
Less usual, but worth mentioning, is the input format setting. This is mostly defined on a per application basis.

Furthermore, the General settings usually include the keyboard layout
Keyboard layout
A keyboard layout is any specific mechanical, visual, or functional arrangement of the keys, legends, or key–meaning associations of a computer, typewriter, or other typographic keyboard....

 setting.

Programming/markup language support

In these environments,

  • C
    C (programming language)
    C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

  • C++
  • Eiffel
    Eiffel (programming language)
    Eiffel is an ISO-standardized, object-oriented programming language designed by Bertrand Meyer and Eiffel Software. The design of the language is closely connected with the Eiffel programming method...

  • Java
    Java (programming language)
    Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...


  • Microsoft .NET framework
  • REBOL
    REBOL
    REBOL is a cross-platform data exchange language and a multi-paradigm dynamic programming language originally designed by Carl Sassenrath for network communications and distributed computing. The language and its official implementation, which is a proprietary freely redistributable software are...

  • Ruby
    Ruby (programming language)
    Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...

  • Perl

  • PHP
    PHP
    PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

  • Python
    Python (programming language)
    Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

  • XML
    XML
    Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....


and other (nowadays) Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

-based environments, they are defined in a format similar to BCP 47. They are usually defined with just ISO 639
ISO 639
ISO 639 is a set of standards by the International Organization for Standardization that is concerned with representation of names for language and language groups....

 and ISO 3166-1 alpha-2
ISO 3166-1 alpha-2
ISO 3166-1 alpha-2 codes are two-letter country codes defined in ISO 3166-1, part of the ISO 3166 standard published by the International Organization for Standardization , to represent countries, dependent territories, and special areas of geographical interest...

 codes.

POSIX-type platforms

On Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

, Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 and other POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

-type platforms, locale identifiers are defined similarly to the BCP 47 definition of language tags, but the locale variant modifier is defined differently, and the character set is included as a part of the identifier.

In the next example there is an output of command locale for Czech language
Czech language
Czech is a West Slavic language with about 12 million native speakers; it is the majority language in the Czech Republic and spoken by Czechs worldwide. The language was known as Bohemian in English until the late 19th century...

 (cs), Czech Republic
Czech Republic
The Czech Republic is a landlocked country in Central Europe. The country is bordered by Poland to the northeast, Slovakia to the east, Austria to the south, and Germany to the west and northwest....

 (CZ) with explicit UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...

 encoding:

$ locale
LANG=cs_CZ.UTF-8
LC_CTYPE="cs_CZ.UTF-8"
LC_NUMERIC="cs_CZ.UTF-8"
LC_TIME="cs_CZ.UTF-8"
LC_COLLATE="cs_CZ.UTF-8"
LC_MONETARY="cs_CZ.UTF-8"
LC_MESSAGES="cs_CZ.UTF-8"
LC_PAPER="cs_CZ.UTF-8"
LC_NAME="cs_CZ.UTF-8"
LC_ADDRESS="cs_CZ.UTF-8"
LC_TELEPHONE="cs_CZ.UTF-8"
LC_MEASUREMENT="cs_CZ.UTF-8"
LC_IDENTIFICATION="cs_CZ.UTF-8"
LC_ALL=

The full list of POSIX locale codes

may be found on the Internet Assigned Numbers Authority
Internet Assigned Numbers Authority
The Internet Assigned Numbers Authority is the entity that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System , media types, and other Internet Protocol-related symbols and numbers...

 (IANA) website

Details of the IANA registry for language tag extensions

and IANA protocols
are also to be found there.

Specifics for Microsoft platforms

Locale identifier (LCID) for unmanaged code
Managed code
Managed code is a term coined by Microsoft to identify computer program code that requires and will only execute under the "management" of a Common Language Runtime virtual machine ....

 on Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, a number such as 1033 for English (United States) or 1041 for Japanese (Japan). These numbers consist of a language code (lower 10 bits) and culture code (upper bits) and are therefore often written in hexadecimal
Hexadecimal
In mathematics and computer science, hexadecimal is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F to represent values ten to fifteen...

 notation, such as 0x0409 or 0x0411. The list of those codesets are described in character encoding
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...

.
Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

 is beginning to introduce unmanaged code Application programming interface
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

s (APIs) for .NET that use this format. One of the first to be generally released is a function to mitigate issues with internationalized domain name
Internationalized domain name
An internationalized domain name is an Internet domain name that contains at least one label that is displayed in software applications, in whole or in part, in a language-specific script or alphabet, such as Arabic, Chinese, Russian, Hindi or the Latin alphabet-based characters with diacritics,...

s,http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/nls_DownlevelGetLocaleScripts.asp but more are in Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...

 Beta 1.

Beginning with Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...

, new functions that use BCP 47 locale names have been introduced to replace nearly all LCID-based APIs.

See also

  • Internationalization and localization
    Internationalization and localization
    In computing, internationalization and localization are means of adapting computer software to different languages, regional differences and technical requirements of a target market...

  • ISO 639
    ISO 639
    ISO 639 is a set of standards by the International Organization for Standardization that is concerned with representation of names for language and language groups....

     language code
    Language code
    A language code is a code that assigns letters and/or numbers as identifiers or classifiers for languages. These codes may be used to organize library collections or presentations of data, to choose the correct localizations and translations in computing, and as a shorthand designation for longer...

    s
  • ISO 3166-1 alpha-2
    ISO 3166-1 alpha-2
    ISO 3166-1 alpha-2 codes are two-letter country codes defined in ISO 3166-1, part of the ISO 3166 standard published by the International Organization for Standardization , to represent countries, dependent territories, and special areas of geographical interest...

     country code
    Country code
    Country codes are short alphabetic or numeric geographical codes developed to represent countries and dependent areas, for use in data processing and communications. Several different systems have been developed to do this. The best known of these is ISO 3166-1...

    s
  • IETF language tag
  • Common Locale Data Repository
    Common Locale Data Repository
    The Common Locale Data Repository Project, often abbreviated as CLDR, is a project of the Unicode Consortium to provide locale data in the XML format for use in computer applications. CLDR contains locale specific information that an operating system will typically provide to applications. CLDR is...

  • Date and time representation by country
  • AppLocale
    AppLocale
    AppLocale is a tool for Windows XP and Windows Server 2003 by Microsoft. It is a launcher application that makes it possible to run non-Unicode applications in a locale of the user's choice. Since changing the locale normally requires a restart of Windows, AppLocale is especially popular with...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK