ISO 639-3
Encyclopedia
ISO 639-3:2007, Codes for the representation of names of languages — Part 3: Alpha-3 code for comprehensive coverage of languages, is an international standard for language code
Language code
A language code is a code that assigns letters and/or numbers as identifiers or classifiers for languages. These codes may be used to organize library collections or presentations of data, to choose the correct localizations and translations in computing, and as a shorthand designation for longer...

s in the ISO 639
ISO 639
ISO 639 is a set of standards by the International Organization for Standardization that is concerned with representation of names for language and language groups....

 series. The standard describes three‐letter codes for identifying languages. It extends the ISO 639-2
ISO 639-2
ISO 639-2:1998, Codes for the representation of names of languages — Part 2: Alpha-3 code, is the second part of the ISO 639 standard, which lists codes for the representation of the names of languages. The three-letter codes given for each language in this part of the standard are referred to as...

 alpha-3 codes with an aim to cover all known natural
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

 language
Language
Language may refer either to the specifically human capacity for acquiring and using complex systems of communication, or to a specific instance of such a system of complex communication...

s. The standard was published by ISO on 2007-02-05.

It is intended for use in a wide range of applications, in particular computer systems where many languages need to be supported. It provides an enumeration of languages as complete as possible, including living and extinct, ancient and constructed, major and minor, written and unwritten. However, it does not include reconstructed languages such as Proto-Indo-European
Proto-Indo-European language
The Proto-Indo-European language is the reconstructed common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans...

.

It is a superset of ISO 639-1
ISO 639-1
ISO 639-1:2002, Codes for the representation of names of languages — Part 1: Alpha-2 code, is the first part of the ISO 639 series of international standards for language codes. Part 1 covers the registration of two-letter codes. There are 136 two-letter codes registered...

 and of the individual languages in ISO 639-2
ISO 639-2
ISO 639-2:1998, Codes for the representation of names of languages — Part 2: Alpha-3 code, is the second part of the ISO 639 standard, which lists codes for the representation of the names of languages. The three-letter codes given for each language in this part of the standard are referred to as...

. ISO 639-1 and ISO 639-2 focused on major languages, most frequently represented in the total body of the world's literature. Since ISO 639-2 also includes language collections and Part 3 does not, ISO 639-3 is not a superset of ISO 639-2. Where B and T codes exist in ISO 639-2, ISO 639-3 uses the T-codes.

Examples:
language 639-1 639-2 (B/T) type 639-3
English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

en eng individual eng
German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

de ger/deu individual deu
Arabic ar ara macro ara: arb + several others
Minnan
Min Nan
The Southern Min languages, or Min Nan , are a family of Chinese languages spoken in southern Fujian, eastern Guangdong, Hainan, Taiwan, and southern Zhejiang provinces of China, and by descendants of emigrants from these areas in diaspora....

individual nan


The final standard contains 7589 entries. The inventory of languages is based on a number of sources including: the individual languages contained in 639-2, modern languages from the Ethnologue
Ethnologue
Ethnologue: Languages of the World is a web and print publication of SIL International , a Christian linguistic service organization, which studies lesser-known languages, to provide the speakers with Bibles in their native language and support their efforts in language development.The Ethnologue...

 15th edition, historic varieties, ancient languages and artificial languages
Constructed language
A planned or constructed language—known colloquially as a conlang—is a language whose phonology, grammar, and/or vocabulary has been consciously devised by an individual or group, instead of having evolved naturally...

 from Anthony Aristar
Anthony Aristar
Anthony Manuel Rodrigues Aristar is a linguist, the founder of the LINGUIST List, the most important linguistic resource on the web, and currently a professor of linguistics at Eastern Michigan University.-Studies:...

 at the Linguist List
Linguist List
The LINGUIST List is a major online resource for the academic field of linguistics. It was founded by Anthony Aristar in early 1990 at the University of Western Australia, and is used as a reference by the National Science Foundation in the United States...

 as well as languages recommended within a public commenting period.

A transition from ISO 639-1 could be done with List of ISO 639-1 codes.

Code space

Since the code is three-letter alphabetic, one upper bound for the number of languages that can be represented is 26 × 26 × 26 = 17576. Since ISO 639-2 defines special codes (4), a reserved range (520) and B-only codes (23), 547 codes cannot be used in part 3. Therefore a lower upper bound is 17576 − 547 = 17030.

The upper bound gets even lower if one subtracts the language collections defined in 639-2 and the ones yet to be defined in ISO 639-5
ISO 639-5
ISO 639-5:2008 "Codes for the representation of names of languages—Part 5: Alpha-3 code for language families and groups" is an international standard published by the International Organization for Standardization . It was developed by ISO Technical Committee 37, Subcommittee 2, and first...

.

Macrolanguages

There are 56 languages in ISO 639-2 which are considered, for the purposes of the standard, to be "macrolanguages" in ISO 639-3.

Some of these macrolanguages had no individual language as defined by ISO 639-3 in the code set of ISO 639-2, e.g. 'ara' (Generic Arabic). Others like 'nor' (Norwegian) had their two individual parts ('nno' (Nynorsk
Nynorsk
Nynorsk or New Norwegian is one of two official written standards for the Norwegian language, the other being Bokmål. The standard language was created by Ivar Aasen during the mid-19th century, to provide a Norwegian alternative to the Danish language which was commonly written in Norway at the...

), 'nob' (Bokmål
Bokmål
Bokmål is one of two official Norwegian written standard languages, the other being Nynorsk. Bokmål is used by 85–90% of the population in Norway, and is the standard most commonly taught to foreign students of the Norwegian language....

)) already in ISO 639-2.

That means some languages (e.g. 'arb', Standard Arabic) that were considered by ISO 639-2 to be dialects of one language ('ara') are now in ISO 639-3 in certain contexts considered to be individual languages themselves.

This is an attempt to deal with varieties that may be linguistically distinct from each other, but are treated by their speakers as two forms of the same language, e.g. in cases of diglossia
Diglossia
In linguistics, diglossia refers to a situation in which two dialects or languages are used by a single language community. In addition to the community's everyday or vernacular language variety , a second, highly codified variety is used in certain situations such as literature, formal...

.

For example:
  • http://www.sil.org/iso639-3/documentation.asp?id=ara (Generic Arabic, 639-2)
  • http://www.sil.org/iso639-3/documentation.asp?id=arb (Standard Arabic, 639-3)


See for the complete list.

Collective languages

"A collective language code element is an identifier that represents a group of individual languages that are not deemed to be one language in any usage context." These codes do not precisely represent a particular language or macrolanguage.

While ISO 639-2 includes three-letter identifiers for collective languages, these codes are excluded from ISO 639-3. Hence ISO 639-3 is not a superset of ISO 639-2.

ISO 639-5
ISO 639-5
ISO 639-5:2008 "Codes for the representation of names of languages—Part 5: Alpha-3 code for language families and groups" is an international standard published by the International Organization for Standardization . It was developed by ISO Technical Committee 37, Subcommittee 2, and first...

 defines 3-letter collective codes for language families and groups.

Usage of ISO 639-3

  • Lexical Markup Framework
    Lexical Markup Framework
    ISO 24613:2008, Language resource management - Lexical markup framework , is the ISO International Organization for Standardization ISO/TC37 standard for natural language processing and machine-readable dictionary lexicons...

    , ISO specification for representation of machine-readable dictionaries
  • Ethnologue
    Ethnologue
    Ethnologue: Languages of the World is a web and print publication of SIL International , a Christian linguistic service organization, which studies lesser-known languages, to provide the speakers with Bibles in their native language and support their efforts in language development.The Ethnologue...

    , Linguist List
    Linguist List
    The LINGUIST List is a major online resource for the academic field of linguistics. It was founded by Anthony Aristar in early 1990 at the University of Western Australia, and is used as a reference by the National Science Foundation in the United States...

    ,
  • IETF language tag
  • proposed as language TLD (lcTLD) http://www.circleid.com/posts/languages_in_the_root_a_tld_launch_strategy_based_on_iso_639 http://forum.icann.org/lists/gtld-strategy-draft/msg00005.html

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK