English on the Internet
Encyclopedia
The English language
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

 is sometimes described as the lingua franca
Lingua franca
A lingua franca is a language systematically used to make communication possible between people not sharing a mother tongue, in particular when it is a third language, distinct from both mother tongues.-Characteristics:"Lingua franca" is a functionally defined term, independent of the linguistic...

 of computing. In comparison to other sciences, where Latin
Latin
Latin is an Italic language originally spoken in Latium and Ancient Rome. It, along with most European languages, is a descendant of the ancient Proto-Indo-European language. Although it is considered a dead language, a number of scholars and members of the Christian clergy speak it fluently, and...

 and Greek
Greek language
Greek is an independent branch of the Indo-European family of languages. Native to the southern Balkans, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history;...

 are the principal sources of vocabulary, Computer Science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

 borrows more extensively from English. Due to the technical limitations of early computers, and the lack of international standards on the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

, computer users were limited to using English and the Latin alphabet. However, this historical limitation is less present today. Most software products are localized
Internationalization and localization
In computing, internationalization and localization are means of adapting computer software to different languages, regional differences and technical requirements of a target market...

 in numerous languages and the use of the Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 character encoding has resolved problems with non-Latin alphabets. Some limitations have only been changed recently, such as with domain name
Domain name
A domain name is an identification string that defines a realm of administrative autonomy, authority, or control in the Internet. Domain names are formed by the rules and procedures of the Domain Name System ....

s, which previously allowed only ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 characters.

Influence on other languages

The computing terminology of many languages borrows from English. Some language communities resist actively to that trend, and in other cases English is used extensively and more directly. This section gives some examples for the use of English terminology in other languages, and also mentions any notable differences.

Bulgarian

Both English and Russian have influence over Bulgarian
Bulgarian language
Bulgarian is an Indo-European language, a member of the Slavic linguistic group.Bulgarian, along with the closely related Macedonian language, demonstrates several linguistic characteristics that set it apart from all other Slavic languages such as the elimination of case declension, the...

 computing vocabulary. However, in many cases the borrowed terminology is translated, and not transcribed phonetically. Combined with the use of the Cyrillic alphabet
Cyrillic alphabet
The Cyrillic script or azbuka is an alphabetic writing system developed in the First Bulgarian Empire during the 10th century AD at the Preslav Literary School...

 this can make it difficult to recognize loanwords. For example the Bulgarian term for motherboard
Motherboard
In personal computers, a motherboard is the central printed circuit board in many modern computers and holds many of the crucial components of the system, providing connectors for other peripherals. The motherboard is sometimes alternatively known as the mainboard, system board, or, on Apple...

 is 'дънна платка' (IPA /danna platka/ or literally "bottom board" ).
  • компютър /compiutar/ - computer
  • твърд диск /tvard disk/ - hard disk
  • дискета /disketa/ - floppy disk
    Floppy disk
    A floppy disk is a disk storage medium composed of a disk of thin and flexible magnetic storage medium, sealed in a rectangular plastic carrier lined with fabric that removes dust particles...

    ; like the French disquette
  • уеб сайт /web sait/ - web site; but also "интернет страница" /internet stranitsa/

Faroese

The Faroese language
Faroese language
Faroese , is an Insular Nordic language spoken by 48,000 people in the Faroe Islands and about 25,000 Faroese people in Denmark and elsewhere...

 has a sparse scientific vocabulary based on the language itself. Many Faroese scientific words are borrowed and/or modified versions of especially Nordic and English equivalents. The vocabulary is constantly evolving and thus new words often die out, and only a few survive and become widely used. Examples of successful words include e.g. "telda" (computer), "kurla" (at sign) and "ambætari" (server). List of Faroese-English-Danish IT words

French

In French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

 there are some generally accepted English loan-words, but there is also a distinct effort to avoid them. In France, the Académie française
Académie française
L'Académie française , also called the French Academy, is the pre-eminent French learned body on matters pertaining to the French language. The Académie was officially established in 1635 by Cardinal Richelieu, the chief minister to King Louis XIII. Suppressed in 1793 during the French Revolution,...

 is responsible for the standardisation of the language and often coins new technological terms. Some of them are accepted in practice, in other cases the English loanwords remain predominant. In Quebec, the Office québécois de la langue française
Office québécois de la langue française
The Office québécois de la langue française is a public organization established on March 24, 1961 by the Liberal government of Jean Lesage...

 has a similar function.
  • email/mail (in Europe); courriel (mainly in Quebec, but increasingly used in French speaking Europe); informally mèl; more formally "courrier électronique"
  • pourriel - Spam
  • hameçonnage, phishing - Phishing
  • télécharger - to download
  • site web - web site
  • lien - website hyper-link
  • base de données - Database
  • caméra web - Webcam

  • amorcer, démarrer, booter - to boot
  • redémarrer, rebooter - to reboot
  • arrêter, éteindre - to shutdown
  • amorçable, bootable - Bootable
  • overclocking, surfréquençage, surcadençage - Overclocking
  • watercooling: refroidissement à l'eau
  • tuning PC: case modding
    Case modding
    Case modification is the modification of a computer chassis , or a video game console chassis. Modifying a computer case in any non-standard way is considered a case mod...


German

In German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

, English words are very often used as well:
  • noun: Computer, Website, Software, E-Mail, Blog
  • verb: downloaden, booten, crashen

Icelandic

The Icelandic language
Icelandic language
Icelandic is a North Germanic language, the main language of Iceland. Its closest relative is Faroese.Icelandic is an Indo-European language belonging to the North Germanic or Nordic branch of the Germanic languages. Historically, it was the westernmost of the Indo-European languages prior to the...

 has its own vocabulary of scientific terms, still English borrowings exist. English or Icelandicised words are mostly used in casual conversations, whereas the Icelandic words might be longer or not widespread.

Russian

  • History of computer hardware in Soviet Bloc countries
    History of computer hardware in Soviet Bloc countries
    The history of computing hardware in the former Soviet Bloc is somewhat different from that of the Western world. As a result of the CoCom embargo, computers could not be imported in a large scale from capitalist countries...

  • Computer Russification
    Computer russification
    In computing, Russification is the localization of computers and software, i.e., making the user interface of a computer and software to communicate in the Russian language and alphabet....


Spanish

The English influence on the software industry and the internet in Latin America has borrowed significantly from the Castilian lexicon.

frequently untranslated, and their Spanish equivalent
  • email: correo electrónico
  • messenger: mensajero
  • webcam: cámara web
  • website: página web, sitio web
  • blog: bitácora, 'blog'


Not translated
  • web
  • flog


Undecided
Many computing terms in Spanish share a common root with their English counterpart. In these cases, both terms are understood, but the Spanish is preferred for formal use:
  • mouse vs ratón
  • net vs red

Character encoding

The early computer software and hardware had very little support for alphabets other than the Latin. As a result of this it was difficult or impossible to represent languages based on other scripts. The ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 character encoding, created in the 1960s, only supported 256 different characters. With the use of additional software it was possible to provide support for some languages, for instance those based on the Cyrillic alphabet. However, complex-script languages like Chinese or Japanese need more characters than the 256 limit imposed by ASCII. Some computers created in the former USSR had native support for the Cyrillic alphabet.

The wide adoption of Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

, and UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...

 on the web, resolved most of these historical limitations. ASCII remains the de-facto standard for command interpreters, programming languages and text-based communication protocols.
  • Mojibake
    Mojibake
    , from the Japanese 文字 "character" + 化け "change", is the occurrence of incorrect, unreadable characters shown when computer software fails to render text correctly according to its associated character encoding.-Causes:...

     - Common mistakes

Programming language

The syntax of most programming languages uses English keywords, and therefore it could be argued some knowledge of English is required in order to use them. However, it is important to recognize all programming languages are in the class of formal languages. They are very different from any natural language, including English.

Some examples of non-English programming languages:
  • Although it uses English keywords, Ruby allows the use of Japanese
    Japanese language
    is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is a member of the Japonic language family, which has a number of proposed relationships with other languages, none of which has gained wide acceptance among historical linguists .Japanese is an...

     characters in variable names, and other elements of the code.
  • Arabic
    Arabic language
    Arabic is a name applied to the descendants of the Classical Arabic language of the 6th century AD, used most prominently in the Quran, the Islamic Holy Book...

    : ARLOGO
    ARLOGO
    ARLOGO is the Arabic language Logo Project. It is based on UCBLogo and is an attempt to create the first open-source Arabic programming language.At the moment, UCBLogo Arabic Beta 1 is available only for Microsoft Windows.-External links:*...

  • Bangla: BangaBhasha
  • Chinese
    Chinese language
    The Chinese language is a language or language family consisting of varieties which are mutually intelligible to varying degrees. Originally the indigenous languages spoken by the Han Chinese in China, it forms one of the branches of Sino-Tibetan family of languages...

    : Chinese BASIC
    Chinese BASIC
    Chinese BASIC is the name given to several Chinese-localized versions of the BASIC programming language in the early 1980s.- Versions :...

  • Dutch
    Dutch language
    Dutch is a West Germanic language and the native language of the majority of the population of the Netherlands, Belgium, and Suriname, the three member states of the Dutch Language Union. Most speakers live in the European Union, where it is a first language for about 23 million and a second...

    : Superlogo
  • French
    French language
    French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

    : LSE
    LSE (programming language)
    LSE is a programming language developed at Supélec in the late 1970s/early 1980s. It is similar to the BASIC, except with French-language instead of English-language keywords. It was derived from an earlier language called LSD, also developed at Supélec...

    , WinDev
    WinDev
    WinDev is an integrated development environment fourth generation language , first published by PC SOFT in 1993, which is based upon a run-time engine . It uses a 4GL known as WLanguage. The tools enables a predetermined set of standard forms and algorithms to be used in an automated fashion to...

    , Pascal (although the English version is more widespread)
    Pascal (programming language)
    Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

  • Hebrew
    Hebrew language
    Hebrew is a Semitic language of the Afroasiatic language family. Culturally, is it considered by Jews and other religious groups as the language of the Jewish people, though other Jewish languages had originated among diaspora Jews, and the Hebrew language is also used by non-Jewish groups, such...

    : Hebrew Programming Language
  • Icelandic
    Icelandic language
    Icelandic is a North Germanic language, the main language of Iceland. Its closest relative is Faroese.Icelandic is an Indo-European language belonging to the North Germanic or Nordic branch of the Germanic languages. Historically, it was the westernmost of the Indo-European languages prior to the...

    : Fjölnir
  • Indian Languages: Hindawi Programming System
    Hindawi Programming System
    Hindawi Programming System is a suite of open source programming languages. It allows non-English medium literates to learn and write computer programs...

  • Russian
    Russian language
    Russian is a Slavic language used primarily in Russia, Belarus, Uzbekistan, Kazakhstan, Tajikistan and Kyrgyzstan. It is an unofficial but widely spoken language in Ukraine, Moldova, Latvia, Turkmenistan and Estonia and, to a lesser extent, the other countries that were once constituent republics...

    : Glagol
  • Spanish
    Spanish language
    Spanish , also known as Castilian , is a Romance language in the Ibero-Romance group that evolved from several languages and dialects in central-northern Iberia around the 9th century and gradually spread with the expansion of the Kingdom of Castile into central and southern Iberia during the...

    : Lexico

Communication protocols

Many application protocols, especially those depending on widespread standardisation to be effective, use text strings for requests and parameters, rather than the binary values commonly used in lower layer protocols. The request strings are generally based on English words, although in some cases the strings are contractions or acronyms of English expressions, which renders them somewhat cryptic to anyone not familiar with the protocol, whatever their proficiency in English. Nevertheless, the use of word-like strings is a convenient mnemonic device that allows a person skilled in the art (and with sufficient knowledge of English) to execute the protocol manually from a keyboard, usually for the purpose of finding a problem with the service.

Examples:
  • FTP: USER, PASS (password), PASV (passive), PORT, RETR (retrieve), STOR (store), QUIT
  • SMTP: HELO (hello), MAIL, RCPT (recipient), DATA, QUIT
  • HTTP: GET, PUT, POST, HEAD (headers), DELETE, TRACE, OPTIONS


It is notable that response codes, that is, the strings sent back by the recipient of a request, are typically numeric: for instance, in HTTP (and some borrowed by other protocols)
  • 200 OK request succeeded
  • 301 Moved Permanently to redirect the request to a new address
  • 404 Not Found the requested page does not exist


This is because response codes also need to convey unambiguous information, but can have various nuances that the requester may optionally use to vary its subsequent actions. To convey all such "sub-codes" with alphabetic words would be unwieldy, and negate the advantage of using pseudo-English words. Since responses are usually generated by software they do not need to be mnemonic. Numeric codes are also more easily analysed and categorised when they are processed by software, instead of a human testing the protocol by manual input.

BIOS

Many personal computers have a BIOS
BIOS
In IBM PC compatible computers, the basic input/output system , also known as the System BIOS or ROM BIOS , is a de facto standard defining a firmware interface....

 chip, displaying text in English during boot time.

Keyboard shortcut

Keyboard shortcut
Keyboard shortcut
In computing, a keyboard shortcut is a finite set of one or more keys that invoke a software or operating system operation when triggered by the user. A meaning of term "keyboard shortcut" can vary depending on software manufacturer...

s are usually defined in terms of English keywords such as CTRL+F for find
Find
In Unix-like and some other operating systems, find is a command-line utility that searches through one or more directory trees of a file system, locates files based on some user-specified criteria and applies a user-specified action on each matched file...

.

English on the World Wide Web

English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

 is the largest language on the World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

, with 27% of internet users. Please refer to the article for Internet linguistic patterns for more details.

English speakers

Web user percentages usually focus on raw comparisons of the first language of those who access the web. Just as important is a consideration of second- and foreign-language users; i.e., the first language of a user does not necessarily reflect which language he or she regularly employs when using the web.

Native speakers

English-language users appear to be a plurality of web users, consistently cited as around one-third of the overall (near one billion). This reflects the relative affluence of English-speaking countries and high Internet penetration rates in them.

This lead may be eroding due mainly to a rapid increase of Chinese users, which broadly parallels China's advance on other economic fronts. In fact, if first-language speakers are compared, Chinese ought, in time, to outstrip English by a wide margin (837+ million for Mandarin Chinese, 370+ million for English).

First-language users among other relatively affluent countries appear generally stable, the two largest being German and Japanese, which each have between 5% and 10% of the overall share.

As a foreign language

If a gradual decline in English first-language users is inevitable, it does not necessarily follow that English will not continue to be the language of choice for those accessing the World Wide Web. There is an enormous pool of English second-language speakers who employ the language in technical, governmental and educational spheres and access the Internet in English.

A classic example of this scenario is India
India
India , officially the Republic of India , is a country in South Asia. It is the seventh-largest country by geographical area, the second-most populous country with over 1.2 billion people, and the most populous democracy in the world...

, the world's second most populated country. With economic growth, English has begun exploding as the emerging lingua franca in India. In 1995 it was thought that perhaps only 4% of the population was truly fluent in English (still an impressive 40 million). A decade later, by 2005, India had the world's largest English
Indian English
Indian English is an umbrella term used to describe dialects of the English language spoken primarily in the Republic of India.As a result of British colonial rule until Indian independence in 1947 English is an official language of India and is widely used in both spoken and literary contexts...

-speaking and understanding population and second largest "Fluent English" speaking population (led only by the U.S.). It is expected to have the world's largest number of English speakers within a decade.

Chinese is rarely employed as a lingua franca
Lingua franca
A lingua franca is a language systematically used to make communication possible between people not sharing a mother tongue, in particular when it is a third language, distinct from both mother tongues.-Characteristics:"Lingua franca" is a functionally defined term, independent of the linguistic...

 outside of China
China
Chinese civilization may refer to:* China for more general discussion of the country.* Chinese culture* Greater China, the transnational community of ethnic Chinese.* History of China* Sinosphere, the area historically affected by Chinese culture...

 by non-ethnic Chinese
Overseas Chinese
Overseas Chinese are people of Chinese birth or descent who live outside the Greater China Area . People of partial Chinese ancestry living outside the Greater China Area may also consider themselves Overseas Chinese....

; . Further, China is not truly monoglot: Mandarin
Standard Chinese
Standard Chinese, or Modern Standard Chinese, also known as Mandarin or Putonghua, is the official language of the People's Republic of China and Republic of China , and is one of the four official languages of Singapore....

 is official but different spoken variants of Chinese are often mutually unintelligible; . There is, however, an existing written standard that serves as a common written language.

In the future, then, English and Chinese may have roughly equal positions at the top of the overall web first-language users, but English will likely continue to dominate as the default choice for those accessing the World Wide Web in a second language.

Other world languages that could conceivably begin to challenge English include Spanish
Spanish language
Spanish , also known as Castilian , is a Romance language in the Ibero-Romance group that evolved from several languages and dialects in central-northern Iberia around the 9th century and gradually spread with the expansion of the Kingdom of Castile into central and southern Iberia during the...

 and Arabic
Arabic language
Arabic is a name applied to the descendants of the Classical Arabic language of the 6th century AD, used most prominently in the Quran, the Islamic Holy Book...

, though it remains to be seen if these, too, will be largely isolated to first-language speakers on the Internet as is Chinese.

World Wide Web content

One widely quoted figure for the amount of web content in English is 80%. Other sources show figures five to fifteen points lower, though still well over 50%.
There are two notable facts about these percentages:

The English web content is greater than the number of first-language English users by as much as 2 to 1.

Given the enormous lead it already enjoys and its increasing use as a lingua franca in other spheres, English web content may continue to dominate even as English first-language Internet users decline. This is a classic positive feedback loop: new Internet users find it helpful to learn English and employ it on-line, thus reinforcing the language's prestige and forcing subsequent new users to learn English as well.

Certain other factors (some predating the medium's appearance) have propelled English into a majority web-content position. Most notable in this regard is the tendency for researchers and professionals to publish in English to ensure maximum exposure. The largest database of medical bibliographical information, for example, shows English was the majority language choice for the past forty years and its share has continually increased over the same period.

The fact that non-Anglophones regularly publish in English only reinforces the language's dominance. English has a rich technical vocabulary (largely because native and non-native speakers alike use it to communicate technical ideas) and many IT and technical professionals use English regardless of country of origin (Linus Torvalds
Linus Torvalds
Linus Benedict Torvalds is a Finnish software engineer and hacker, best known for having initiated the development of the open source Linux kernel. He later became the chief architect of the Linux kernel, and now acts as the project's coordinator...

, for instance, comments his code in English, despite being from Finland and having Swedish as his first language).
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK