All Topics  
Underscore

 

   Email Print
   Bookmark   Link






 

Underscore



 
 
The underscore [ _ ] (also called understrike, underbar, low line, or low dash) is a character that originally appeared on the typewriter
Typewriter

A typewriter is a Machine or electromechanical device with a set of "keys" that, when pressed, cause Typeface to be printed on a medium, usually paper....
. Prior to the advent of word processing
Word processing

Word processing is the creation of documents using a word processor. It can also refer to advanced shorthand techniques, sometimes used in specialized contexts with a specially modified typewriter....
, the underscore character was the only method of underlining
Underline

An underline, also called an underscore, is one or more horizontal lines immediately below a portion of writing. Single, and occasionally double , underlining was originally used in hand-written or typewriter documents to emphasise text....
 words. To produce an underlined word, the word was typed, the typewriter carriage was moved back to the beginning of the word, and the word was overtyped with the underscore character.

This character is sometimes used to create visual spacing within a sequence of characters, where a white space
White space

White space or whitespace refers to the blank area between written characters or graphic regionsIt may also refer to:* White space , or negative space, the portions of a page left unmarked...
 character is not permitted, e.g., in computer
Computer

A computer is a machine that manipulates Data according to a list of Code .The first devices that resemble modern computers date to the mid-20th century , although the computer concept and various machines similar to computers existed earlier....
 filename
Filename

A filename is a special kind of String used to uniquely identify a computer file stored on the file system of a computer. Some operating systems also identify directory in the same way....
s, e-mail address
E-mail address

An e-mail address identifies a location to which e-mail messages can be delivered. An e-mail address on the modern Internet looks like, for example, jsmith@example.com and is usually read as "jsmith at example dot com"....
es, and in World Wide Web
World Wide Web

The World Wide Web is a very large set of interlinked hypertext documents accessed via the Internet. With a Web browser, one can view Web pages that may contain writing, s, videos, and other multimedia and navigate between them using hyperlinks....
 URL
Uniform Resource Locator

In Information technology, a Uniform Resource Locator is a type of Uniform Resource Identifier that specifies where an identified resource is available and the mechanism for retrieving it....
s.






Discussion
Ask a question about 'Underscore'
Start a new discussion about 'Underscore'
Answer questions from other users
Full Discussion Forum



Encyclopedia


The underscore [ _ ] (also called understrike, underbar, low line, or low dash) is a character that originally appeared on the typewriter
Typewriter

A typewriter is a Machine or electromechanical device with a set of "keys" that, when pressed, cause Typeface to be printed on a medium, usually paper....
. Prior to the advent of word processing
Word processing

Word processing is the creation of documents using a word processor. It can also refer to advanced shorthand techniques, sometimes used in specialized contexts with a specially modified typewriter....
, the underscore character was the only method of underlining
Underline

An underline, also called an underscore, is one or more horizontal lines immediately below a portion of writing. Single, and occasionally double , underlining was originally used in hand-written or typewriter documents to emphasise text....
 words. To produce an underlined word, the word was typed, the typewriter carriage was moved back to the beginning of the word, and the word was overtyped with the underscore character.

This character is sometimes used to create visual spacing within a sequence of characters, where a white space
White space

White space or whitespace refers to the blank area between written characters or graphic regionsIt may also refer to:* White space , or negative space, the portions of a page left unmarked...
 character is not permitted, e.g., in computer
Computer

A computer is a machine that manipulates Data according to a list of Code .The first devices that resemble modern computers date to the mid-20th century , although the computer concept and various machines similar to computers existed earlier....
 filename
Filename

A filename is a special kind of String used to uniquely identify a computer file stored on the file system of a computer. Some operating systems also identify directory in the same way....
s, e-mail address
E-mail address

An e-mail address identifies a location to which e-mail messages can be delivered. An e-mail address on the modern Internet looks like, for example, jsmith@example.com and is usually read as "jsmith at example dot com"....
es, and in World Wide Web
World Wide Web

The World Wide Web is a very large set of interlinked hypertext documents accessed via the Internet. With a Web browser, one can view Web pages that may contain writing, s, videos, and other multimedia and navigate between them using hyperlinks....
 URL
Uniform Resource Locator

In Information technology, a Uniform Resource Locator is a type of Uniform Resource Identifier that specifies where an identified resource is available and the mechanism for retrieving it....
s. Some computer applications will automatically underline text surrounded by underscores: _underlined_ will render underlined. It is also conventionally used in this fashion on Usenet
Usenet

Usenet, a portmanteau of "user" and "network", is a worldwide distributed Internet discussion system. It evolved from the general purpose UUCP architecture of the same name....
 to indicate emphasis, and can be used in other ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
-only media (E-mail
E-mail

Electronic mail, often abbreviated as e-mail, email, E-Mail, or eMail, is any method of creating, transmitting, or storing primarily text-based human communications with digital communications systems....
, IRC, Instant Messaging
Instant messaging

Instant messaging is a form of Real-time computing communication between two or more people based on typed text. The Written language is conveyed via devices connected over a network such as the Internet....
) for this purpose. When the underscore is used for emphasis in this fashion, it is usually interpreted as indicating that the enclosed text is underlined or italicised (as opposed to bold, which is indicated by *asterisks*).

The underscore is not the same character as the dash character, although one convention for text news wires is to use an underscore when an em-dash or en-dash is desired, or when other non-standard characters such as bullet
Bullet (typography)

In typography, a bullet is a typographical symbol or glyph used to introduce items in a wiktionary:list, like below, also known as the point of a bullet:...
s would be appropriate. A series of underscores (like _________) may be used to create a blank to be filled in on a form. It is also sometimes used to create a horizontal line, if no other method is available.

The ASCII value of this character is 95. On the standard US or UK 101/102 computer keyboard it shares a key with the hyphen
Hyphen

A hyphen is a punctuation mark. It is used both to join words and also to separate syllables of a single word. It is often confused with the dash , which are longer and have different uses, and with the minus sign which is also longer....
 on the top row, to the right of the 0
0 (number)

0 is both a number and the numerical digit used to represent that number in numeral system. It plays a central role in mathematics as the additive identity of the integers, real numbers, and many other algebraic structures....
 key.

Underscores as diacritic


The underscore is used as a diacritic
Diacritic

A diacritic is a small sign added to a letter to alter pronunciation or to distinguish between similar words. The term derives from the Greek language d?a???t???? ....
 mark, "combining low line", in some African
African languages

There are an estimated 2,000 languages spoken in Africa. They fall into four major language family:*Afro-Asiatic languages stretches from North Africa to the Horn of Africa and Southwest Asia....
 and Native American languages.

Not to be confused is the combining macron below
Combining macron below

"Combining macron below" is a Unicode combining diacritical mark used in various orthographies; see the precomposed charactersNot to be confused are "combining minus below" ? , "underline" and "low line" _ )....
.

Usage in computing


Origins of underscores in identifiers

In programs
Computer program

Computer programs are Instruction for a computer. A computer requires programs to function. Moreover, a computer program does not run unless its instructions are executed by a Central processing unit; however, a program may communicate an Algorithm#Formalization of algorithms to people without running....
 of any significant size, there is a need for descriptive (hence multi-word) identifier
Identifier

In computer science, Identifiers are Lexical Token s that name entity. The concept is analogy to that of a "name". Identifiers are used extensively in virtually all information processing systems....
s, like "previous balance" or "end of file". However, spaces are not typically permitted inside identifiers, as they are treated as delimiters between tokens. Writing the words together as in "endoffile" is not satisfactory because the names often become unreadable. Therefore, the programming language COBOL
COBOL

COBOL is one of the oldest programming languages still in active use. Its name is an acronym for COmmon Business-Oriented Language, defining its primary domain in business, finance, and administrative systems for companies and governments....
 allowed a hyphen
Hyphen

A hyphen is a punctuation mark. It is used both to join words and also to separate syllables of a single word. It is often confused with the dash , which are longer and have different uses, and with the minus sign which is also longer....
 ("-") to be used between words of compound identifiers, as in "END-OF-FILE".

Most programming languages, however, interpret the hyphen as a subtraction operator and do not allow the character in identifier names. The common punched card character sets of the time had no lower-case letters and no special character that would be adequate as a word separator in identifiers. However, by the late 1960s the ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
 character set standard had been established, allowing the designers of the C language to adopt the underscore character "_" as a word joiner. Underscore-separated compounds like "end_of_file" are still prevalent in C programs and libraries. Programmers working in the tradition of linkage oriented languages, especially the Unix C
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
 tradition (and later C++
C++

C++ is a general-purpose programming language. It is regarded as a middle-level language, as it comprises a combination of both high-level programming language and low-level programming language language features....
), had many concerns to address. Early Unix
Unix

Unix is a computer operating system originally developed in 1969 by a group of American Telephone & Telegraph employees at Bell Labs, including Ken Thompson , Dennis Ritchie, Douglas McIlroy, and Joe Ossanna....
 systems (and early personal computers in general) provided linkage
Linker

In computer science, a linker or link editor is a computer program that takes one ormore object file generated by a compiler and combines them into a single executable program....
 models where external identifier
Identifier

In computer science, Identifiers are Lexical Token s that name entity. The concept is analogy to that of a "name". Identifiers are used extensively in virtually all information processing systems....
s were limited to a short length, often as few as the initial eight characters. Many clashes were possible within the external identifier linkage space which potentially mingles code generated by various high level compilers, runtime
Runtime

In computer science, runtime or run time describes the operation of a computer program, the duration of its execution, from beginning to termination ....
 libraries required by each of these compilers, compiler generated helper functions, and program startup code, of which some fraction was inevitably compiled from system assembly language
Assembly language

An assembly language is a low-level language for programming computers. It implements a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture....
. Within this collision domain
Collision domain

A collision domain is a physical network segment where data packets can "collide" with one another for being sent on a shared medium, in particular in the Ethernet networking protocol....
 the underscore character quickly became entrenched as the primary mechanism for differentiating the external linkage space. It was common practice for C compilers to prepend a leading underscore to all external scope program identifiers to avert clashes with contributions from runtime language support. Furthermore, when the C/C++ compiler needed to introduce names into external linkage as part of the translation process, these names were often distinguished with some combination of multiple leading or trailing underscores.

This practice was later codified as part of the C and C++ language standards, in which the use of leading underscores was reserved for the implementation.

A second, independent collision domain was the C preprocessor
C preprocessor

The C preprocessor is the preprocessor for the C . In many C implementations, it is a separate computer program invoked by the compiler as the first part of translation....
. The C language preprocessor is unusual in that it does not respect any language-defined scoping
Scope (programming)

In computer programming, scope is an enclosing context where values and expressions are associated. Various programming languages have various types of scopes....
 model or reserved namespace
Namespace (computer science)

A namespace is an abstract container or environment created to hold a logical grouping of unique identifiers or symbols . An identifier defined in a namespace is associated with that namespace....
, not even C language keywords. This problem was generally addressed by writing macros in macro case which mostly mixes upper case letters with dividing underscores:

#define OPEN_FILE_LIMIT (15)

Once again the implementation must often supply hidden macros, and once again dressing up these "hidden behind the scenes" identifiers with multiple leading or trailing underscores became accepted practice. As this practice became pervasive on both levels, the underscore gained a cognitive association with system level programming, hidden technicalities, and the messy entrails of language support.

The C language linkage model further complicated matters by not supporting a strong module-level linkage model. In the C language the concept of module was initially rather loose. There was no language distinction between function names intended for linkage to other compilation units and function names intended only for use within a single compilation unit to simplify the implementation. The C language provides the static keyword which makes it possible to hide names from external linkage, but this was rarely employed, as it also obscured these names from most runtime debugging tools.

A common early convention was to use names (often prosaic) consisting mostly of lower case letters and underscores for names in external linkage not intended for use by other translation unit
Translation unit

In the field of translation, a translation unit is a segment of a text which the translator treats as a single cognitive unit for the purposes of establishing an equivalence....
s such as a local function named count_obscure_piddly_flags and camel case
CamelCase

CamelCase is the practice of writing compound noun and adjectives or phrases in which the words are joined without Whitespace s and are capitalization within the compound?as in Patti LaBelle, Visual Basic, or iPod....
 or some variant for primary application calls such as EditSaveFile.

Ruby
Ruby (programming language)

Ruby is a dynamic programming language, reflection , general purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features....
 and Perl
Perl

In computer programming, Perl is a high-level programming language, List of programming languages by category, Interpreter , dynamic programming language....
 use $_ as a special variable described as the "default input and pattern matching space" - any output defaults to that variable, and may be omitted.

See also

  • underline
    Underline

    An underline, also called an underscore, is one or more horizontal lines immediately below a portion of writing. Single, and occasionally double , underlining was originally used in hand-written or typewriter documents to emphasise text....
  • overline
    Overline

    An overline or overbar , refers to the typographical feature of a line drawn immediately above the text, for example used to indicate medieval sigla....
  • strikethrough
    Strikethrough

    Strikethrough is a typographical presentation of words with a horizontal line through the center of them. Here is an example.It signifies one of two meanings....