All Topics  
Control character

 

   Email Print
   Bookmark   Link






 

Control character



 
 
In computing
Computing

Computing is usually defined as the activity of using and developing computer technology, computer hardware and computer software. It is the computer-specific part of information technology....
 and telecommunication
Telecommunication

Telecommunication is the assisted Transmission of Signal over a distance for the purpose of communication. In earlier times, this may have involved the use of smoke signals, Drum , Semaphore line, flag signals or heliograph....
, a control character
Grapheme

In typography, a grapheme is the fundamental unit in writing systems. Graphemes include letter , Chinese characters, numerals, punctuation marks, and all the individual symbols of any of the world's writing systems....
 or non-printing character is a code point
Code point

In character encoding terminology a code point is any of the numerical values that make up the codespace. For example, ASCII comprises 128 code points in the range 0Hexadecimal to 7Fhex, Extended ASCII comprises 256 code points in the range 0Hexadecimal to FFhex, and Unicode comprises 1,114,112 code...
 (a number
Number

A number is a mathematical object used in counting and measurement. A notational symbol which represents a number is called a Numeral system, but in common usage the word number is used for both the abstract object and the symbol, as well as for the numeral for the number....
) in a character set
Character encoding

A character encoding system consists of a code that pairs a sequence of character from a given character set with something else, such as a sequence of natural numbers, octet or electrical pulses, in order to facilitate the transmission of data through telecommunication networks and/or Computer data storage of Character in compute...
, that does not in itself represent a written symbol. It is in-band signaling
In-band signaling

In telecommunications, in-band signaling is the sending of metadata and signalling in the same band, on the same channel, as used for data....
 in the context of character encoding
Character encoding

A character encoding system consists of a code that pairs a sequence of character from a given character set with something else, such as a sequence of natural numbers, octet or electrical pulses, in order to facilitate the transmission of data through telecommunication networks and/or Computer data storage of Character in compute...
. All entries in the ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
 table below code 32 (technically the C0
C0 and C1 control codes

The C0 and C1 control code sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters....
 control code set) and 127 are of this kind, including BEL (which is intended to cause an audible signal in the receiving terminal), SYN (which is a synchronization signal), and ENQ (a signal that is intended to trigger a response at the receiving end, to see if it is still present).






Discussion
Ask a question about 'Control character'
Start a new discussion about 'Control character'
Answer questions from other users
Full Discussion Forum



Encyclopedia


In computing
Computing

Computing is usually defined as the activity of using and developing computer technology, computer hardware and computer software. It is the computer-specific part of information technology....
 and telecommunication
Telecommunication

Telecommunication is the assisted Transmission of Signal over a distance for the purpose of communication. In earlier times, this may have involved the use of smoke signals, Drum , Semaphore line, flag signals or heliograph....
, a control character
Grapheme

In typography, a grapheme is the fundamental unit in writing systems. Graphemes include letter , Chinese characters, numerals, punctuation marks, and all the individual symbols of any of the world's writing systems....
 or non-printing character is a code point
Code point

In character encoding terminology a code point is any of the numerical values that make up the codespace. For example, ASCII comprises 128 code points in the range 0Hexadecimal to 7Fhex, Extended ASCII comprises 256 code points in the range 0Hexadecimal to FFhex, and Unicode comprises 1,114,112 code...
 (a number
Number

A number is a mathematical object used in counting and measurement. A notational symbol which represents a number is called a Numeral system, but in common usage the word number is used for both the abstract object and the symbol, as well as for the numeral for the number....
) in a character set
Character encoding

A character encoding system consists of a code that pairs a sequence of character from a given character set with something else, such as a sequence of natural numbers, octet or electrical pulses, in order to facilitate the transmission of data through telecommunication networks and/or Computer data storage of Character in compute...
, that does not in itself represent a written symbol. It is in-band signaling
In-band signaling

In telecommunications, in-band signaling is the sending of metadata and signalling in the same band, on the same channel, as used for data....
 in the context of character encoding
Character encoding

A character encoding system consists of a code that pairs a sequence of character from a given character set with something else, such as a sequence of natural numbers, octet or electrical pulses, in order to facilitate the transmission of data through telecommunication networks and/or Computer data storage of Character in compute...
. All entries in the ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
 table below code 32 (technically the C0
C0 and C1 control codes

The C0 and C1 control code sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters....
 control code set) and 127 are of this kind, including BEL (which is intended to cause an audible signal in the receiving terminal), SYN (which is a synchronization signal), and ENQ (a signal that is intended to trigger a response at the receiving end, to see if it is still present). The Extended Binary Coded Decimal Interchange Code (EBCDIC) character set contains 65 control codes, including all of the ASCII control codes as well as additional codes which are mostly used to control IBM peripherals. The Unicode
Unicode

Unicode is a computing industry standard allowing computers to consistently represent and manipulate Character expressed in most of the world's writing systems....
 standard has added many new non-printing characters, for example the Zero-width non-joiner
Zero-width non-joiner

The zero width non joiner is a non-printing character used in the computerized typesetting of some cursive script, Korean hangul or Persian alphabet script....
. The remainder of this article covers control codes in general and some codes that are in common use. For detailed tables of the C0 and C1 control codes
C0 and C1 control codes

The C0 and C1 control code sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters....
 used in ASCII and ISO/IEC 8859
ISO/IEC 8859

ISO/IEC 8859 is a joint International Organization for Standardization and International Electrotechnical Commission standard for 8-bit character encodings for use by computers....
, please see their respective articles.

Other characters are printing or printable characters, except perhaps for the "space" character (see ASCII printable characters
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
).

0x00 0x10
0x00 NUL
Null character

The null character is a character with the value zero, present in the ASCII and Unicode character sets, and available in nearly all mainstream programming languages....
DLE
C0 and C1 control codes

The C0 and C1 control code sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters....
0x01 SOH DC1
0x02 STX DC2
0x03 ETX DC3
0x04 EOT
End-of-transmission character

In telecommunication, an end-of-transmission character is a transmission control character used to indicate the conclusion of a transmission that may have included one or more texts and any associated message headings....
DC4
0x05 ENQ NAK
Negative-acknowledge character

* In telecommunications, a negative-acknowledge character is a transmission control character sent by a station as a negative response to the station with which the telecommunication connection has been set up....
0x06 ACK
Acknowledge character

For teleprinters, Acknowledge character is a transmission control character transmitted by the receiving station as an affirmative response to the sending station....
SYN
C0 and C1 control codes

The C0 and C1 control code sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters....
0x07 BEL
Bell character

A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message....
ETB
0x08 BS
Backspace

Backspace is the keyboard key that originally pushed the typewriter carriage one position backwards, and in modern computer displays moves the cursor one position backwards, deletes the preceding character, and shifts back the text after it by one position....
CAN
Cancel character

In telecommunication, the term cancel character has the following meanings:#A accuracy and precision control character used to indicate that the data with which it is associated are in error or are to be disregarded....
0x09 TAB
Tab key

Tab key on a alphanumeric keyboard is used to advance the cursor to the next tab stop....
EM
0x0A LF
Newline

In computing, a newline is a special character or sequence of characters signifying the end of a line of text. The name comes from the fact that the next character after the newline will appear on a new line?that is, on the next line below the text, immediately proceeding the newline....
SUB
Substitute character

Substitute character : A control character that is used in the place of a character that is recognized to be invalid or in error or that cannot be represented on a given device....
0x0B VT
Tab key

Tab key on a alphanumeric keyboard is used to advance the cursor to the next tab stop....
ESC
Escape character

In computing and telecommunication, an escape character is a single character which in a sequence of characters signifies that what is to follow takes an alternative interpretation....
0x0C FF
Page Break

A page break is a marker in an electronic document, which tells the document interpreter that the contents which follows is part of a new page. A page break causes a form feed to be sent to the printer during spooling of the document to the printer....
FS
0x0D CR
Carriage return

Originally, carriage return was the term for the control character in Baudot code on a Teleprinter for end of line return to beginning of line and did not include line feed....
GS
0x0E SO
Shift Out and Shift In characters

Shift Out and Shift In are ASCII control characters 14 and 15, respectively .  The original meaning of those characters was to switch to a different character set and back.  This was used, for instance, in the Russian language character set known as KOI7, where SO starts printing Russian alphabet, and SI starts printing Lati...
RS
0x0F SI
Shift Out and Shift In characters

Shift Out and Shift In are ASCII control characters 14 and 15, respectively .  The original meaning of those characters was to switch to a different character set and back.  This was used, for instance, in the Russian language character set known as KOI7, where SO starts printing Russian alphabet, and SI starts printing Lati...
US
0x7F  DEL
Delete key

The delete key, known less ambiguously as forward delete, Del, or ?, performs a function when struck on a computer keyboard during text or command editing which discards the character ahead of the cursor 's position, moving all following characters one position "back" towards the freed place....


History

Procedural signs
Prosigns for Morse Code

In Morse code, prosigns or procedural signals are dot/dash sequences that have a special meaning in a transmission: they are a form of control character....
 in Morse code
Morse code

Morse code is a type of character encoding that transmits telegraphic information using rhythm. Morse code uses a standardized sequence of short and long elements to represent the alphanumeric, punctuation and special characters of a given message....
 are a form of control character.

A form of control characters were introduced in the 1870 Baudot code
Baudot code

The Baudot code, invented by ?mile Baudot, is a character encoding predating EBCDIC and ASCII, and the root predecessor to International Telegraph Alphabet No 2 , the teleprinter code in use until the advent of ASCII....
: NUL and DEL. The 1901 Murray code added the carriage return
Carriage return

Originally, carriage return was the term for the control character in Baudot code on a Teleprinter for end of line return to beginning of line and did not include line feed....
 (CR) and line feed (LF), and other versions of the Baudot code included other control characters.

The bell character
Bell character

A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message....
 (BEL), which rang a bell to alert operators, was also an early teletype control character.

They have also been called "format effectors".

In ASCII

The control characters in ASCII still in common use include:
  • 0 (null
    Null character

    The null character is a character with the value zero, present in the ASCII and Unicode character sets, and available in nearly all mainstream programming languages....
    , \0), originally intended to be an ignored character, but now used by many programming language
    Programming language

    A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
    s to terminate the end of a string.
  • 7 (bell
    Bell character

    A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message....
    , \a, ^G), which may cause the device receiving it to emit a warning of some kind (usually audible).
  • 8 (backspace
    Backspace

    Backspace is the keyboard key that originally pushed the typewriter carriage one position backwards, and in modern computer displays moves the cursor one position backwards, deletes the preceding character, and shifts back the text after it by one position....
    , \b, ^H), used either to erase the last character printed or to overprint it.
  • 9 (horizontal tab
    Tab key

    Tab key on a alphanumeric keyboard is used to advance the cursor to the next tab stop....
    , \t,^I), moves the printing position some spaces to the right.
  • 10 (line feed
    Newline

    In computing, a newline is a special character or sequence of characters signifying the end of a line of text. The name comes from the fact that the next character after the newline will appear on a new line?that is, on the next line below the text, immediately proceeding the newline....
    , \n), used as the end_of_line marker in most UNIX systems
    Unix

    Unix is a computer operating system originally developed in 1969 by a group of American Telephone & Telegraph employees at Bell Labs, including Ken Thompson , Dennis Ritchie, Douglas McIlroy, and Joe Ossanna....
     and variants.
  • 12 (form feed
    Page Break

    A page break is a marker in an electronic document, which tells the document interpreter that the contents which follows is part of a new page. A page break causes a form feed to be sent to the printer during spooling of the document to the printer....
    , \f), to cause a printer to eject paper to the top of the next page, or a video terminal to clear the screen.
  • 13 (carriage return
    Carriage return

    Originally, carriage return was the term for the control character in Baudot code on a Teleprinter for end of line return to beginning of line and did not include line feed....
    , \r, ^M), used as the end_of_line marker in Mac OS
    Mac OS

    Mac OS is the trademarked name for a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems....
    , OS-9
    OS-9

    OS-9 is a family of real-time computing, process , computer multitasking, multi-user, Unix-like operating systems, developed in the 1980s, originally by Microware for the Motorola 6809 microprocessor....
    , FLEX
    FLEX (operating system)

    The FLEX single-tasking operating system was developed by Technical Systems Consultants of West Lafayette, Indiana, for the Motorola 6800 in 1976....
     (and variants). A carriage return/line feed pair is used by CP/M
    CP/M

    CP/M is an operating system originally created for Intel 8080/Intel 8085 based microcomputers by Gary Kildall of Digital Research. Initially confined to single tasking on 8-bit processors and no more than 64 kilobytes of memory, later versions of CP/M added multi-user variations, and were migrated to 16-bit processors....
    -80 and its derivatives including DOS
    DOS

    DOS, short for "Disk Operating System", is a shorthand term for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions Windows 95, Windows 98, and Windows Me....
     and Windows
    Microsoft Windows

    Microsoft Windows is a series of software operating systems and graphical user interfaces produced by Microsoft. Microsoft first introduced an operating environment named Windows in November 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces ....
    , and by Application Layer
    Application layer

    Application Layer is a term used in categorizing protocols and methods in architectural models of computer networking. Both, the OSI model and the Internet Protocol Suite contain an application layer....
     protocols
    Communications protocol

    In the field of telecommunications, a communications protocol is the set of standard rules for data representation, Signalling , authentication and Error detection and correction required to send information over a communications channel....
     such as HTTP
    Hypertext Transfer Protocol

    Hypertext Transfer Protocol is an application-level protocol for distributed, collaborative, hypermedia information systems. Its use for retrieving inter-linked resources led to the establishment of the World Wide Web....
    .
  • 27 (escape
    Escape character

    In computing and telecommunication, an escape character is a single character which in a sequence of characters signifies that what is to follow takes an alternative interpretation....
    , \e (gcc only)).
  • 127 (delete
    Delete key

    The delete key, known less ambiguously as forward delete, Del, or ?, performs a function when struck on a computer keyboard during text or command editing which discards the character ahead of the cursor 's position, moving all following characters one position "back" towards the freed place....
    ), originally intended to be an ignored character, but now used to erase a character (especially the one to the right of the cursor).


Occasionally one might encounter modern uses of other codes, such as code 4 (End of transmission), used to end a Unix shell
Unix shell

A Unix shell is a command-line interpreter and script host that provides a traditional user interface for the Unix operating system and for Unix-like systems....
 session or PostScript
PostScript

PostScript is a dynamically typed concatenative programming language programming language created by John Warnock and Charles Geschke in 1982. PostScript is best known for its use as a page description language in the electronic and desktop publishing areas....
 printer transmission. For the full list of control characters, see ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
.

Even though many control characters are rarely used, the concept of sending device-control information intermixed with printable characters is so useful that device makers found a way to send hundreds of device instructions. Specifically, they used ASCII code 27 (escape), followed by a series of characters called a "control sequence" or "escape sequence
Escape sequence

An escape sequence is a series of character used to change the state of computers and their attached peripheral devices. These are also known as control sequences, reflecting their use in device control....
". The mechanism was invented by Bob Bemer
Bob Bemer

Robert William Bemer was a computer scientist best known for his work at IBM during the late 1950s and early 1960s....
, the father of ASCII.

Typically, code 27 was sent first in such a sequence to alert the device that the following characters were to be interpreted as a control sequence rather than as plain characters, then one or more characters would follow to specify some detailed action, after which the device would go back to interpreting characters normally. For example, the sequence of code 27, followed by the printable characters "[2;10H", would cause a DEC
Digital Equipment Corporation

Digital Equipment Corporation was a pioneering United States company in the computer industry. It is often referred to within the computing industry as DEC ....
 VT-102 terminal to move its cursor
Cursor (computers)

In computing, a cursor is an indicator used to show the position on a computer monitor or other display device that will respond to input from a text input or pointing device....
 to the 10th cell of the 2nd line of the screen. Several standards exist for these sequences, notably ANSI
American National Standards Institute

The American National Standards Institute or ANSI is a private non-profit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States....
 X3.64. But the number of non-standard variations in use is large, especially among printers, where technology has advanced far faster than any standards body can possibly keep up with.

Display

As non-printing characters, how does one display or refer to control characters? There are a number of techniques, which one may illustrate with the bell character
Bell character

A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message....
 in ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
 encoding:
  • Code point
    Code point

    In character encoding terminology a code point is any of the numerical values that make up the codespace. For example, ASCII comprises 128 code points in the range 0Hexadecimal to 7Fhex, Extended ASCII comprises 256 code points in the range 0Hexadecimal to FFhex, and Unicode comprises 1,114,112 code...
    : decimal 7, hexadecimal 0x07
  • An abbreviation, often three capital letters: BEL
  • A special character: Unicode U+2407, "symbol for bell" (note that this uses the abbreviation, specially formatted)
  • Caret notation
    Caret notation

    Caret notation is a notation for unprintable control characters in ASCII. The notation consists of a caret followed by a capital letter; this digraph stands for the ASCII code that has the numerical value equivalent to the letter's numerical value....
     in ASCII, where code point 00xxxxx is represented as a caret followed by the capital letter at code point 01xxxxx: ^G
  • An escape sequence
    Escape sequence

    An escape sequence is a series of character used to change the state of computers and their attached peripheral devices. These are also known as control sequences, reflecting their use in device control....
    , as in printf
    Printf

    The class of printf functions is a class of function , typically associated with curly bracket programming languages, that accept a string parameter which specifies a method for rendering a number of other parameters into a string....
     codes: \a


How control characters map to keyboards

ASCII-based keyboards
Keyboard (computing)

In computing, a keyboard is an input device, partially modeled after the Typewriter#Keyboard layout, which uses an arrangement of buttons or Push-button, which act as mechanical levers or electronic switches....
 have a key labelled "Control
Control key

In computing, a Control key is a modifier key which, when pressed in conjunction with another key, will perform a special operation ; similar to the Shift key, the Control key rarely performs any function when pressed by itself....
", "Ctrl", or (rarely) "Cntl" which is used much like a shift key, being depressed in combination with another letter or symbol key. In one implementation, the control key generates the code 64 places below the code for the (generally) uppercase letter it is pressed in combination with (i.e., subtract 64 from ASCII code value in decimal of the (generally) uppercase letter). The other implementation is to take the ASCII code produced by the key and bitwise AND
Bitwise operation

In computer programming, a bitwise operation operates on one or two bit patterns or Binary numeral system at the level of their individual bits....
 it with 63, forcing bits 5 and 6 to zero. For example, pressing "control" and the letter "g" or "G" (code 103 or 71 in base 10
Decimal

The decimal numeral system has 10 as its Base . It is the most widely used numeral system....
, which is 01000111 in binary
Binary numeral system

The binary numeral system, or notation with a radix of 2. Owing to its straightforward implementation in digital electronic circuitry using logic gates, the binary system is used internally by all modern computers....
, produces the code 7 (Bell, 7 in base 10, or 00000111 in binary). The NULL character (code 0) is represented by Ctrl-@, "@" being the code immediately before "A" in the ASCII character set. In either case, this produces one of the 32 ASCII control codes between 0 and 31. This approach is not able to represent the DEL character because of its value (code 127), and so Ctrl-? is used to represent this character, although this key combination does not follow the same logic as for the other control characters.

When the control key is held down, letter keys produce the same control characters regardless of the state of the shift
Shift key

The shift key is a modifier key on a alphanumeric keyboard, used to type majuscule and other alternate "upper" characters. There are typically two shift keys, on the left and right sides of the row below the home row....
 or caps lock
Caps lock

The caps lock is a key on a computer keyboard. Pressing it will set a keyboard mode in which typed letters are capital letter by default and in lower case when the shift key is pressed; the keyboard remains in this mode until caps lock is pressed again....
 keys. In other words, it does not matter whether the key would have produced and upper-case or a lower-case letter. The interpretation of the control key with the space, graphics character, and digit keys (ASCII codes 32 to 63) vary between systems. Some will produce the same character code as if the control key was not held down. Other systems translate these keys into control characters when the control key is held down. The interpretation of the control key with non-ASCII ("foreign") keys also varies between systems.

Control characters are often rendered into a printable form known as caret notation
Caret notation

Caret notation is a notation for unprintable control characters in ASCII. The notation consists of a caret followed by a capital letter; this digraph stands for the ASCII code that has the numerical value equivalent to the letter's numerical value....
 by printing a caret (^) and then the ASCII character that has a value of the control character plus 64. Control characters generated using letter keys are thus displayed with the upper-case form of the letter. For example, ^G represents code 7, which is generated by pressing the G key when the control key is held down.

Keyboards also typically have a few single keys which produce control character codes. For example, the key labelled "Backspace" typically produces code 8, "Tab" code 9, "Enter" or "Return" code 13 (though some keyboards might produce code 10 for "Enter").

Modern keyboards have many keys that do not correspond to any ASCII printable or control character, for example cursor control arrows and word processing
Word processing

Word processing is the creation of documents using a word processor. It can also refer to advanced shorthand techniques, sometimes used in specialized contexts with a specially modified typewriter....
 functions. These keyboards communicate these keys to the attached computer by one of four methods: appropriating some otherwise unused control character for the new use; using some encoding other than ASCII; using multi-character control sequences; or using an additional mechanism outside of generating characters to handle these events. "Dumb" computer terminal
Computer terminal

A computer terminal is an electronic or electromechanical computer hardware device that is used for entering data into, and displaying data from, a computer or a computing system....
s typically use control sequences. Keyboards attached to stand-alone personal computer
Personal computer

A personal computer is any general-purpose computer whose original sales price, size, and capabilities make it useful for individuals, and which is intended to be operated directly by an end user, with no intervening computer operator....
s made in the 1980s typically use one (or both) of the first two methods. Modern computer keyboards generate scancode
Scancode

A scancode is the data that most computer keyboards send to a computer to report which keys have been pressed. A number, or sequence of numbers, is assigned to each key on the keyboard....
s that identify the specific physical keys that are pressed; computer software then determines how to handle the keys that are pressed, including any of the four methods described above.

The design purpose

The control characters were designed to fall into a few groups: printing and display control, data structuring, transmission control, and miscellaneous.

Printing and display control

Printing control characters were first used to control the physical mechanism of printers, the earliest output device. An early implementation of this idea was the out-of-band
Out-of-band

Out-of-band is a technical term with different uses in communications and telecommunication. It refers to communications which occur outside of a previously established communications method or channel....
 ASA carriage control characters
ASA carriage control characters

Computer printer uses some very simple control characters to control the movement of the paper through a line printer. "ASA" is the abbreviation of the American Standards Association, a former name for the American National Standards Institute , which is believed to have sanctioned these control characters....
. Later, control characters were integrated into the stream of data to be printed. The carriage return character (CR), when sent to such a device, causes it to put the character at the edge of the paper at which writing begins (it may, or may not, also move the printing position to the next line). The line feed character (LF/NL) causes the device to put the printing position on the next line. It may (or may not), depending on the device and its configuration, also move the printing position to the start of the next line (whichever direction is first -- left in Western languages and right in Hebrew and Arabic). The vertical and horizontal tab characters (VT and HT/TAB) cause the output device to move the printing position to the next tab stop in the direction of reading. The form feed character (FF/NP) starts a new sheet of paper, and may or may not move to the start of the first line. The backspace character (BS) moves the printing position one character space backwards. On printers, this is most often used so the printer can overprint characters to make other, not normally available, characters. On terminals and other electronic output devices, there are often software (or hardware) configuration choices which will allow a destruct backspace (ie, a BS, SP, BS sequence) which erases, or a non-destructive one which does not. The shift in and shift out characters (SO and SI) selected alternate character sets, fonts, underlining or other printing modes. Escape sequences were often used to do the same thing.

With the advent of computer terminal
Computer terminal

A computer terminal is an electronic or electromechanical computer hardware device that is used for entering data into, and displaying data from, a computer or a computing system....
s that did not physically print on paper and so offered more flexibility regarding screen placement, erasure, and so forth, printing control codes were adapted. Form feeds, for example, usually cleared the screen, there being no new paper page to move to. More complex escape sequences were developed to take advantage of the flexibility of the new terminals, and indeed of newer printers. The concept of a control character had always been somewhat limiting, and was extremely so when used with new, much more flexible, hardware. Control sequences (sometimes implemented as escape sequences) could match the new flexibility and power and became the standard method. However, there were, and remain, a large variety of standard sequences to choose from.

Data structuring

The separators (File, Group, Record, and Unit: FS, GS, RS and US) were made to structure data, usually on a tape, in order to simulate punched cards. End of medium (EM) warns that the tape (or whatever) is ending. While many systems use CR/LF and TAB for structuring data, it is possible to encounter the separator control characters in data that needs to be structured. The separator control characters are not overloaded; there is no general use of them except to separate data into structured groupings. Their numeric values are contiguous with the space character, which can be considered a member of the group, as a word separator.

Transmission control

The transmission control characters were intended to structure a data stream, and to manage re-transmission or graceful failure, as needed, in the face of transmission errors.

The start of heading (SOH) character was to mark a non-data section of a data stream -- the part of a stream containing addresses and other housekeeping data. The start of text character (STX) marked the end of the header, and the start of the textual part of a stream. The end of text character (ETX) marked the end of the data of a message. A widely used convention is to make the two characters preceding ETX a checksum or CRC
Cyclic redundancy check

A cyclic redundancy check is a type of function that takes as input a data stream of any length, and produces as output a value of a certain space, commonly a 32-bit integer....
 for error-detection purposes. The end of transmission block character (ETB) was used to indicate the end of a block of data, where data was divided into such blocks for transmission purposes.

The escape character (ESC
Escape character

In computing and telecommunication, an escape character is a single character which in a sequence of characters signifies that what is to follow takes an alternative interpretation....
) can be used in software user-interfaces to exit from a screen, menu, or mode, or in device-control protocols (e.g., printers and terminals) to signal that what follows is a special command sequence rather than normal data.

The substitute character (SUB
Substitute character

Substitute character : A control character that is used in the place of a character that is recognized to be invalid or in error or that cannot be represented on a given device....
) was intended to request a translation of the next character from a printable character to another value, usually by setting bit 5 to zero. This is handy because some media (such as sheets of paper produced by typewriters) can transmit only printable characters. However, on MS-DOS systems with files opened in text mode, "end of text" or "end of file" is marked by this Ctrl-Z
Control-Z

In computing, control-Z is a control character in ASCII code, also known as the substitute character. It is generated by pressing the key while holding down the key on a computer keyboard....
 character, instead of the Ctrl-C
Control-C

In computing, control-C is a control character in ASCII code, also known as the end of text character. It is generated by pressing the key while holding down the key on a computer keyboard....
 or Ctrl-D
End-of-transmission character

In telecommunication, an end-of-transmission character is a transmission control character used to indicate the conclusion of a transmission that may have included one or more texts and any associated message headings....
, which are common on other operating systems.

The cancel character (CAN
Cancel character

In telecommunication, the term cancel character has the following meanings:#A accuracy and precision control character used to indicate that the data with which it is associated are in error or are to be disregarded....
) signalled that the previous element should be discarded. The negative acknowledge character (NAK
Negative-acknowledge character

* In telecommunications, a negative-acknowledge character is a transmission control character sent by a station as a negative response to the station with which the telecommunication connection has been set up....
) is a definite flag for, usually, noting that reception was a problem, and, often, that the current element should be sent again. The acknowledge character (ACK
Acknowledge character

For teleprinters, Acknowledge character is a transmission control character transmitted by the receiving station as an affirmative response to the sending station....
) is normally used as a flag to indicate no problem detected with current element.

When a transmission medium is half duplex (that is, it can transmit in only one direction at a time), there is usually a master station that can transmit at any time, and one or more slave stations that transmit when they have permission. The enquire character (ENQ
C0 and C1 control codes

The C0 and C1 control code sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters....
) is generally used by a master station to ask a slave station to send its next message. A slave station indicates that it has completed its transmission by sending the end of transmission character (EOT
End-of-transmission character

In telecommunication, an end-of-transmission character is a transmission control character used to indicate the conclusion of a transmission that may have included one or more texts and any associated message headings....
).

The device control codes (DC1 to DC4) were originally generic, to be implemented as necessary by each device. However, a universal need in data transmission is to request the sender to stop transmitting when a receiver can't take more data right now. Digital Equipment Corporation
Digital Equipment Corporation

Digital Equipment Corporation was a pioneering United States company in the computer industry. It is often referred to within the computing industry as DEC ....
 invented a convention which used 19, (the device control 3 character (DC3), also known as control-S, or XOFF) to "S"top transmission, and 17, (the device control 1 character (DC1), aka control-Q, or XON) to start transmission. It has become so widely used that most don't realize it is not part of official ASCII. This technique, however implemented, avoids additional wires in the data cable devoted only to transmission management, which saves money. A sensible protocol for the use of such transmission flow control signals must be used, to avoid potential deadlock conditions, however.

The data link escape character (DLE
C0 and C1 control codes

The C0 and C1 control code sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters....
) was intended to be a signal to the other end of a data link to cause the following code to be interpreted as raw data, not a control code.

Miscellaneous codes

Code 7 (BEL
Bell character

A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message....
) is intended to cause an audible signal in the receiving terminal.

Many of the ASCII control characters were designed for devices of the time that are not often seen today. For example, code 22, "synchronous idle" (SYN
C0 and C1 control codes

The C0 and C1 control code sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters....
), was originally sent by synchronous modems (which have to send data constantly) when there was no actual data to send. (Modern systems typically use a start bit to announce the beginning of a transmitted word.)

Code 0 (ASCII code name NUL
Null character

The null character is a character with the value zero, present in the ASCII and Unicode character sets, and available in nearly all mainstream programming languages....
) is a special case. In paper tape, it is the case when there are no holes. It is convenient to treat this as a fill character without meaning otherwise.

Code 127 (DEL
Delete key

The delete key, known less ambiguously as forward delete, Del, or ?, performs a function when struck on a computer keyboard during text or command editing which discards the character ahead of the cursor 's position, moving all following characters one position "back" towards the freed place....
, a.k.a. "rubout") is likewise a special case. Its code is all-bits-on in binary, which essentially erased a character cell on a paper tape
Punched tape

Punched tape or paper tape is a largely obsolete form of data storage, consisting of a long strip of paper in which holes are punched to store data....
 when overpunched. Paper tape was a common storage medium when ASCII was developed, with a computing history dating back to WWII code breaking equipment at Bletchley Park
Bletchley Park

Bletchley Park, also known as Station X, is an estate located in the town of Bletchley, in Buckinghamshire. Since 1967, Bletchley has been part of Milton Keynes, England....
. Paper tape became obsolete in the 1970s, so this clever aspect of ASCII rarely saw any use. Some systems (such as the original Apples) converted it to a backspace. But because its code is in the range occupied by other printable characters, and because it had no official assigned glyph, many computer equipment vendors used it as an additional printable character (often an all-black "box" character useful for erasing text by overprinting with ink).

Many file system
File system

In computing, a file system is a method for store and organize computer files and the data they contain to make it easy to find and access them....
s do not allow control characters in the filename
Filename

A filename is a special kind of String used to uniquely identify a computer file stored on the file system of a computer. Some operating systems also identify directory in the same way....
s, as they may have reserved functions.

See also

  • C0 and C1 control codes
    C0 and C1 control codes

    The C0 and C1 control code sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters....
  • Escape sequence
    Escape sequence

    An escape sequence is a series of character used to change the state of computers and their attached peripheral devices. These are also known as control sequences, reflecting their use in device control....
  • In-band signaling
    In-band signaling

    In telecommunications, in-band signaling is the sending of metadata and signalling in the same band, on the same channel, as used for data....


External links

  • , Information Technology - Control functions for coded character sets
  • C0 Set of ISO 646 (PDF)