Control character
Encyclopedia
In computing
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...

 and telecommunication
Telecommunication
Telecommunication is the transmission of information over significant distances to communicate. In earlier times, telecommunications involved the use of visual signals, such as beacons, smoke signals, semaphore telegraphs, signal flags, and optical heliographs, or audio messages via coded...

, a control character or non-printing character is a code point
Code point
In character encoding terminology, a code point or code position is any of the numerical values that make up the code space . For example, ASCII comprises 128 code points in the range 0hex to 7Fhex, Extended ASCII comprises 256 code points in the range 0hex to FFhex, and Unicode comprises 1,114,112...

 (a number
Number
A number is a mathematical object used to count and measure. In mathematics, the definition of number has been extended over the years to include such numbers as zero, negative numbers, rational numbers, irrational numbers, and complex numbers....

) in a character set
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...

, that does not in itself represent a written symbol.
It is in-band signaling
In-band signaling
In telecommunications, in-band signaling is the sending of metadata and control information in the same band or channel used for data.-Telephone:...

 in the context of character encoding
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...

.
All entries in the ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 table below code 32 (technically the C0
C0 and C1 control codes
Most character encodings, in addition to representing printable characters, may also represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received...

 control code set) and 127 are of this kind, including BEL (which is intended to cause an audible signal in the receiving terminal), SYN (which is a synchronization signal), and ENQ (a signal that is intended to trigger a response at the receiving end, to see if it is still present). The Extended Binary Coded Decimal Interchange Code (EBCDIC) character set contains 65 control codes, including all of the ASCII control codes as well as additional codes which are mostly used to control IBM peripherals. Unicode makes a distinction between Control characters (C0 and C1 control codes
C0 and C1 control codes
Most character encodings, in addition to representing printable characters, may also represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received...

) versus Formatting characters (such as the Zero-width non-joiner
Zero-width non-joiner
The zero-width non-joiner is a non-printing character used in the computerization of writing systems that make use of ligatures. When placed between two characters that would otherwise be connected into a ligature, a ZWNJ causes them to be printed in their final and initial forms, respectively...

).

Other characters are printing, printable, or graphic character
Graphic character
In ISO/IEC 646 and related standards including ISO 8859 and Unicode, a graphic character is any character intended to be written, printed, or otherwise displayed in a form that can be read by humans...

s
, except perhaps for the "space" character (see ASCII printable characters).
0x00 0x10
0x00 NUL
Null character
The null character , abbreviated NUL, is a control character with the value zero.It is present in many character sets, including ISO/IEC 646 , the C0 control code, the Universal Character Set , and EBCDIC...

DLE
0x01 SOH DC1
0x02 STX DC2
0x03 ETX
End-of-text character
The End Of Text character is an ASCII control character used to inform the receiving computer that the end of the data stream has been reached. This may or may not be an indication that all of the data has been received....

DC3
0x04 EOT
End-of-transmission character
In telecommunication, an end-of-transmission character is a transmission control character. Its intended use is to indicate the conclusion of a transmission that may have included one or more texts and any associated message headings....

DC4
0x05 ENQ
Enquiry character
In computer communications, enquiry is a transmission-control character that requests a response from the receiving station with which a connection has been set up. It represents a signal intended to trigger a response at the receiving end, to see if it is still present...

NAK
Negative-acknowledge character
* In telecommunications, a negative-acknowledge character is a transmission control character sent by a station as a negative response to the station with which the connection has been set up....

0x06 ACK
Acknowledge character
In telecommunications, an acknowledge character is a transmission control character transmitted by the receiving station as an acknowledgement, i.e...

SYN
0x07 BEL
Bell character
A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message...

ETB
0x08 BS
Backspace
Backspace is the keyboard key that originally pushed the typewriter carriage one position backwards, and in modern computer displays moves the cursor one position backwards, deletes the preceding character, and shifts back the text after it by one position....

CAN
Cancel character
In telecommunication, the term cancel character has the following meanings:#A control character used to indicate that the data with which it is associated are in error or are to be disregarded....

0x09 TAB EM
0x0A LF SUB
Substitute character
A substitute character is a control character that is used in the place of a character that is recognized to be invalid or in error or that cannot be represented on a given device....

0x0B VT ESC
Escape character
In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters...

0x0C FF FS
0x0D CR
Carriage return
Carriage return, often shortened to return, refers to a control character or mechanism used to start a new line of text.Originally, the term "carriage return" referred to a mechanism or lever on a typewriter...

GS
0x0E SO
Shift Out and Shift In characters
Shift Out and Shift In are ASCII control characters 14 and 15, respectively .  The original meaning of those characters was to switch to a different character set and back.  This was used, for instance, in the Russian character set known as KOI7, where SO starts printing Russian...

RS
0x0F SI
Shift Out and Shift In characters
Shift Out and Shift In are ASCII control characters 14 and 15, respectively .  The original meaning of those characters was to switch to a different character set and back.  This was used, for instance, in the Russian character set known as KOI7, where SO starts printing Russian...

US
0x7F DEL
Delete character
In computing, a delete character is the last character in the ASCII repertoire, with the code 127. Not a graphic character, it denoted as ^? in caret notation and has a graphic representation in Unicode like all ASCII control characters, while its meaning is a bit unclear.There is no common...


History

Procedural signs
Prosigns for Morse Code
In Morse code, prosigns or procedural signals are dot/dash sequences that have a special meaning in a transmission: they are a form of control character...

 in Morse code
Morse code
Morse code is a method of transmitting textual information as a series of on-off tones, lights, or clicks that can be directly understood by a skilled listener or observer without special equipment...

 are a form of control character.

A form of control characters were introduced in the 1870 Baudot code
Baudot code
The Baudot code, invented by Émile Baudot, is a character set predating EBCDIC and ASCII. It was the predecessor to the International Telegraph Alphabet No 2 , the teleprinter code in use until the advent of ASCII. Each character in the alphabet is represented by a series of bits, sent over a...

: NUL and DEL.
The 1901 Murray code added the carriage return
Carriage return
Carriage return, often shortened to return, refers to a control character or mechanism used to start a new line of text.Originally, the term "carriage return" referred to a mechanism or lever on a typewriter...

 (CR) and line feed (LF), and other versions of the Baudot code included other control characters.

The bell character
Bell character
A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message...

 (BEL), which rang a bell to alert operators, was also an early teletype control character.

They have also been called "format effectors".

In ASCII

The control characters in ASCII still in common use include:
  • 0 (null
    Null character
    The null character , abbreviated NUL, is a control character with the value zero.It is present in many character sets, including ISO/IEC 646 , the C0 control code, the Universal Character Set , and EBCDIC...

    , NUL, \0, ^@), originally intended to be an ignored character, but now used by many programming language
    Programming language
    A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

    s to mark the end of a string.
  • 7 (bell
    Bell character
    A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message...

    , BEL, \a, ^G), which may cause the device receiving it to emit a warning of some kind (usually audible).
  • 8 (backspace
    Backspace
    Backspace is the keyboard key that originally pushed the typewriter carriage one position backwards, and in modern computer displays moves the cursor one position backwards, deletes the preceding character, and shifts back the text after it by one position....

    , BS, \b, ^H), used either to erase the last character printed or to overprint it.
  • 9 (horizontal tab
    Tab key
    Tab key on a keyboard is used to advance the cursor to the next tab stop.- Origin :The word tab derives from the word tabulate, which means "to arrange data in a tabular, or table, form"...

    , HT, \t, ^I), moves the printing position some spaces to the right.
  • 10 (line feed
    Newline
    In computing, a newline, also known as a line break or end-of-line marker, is a special character or sequence of characters signifying the end of a line of text. The name comes from the fact that the next character after the newline will appear on a new line—that is, on the next line below the...

    , LF, \n, ^J), used as the end of line marker in most UNIX systems
    Unix
    Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

     and variants.

  • 12 (form feed
    Page Break
    A page break is a marker in an electronic document that tells the document interpreter that the content which follows is part of a new page. A page break causes a form feed to be sent to the printer during spooling of the document to the printer.-Form feed:...

    , FF, \f, ^L), to cause a printer to eject paper to the top of the next page, or a video terminal to clear the screen.
  • 13 (carriage return
    Carriage return
    Carriage return, often shortened to return, refers to a control character or mechanism used to start a new line of text.Originally, the term "carriage return" referred to a mechanism or lever on a typewriter...

    , CR, \r, ^M), used as the end of line marker in Mac OS
    Mac OS
    Mac OS is a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems. The Macintosh user experience is credited with popularizing the graphical user interface...

    , OS-9
    OS-9
    OS-9 is a family of real-time, process-based, multitasking, multi-user, Unix-like operating systems, developed in the 1980s, originally by Microware Systems Corporation for the Motorola 6809 microprocessor. It is currently owned by RadiSys Corporation....

    , FLEX
    FLEX (operating system)
    The FLEX single-tasking operating system was developed by Technical Systems Consultants of West Lafayette, Indiana, for the Motorola 6800 in 1976. The original version was for 8" floppy disks and the version for 5.25" floppies was called mini-Flex. It was also later ported to the Motorola 6809;...

     (and variants). A carriage return/line feed pair is used by CP/M
    CP/M
    CP/M was a mass-market operating system created for Intel 8080/85 based microcomputers by Gary Kildall of Digital Research, Inc...

    -80 and its derivatives including DOS
    DOS
    DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...

     and Windows
    Microsoft Windows
    Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

    , and by Application Layer
    Application layer
    The Internet protocol suite and the Open Systems Interconnection model of computer networking each specify a group of protocols and methods identified by the name application layer....

     protocols
    Communications protocol
    A communications protocol is a system of digital message formats and rules for exchanging those messages in or between computing systems and in telecommunications...

     such as HTTP
    Hypertext Transfer Protocol
    The Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....

    .
  • 27 (escape
    Escape character
    In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters...

    , ESC, \e [GCC only], ^[).
  • 127 (delete
    Delete character
    In computing, a delete character is the last character in the ASCII repertoire, with the code 127. Not a graphic character, it denoted as ^? in caret notation and has a graphic representation in Unicode like all ASCII control characters, while its meaning is a bit unclear.There is no common...

    , DEL, ^?), originally intended to be an ignored character, but now used in some systems to erase a character.


Occasionally one might encounter modern uses of other codes, such as code 4 (End of transmission), used to end a Unix shell
Unix shell
A Unix shell is a command-line interpreter or shell that provides a traditional user interface for the Unix operating system and for Unix-like systems...

 session or PostScript
PostScript
PostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...

 printer transmission. For the full list of control characters, see ASCII.

Even though many control characters are rarely used, the concept of sending device-control information intermixed with printable characters is so useful that device makers found a way to send hundreds of device instructions. Specifically, they used ASCII code 27 (escape), followed by a series of characters called a "control sequence" or "escape sequence
Escape sequence
An escape sequence is a series of characters used to change the state of computers and their attached peripheral devices. These are also known as control sequences, reflecting their use in device control. Some control sequences are special characters that always have the same meaning...

". The mechanism was invented by Bob Bemer
Bob Bemer
Robert William Bemer was a computer scientist best known for his work at IBM during the late 1950s and early 1960s.-Biography:...

, the father of ASCII.

Typically, code 27 was sent first in such a sequence to alert the device that the following characters were to be interpreted as a control sequence rather than as plain characters, then one or more characters would follow to specify some detailed action, after which the device would go back to interpreting characters normally. For example, the sequence of code 27, followed by the printable characters "[2;10H", would cause a DEC
Digital Equipment Corporation
Digital Equipment Corporation was a major American company in the computer industry and a leading vendor of computer systems, software and peripherals from the 1960s to the 1990s...

 VT-102 terminal to move its cursor
Cursor (computers)
In computing, a cursor is an indicator used to show the position on a computer monitor or other display device that will respond to input from a text input or pointing device. The flashing text cursor may be referred to as a caret in some cases...

 to the 10th cell of the 2nd line of the screen. Several standards exist for these sequences, notably ANSI
American National Standards Institute
The American National Standards Institute is a private non-profit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organization also coordinates U.S. standards with international...

 X3.64. But the number of non-standard variations in use is large, especially among printers, where technology has advanced far faster than any standards body can possibly keep up with.

In Unicode

In Unicode, "Control-characters" are those defined in C0 and C1 control codes. Their General Category is "Cc". Formatting codes are distinct, in General Category "Cf". The Cc control characters have no Name in Unicode. They may be indicated informally as "".

Display

There are a number of techniques to display non-printing characters, which may be illustrated with the bell character
Bell character
A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message...

 in ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 encoding:
  • Code point
    Code point
    In character encoding terminology, a code point or code position is any of the numerical values that make up the code space . For example, ASCII comprises 128 code points in the range 0hex to 7Fhex, Extended ASCII comprises 256 code points in the range 0hex to FFhex, and Unicode comprises 1,114,112...

    : decimal 7, hexadecimal 0x07
  • An abbreviation, often three capital letters: BEL
  • A special character: Unicode U+2407 (␇), "symbol for bell" (note that this uses the abbreviation, specially formatted)
  • Caret notation
    Caret notation
    Caret notation is a notation for unprintable control characters in ASCII encoding. The notation consists of a caret followed by a capital letter; this digraph stands for the ASCII code that has the numerical value equivalent to the letter's numerical value. For example the EOT character with a...

     in ASCII, where code point 00xxxxx is represented as a caret followed by the capital letter at code point 10xxxxx: ^G
  • An escape sequence
    Escape sequence
    An escape sequence is a series of characters used to change the state of computers and their attached peripheral devices. These are also known as control sequences, reflecting their use in device control. Some control sequences are special characters that always have the same meaning...

    , as in printf
    Printf
    Printf format string refers to a control parameter used by a class of functions typically associated with some types of programming languages. The format string specifies a method for rendering an arbitrary number of varied data type parameter into a string...

     codes: \a

How control characters map to keyboards

ASCII-based keyboards
Keyboard (computing)
In computing, a keyboard is a typewriter-style keyboard, which uses an arrangement of buttons or keys, to act as mechanical levers or electronic switches...

 have a key labelled "Control
Control key
In computing, a Control key is a modifier key which, when pressed in conjunction with another key, will perform a special operation ; similar to the Shift key, the Control key rarely performs any function when pressed by itself...

", "Ctrl", or (rarely) "Cntl" which is used much like a shift key, being pressed in combination with another letter or symbol key. In one implementation, the control key generates the code 64 places below the code for the (generally) uppercase letter it is pressed in combination with (i.e., subtract 64 from ASCII code value in decimal of the (generally) uppercase letter). The other implementation is to take the ASCII code produced by the key and bitwise AND
Bitwise operation
A bitwise operation operates on one or more bit patterns or binary numerals at the level of their individual bits. This is used directly at the digital hardware level as well as in microcode, machine code and certain kinds of high level languages...

 it with 63, forcing bits 6 and 7 to zero. For example, pressing "control" and the letter "g" or "G" (code 103 or 71 in base 10
Decimal
The decimal numeral system has ten as its base. It is the numerical base most widely used by modern civilizations....

, which is 01000111 in binary
Binary numeral system
The binary numeral system, or base-2 number system, represents numeric values using two symbols, 0 and 1. More specifically, the usual base-2 system is a positional notation with a radix of 2...

, produces the code 7 (Bell, 7 in base 10, or 00000111 in binary). The NULL character (code 0) is represented by Ctrl-@, "@" being the code immediately before "A" in the ASCII character set. For convenience, a lot of terminals accept Ctrl-Space as an alias for Ctrl-@. In either case, this produces one of the 32 ASCII control codes between 0 and 31. This approach is not able to represent the DEL character because of its value (code 127), but Ctrl-? is often used for this character, as subtracting 64 from a '?' gives −1, which if masked to 7 bits is 127.

When the control key is held down, letter keys produce the same control characters regardless of the state of the shift
Shift key
The shift key is a modifier key on a keyboard, used to type capital letters and other alternate "upper" characters. There are typically two shift keys, on the left and right sides of the row below the home row...

 or caps lock
Caps lock
Caps lock is a key on many computer keyboards. Pressing it sets an input mode in which typed letters are uppercase by default. The keyboard remains in caps lock mode until the key is pressed again...

 keys. In other words, it does not matter whether the key would have produced an upper-case or a lower-case letter. The interpretation of the control key with the space, graphics character, and digit keys (ASCII codes 32 to 63) vary between systems. Some will produce the same character code as if the control key was not held down. Other systems translate these keys into control characters when the control key is held down. The interpretation of the control key with non-ASCII ("foreign") keys also varies between systems.

Control characters are often rendered into a printable form known as caret notation
Caret notation
Caret notation is a notation for unprintable control characters in ASCII encoding. The notation consists of a caret followed by a capital letter; this digraph stands for the ASCII code that has the numerical value equivalent to the letter's numerical value. For example the EOT character with a...

 by printing a caret (^) and then the ASCII character that has a value of the control character plus 64. Control characters generated using letter keys are thus displayed with the upper-case form of the letter. For example, ^G represents code 7, which is generated by pressing the G key when the control key is held down.

Keyboards also typically have a few single keys which produce control character codes. For example, the key labelled "Backspace" typically produces code 8, "Tab" code 9, "Enter" or "Return" code 13 (though some keyboards might produce code 10 for "Enter").

Many keyboards include keys that do not correspond to any ASCII printable or control character, for example cursor control arrows and word processing
Word processing
Word processing is the creation of documents using a word processor. It can also refer to advanced shorthand techniques, sometimes used in specialized contexts with a specially modified typewriter.-External links:...

 functions. The associated keypresses are communicated to computer programs by one of four methods: appropriating otherwise unused control characters; using some encoding other than ASCII; using multi-character control sequences; or using an additional mechanism outside of generating characters. "Dumb" computer terminal
Computer terminal
A computer terminal is an electronic or electromechanical hardware device that is used for entering data into, and displaying data from, a computer or a computing system...

s typically use control sequences. Keyboards attached to stand-alone personal computer
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...

s made in the 1980s typically use one (or both) of the first two methods. Modern computer keyboards generate scancode
Scancode
A scancode is the data that most computer keyboards send to a computer to report which keys have been pressed. A number, or sequence of numbers, is assigned to each key on the keyboard.- Variants :...

s that identify the specific physical keys that are pressed; computer software then determines how to handle the keys that are pressed, including any of the four methods described above.

The design purpose

The control characters were designed to fall into a few groups: printing and display control, data structuring, transmission control, and miscellaneous.

Printing and display control

Printing control characters were first used to control the physical mechanism of printers, the earliest output device. An early implementation of this idea was the out-of-band
Out-of-band
The term out-of-band has different uses in communications and telecommunication. In case of out-of-band control signaling, signaling bits are sent in special order in a dedicated signaling frame...

 ASA carriage control characters
ASA carriage control characters
ASA control characters are simple printing command characters used by mainframe printers to control the movement of paper through line printers. These commands are presented as special characters in the first column of each text line to be printed, and affect how the paper is advanced before the...

. Later, control characters were integrated into the stream of data to be printed.
The carriage return character (CR), when sent to such a device, causes it to put the character at the edge of the paper at which writing begins (it may, or may not, also move the printing position to the next line).
The line feed character (LF/NL) causes the device to put the printing position on the next line. It may (or may not), depending on the device and its configuration, also move the printing position to the start of the next line (whichever direction is first—left in Western languages and right in Hebrew and Arabic).
The vertical and horizontal tab characters (VT and HT/TAB) cause the output device to move the printing position to the next tab stop in the direction of reading.
The form feed character (FF/NP) starts a new sheet of paper, and may or may not move to the start of the first line.
The backspace character (BS) moves the printing position one character space backwards. On printers, this is most often used so the printer can overprint characters to make other, not normally available, characters. On terminals and other electronic output devices, there are often software (or hardware) configuration choices which will allow a destruct backspace (i.e., a BS, SP, BS sequence) which erases, or a non-destructive one which does not.
The shift in and shift out characters (SO and SI) selected alternate character sets, fonts, underlining or other printing modes. Escape sequences were often used to do the same thing.

With the advent of computer terminal
Computer terminal
A computer terminal is an electronic or electromechanical hardware device that is used for entering data into, and displaying data from, a computer or a computing system...

s that did not physically print on paper and so offered more flexibility regarding screen placement, erasure, and so forth, printing control codes were adapted. Form feeds, for example, usually cleared the screen, there being no new paper page to move to. More complex escape sequences were developed to take advantage of the flexibility of the new terminals, and indeed of newer printers. The concept of a control character had always been somewhat limiting, and was extremely so when used with new, much more flexible, hardware. Control sequences (sometimes implemented as escape sequences) could match the new flexibility and power and became the standard method. However, there were, and remain, a large variety of standard sequences to choose from.

Data structuring

The separators (File, Group, Record, and Unit: FS, GS, RS and US) were made to structure data, usually on a tape, in order to simulate punched card
Punched card
A punched card, punch card, IBM card, or Hollerith card is a piece of stiff paper that contains digital information represented by the presence or absence of holes in predefined positions...

s.
End of medium (EM) warns that the tape (or other recording medium) is ending.
While many systems use CR/LF and TAB for structuring data, it is possible to encounter the separator control characters in data that needs to be structured. The separator control characters are not overloaded; there is no general use of them except to separate data into structured groupings. Their numeric values are contiguous with the space character, which can be considered a member of the group, as a word separator.

Transmission control

The transmission control characters were intended to structure a data stream, and to manage re-transmission or graceful failure, as needed, in the face of transmission errors.

The start of heading (SOH) character was to mark a non-data section of a data stream—the part of a stream containing addresses and other housekeeping data. The start of text character (STX) marked the end of the header, and the start of the textual part of a stream. The end of text character (ETX) marked the end of the data of a message. A widely used convention is to make the two characters preceding ETX a checksum or CRC
Cyclic redundancy check
A cyclic redundancy check is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to raw data...

 for error-detection purposes. The end of transmission block character (ETB) was used to indicate the end of a block of data, where data was divided into such blocks for transmission purposes.

The escape character (ESC
Escape character
In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters...

) was intended to "quote" the next character, if it was another control character it would print it instead of performing the control function. It is almost never used for this purpose today.

The substitute character (SUB
Substitute character
A substitute character is a control character that is used in the place of a character that is recognized to be invalid or in error or that cannot be represented on a given device....

) was intended to request a translation of the next character from a printable character to another value, usually by setting bit 5 to zero. This is handy because some media (such as sheets of paper produced by typewriters) can transmit only printable characters. However, on MS-DOS systems with files opened in text mode, "end of text" or "end of file" is marked by this Ctrl-Z
Control-Z
In computing, is a control character in ASCII code, also known as the substitute character or a keyboard shortcut. Strictly speaking, is not a printable character at all but a code for control purposes, though it is sometimes rendered by two characters as ^Z. It is generated by pressing the key...

 character, instead of the Ctrl-C
Control-C
Control-C is a common computer command. It is generated by pressing the key while holding down the key on a computer keyboard.In graphical user interface environments that use the control key to control the active program, control-C is often used to copy highlighted text to the clipboard...

 or Ctrl-D
End-of-transmission character
In telecommunication, an end-of-transmission character is a transmission control character. Its intended use is to indicate the conclusion of a transmission that may have included one or more texts and any associated message headings....

, which are common on other operating systems.

The cancel character (CAN
Cancel character
In telecommunication, the term cancel character has the following meanings:#A control character used to indicate that the data with which it is associated are in error or are to be disregarded....

) signalled that the previous element should be discarded. The negative acknowledge character (NAK
Negative-acknowledge character
* In telecommunications, a negative-acknowledge character is a transmission control character sent by a station as a negative response to the station with which the connection has been set up....

) is a definite flag for, usually, noting that reception was a problem, and, often, that the current element should be sent again. The acknowledge character (ACK
Acknowledge character
In telecommunications, an acknowledge character is a transmission control character transmitted by the receiving station as an acknowledgement, i.e...

) is normally used as a flag to indicate no problem detected with current element.

When a transmission medium is half duplex (that is, it can transmit in only one direction at a time), there is usually a master station that can transmit at any time, and one or more slave stations that transmit when they have permission. The enquire character (ENQ
Enquiry character
In computer communications, enquiry is a transmission-control character that requests a response from the receiving station with which a connection has been set up. It represents a signal intended to trigger a response at the receiving end, to see if it is still present...

) is generally used by a master station to ask a slave station to send its next message. A slave station indicates that it has completed its transmission by sending the end of transmission character (EOT
End-of-transmission character
In telecommunication, an end-of-transmission character is a transmission control character. Its intended use is to indicate the conclusion of a transmission that may have included one or more texts and any associated message headings....

).

The device control codes (DC1 to DC4) were originally generic, to be implemented as necessary by each device. However, a universal need in data transmission is to request the sender to stop transmitting when a receiver can't take more data right now. Digital Equipment Corporation
Digital Equipment Corporation
Digital Equipment Corporation was a major American company in the computer industry and a leading vendor of computer systems, software and peripherals from the 1960s to the 1990s...

 invented a convention which used 19, (the device control 3 character (DC3), also known as control-S, or XOFF) to "S"top transmission, and 17, (the device control 1 character (DC1), aka control-Q, or XON) to start transmission. It has become so widely used that most don't realize it is not part of official ASCII. This technique, however implemented, avoids additional wires in the data cable devoted only to transmission management, which saves money. A sensible protocol for the use of such transmission flow control signals must be used, to avoid potential deadlock conditions, however.

The data link escape character (DLE
C0 and C1 control codes
Most character encodings, in addition to representing printable characters, may also represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received...

) was intended to be a signal to the other end of a data link that the following character is a control character such as STX or ETX. For example a packet may be structured in the following way (DLE
C0 and C1 control codes
Most character encodings, in addition to representing printable characters, may also represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received...

) (DLE
C0 and C1 control codes
Most character encodings, in addition to representing printable characters, may also represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received...

) .

Miscellaneous codes

Code 7 (BEL
Bell character
A bell code is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message...

) is intended to cause an audible signal in the receiving terminal.

Many of the ASCII control characters were designed for devices of the time that are not often seen today. For example, code 22, "synchronous idle" (SYN
C0 and C1 control codes
Most character encodings, in addition to representing printable characters, may also represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received...

), was originally sent by synchronous modems (which have to send data constantly) when there was no actual data to send. (Modern systems typically use a start bit to announce the beginning of a transmitted word— this is a feature of asynchronous communication. Synchronous communication links were more often seen with mainframes, where they were typically run over corporate leased lines to connect a mainframe to another mainframe or perhaps a minicomputer.)

Code 0 (ASCII code name NUL
Null character
The null character , abbreviated NUL, is a control character with the value zero.It is present in many character sets, including ISO/IEC 646 , the C0 control code, the Universal Character Set , and EBCDIC...

) is a special case. In paper tape, it is the case when there are no holes. It is convenient to treat this as a fill character with no meaning otherwise. Since the position of a NUL character has no holes punched, it can be replaced with any other character at a later time, so it was typically used to reserve space, either for correcting errors or for inserting information that would be available at a later time or in another place.

Code 127 (DEL
Delete character
In computing, a delete character is the last character in the ASCII repertoire, with the code 127. Not a graphic character, it denoted as ^? in caret notation and has a graphic representation in Unicode like all ASCII control characters, while its meaning is a bit unclear.There is no common...

, a.k.a. "rubout") is likewise a special case. Its 7-bit code is all-bits-on in binary, which essentially erased a character cell on a paper tape
Punched tape
Punched tape or paper tape is an obsolete form of data storage, consisting of a long strip of paper in which holes are punched to store data...

 when overpunched. Paper tape was a common storage medium when ASCII was developed, with a computing history dating back to WWII code breaking equipment at Biuro Szyfrów
Biuro Szyfrów
The Biuro Szyfrów was the interwar Polish General Staff's agency charged with both cryptography and cryptology ....

. Paper tape became obsolete in the 1970s, so this clever aspect of ASCII rarely saw any use after that. (However it should be noted that non-erasable Programmable ROM
Programmable read-only memory
A programmable read-only memory or field programmable read-only memory or one-time programmable non-volatile memory is a form of digital memory where the setting of each bit is locked by a fuse or antifuse. Such PROMs are used to store programs permanently...

s are typically implemented as arrays of fusible elements, each representing a bit
Bit
A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...

, which can only be switched one way, usually from one to zero. In such PROMs, the DEL and NUL characters can be used in the same way that they were used on punched tape: one to reserve meaningless fill bytes that can be written later, and the other to convert written bytes to meaningless fill bytes. For PROMs that switch one to zero, the roles of NUL and DEL are reversed; also, DEL will only work with 7-bit characters, which are rarely used today; for 8-bit content, the character code 255, commonly defined as a nonbreaking space character, can be used instead of DEL.) Some systems (such as the original Apples) converted it to a backspace. But because its code is in the range occupied by other printable characters, and because it had no official assigned glyph, many computer equipment vendors used it as an additional printable character (often an all-black "box" character useful for erasing text by overprinting with ink).

Many file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

s do not allow control characters in the filename
Filename
The filename is metadata about a file; a string used to uniquely identify a file stored on the file system. Different file systems impose different restrictions on length and allowed characters on filenames.A filename includes one or more of these components:...

s, as they may have reserved functions.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK