All Topics  
Plain text

 

   Email Print
   Bookmark   Link






 

Plain text



 
 
In computing
Computing

Computing is usually defined as the activity of using and developing computer technology, computer hardware and computer software. It is the computer-specific part of information technology....
, plain text is a term used for an ordinary "unformatted" sequential file readable as textual material without much processing.

The encoding
Character encoding

A character encoding system consists of a code that pairs a sequence of character from a given character set with something else, such as a sequence of natural numbers, octet or electrical pulses, in order to facilitate the transmission of data through telecommunication networks and/or Computer data storage of Character in compute...
 has traditionally been either ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
, one of its many derivatives such as ISO/IEC 646
ISO/IEC 646

ISO 646 is an International Organization for Standardization standard that since 1972 has specified a 7-bit character code from which several national standards are derived....
 etc., or sometimes EBCDIC
EBCDIC

Extended Binary Coded Decimal Interchange Code is an 8-bit character encoding used on IBM mainframe operating systems such as z/OS, OS/390, VM and VSE , as well as IBM midrange computer operating systems such as OS/400 and i5/OS ....
. No other encodings are used in plain text files which neither contain any (character-based) structural tags such as heading marks, nor any typographic markers like bold face, italics, etc.

Unicode
Unicode

Unicode is a computing industry standard allowing computers to consistently represent and manipulate Character expressed in most of the world's writing systems....
 is today gradually replacing the older ASCII derivatives limited to 7 or 8 bit codes.






Discussion
Ask a question about 'Plain text'
Start a new discussion about 'Plain text'
Answer questions from other users
Full Discussion Forum



Encyclopedia


In computing
Computing

Computing is usually defined as the activity of using and developing computer technology, computer hardware and computer software. It is the computer-specific part of information technology....
, plain text is a term used for an ordinary "unformatted" sequential file readable as textual material without much processing.

The encoding
Character encoding

A character encoding system consists of a code that pairs a sequence of character from a given character set with something else, such as a sequence of natural numbers, octet or electrical pulses, in order to facilitate the transmission of data through telecommunication networks and/or Computer data storage of Character in compute...
 has traditionally been either ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
, one of its many derivatives such as ISO/IEC 646
ISO/IEC 646

ISO 646 is an International Organization for Standardization standard that since 1972 has specified a 7-bit character code from which several national standards are derived....
 etc., or sometimes EBCDIC
EBCDIC

Extended Binary Coded Decimal Interchange Code is an 8-bit character encoding used on IBM mainframe operating systems such as z/OS, OS/390, VM and VSE , as well as IBM midrange computer operating systems such as OS/400 and i5/OS ....
. No other encodings are used in plain text files which neither contain any (character-based) structural tags such as heading marks, nor any typographic markers like bold face, italics, etc.

Unicode
Unicode

Unicode is a computing industry standard allowing computers to consistently represent and manipulate Character expressed in most of the world's writing systems....
 is today gradually replacing the older ASCII derivatives limited to 7 or 8 bit codes. It will probably serve much the same purposes, but this time permitting almost any human language as well as important punctuation and symbols such as mathematical relations (? = = ˜), multiplication (× •), etc, which are not included in the more restricted ASCII set.

Usage


The purpose of using plain text today is primarily a "lowest common denominator" independence from programs that require their very own special encoding or formatting (with due sacrifices and limitations). Plain text files can be opened, read, and edited with most text editor
Text editor

A text editor is a type of software application used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code....
s. Examples include Notepad
Notepad

Notepad is a simple text editor included in all versions of Microsoft Windows since Windows 1.0 in 1985....
 (Windows
Microsoft Windows

Microsoft Windows is a series of software operating systems and graphical user interfaces produced by Microsoft. Microsoft first introduced an operating environment named Windows in November 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces ....
), edit (DOS
DOS

DOS, short for "Disk Operating System", is a shorthand term for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions Windows 95, Windows 98, and Windows Me....
), ed
Ed (text editor)

ed is a standard text editor on the Unix operating system. ed was originally written by Ken Thompson and contains one of the first implementations of regular expressions....
, vi
Vi

vi is a family of screen-oriented text editors which share common characteristics, such as methods of invocation from the operating system command interpreter, and characteristic user interface features....
 or vim (Unix
Unix

Unix is a computer operating system originally developed in 1969 by a group of American Telephone & Telegraph employees at Bell Labs, including Ken Thompson , Dennis Ritchie, Douglas McIlroy, and Joe Ossanna....
, Linux
Linux

Linux is a generic term referring to Unix-like computer operating systems based on the Linux kernel. Their development is one of the most prominent examples of free and open source software collaboration; typically all the underlying source code can be used, freely modified, and redistributed by anyone under the terms of the GNU GPL license...
), SimpleText
SimpleText

SimpleText is the native text editor for the Classic Mac OS. SimpleText allows editing including text formatting , fonts, and sizes. It can be considered similar to Windows' WordPad application....
 (Mac OS
Mac OS

Mac OS is the trademarked name for a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems....
), or TextEdit
TextEdit

TextEdit is a simple, open source word processor and text editor, first featured in NeXT's NEXTSTEP and OPENSTEP. It is now distributed with Mac OS X since Apple Inc.'s acquisition of NeXT, and available as a GNUstep application for other Unix-compatible operating systems such as Linux....
 (Mac OS X
Mac OS X

Mac OS X is a line of computer operating systems developed, marketed, and sold by Apple Inc., and since 2002 has been included with all new Macintosh computer systems....
). Other computer programs are also capable of reading and importing plain text. It can also be used by simple computer tools such as line printing text commands like type
Type (command)

In computing, type is a command in various OpenVMS, CP/M, DOS, OS/2 and Microsoft Windows command line interpreters such as COMMAND.COM, cmd.exe, 4DOS/4NT and Windows PowerShell....
(DOS
DOS

DOS, short for "Disk Operating System", is a shorthand term for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions Windows 95, Windows 98, and Windows Me....
 and Windows) and cat
Cat (Unix)

The cat command is a standard Unix program used to Concatenation and display files. The name is from :wikt:catenate, a synonym of concatenate....
(Unix).

Plain text files are almost universal in programming; a source code file containing instructions in a programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
 is almost always a plain text file. Plain text is also commonly used for configuration files, which are read for saved settings at the startup of a program.

Related terms

The related term, plaintext
Plaintext

In cryptography, plaintext is the information which the sender wishes to transmit to the receiver. Before the computer era, plaintext simply meant text in the language of the communicating parties....
, is most commonly used in a cryptographic context, while cleartext
Cleartext

In data communications, cleartext is the form of a message or data which is in a form that is immediately comprehensible to a human being without additional processing....
 usually refers to lack of protection from eavesdropping
Eavesdropping

Eavesdropping is the act of surreptitiously listening to a private conversation. This is commonly thought to be unethical and there is an old adage that eavesdroppers seldom hear anything good of themselves....
. Usage of these terms is such that there is some confusion amongst them, especially among those new to computers, cryptography, or data communications.

Philosophy

This reveals that plain text is in fact the technical user's way to regard a file or a sequence of bytes. In this sense, there is no plain text, since bits are stored as states of latches, charges on transistor gates, microscopic
Microscopic

Microscopic is a term used to describe objects smaller than those that can easily be seen by the naked eye and which require a lens or microscope to see them clearly....
 magnetic or mechanical
Mechanical

* Mechanical engineering, a branch of engineering concerned with the application of physical mechanics* HVAC , the mechanical systems of a building* Mechanical , one of several characters in Shakespeare's A Midsummer Night's Dream...
 dots on a disk, etc, and human
Human

A human being, also human or man, is a member of a species of bipedalism primates in the family Hominidae . Mitochondrial DNA evidence indicates that modern humans originated in east Africa about 200,000 years ago....
s don't have the senses
Sense

Senses are the physiological methods of perception. The senses and their operation, classification, and theory are overlapping topics studied by a variety of fields, most notably neuroscience, cognitive psychology , and philosophy of perception....
 needed to read this. The information must thus appear as text (on screen or on paper) in order to be text in this absolute sense of the word.

Plain text is a way to represent generic
Generic

Generic is something that is general, comon, or inclusive rather than specific, unique, or selective.* Generic mood, a grammatical mood used to make generalized statements like Snow is white...
 text without attributes such as fonts, subscripts, and boldface; due to this simplicity, it is readable and processable by almost any computer program. In a way a HTML
HTML

HTML, an Acronym and initialism of HyperText Markup Language, is the predominant markup language for Web pages. It provides a means to describe the structure of text-based information in a document?by denoting certain text as links, headings, paragraphs, lists, and so on?and to supplement that text with interactive forms, embedded '...
, SGML and an XML file is regarded as plain text, since no control codes (see below) are used, but real structural tags are actually included in these formats. As regards to the SGML and XML author, these tags are "human readable" since that format author understands the structure by reading the format. This may illuminate the complications of the usage of terms within computer science: it's all a relative view point.

For plain text people write formatting like *bold*, /italic/ and _underline_. Someone use "^W" to cross the word at end (Like this crossed example^W^W^W^W).

Encoding


Character encodings

Text was once commonly encoded in ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
, using 8 bits
BITS

BITS or bits may refer to:* Binary digits* Drill bits* The pieces of a Spanish dollar* Bits , a Beanie Baby teddy bear produced by Ty, Inc....
 for one letter or other character, encoding 7 bits, allowing 128 values, and using the 8th as a checksum bit when transferring a file. This just allowed the ordinary Latin
Latin

Latin is an Italic language, historically spoken in Latium and Ancient Rome. Through the Military history of the Roman Empire, Latin spread throughout the Mediterranean and a large part of Europe....
 alphabet, transfer control codes, parentheses and interpunction, which annoyed especially Portuguese and Swedish computer users. Therefore, when data transfer became more stable, the remaining 128 values were encoded, everywhere differently, and in a way that made multilingual texts impossible to encode. At last Unicode
Unicode

Unicode is a computing industry standard allowing computers to consistently represent and manipulate Character expressed in most of the world's writing systems....
 was defined, which currently allows for 1,114,112 code values used for any modern text writing system, and a lot of extinct ones. For example Unicode codes Chinese, Hebrew, Cyrillic as well as Latin. Some of these text formats may be pretty complicated to process correctly, but they still contain no structural data, such as bold start and end markers, and are therefore plain text.

Control codes

The ASCII codes before SPACE (= 32 = 20H) are not intended as displayable characters, but instead as control characters. They are used for a diversity of interpreted meanings, for example the code NULL (= 0, sometimes denoted Ctrl-@) is used as string end markers in the programming language C and successors. Most troublesome of these are the codes LF (= LINE FEED = 10 = 0AH) and CR (= CARRIAGE RETURN = 13 = 0DH). Windows and OS/2
OS/2

OS/2 is a computer operating system, initially created by Microsoft and IBM, then later developed by IBM exclusively. The name stands for "Operating System/2," because it was introduced as part of the same generation change release as IBM's "IBM Personal System/2 " line of second-generation personal computers....
 require the sequence CR,LF to represent a newline, while Unix
Unix

Unix is a computer operating system originally developed in 1969 by a group of American Telephone & Telegraph employees at Bell Labs, including Ken Thompson , Dennis Ritchie, Douglas McIlroy, and Joe Ossanna....
 and relatives use just the LF, and Classic Mac OS
Mac OS

Mac OS is the trademarked name for a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems....
 (but not Mac OS X
Mac OS X

Mac OS X is a line of computer operating systems developed, marketed, and sold by Apple Inc., and since 2002 has been included with all new Macintosh computer systems....
) uses just the code CR. This was once a slight problem when transferring files between Windows and Unices, but today most computer programs treat this seamlessly.

See also

  • E-text
    E-text

    An e-text is, generally, any text-based information that is available in a digitally encoded human-readable format and read by electronic means, but more specifically it refers to files in the ASCII character encoding....
  • MIME Content-type
    MIME

    Multipurpose Internet Mail Extensions is an Internet standard that extends the format of electronic mail to support:* Text in character sets other than ASCII...
  • Formatted text
    Formatted text

    Formatted text, styled text or rich text, as opposed to plain text, has styling information beyond the minimum of semantic elements: colours, styles , sizes and special features ....
  • Filename extension
    Filename extension

    A filename extension is a substring to the filename of a computer file applied to indicate the encoding convention of its contents.In some operating systems it is optional, while in some others it is a requirement....
  • File format
    File format

    A file format is a particular way to encode information for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa....
  • Binary file
    Binary file

    A binary file is a computer file which may contain any type of data, encoded in Binary numeral system form for computer storage and processing purposes; for example, Document file format containing formatted text....
  • Text file
    Text file

    A text file is a kind of computer file that is structured as a sequence of line . A text file exists within a computer file system. The end of a text file is often denoted by placing one or more special characters, known as an end-of-file marker, after the last line in a text file....
  • Editor wars
  • File system
    File system

    In computing, a file system is a method for store and organize computer files and the data they contain to make it easy to find and access them....
  • Configuration file
    Configuration file

    In computing, configuration Computer files, or config files, are used to configure the initial settings for some computer programs. They are used for user application software, Server and operating system settings....
  • Source code
    Source code

    In computer science, source code is any collection of statements or declarations written in some human-readable computer programming language....