Markup language
Encyclopedia
A markup language is a modern system for annotating
Annotation
An annotation is a note that is made while reading any form of text. This may be as simple as underlining or highlighting passages.Annotated bibliographies give descriptions about how each source is useful to an author in constructing a paper or argument...

 a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts. Examples are typesetting instructions such as those found in troff
Troff
troff is a document processing system developed by AT&T for the Unix operating system.-History:troff can trace its origins back to a text formatting program called RUNOFF, written by Jerome H. Saltzer for MIT's CTSS operating system in the mid-1960s...

 and LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...

, and structural markers such as XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 tags. Markup is typically omitted from the version of the text which is displayed for end-user consumption. Some markup languages, like HTML have presentation semantics
Presentation semantics
In computer science, particularly in human-computer interaction, presentation semantics specify how a particular piece of a formal language is represented in a distinguished manner accessible to human senses, usually human vision. For example, saying that .....

, meaning their specification prescribes how the structured data is to be presented, but other markup languages, like XML, have no predefined semantics.

A well-known example of a markup language in widespread use today is HyperText
Hypertext
Hypertext is text displayed on a computer or other electronic device with references to other text that the reader can immediately access, usually by a mouse click or keypress sequence. Apart from running text, hypertext may contain tables, images and other presentational devices. Hypertext is the...

 Markup Language (HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

), one of the document formats of the World Wide Web. HTML is mostly an instance of SGML (though, strictly, it does not comply with all the rules of SGML) and follows many of the markup conventions used in the publishing industry in the communication of printed work between authors, editors, and printers.

Types

There are three general categories of electronic markup: Presentational, procedural, and descriptive.
  • Presentational markup is that used by traditional word-processing systems, binary codes embedded in document text that produced the WYSIWYG
    WYSIWYG
    WYSIWYG is an acronym for What You See Is What You Get. The term is used in computing to describe a system in which content displayed onscreen during editing appears in a form closely corresponding to its appearance when printed or displayed as a finished product...

     effect. Such markup is usually designed to be hidden from human users, even those who are authors or editors.

  • Procedural markup is embedded in text and provides instructions for programs that are to process the text. Well-known examples include troff
    Troff
    troff is a document processing system developed by AT&T for the Unix operating system.-History:troff can trace its origins back to a text formatting program called RUNOFF, written by Jerome H. Saltzer for MIT's CTSS operating system in the mid-1960s...

    , LaTeX
    LaTeX
    LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...

    , and PostScript
    PostScript
    PostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...

    ; it is expected that the processor runs through the text from beginning to end, following the instructions as encountered. Text with such markup is often edited with the markup visible and directly manipulated by the author. Popular procedural-markup systems usually include programming constructs, such that macros or subroutines can be defined and invoked by name.

  • In Descriptive markup, the markup is used to label parts of the document rather than to provide specific instructions as to how they should be processed. The objective is to decouple the inherent structure of the document from any particular treatment or rendition of it. Such markup is often described as "semantic". An example of descriptive markup would be HTML's <cite> tag, which is used to label a citation.


There is considerable blurring of the lines between the types of markup. In modern word-processing systems, presentational markup is often saved in descriptive-markup-oriented systems such as XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

, and then processed procedurally by implementations. The programming constructs in descriptive-markup systems such as TeX
TeX
TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....

 may be used to create higher-level markup systems which are more descriptive, such as LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...

.

In recent years, a number of small and largely unstandardized markup languages have been developed to allow authors to create formatted text via web browsers, for use in wiki
Wiki
A wiki is a website that allows the creation and editing of any number of interlinked web pages via a web browser using a simplified markup language or a WYSIWYG text editor. Wikis are typically powered by wiki software and are often used collaboratively by multiple users. Examples include...

s and web forums. These are sometimes called Lightweight markup language
Lightweight markup language
A lightweight markup language is a markup language with a simple syntax, designed to be easy for a human to enter with a simple text editor, and easy to read in its raw form....

s. The markup language used by Wikipedia
Wikipedia
Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Its 20 million articles have been written collaboratively by volunteers around the world. Almost all of its articles can be edited by anyone with access to the site,...

 is one such.

History

The term markup is derived from the traditional publishing practice of "marking up" a manuscript
Manuscript
A manuscript or handwrite is written information that has been manually created by someone or some people, such as a hand-written letter, as opposed to being printed or reproduced some other way...

, which involves adding handwritten annotations in the form of conventional symbolic printer
Printing
Printing is a process for reproducing text and image, typically with ink on paper using a printing press. It is often carried out as a large-scale industrial process, and is an essential part of publishing and transaction printing....

's instructions in the margins and text of a paper manuscript or printed proof
Proofreading
Proofreading is the reading of a galley proof or computer monitor to detect and correct production-errors of text or art. Proofreaders are expected to be consistently accurate by default because they occupy the last stage of typographic production before publication.-Traditional method:A proof is...

. For centuries, this task was done primarily by skilled typographers known as "markup men" or "copy markers" who marked up text to indicate what typeface
Typeface
In typography, a typeface is the artistic representation or interpretation of characters; it is the way the type looks. Each type is designed and there are thousands of different typefaces in existence, with new ones being developed constantly....

, style, and size should be applied to each part, and then passed the manuscript to others for typesetting
Typesetting
Typesetting is the composition of text by means of types.Typesetting requires the prior process of designing a font and storing it in some manner...

 by hand. Markup was also commonly applied by editors, proofreaders, publishers, and graphic designers, and indeed by document authors.

GenCode

The first well-known public presentation of markup languages in computer text processing was made by William W. Tunnicliffe
William W. Tunnicliffe
William W. Tunnicliffe is credited by Charles Goldfarb as being the first person to articulate the idea of separating the definition of formatting from the structure of content in electronic documents .In September 1967, during a meeting at the Canadian Government Printing Office, Tunnicliffe...

 at a conference in 1967, although he preferred to call it generic coding. It can be seen as a response to the emergence of programs such as RUNOFF
RUNOFF
RUNOFF was the first computer text formatting program to see significant use. It was written in 1964 for the CTSS operating system by Jerome H. Saltzer in MAD assembler....

 that each used their own control notations, often specific to the target typesetting device. In the 1970s, Tunnicliffe led the development of a standard called GenCode for the publishing industry and later was the first chair of the International Organization for Standardization
International Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...

 committee that created SGML, the first standard descriptive markup language. Book designer Stanley Rice published speculation along similar lines in 1970. Brian Reid, in his 1980 dissertation at Carnegie Mellon University
Carnegie Mellon University
Carnegie Mellon University is a private research university in Pittsburgh, Pennsylvania, United States....

, developed the theory and a working implementation of descriptive markup in actual use.

However, IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 researcher Charles Goldfarb
Charles Goldfarb
Charles F. Goldfarb is known as the father of SGML and is a co-inventor of the concept of markup languages. In 1969 Charles Goldfarb, leading a small team at IBM, developed the first markup language, called Generalized Markup Language, or GML. In an , Dr...

 is more commonly seen today as the "father" of markup languages. Goldfarb hit upon the basic idea while working on a primitive document management system intended for law firms in 1969, and helped invent IBM GML later that same year. GML was first publicly disclosed in 1973.

In 1975, Goldfarb moved from Cambridge, Massachusetts
Cambridge, Massachusetts
Cambridge is a city in Middlesex County, Massachusetts, United States, in the Greater Boston area. It was named in honor of the University of Cambridge in England, an important center of the Puritan theology embraced by the town's founders. Cambridge is home to two of the world's most prominent...

 to Silicon Valley
Silicon Valley
Silicon Valley is a term which refers to the southern part of the San Francisco Bay Area in Northern California in the United States. The region is home to many of the world's largest technology corporations...

 and became a product planner at the IBM Almaden Research Center. There, he convinced IBM's executives to deploy GML commercially in 1978 as part of IBM's Document Composition Facility product, and it was widely used in business within a few years.

SGML, which was based on both GML and GenCode, was developed by Goldfarb in 1974. Goldfarb eventually became chair of the SGML committee. SGML was first released by ISO as the ISO 8879 standard in October 1986.

Some early examples of computer markup languages available outside the publishing industry can be found in typesetting tools on Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 systems such as troff
Troff
troff is a document processing system developed by AT&T for the Unix operating system.-History:troff can trace its origins back to a text formatting program called RUNOFF, written by Jerome H. Saltzer for MIT's CTSS operating system in the mid-1960s...

 and nroff
Nroff
nroff is a Unix text-formatting program; it produces output suitable for simple fixed-width printers and terminal windows...

. In these systems, formatting commands were inserted into the document text so that typesetting software could format the text according to the editor's specifications. It was a trial and error
Trial and error
Trial and error, or trial by error, is a general method of problem solving, fixing things, or for obtaining knowledge."Learning doesn't happen from failure itself but rather from analyzing the failure, making a change, and then trying again."...

 iterative process to get a document printed correctly. Availability of WYSIWYG
WYSIWYG
WYSIWYG is an acronym for What You See Is What You Get. The term is used in computing to describe a system in which content displayed onscreen during editing appears in a form closely corresponding to its appearance when printed or displayed as a finished product...

 ("what you see is what you get") publishing software supplanted much use of these languages among casual users, though serious publishing work still uses markup to specify the non-visual structure of texts, and WYSIWYG editors now usually save documents in a markup-language-based format.

TeX

Another major publishing standard is TeX
TeX
TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....

, created and continuously refined by Donald Knuth
Donald Knuth
Donald Ervin Knuth is a computer scientist and Professor Emeritus at Stanford University.He is the author of the seminal multi-volume work The Art of Computer Programming. Knuth has been called the "father" of the analysis of algorithms...

 in the 1970s and '80s. TeX
TeX
TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....

 concentrated on detailed layout of text and font descriptions in order to typeset mathematical books in professional quality. This required Knuth to spend considerable time investigating the art of typesetting
Typesetting
Typesetting is the composition of text by means of types.Typesetting requires the prior process of designing a font and storing it in some manner...

. However, TeX has a steep learning curve, so that it is mainly used in academia
Academia
Academia is the community of students and scholars engaged in higher education and research.-Etymology:The word comes from the akademeia in ancient Greece. Outside the city walls of Athens, the gymnasium was made famous by Plato as a center of learning...

, where it is the de facto standard in many scientific disciplines. A TeX macro package known as LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...

 provides a descriptive markup system on top of TeX, and is widely used.

Scribe, GML and SGML

The first language to make a clear and clean distinction between structure and presentation was Scribe
Scribe (markup language)
Scribe is a markup language and word processing system which pioneered the use of descriptive markup. Scribe was revolutionary when it was proposed, because it involved for the first time a clean separation of structure and format.-Beginnings:...

, developed by Brian Reid and described in his doctoral thesis in 1980. Scribe was revolutionary in a number of ways, not least that it introduced the idea of styles separated from the marked up document, and of a grammar
Grammar
In linguistics, grammar is the set of structural rules that govern the composition of clauses, phrases, and words in any given natural language. The term refers also to the study of such rules, and this field includes morphology, syntax, and phonology, often complemented by phonetics, semantics,...

 controlling the usage of descriptive elements. Scribe influenced the development of Generalized Markup Language (later SGML) and is a direct ancestor to HTML and LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...

.

In the early 1980s, the idea that markup should be focused on the structural aspects of a document and leave the visual presentation of that structure to the interpreter led to the creation of SGML. The language was developed by a committee chaired by Goldfarb. It incorporated ideas from many different sources, including Tunnicliffe's project, GenCode. Sharon Adler, Anders Berglund
Anders Berglund
Anders Berglund is a Swedish organizer, composer, conductor, pianist and musician.-Career:Berglund is known as the Melodifestivalen conductor 16 times....

, and James A. Marke were also key members of the SGML committee.

SGML specified a syntax for including the markup in documents, as well as one for separately describing what tags
Tag (metadata)
In online computer systems terminology, a tag is a non-hierarchical keyword or term assigned to a piece of information . This kind of metadata helps describe an item and allows it to be found again by browsing or searching...

 were allowed, and where (the Document Type Definition (DTD
Document Type Definition
Document Type Definition is a set of markup declarations that define a document type for SGML-family markup languages...

) or schema
XML schema
An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself...

). This allowed authors to create and use any markup they wished, selecting tags that made the most sense to them and were named in their own natural languages. Thus, SGML is properly a meta-language, and many particular markup languages are derived from it. From the late '80s on, most substantial new markup languages have been based on SGML system, including for example TEI
Text Encoding Initiative
The Text Encoding Initiative is a text-centric community of practice in the academic field of digital humanities. The community runs a mailing list, meetings and conference series, and maintains a technical standard, a wiki and a toolset....

 and DocBook
DocBook
DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation....

. SGML was promulgated as an International Standard by International Organization for Standardization
International Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...

, ISO 8879, in 1986.

SGML found wide acceptance and use in fields with very large-scale documentation requirements. However, it was generally found to be cumbersome and difficult to learn, a side effect of attempting to do too much and be too flexible. For example, SGML made end tags
Tag (metadata)
In online computer systems terminology, a tag is a non-hierarchical keyword or term assigned to a piece of information . This kind of metadata helps describe an item and allows it to be found again by browsing or searching...

 (or start-tags, or even both) optional in certain contexts, because it was thought that markup would be done manually by overworked support staff who would appreciate saving keystrokes.

HTML

By 1991, it appeared to many that SGML would be limited to commercial and data-based applications while WYSIWYG
WYSIWYG
WYSIWYG is an acronym for What You See Is What You Get. The term is used in computing to describe a system in which content displayed onscreen during editing appears in a form closely corresponding to its appearance when printed or displayed as a finished product...

 tools (which stored documents in proprietary binary formats) would suffice for other document processing
Document processing
Document Processing involves the conversion of typed and handwritten text on paper-based & electronic documents into electronic information utilising one of, or a combination of, Intelligent Character Recognition , Optical Character Recognition and experienced Data Entry Clerks....

 applications. The situation changed when Sir Tim Berners-Lee
Tim Berners-Lee
Sir Timothy John "Tim" Berners-Lee, , also known as "TimBL", is a British computer scientist, MIT professor and the inventor of the World Wide Web...

, learning of SGML from co-worker Anders Berglund and others at CERN
CERN
The European Organization for Nuclear Research , known as CERN , is an international organization whose purpose is to operate the world's largest particle physics laboratory, which is situated in the northwest suburbs of Geneva on the Franco–Swiss border...

, used SGML syntax to create HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

. HTML resembles other SGML-based tag languages, although it began as simpler than most and a formal DTD was not developed until later. Steven DeRose
Steven DeRose
Steven J DeRose is a computer scientist with a significant history of contributions to Computational Linguistics and to key standards related to document processing, mostly around ISO's Standard Generalized Markup Language and W3C's Extensible Markup Language .His contributions include the...

 argues that HTML's use of descriptive markup (and SGML in particular) was a major factor in the success of the Web, because of the flexibility and extensibility that it enabled (other factors include the notion of URLs and the free distribution of browsers). HTML is quite likely the most used markup language in the world today.

Some would restrict the term "markup language" to systems that directly support non-hierarchical structures (see Hierarchical model
Hierarchical model
A hierarchical database model is a data model in which the data is organized into a tree-like structure. The structure allows representing information using parent/child relationships: each parent can have many children, but each child has only one parent...

). By this definition HTML, XML, and even SGML (apart from its rarely used CONCUR option) would be disqualified and called "container languages" instead. However, the term "container language" is not in widespread use, and such hierarchical languages are almost universally considered markup languages. There is active research on non-hierarchical markup models, some expressed within XML and related languages (for example, using the Text Encoding Initiative
Text Encoding Initiative
The Text Encoding Initiative is a text-centric community of practice in the academic field of digital humanities. The community runs a mailing list, meetings and conference series, and maintains a technical standard, a wiki and a toolset....

 Guidelines and derivatives such as the Open Scripture Information Standard
Open Scripture Information Standard
Open Scripture Information Standard is an XML application , that defines tags for marking up Bibles, theological commentaries, and other related literature....

 and CLIX
Clix
Clix can be:* CLIX , a version of UNIX developed by Intergraph* Clix , a Portuguese triple play brand* Axe Clix, a deodorant* iriver clix, rebrand of the iriver U10, a multimedia player...

), and some not (for example, MECS
MECS
MECS is the Multi-Element Code System, a markup system developed by the Wittgenstein Archives at the University of Bergen. It is very similar to SGML and XML except that it allows elements to overlap....

 and the Layed Markup and Annotation Language or LMNL). Much of this research is published in the proceedings of the Extreme Markup and Balisage
Balisage
Balisage is, most commonly in military applications, the use of dim lighting to enable navigation while not giving away one's position to the enemy....

 conferences, generally held in Montreal
Montreal
Montreal is a city in Canada. It is the largest city in the province of Quebec, the second-largest city in Canada and the seventh largest in North America...

.

XML

XML (Extensible Markup Language) is a meta markup language that is now widely used. XML was developed by the World Wide Web Consortium
World Wide Web Consortium
The World Wide Web Consortium is the main international standards organization for the World Wide Web .Founded and headed by Tim Berners-Lee, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the...

, in a committee created and chaired by Jon Bosak
Jon Bosak
Jon Bosak led the creation of the XML specification at the W3C. From 1996–2008, he worked for Sun Microsystems.-XML:Tim Bray, who was one of the editors of the XML specification, has this to say in his note on Bosak in his annotated version of the specification:In a 1999 posting to the xml-dev...

. The main purpose of XML was to simplify SGML by focusing on a particular problem — documents on the Internet. XML remains a meta-language like SGML, allowing users to create any tags needed (hence "extensible") and then describing those tags and their permitted uses.

XML adoption was helped because every XML document can be written in such a way that it is also an SGML document, and existing SGML users and software could switch to XML fairly easily. However, XML eliminated many of the more complex and human-oriented features of SGML to simplify implementation environments such as documents and publications. However, it appeared to strike a happy medium between simplicity and flexibility, and was rapidly adopted for many other uses. XML is now widely used for communicating data
Database transaction
A transaction comprises a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions...

 between applications. Like HTML, it can be described as a 'container' language.

XHTML

Since January 2000 all W3C Recommendation
W3C recommendation
A W3C Recommendation is the final stage of a ratification process of the World Wide Web Consortium working group concerning a technical standard. This designation signifies that a document has been subjected to a public and W3C-member organization's review. It aims to standardise the Web technology...

s for HTML have been based on XML rather than SGML, using the abbreviation XHTML
XHTML
XHTML is a family of XML markup languages that mirror or extend versions of the widely-used Hypertext Markup Language , the language in which web pages are written....

 (Extensible HyperText Markup Language). The language specification requires that XHTML Web documents must be well-formed XML documents – this allows for more rigorous and robust documents while using tags familiar from HTML.

One of the most noticeable differences between HTML and XHTML is the rule that all tags must be closed: empty HTML tags such as
must either be closed with a regular end-tag, or replaced by a special form:  /> (the space before the '/' on the end tag is optional, but frequently used because it enables some pre-XML Web browsers, and SGML parsers, to accept the tag). Another is that all attribute values in tags must be quoted. Finally, all tag and attribute names must be lowercase in order to be valid; HTML, on the other hand, was case-insensitive.

Other XML-based applications

Many XML-based applications now exist, including Resource Description Framework
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...

 (RDF), XForms
XForms
XForms is an XML format for the specification of a data processing model for XML data and user interface for the XML data, such as web forms...

, DocBook
DocBook
DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation....

, SOAP and the Web Ontology Language
Web Ontology Language
The Web Ontology Language is a family of knowledge representation languages for authoring ontologies.The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web...

 (OWL). For a partial list of these see List of XML markup languages.

Features

A common feature of many markup languages is that they intermix the text of a document with markup instructions in the same data stream or file. This is not necessary; it is possible to isolate markup from text content, using pointers, offsets, IDs, or other methods to co-ordinate the two. Such "standoff markup" is typical for the internal representations that programs use to work with marked-up documents. However, embedded or "inline" markup is much more common elsewhere. Here, for example, is a small section of text marked up in HTML:

<h1> Anatidae </h1>
<p>
The family <i>Anatidae</i> includes ducks, geese, and swans,
but <em>not</em> the closely related screamers.
</p>

The codes enclosed in angle-brackets <like this> are markup instructions (known as tags
Tag (metadata)
In online computer systems terminology, a tag is a non-hierarchical keyword or term assigned to a piece of information . This kind of metadata helps describe an item and allows it to be found again by browsing or searching...

), while the text between these instructions is the actual text of the document. The codes h1, p, and em are examples of semantic markup, in that they describe the intended purpose or meaning of the text they include. Specifically, h1 means "this is a first-level heading", p means "this is a paragraph", and em means "this is an emphasized word or phrase". A program interpreting such structural markup may apply its own rules or styles for presenting the various pieces of text, using diffent typefaces, boldness, font size, indention, colour, or other styles, as desired.
A tag such as "h1" (header level 1) might be presented in a large bold sans-serif typeface, for example, or in a monospaced (typewriter-style) document it might be underscored – or it might not change the presentation at all.

In contrast, the i tag in HTML is an example of presentational markup; it is generally used to specify a particular characteristic of the text (in this case, the use of an italic typeface) without specifying the reason for that appearance.

The Text Encoding Initiative
Text Encoding Initiative
The Text Encoding Initiative is a text-centric community of practice in the academic field of digital humanities. The community runs a mailing list, meetings and conference series, and maintains a technical standard, a wiki and a toolset....

 (TEI) has published extensive guidelines for how to encode texts of interest in the humanities and social sciences, developed through years of international cooperative work. These guidelines are used by projects encoding historical documents, the works of particular scholars, periods, or genres, and so on.

Alternative usage

While the idea of markup language originated with text documents, there is an increasing usage of markup languages in other areas which involve the presentation of various types of information, including playlist
Playlist
In its most general form, a playlist is simply a list of songs. They can be played in sequential or shuffled order. The term has several specialized meanings in the realms of radio broadcasting and personal computers.-In radio:...

s, vector graphics
Vector graphics
Vector graphics is the use of geometrical primitives such as points, lines, curves, and shapes or polygon, which are all based on mathematical expressions, to represent images in computer graphics...

, web service
Web service
A Web service is a method of communication between two electronic devices over the web.The W3C defines a "Web service" as "a software system designed to support interoperable machine-to-machine interaction over a network". It has an interface described in a machine-processable format...

s, content syndication
Web syndication
Web syndication is a form of syndication in which website material is made available to multiple other sites. Most commonly, web syndication refers to making web feeds available from a site in order to provide other people with a summary or update of the website's recently added content...

, and user interface
User interface
The user interface, in the industrial design field of human–machine interaction, is the space where interaction between humans and machines occurs. The goal of interaction between a human and a machine at the user interface is effective operation and control of the machine, and feedback from the...

s. Most of these are XML applications because it is a well-defined and extensible language.

The use of XML has also led to the possibility of combining multiple markup languages into a single profile, like XHTML+SMIL
XHTMLplusSMIL
XHTML+SMIL is a W3C Note that describes an integration of SMIL semantics with XHTML and CSS. It is based generally upon the HTML+TIME submission. The language is also known as HTML+SMIL....

 and XHTML+MathML+SVG

Because markup languages, and more generally data description languages (not necessarily textual markup), are not programming languages (they are data without instructions), they are more easily manipulated than programming languages – for example, web pages are presented as HTML documents, not C code, and thus can be embedded within other web pages, displayed when only partially received, and so forth. This leads to the web design principle of the "Rule of Least Power", which advocates using the least (computationally) powerful language that satisfies a task to facilitate such manipulation and reuse.

See also

  • ColdFusion Markup Language
    ColdFusion Markup Language
    ColdFusion Markup Language, more commonly known as CFML, is a scripting language for web development that runs on the JVM, the .NET framework, and Google App Engine...

  • CSS
    Cascading Style Sheets
    Cascading Style Sheets is a style sheet language used to describe the presentation semantics of a document written in a markup language...

     (Cascading Style Sheets)
  • Curl (declarative markup and functional programming)
  • Lightweight markup language
    Lightweight markup language
    A lightweight markup language is a markup language with a simple syntax, designed to be easy for a human to enter with a simple text editor, and easy to read in its raw form....

  • List of markup languages
  • Parameter Value Language
    Parameter Value Language
    In computer programming, Parameter Value Language is a markup language similar to XML. It is commonly employed for entries in the Planetary Database System used by NASA to store mission data, among other uses.- External links :*...

  • Programming language
    Programming language
    A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

     (contrast)
  • Scalable Vector Graphics
  • Struxt
    Struxt
    Struxt is a human-readable data format designed to be structurally equivalent to XML yet representationally similar to C-style programming languages.'Struxt stands for "Structured Text".- Features :...

     (C-style equivalent to XML)
  • UDO (markup language)
    UDO (markup language)
    UDO is a lightweight markup language. It requires the installation of a special UDO "converter program" that can convert UDO documents to Apple-QuickView, ASCII, HTML, Texinfo, Linuxdoc-SGML, Manualpage, Pure-C-Help, Rich Text Format, ST-Guide, LaTeX, Turbo Vision Help or Windows Help formats....

  • User interface markup language
    User interface markup language
    A user interface markup language is a markup language that renders and describes graphical user interfaces and controls. Many of these markup languages are dialects of XML and are dependent upon a pre-existing scripting language engine, usually a JavaScript engine, for rendering of controls and...

  • Vector graphics markup language
  • YAML
    YAML
    YAML is a human-readable data serialization format that takes concepts from programming languages such as C, Perl, and Python, and ideas from XML and the data format of electronic mail . YAML was first proposed by Clark Evans in 2001, who designed it together with Ingy döt Net and Oren Ben-Kiki...

     (YAML is not a markup language, but it's close)
  • Wikitext
    Wikitext
    Wikitext language, or wiki markup, is a lightweight markup language used to write pages in wiki websites, such as Wikipedia, and is a simplified alternative/intermediate to HTML. Its ultimate purpose is to be converted by wiki software into HTML, which in turn is served to web browsers.There is no...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK