Quotation mark glyphs
Encyclopedia
Different typeface
Typeface
In typography, a typeface is the artistic representation or interpretation of characters; it is the way the type looks. Each type is designed and there are thousands of different typefaces in existence, with new ones being developed constantly....

s, character encoding
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...

s and computer languages use various encodings and glyphs for quotation marks. This article lists some of these glyph
Glyph
A glyph is an element of writing: an individual mark on a written medium that contributes to the meaning of what is written. A glyph is made up of one or more graphemes....

s along with their Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 code points and HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

 entities. The Unicode standard defines two general character categories, "Pi" (punctuation initial quote) and "Pf" (punctuation final quote), for all quotation mark characters.

Typewriter quotation marks

"Ambidextrous" quotation marks were introduced on typewriters to reduce the number of keys on the keyboard, and were inherited by computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

 keyboards and character sets. Some computer systems designed in the past had character sets with proper opening and closing quotes. However, the ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 character set, which has been used on a wide variety of computers since the 1960s, only contained straight single quote and apostrophe
Apostrophe
The apostrophe is a punctuation mark, and sometimes a diacritic mark, in languages that use the Latin alphabet or certain other alphabets...

 (', U+0027) and double quote (" U+0022).

Many systems, like the personal computer
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...

s of the 1980s and early '90s, actually drew these quotes like curved closing quotes on-screen and in printouts, so text would appear like this (approximately):
”Good morning, Dave”, said HAL.
’Good morning, Dave’, said HAL.


These same systems often drew the grave accent
Grave accent
The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient...

 (`, U+0060) as an open quote glyph (actually a high-reversed-9 glyph, to preserve some usability as a grave). This gives a proper appearance at the cost of semantic correctness. Nothing similar was available for the double quote, so many people resorted to using two single quotes for double quotes, which would look like the following:
‛‛Good morning, Dave’’, said HAL.
‛Good morning, Dave’, said HAL.


The typesetting application TeX
TeX
TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....

 still uses this convention for input files. However, the appearance of these characters has varied greatly from font to font. On systems which provide straight quotes and grave accents like most do today (and as Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 specifies) the result is poor as shown here:
``Good morning, Dave'', said HAL.
`Good morning, Dave', said HAL.


The Unicode slanted/curved quotes described below are shown here for comparison:
“Good morning, Dave”, said HAL.
‘Good morning, Dave’, said HAL.

Quotation marks in English

English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

 curved quotes, also called “book quotes” or “curly quotes”, resemble small figures six and nine raised above the baseline (like 6...9 and 66...99), but then solid, i.e., with the counter
Counter (typography)
In typography, a counter or aperture is an area entirely or partially enclosed by a letter form or a symbol . Letters containing closed counters include A, B, D, O, P, Q, R, a, b, d, e, g, o, p, and q. Letters containing open counters include c, f, h, i, s etc. The digits 0, 4, 6, 8, and 9 also...

s filled. In many typeface
Typeface
In typography, a typeface is the artistic representation or interpretation of characters; it is the way the type looks. Each type is designed and there are thousands of different typefaces in existence, with new ones being developed constantly....

s, the shapes are the same as those of an inverted (upside down) and normal comma
Comma (punctuation)
The comma is a punctuation mark. It has the same shape as an apostrophe or single closing quotation mark in many typefaces, but it differs from them in being placed on the baseline of the text. Some typefaces render it as a small line, slightly curved or straight but inclined from the vertical, or...

. They are preferred in formal writing and printed typography
Typography
Typography is the art and technique of arranging type in order to make language visible. The arrangement of type involves the selection of typefaces, point size, line length, leading , adjusting the spaces between groups of letters and adjusting the space between pairs of letters...

.

Quotation marks in electronic documents

In e-mail
E-mail
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

 and on Usenet
Usenet
Usenet is a worldwide distributed Internet discussion system. It developed from the general purpose UUCP architecture of the same name.Duke University graduate students Tom Truscott and Jim Ellis conceived the idea in 1979 and it was established in 1980...

, curved quotes can only be used by using a MIME type with a character set outside of the ISO-8859 series such as a Unicode encoding or one of the Windows-125x series. In most cases, (the exceptions being if UTF-7 is used or if the 8BITMIME extension is present), this also requires the use of a content-transfer encoding. A few mail clients send curved quotes using the windows-1252
Windows-1252
Windows-1252 or CP-1252 is a character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages. It is one version within the group of Windows code pages...

 codes, but mark the text as ISO-8859-1, causing problems for decoders that do not make the dubious assumption that C1 control codes in ISO-8859-1 text were meant to be windows-1252 printable characters.

Curved and straight quotes are also sometimes referred to as smart quotes (“…”) and dumb quotes ("…") respectively; these names are in reference to the name of a function found in several word processors that automatically converts straight quotes typed by the user into curved quotes.
This function, known as “educating quotes”, was developed for systems that lack separate open- and close-quote keyboard keys.

Supporting curved quotes has been a problem in information technology, primarily because the widely used ASCII character set did not include a representation for them (as discussed above).

Word processors have traditionally offered curved quotes to users, because in printed documents curved quotes are preferred to straight ones. Before Unicode was widely accepted and supported, this meant representing the curved quotes in whatever 8-bit encoding the software and underlying operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 were using—but the character sets for Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 and Macintosh used two different pairs of values for curved quotes, and ISO 8859-1 (historically the default character set for the Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

es and older Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 systems) has no curved quotes, making cross-platform compatibility quite difficult to implement.

Compounding the problem is the “smart quotes” feature mentioned above, which some word processors (including Microsoft Word and OpenOffice.org
OpenOffice.org
OpenOffice.org, commonly known as OOo or OpenOffice, is an open-source application suite whose main components are for word processing, spreadsheets, presentations, graphics, and databases. OpenOffice is available for a number of different computer operating systems, is distributed as free software...

) use by default. With this feature turned on, users may not have realised that the ASCII-compatible straight quotes they were typing on their keyboards ended up as something entirely different.

Further, the “smart quotes” feature converts opening apostrophes (such as in the words ’tis, ’em, and ’til) into opening single quotation marks—essentially upside-down apostrophes. A blatant example of this error appears in the advertisements for the television show Til Death
'Til Death
’Til Death is an American sitcom which aired on the Fox network from September 7, 2006, to June 20, 2010. The series was created by husband-and-wife team Josh Goldsmith and Cathy Yuspa, who were also the writers and executive-producers...

.

Unicode support has since become the norm for operating systems. Thus, in at least some cases, transferring content containing curved quotes (or any other non-ASCII characters) from a word processor to another application or platform has sometimes been less troublesome, provided all steps in the process (including the clipboard
Clipboard (software)
The clipboard is a software facility that can be used for short-term data storage and/or data transfer between documents or applications, via copy and paste operations...

 if applicable) are Unicode-aware. But there are many applications which still use the older character sets, or output data using them, and thus problems still occur.

There are other considerations for including curved quotes in the widely used markup language
Markup language
A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...

s HTML, XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

, and SGML. If the encoding of the document supports direct representation of the characters, they can be used, but doing so can result in difficulties if the document needs to be edited by someone who is using an editor that cannot support the encoding. For example, many simple text editors only handle a few encodings or assume that the encoding of any file opened is a platform default, so the quote characters may appear as “garbage”. HTML includes a set of entities for curved quotes: ‘ (left single), ’ (right single), ‚ (low 9 single), “ (left double), ” (right double), and „ (low 9 double). XML does not define these by default, but specifications based on it can do so, and XHTML does. In addition, while the HTML 4, XHTML and XML specifications allow specifying numeric character references in either hexadecimal or decimal, SGML and older versions of HTML (and many old implementations) only support decimal references. Thus, to represent curly quotes in XML and SGML, it is safest to use the decimal numeric character references. That is, to represent the double curly quotes use “ and ”, and to represent single curly quotes use ‘ and ’. Both numeric and named references function correctly in almost every modern browser. While using numeric references can make a page more compatible with outdated browsers, using named references are safer for systems that handle multiple character encodings (i.e. RSS aggregators and search results).

Quotation marks in Unicode

View CodeUnicode nameHTMLComments
" U+0022 Quotation mark " Typewriter (“programmer’s”) quote, ambidextrous
' U+0027 Apostrophe ' Typewriter (“programmer’s”) straight single quote, ambidextrous
« U+00AB Left-pointing double angle quotation mark « Double angle quote (chevron, guillemet
Guillemets
Guillemets , also called angle quotes, are line segments, pointed as if arrows , sometimes forming a complementary set of punctuation marks used as a form of quotation mark....

, duck-foot quote), left
» U+00BB Right-pointing double angle quotation mark » Double angle quote, right
U+2018 Left single quotation mark ‘ Single curved quote, left
U+2019 Right single quotation mark ’ Single curved quote, right
U+201A Single low-9 quotation mark ‚ Low single curved quote, left
U+201B Single high-reversed-9 quotation mark ‛ also called single reversed comma, quotation mark
U+201C Left double quotation mark “ Double curved quote, or “curly quote”, left
U+201D Right double quotation mark ” Double curved quote, right
U+201E Double low-9 quotation mark „ Low double curved quote, left
U+201F Double high-reversed-9 quotation mark ‟ also called double reversed comma, quotation mark
U+2039 Single left-pointing angle quotation mark ‹ Single angle quote, left
U+203A Single right-pointing angle quotation mark › Single angle quote, right
U+300C Left corner bracket 「 CJK
U+300D Right corner bracket 」 CJK
U+300E Left white corner bracket 『 CJK
U+300F Right white corner bracket 』 CJK
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK