Portable Document Format
Encyclopedia
Portable Document Format (PDF) is an open standard
Open standard
An open standard is a standard that is publicly available and has various rights to use associated with it, and may also have various properties of how it was designed . There is no single definition and interpretations vary with usage....

 for document exchange. This file format
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...

, created by Adobe Systems
Adobe Systems
Adobe Systems Incorporated is an American computer software company founded in 1982 and headquartered in San Jose, California, United States...

 in 1993, is used for representing documents in a manner independent of application software
Application software
Application software, also known as an application or an "app", is computer software designed to help the user to perform specific tasks. Examples include enterprise software, accounting software, office suites, graphics software and media players. Many application programs deal principally with...

, hardware, and operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s.
Each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, graphics, and other information needed to display it.

In 1991 Adobe Systems
Adobe Systems
Adobe Systems Incorporated is an American computer software company founded in 1982 and headquartered in San Jose, California, United States...

 co-founder John Warnock
John Warnock
John Edward Warnock is an American computer scientist best known as the co-founder with Charles Geschke of Adobe Systems Inc., the graphics and publishing software company. Dr. Warnock was President of Adobe for his first two years and Chairman and CEO for his remaining sixteen years at the company...

 outlined a system called "Camelot" that evolved into the Portable Document Format (PDF).

While the PDF specification was available for free since at least 2001, PDF was originally a proprietary format controlled by Adobe, and was officially released as an open standard
Open standard
An open standard is a standard that is publicly available and has various rights to use associated with it, and may also have various properties of how it was designed . There is no single definition and interpretations vary with usage....

 on July 1, 2008, and published by the International Organization for Standardization
International Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...

 as ISO 32000-1:2008. In 2008, Adobe published a Public Patent License to ISO 32000-1 granting a royalty-free rights for all patents owned by Adobe that are necessary to make, use, sell and distribute PDF compliant implementations.

History

PDF's adoption in the early days of the format's history was slow. Adobe Acrobat
Adobe Acrobat
Adobe Acrobat is a family of application software developed by Adobe Systems to view, create, manipulate, print and manage files in Portable Document Format . All members of the family, except Adobe Reader , are commercial software, while the latter is available as freeware and can be downloaded...

, Adobe's suite for reading and creating PDF files, was not freely available; early versions of PDF had no support for external hyperlinks, reducing its usefulness on the Internet; the larger size of a PDF document compared to plain text required longer download times over the slower modem
Modem
A modem is a device that modulates an analog carrier signal to encode digital information, and also demodulates such a carrier signal to decode the transmitted information. The goal is to produce a signal that can be transmitted easily and decoded to reproduce the original digital data...

s common at the time; and rendering PDF files was slow on the less powerful machines of the day. Additionally, there were competing formats such as DjVu
DjVu
DjVu is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy...

 (still developing), Envoy
Envoy (WordPerfect)
In computing, Envoy was a proprietary portable document file format marketed by WordPerfect Corporation, created as a competitor for Acrobat Pro. It was introduced by Tumbleweed Communications Corporation in 1993 and shipped with WordPerfect Office in March 1994.An Envoy file could be created by...

, Common Ground Digital Paper, Farallon Replica and even Adobe's own PostScript
PostScript
PostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...

 format (.ps); in those early years, PDF was popular mainly in desktop publishing
Desktop publishing
Desktop publishing is the creation of documents using page layout software on a personal computer.The term has been used for publishing at all levels, from small-circulation documents such as local newsletters to books, magazines and newspapers...

 workflow
Workflow
A workflow consists of a sequence of connected steps. It is a depiction of a sequence of operations, declared as work of a person, a group of persons, an organization of staff, or one or more simple or complex mechanisms. Workflow may be seen as any abstraction of real work...

s.

Adobe soon started distributing its Acrobat Reader (now Adobe Reader) program at no cost, and continued supporting the original PDF, which eventually became the de facto standard
De facto standard
A de facto standard is a custom, convention, product, or system that has achieved a dominant position by public acceptance or market forces...

 for printable documents on the web (a standard web document
Web document
A web document is similar in concept to a web page, but also satisfies the following broader definition:The term "web document" has been used as a fuzzy term in many sources A web document is similar in concept to a web page, but also satisfies the following broader (W3C) definition:The term "web...

).

Adobe's PDF specifications

Adobe changed the PDF specification several times and continues to develop new specifications with new versions of Adobe Acrobat. There have been nine versions of PDF with corresponding Acrobat releases:
  • 1993 – PDF 1.0 / Acrobat 1.0
  • 1994 – PDF 1.1 / Acrobat 2.0
  • 1996 – PDF 1.2 / Acrobat 3.0
  • 2000 – PDF 1.3 / Acrobat 4.0
  • 2001 – PDF 1.4 / Acrobat 5.0
  • 2003 – PDF 1.5 / Acrobat 6.0
  • 2005 – PDF 1.6 / Acrobat 7.0
  • 2006 – PDF 1.7 / Acrobat 8.0
  • 2006 – PDF 1.7 / Acrobat 8.2
  • 2008 – PDF 1.7, Adobe Extension Level 3 / Acrobat 9.0
  • 2009 – PDF 1.7, Adobe Extension Level 5 / Acrobat 9.1


The ISO standard ISO 32000-1:2008 and Adobe PDF 1.7 are technically consistent. Adobe declared that it is not producing a PDF 1.8 Reference. The future versions of the PDF Specification will be produced by ISO technical committees. However, Adobe published documents specifying what extended features for PDF, beyond ISO 32000-1 (PDF 1.7), are supported in its newly released products. This makes use of the extensibility features of PDF as documented in ISO 32000-1 in Annex E. Adobe declared all extended features in Adobe Extension Level 3 and 5 have been accepted for a new proposal of ISO 32000-2 (a.k.a. PDF 2.0).

The specifications for PDF are backward inclusive. The PDF 1.7 specification includes all of the functionality previously documented in the Adobe PDF Specifications for versions 1.0 through 1.6. Where Adobe removed certain features of PDF from their standard, they too are not contained in ISO 32000-1.

PDF documents conforming to ISO 32000-1 carry the PDF version number 1.7. Documents containing Adobe extended features still carry the PDF base version number 1.7 but also contain an indication of which extension was followed during document creation.

Adobe's versions

Version Edition Year of publication New features Acrobat Reader version support
1.0 First 1993 Carousel
1.1 First, revised 1996 Passwords, encryption (MD5, RC4 40bit), device-independent color, threads and links 2.0
1.2 First, revised 1996 Interactive page elements (radio buttons, checkboxes &c); interactive, fill-in forms (AcroForm); Forms Data Format (FDF) for interactive form data that can be imported, exported, transmitted and received from the Web; mouse events; external movie reproduction; external or embedded sound reproduction; zlib
Zlib
zlib is a software library used for data compression. zlib was written by Jean-Loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. Zlib is also a crucial component of many software platforms including Linux, Mac OS X,...

/deflate
DEFLATE
Deflate is a lossless data compression algorithm that uses a combination of the LZ77 algorithm and Huffman coding. It was originally defined by Phil Katz for version 2 of his PKZIP archiving tool and was later specified in RFC 1951....

 compression of text or binary data; Unicode; advanced color features and image proxying
3.0
1.3 Second 2000 Digital signatures; ICC
ICC profile
In color management, an ICC profile is a set of data that characterizes a color input or output device, or a color space, according to standards promulgated by the International Color Consortium...

 and DeviceN color spaces; JavaScript actions; embedded file streams of any type (e.g. used for attachments); new annotation types; new features of the Adobe PostScript Language Level 3 imaging model; masked images; alternate representations for images; smooth shading; enhanced page numbering; Web capture — a facility for capturing information from World Wide Web and converting it to PDF; representation of logical structure independently of graphical structure; additional support for CIDFonts; data structures for mapping strings and numbers to PDF objects; information for prepress production workflows support; new functions for several function object types that represent parameterized classes of functions
4.0
1.4 Third 2001 JBIG2
JBIG2
JBIG2 is an image compression standard for bi-level images, developed by the Joint Bi-level Image Experts Group. It is suitable for both lossless and lossy compression...

; transparency; RC4 encryption key lengths greater than 40 bits (40–128 bits); enhancements to interactive forms and Forms Data Format (FDF), XML form submissions, embedded FDF files, Unicode specification of field export values, remote collaboration and digital signatures in FDF files; accessibility to disabled users; metadata streams using XML — Extensible Metadata Platform (XMP); tagged PDF; inclusion of printer’s marks; display and preview of production-related page boundaries; new predefined CMaps; alternate presentations; importing content from one PDF document into another; EmbeddedFiles entry in the PDF document’s name dictionary — a standard location for the embedded data; OCR
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...

 text layer
5.0
1.5 Fourth 2003 JPEG 2000
JPEG 2000
JPEG 2000 is an image compression standard and coding system. It was created by the Joint Photographic Experts Group committee in 2000 with the intention of superseding their original discrete cosine transform-based JPEG standard with a newly designed, wavelet-based method...

; enhanced support for embedding and playback of multimedia; object streams; cross reference streams; XML Forms Data Format (XFDF) for interactive form submission (replaced the XML format in PDF 1.4); support for forms, rich text elements and attributes based on Adobe’s XML Forms Architecture (XFA) 2.02; public-key security handlers using PKCS#7 (introduced in PDF 1.3 but not documented in the Reference until 1.5), public-key encryption, permissions — usage rights (UR) signatures (does not require document encryption), PKCS#7 with SHA-1, RSA up to 4096-bits; security handler can use its own encryption and decryption algorithms; document sections selectively viewed or hidden by authors or readers — for items such as CAD
Computer-aided design
Computer-aided design , also known as computer-aided design and drafting , is the use of computer technology for the process of design and design-documentation. Computer Aided Drafting describes the process of drafting with a computer...

 drawings, layered
Layers (digital image editing)
Layers are used in digital image editing to separate different elements of an image. A layer can be compared to a transparency on which imaging effects or images are applied and placed over or under an image...

 artwork, maps, and multi-language documents; Alternate Presentations — the only type is slideshow — invoked by means of JavaScript actions (Adobe Reader supports only SVG 1.0); support for MS Windows 98
Windows 98
Windows 98 is a graphical operating system by Microsoft. It is the second major release in the Windows 9x line of operating systems. It was released to manufacturing on 15 May 1998 and to retail on 25 June 1998. Windows 98 is the successor to Windows 95. Like its predecessor, it is a hybrid...

 dropped. To view and print newer version PDFs, such as those at the IRS, with older versions of Reader requires downloading in Google Docs "Quick View" simplified PDF format.
6.0
1.6 Fifth 2004 3D artwork, e.g. support for Universal 3D
Universal 3D
Universal 3D is a compressed file format standard for 3D computer graphics data.The format was defined by a special consortium called 3D Industry Forum that brought together a diverse group of companies and organizations, including Intel, Boeing, HP, Adobe Systems, Bentley Systems, Right...

 file format; OpenType
OpenType
OpenType is a format for scalable computer fonts. It was built on its predecessor TrueType, retaining TrueType's basic structure and adding many intricate data structures for prescribing typographic behavior...

 font embedding; support for XFA 2.2 rich text elements and attributes; AES
Advanced Encryption Standard
Advanced Encryption Standard is a specification for the encryption of electronic data. It has been adopted by the U.S. government and is now used worldwide. It supersedes DES...

 encryption; PKCS#7 with SHA256, DSA up to 4096-bits; NChannel color spaces; additional support for embedded file attachments, including cross-document linking to and from embedded files; enhancements and clarifications to digital signatures related to usage rights and modification detection and prevention signatures
7.0
1.7
(ISO 32000-1:2008
)
Sixth (ISO first) 2006 Increased presentation of 3D artwork; XFA 2.4 rich text elements and attributes; multiple file attachments (portable collections); document requirements for a PDF consumer application; new string types: PDFDocEncoded string, ASCII string, byte string; PKCS#7 with SHA384, SHA512 and RIPEMD160 8
1.7 Extension Level 3 2008 256-bit AES
Advanced Encryption Standard
Advanced Encryption Standard is a specification for the encryption of electronic data. It has been adopted by the U.S. government and is now used worldwide. It supersedes DES...

 encryption; incorporation of XFA Datasets into a file conforming PDF/A-2; improved attachment of Flash applications, video (including Flash video with H.264), audio, and other multimedia, two-way scripting bridge between Flash and conforming applications; XFA 2.5 and 2.6 rich text conventions
9
1.7 Extension Level 5 2009 XFA 3.0 9.1
1.7 Extension Level 8 2011 Specification not published as of May 2011 X (10)

Specialized subsets of PDF

The following specialized subsets of PDF specification has been standardized as ISO standards (or are in standardization process):
  • PDF/X
    PDF/X
    PDF/X is an umbrella term for several ISO standards that define a subset of the PDF standard. The purpose of PDF/X is to facilitate graphics exchange, and it therefore has a series of printing related requirements which do not apply to standard PDF files. For example, in PDF/X-1a all fonts need to...

     (since 2001 - series of ISO 15929 and ISO 15930 standards) - a.k.a. "PDF for Exchange" - for the Graphic technology - Prepress digital data exchange - (working in ISO Technical committee 130), based on PDF 1.3, PDF 1.4 and later also PDF 1.6
  • PDF/A
    PDF/A
    PDF/A is an ISO-standardized version of the Portable Document Format specialized for the digital preservation of electronic documents.PDF/A differs from PDF by omitting features ill-suited to long-term archiving, such as font linking...

     (since 2005 - series of ISO 19005 standards) - a.k.a. "PDF for Archive" - Document management - Electronic document file format for long-term preservation (working in ISO Technical committee 171), based on PDF 1.4 and later also ISO 32000-1 - PDF 1.7
  • PDF/E
    PDF/E
    ISO 24517-1:2008 is an ISO Standard published in 2008.* Document management—Engineering document format using PDF—Part 1: Use of PDF 1.6 This standard defines a format for the creation of documents used in engineering workflows and is based on the PDF Reference version 1.6 from Adobe Systems.PDF/E...

     (since 2008 - ISO 24517) - a.k.a. "PDF for Engineering" - Document management - Engineering document format using PDF (working in ISO Technical committee 171), based on PDF 1.6
  • PDF/VT
    PDF/VT
    PDF/VT is an international standard published by ISO in August 2010 as ISO 16612-2. It defines the use of PDF as an exchange format optimized for Variable Data Printing and transactional printing. Built on top of PDF/X-4, it is the first VDP format which ensures modern ICC-based color management...

     (since 2010 - ISO 16612-2) - a.k.a "PDF for exchange of variable data and transactional (VT) printing" - Graphic technology - Variable data exchange (working in ISO Technical committee 130), based on PDF 1.6 as restricted by PDF/X-4 and PDF/X-5
  • PDF/UA
    PDF/UA
    PDF/UA is a Standards Committee formed by AIIM.The mission of PDF/UA is to develop technical and other standards for the authoring, remediation and validation of PDF content to ensure accessibility for people that use assistive technology such as screen readers for users who are blind.As of...

     (under development in 2011 - ISO/DIS 14289-1) - a.k.a. "PDF for Universal Access" - Document management applications - Electronic document file format enhancement for accessibility (working in ISO Technical committee 171), based on ISO 32000-1 - PDF 1.7


There is also the PDF/H, a.k.a. "PDF Healthcare", a Best Practices Guide (BPG), supplemented by an Implementation Guide (IG), published in 2008. PDF Healthcare is not a standard or proposed standard, but only a guide for use with existing standards and other technologies. It is supported by the standards development organizations ASTM and AIIM. PDF/H BPG is based on PDF 1.6.
PDF 1.7

The final revised documentation for PDF 1.7 was approved by ISO Technical Committee 171 in January 2008 and published as ISO 32000-1:2008 on July 1, 2008. PDF is now a published ISO standard, titled Document management—Portable document format—Part 1: PDF 1.7.

ISO 32000-1:2008 is the first ISO standard for the full function PDF. The previous ISO PDF standards (PDF/A, PDF/X, etc.) are for more specialized uses. The ISO 32000-1 includes all of the functionality previously documented in the Adobe PDF Specifications for versions 1.0 through 1.6. Adobe removed certain features of PDF from previous versions; these features are not contained in PDF 1.7 either.

ISO 32000 document was prepared by Adobe Systems Incorporated based upon PDF Reference, sixth edition, Adobe Portable Document Format version 1.7, November 2006. It was reviewed, edited and adopted, under a special fast-track procedure, by ISO Technical Committee 171 (ISO/TC 171), Document management application, Subcommittee SC 2, Application issues, in parallel with its approval by the ISO member bodies.

According to the ISO PDF standard abstract:

ISO 32000-1:2008 specifies a digital form for representing electronic documents to enable users to exchange and view electronic documents independent of the environment in which they were created or the environment in which they are viewed or printed. It is intended for the developer of software that creates PDF files (conforming writers), software that reads existing PDF files and interprets their contents for display and interaction (conforming readers) and PDF products that read and/or write PDF files for a variety of other purposes (conforming products).

PDF 2.0

A new version of PDF standard is under development under the name ISO/CD 32000-2 - Document management—Portable document format—Part 2: PDF 2.0 (as of July 2011). PDF 2.0 was accepted by ISO as a new proposal in 2009 (ISO/NP 32000-2). Adobe has submitted the Adobe Extension Level 5 and Adobe Extension Level 3 specifications to ISO for inclusion into the next version of the ISO 32000 specification. Adobe declared they have all been accepted for part 2 of ISO 32000.

Technical foundations

Anyone may create applications that can read and write PDF files without having to pay royalties to Adobe Systems
Adobe Systems
Adobe Systems Incorporated is an American computer software company founded in 1982 and headquartered in San Jose, California, United States...

; Adobe holds patents to PDF, but licenses them for royalty-free use in developing software complying with its PDF specification.

The PDF combines three technologies:
  • A subset of the PostScript
    PostScript
    PostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...

     page description programming language, for generating the layout and graphics.
  • A font-embedding/replacement system to allow fonts to travel with the documents.
  • A structured storage system to bundle these elements and any associated content into a single file, with data compression
    Data compression
    In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

     where appropriate.

PostScript

PostScript
PostScript
PostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...

 is a page description language
Page description language
A page description language is a language that describes the appearance of a printed page in a higher level than an actual output bitmap. An overlapping term is printer control language, but it should not be confused as referring solely to Hewlett-Packard's PCL...

 run in an interpreter
Interpreter (computing)
In computer science, an interpreter normally means a computer program that executes, i.e. performs, instructions written in a programming language...

 to generate an image, a process requiring many resources. It can handle not just graphics, but standard features of programming languages such as if and loop commands. PDF is largely based on PostScript but simplified to remove flow control features like these, while graphics commands such as lineto remain.

Often, the PostScript-like PDF code is generated from a source PostScript file. The graphics commands that are output by the PostScript code are collected and tokenized
Lexical analysis
In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner...

; any files, graphics, or fonts to which the document refers also are collected; then, everything is compressed to a single file. Therefore, the entire PostScript world (fonts, layout, measurements) remains intact.

As a document format, PDF has several advantages over PostScript:
  • PDF contains tokenized and interpreted results of the PostScript source code, for direct correspondence between changes to items in the PDF page description and changes to the resulting page appearance.
  • PDF (from version 1.4) supports true graphic transparency
    Transparency (graphic)
    Transparency is possible in a number of graphics file formats. The term transparency is used in various ways by different people, but at its simplest there is "full transparency" i.e. something that is completely invisible. Of course, only part of a graphic should be fully transparent, or there...

    ; PostScript does not.
  • PostScript is an interpretive programming language with an implicit global state, so instructions accompanying the description of one page can affect the appearance of any following page. Therefore, all preceding pages in a PostScript document must be processed in order to determine the correct appearance of a given page, whereas each page in a PDF document is unaffected by the others. As a result, PDF viewers allow the user to quickly jump to the final pages of a long document, whereas a Postscript viewer needs to process all pages sequentially before being able to display the destination page (unless the optional PostScript Document Structuring Conventions have been carefully complied with).

File structure

A PDF file consists primarily of objects, of which there are eight types:
  • Boolean values, representing true or false
  • Numbers
  • Strings
    String (computer science)
    In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....

  • Names
  • Array
    Array data type
    In computer science, an array type is a data type that is meant to describe a collection of elements , each selected by one or more indices that can be computed at run time by the program. Such a collection is usually called an array variable, array value, or simply array...

    s, ordered collections of objects
  • Dictionaries, collections of objects indexed by Names
  • Streams, usually containing large amounts of data
  • The null object


Objects may be either direct (embedded in another object) or indirect. Indirect objects are numbered with an object number and a generation number. An index table called the xref table gives the byte offset of each indirect object from the start of the file. This design allows for efficient random access
Random access
In computer science, random access is the ability to access an element at an arbitrary position in a sequence in equal time, independent of sequence size. The position is arbitrary in the sense that it is unpredictable, thus the use of the term "random" in "random access"...

 to the objects in the file, and also allows for small changes to be made without rewriting the entire file (incremental update). Beginning with PDF version 1.5, indirect objects may also be located in special streams known as object streams. This technique reduces the size of files that have large numbers of small indirect objects and is especially useful for Tagged PDF.

There are two layouts to the PDF files—non-linear (not "optimized") and linear ("optimized"). Non-linear PDF files consume less disk space than their linear counterparts, though they are slower to access because portions of the data required to assemble pages of the document are scattered throughout the PDF file. Linear PDF files (also called "optimized" or "web optimized" PDF files) are constructed in a manner that enables them to be read in a Web browser plugin without waiting for the entire file to download, since they are written to disk in a linear (as in page order) fashion. PDF files may be optimized using Adobe Acrobat
Adobe Acrobat
Adobe Acrobat is a family of application software developed by Adobe Systems to view, create, manipulate, print and manage files in Portable Document Format . All members of the family, except Adobe Reader , are commercial software, while the latter is available as freeware and can be downloaded...

 software or QPDF
QPDF
QPDF is a free command-line program that can convert one PDF file to another equivalent PDF file. It is capable ofperforming a variety of transformations such as linearization , encryption, and decryption of PDF files...

.

Imaging model

The basic design of how graphics
Graphics
Graphics are visual presentations on some surface, such as a wall, canvas, computer screen, paper, or stone to brand, inform, illustrate, or entertain. Examples are photographs, drawings, Line Art, graphs, diagrams, typography, numbers, symbols, geometric designs, maps, engineering drawings,or...

 are represented in PDF is very similar to that of PostScript, except for the use of transparency
Transparency (graphic)
Transparency is possible in a number of graphics file formats. The term transparency is used in various ways by different people, but at its simplest there is "full transparency" i.e. something that is completely invisible. Of course, only part of a graphic should be fully transparent, or there...

, which was added in PDF 1.4.

PDF graphics use a device independent
Device independent
A program or file is device independent when its function is universal on different types of device.For the World Wide Web, this means writing simple common denominator Hypertext Markup Language and Cascading Style Sheets so that most Web user agents on most devices can render it acceptably.For...

 Cartesian coordinate system
Cartesian coordinate system
A Cartesian coordinate system specifies each point uniquely in a plane by a pair of numerical coordinates, which are the signed distances from the point to two fixed perpendicular directed lines, measured in the same unit of length...

 to describe the surface of a page. A PDF page description can use a matrix
Matrix (mathematics)
In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...

 to scale
Scale (ratio)
The scale ratio of some sort of model which represents an original proportionally is the ratio of a linear dimension of the model to the same dimension of the original. Examples include a 3-dimensional scale model of a building or the scale drawings of the elevations or plans of a building. In such...

, rotate, or skew graphical elements. A key concept in PDF is that of the graphics state, which is a collection of graphical parameters that may be changed, saved, and restored by a page description. PDF has (as of version 1.6) 24 graphics state properties, of which some of the most important are:
  • The current transformation matrix (CTM), which determines the coordinate system
  • The clipping path
    Clipping path
    A clipping path is a closed vector path, or shape, used to cut out a 2D image in image editing software. Anything inside the path will be included after the clipping path is applied; anything outside the path will be omitted from the output...

  • The color space
    Color space
    A color model is an abstract mathematical model describing the way colors can be represented as tuples of numbers, typically as three or four values or color components...

  • The alpha constant
    Alpha compositing
    In computer graphics, alpha compositing is the process of combining an image with a background to create the appearance of partial or full transparency. It is often useful to render image elements in separate passes, and then combine the resulting multiple 2D images into a single, final image in a...

    , which is a key component of transparency

Vector graphics

Vector graphics in PDF, as in PostScript, are constructed with paths. Paths are usually composed of lines and cubic Bézier curve
Bézier curve
A Bézier curve is a parametric curve frequently used in computer graphics and related fields. Generalizations of Bézier curves to higher dimensions are called Bézier surfaces, of which the Bézier triangle is a special case....

s, but can also be constructed from the outlines of text. Unlike PostScript, PDF does not allow a single path to mix text outlines with lines and curves. Paths can be stroked, filled, or used for clipping
Clipping path
A clipping path is a closed vector path, or shape, used to cut out a 2D image in image editing software. Anything inside the path will be included after the clipping path is applied; anything outside the path will be omitted from the output...

. Strokes and fills can use any color set in the graphics state, including patterns.

PDF supports several types of patterns. The simplest is the tiling pattern in which a piece of artwork is specified to be drawn repeatedly. This may be a colored tiling pattern, with the colors specified in the pattern object, or an uncolored tiling pattern, which defers color specification to the time the pattern is drawn. Beginning with PDF 1.3 there is also a shading pattern, which draws continuously varying colors. There are seven types of shading pattern of which the simplest are the axial shade (Type 2) and radial shade (Type 3).

Raster images

Raster images in PDF (called Image XObjects) are represented by dictionaries with an associated stream. The dictionary describes properties of the image, and the stream contains the image data. (Less commonly, a raster image may be embedded directly in a page description as an inline image.) Images are typically filtered for compression purposes. Image filters supported in PDF include the general purpose filters
  • ASCII85Decode a deprecated filter used to put the stream into 7-bit ASCII
    ASCII
    The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

  • ASCIIHexDecode similar to ASCII85Decode but less compact
  • FlateDecode a commonly used filter based on the zlib
    Zlib
    zlib is a software library used for data compression. zlib was written by Jean-Loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. Zlib is also a crucial component of many software platforms including Linux, Mac OS X,...

    /deflate
    DEFLATE
    Deflate is a lossless data compression algorithm that uses a combination of the LZ77 algorithm and Huffman coding. It was originally defined by Phil Katz for version 2 of his PKZIP archiving tool and was later specified in RFC 1951....

     algorithm (a.k.a. gzip
    Gzip
    Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

    , but not zip
    ZIP (file format)
    Zip is a file format used for data compression and archiving. A zip file contains one or more files that have been compressed, to reduce file size, or stored as is...

    ) defined in RFC 1950 and RFC 1951; introduced in PDF 1.2; it can use one of two groups of predictor functions for more compact zlib/deflate compression: Predictor 2 from the TIFF 6.0 specification and predictors (filters) from the PNG specification (RFC 2083)
  • LZWDecode a deprecated filter based on LZW
    LZW
    Lempel–Ziv–Welch is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978...

     Compression; it can use one of two groups of predictor functions for more compact LZW compression: Predictor 2 from the TIFF 6.0 specification and predictors (filters) from the PNG specification
  • RunLengthDecode a simple compression method for streams with repetitive data using the Run-length encoding
    Run-length encoding
    Run-length encoding is a very simple form of data compression in which runs of data are stored as a single data value and count, rather than as the original run...

     algorithm and the image-specific filters
  • DCTDecode a lossy filter based on the JPEG
    JPEG
    In computing, JPEG . The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality....

     standard
  • CCITTFaxDecode a lossless bi-level (black/white) filter based on the Group 3 or Group 4
    Group 4 compression
    Group 4 compression, usually abbreviated as G4, is a method of image compression used in Group 4 fax machines, defined in the ITU-T T.6 fax standard. It is only used for monochrome images. G4 compression is also available in the TIFF image file format, as well as in the PDF document format....

     CCITT (ITU-T) fax
    Fax
    Fax , sometimes called telecopying, is the telephonic transmission of scanned printed material , normally to a telephone number connected to a printer or other output device...

     compression standard defined in ITU-T T.4 and T.6
  • JBIG2Decode a lossy or lossless bi-level (black/white) filter based on the JBIG2
    JBIG2
    JBIG2 is an image compression standard for bi-level images, developed by the Joint Bi-level Image Experts Group. It is suitable for both lossless and lossy compression...

     standard, introduced in PDF 1.4
  • JPXDecode a lossy or lossless filter based on the JPEG 2000
    JPEG 2000
    JPEG 2000 is an image compression standard and coding system. It was created by the Joint Photographic Experts Group committee in 2000 with the intention of superseding their original discrete cosine transform-based JPEG standard with a newly designed, wavelet-based method...

     standard, introduced in PDF 1.5


Normally all image content in a PDF is embedded in the file. But PDF allows image data to be stored in external files by the use of external streams or Alternate Images. Standardized subsets of PDF, including PDF/A
PDF/A
PDF/A is an ISO-standardized version of the Portable Document Format specialized for the digital preservation of electronic documents.PDF/A differs from PDF by omitting features ill-suited to long-term archiving, such as font linking...

 and PDF/X
PDF/X
PDF/X is an umbrella term for several ISO standards that define a subset of the PDF standard. The purpose of PDF/X is to facilitate graphics exchange, and it therefore has a series of printing related requirements which do not apply to standard PDF files. For example, in PDF/X-1a all fonts need to...

, prohibit these techniques.

Text

Text in PDF is represented by text elements in page content streams. A text element specifies that characters should be drawn at certain positions. The characters are specified using the encoding of a selected font resource.
Fonts

A font object in PDF is a description of a digital typeface
Typeface
In typography, a typeface is the artistic representation or interpretation of characters; it is the way the type looks. Each type is designed and there are thousands of different typefaces in existence, with new ones being developed constantly....

. It may either describe the characteristics of a typeface, or it may include an embedded font file. The latter case is called an embedded font while the former is called an unembedded font. The font files that may be embedded are based on widely used standard digital font formats: Type 1
PostScript fonts
PostScript fonts are outline font specifications developed by Adobe Systems for professional digital typesetting, which uses PostScript file format to encode font information.-History:...

(and its compressed variant CFF), TrueType
TrueType
TrueType is an outline font standard originally developed by Apple Computer in the late 1980s as a competitor to Adobe's Type 1 fonts used in PostScript...

, and (beginning with PDF 1.6) OpenType
OpenType
OpenType is a format for scalable computer fonts. It was built on its predecessor TrueType, retaining TrueType's basic structure and adding many intricate data structures for prescribing typographic behavior...

. Additionally PDF supports the Type 3 variant in which the components of the font are described by PDF graphic operators.
Standard Type 1 Fonts (Standard 14 Fonts)

There are fourteen typefaces known as standard 14 fonts that have a special significance to PDF documents:
  • Times
    Times Roman
    Times New Roman is a serif typeface commissioned by the British newspaper The Times in 1931, created by Victor Lardent at the English branch of Monotype. It was commissioned after Stanley Morison had written an article criticizing The Times for being badly printed and typographically antiquated...

     (v3) (in regular, italic, bold, and bold italic)
  • Courier
    Courier (typeface)
    Courier is a monospaced slab serif typeface designed to resemble the output from a strike-on typewriter. The typeface was designed by Howard "Bud" Kettler in 1955...

     (in regular, oblique, bold and bold oblique)
  • Helvetica
    Helvetica
    Helvetica is a widely used sans-serif typeface developed in 1957 by Swiss typeface designer Max Miedinger with Eduard Hoffmann.-Visual distinctive characteristics:Characteristics of this typeface are:lower case:square dot over the letter i....

     (v3) (in regular, oblique, bold and bold oblique)
  • Symbol
    Symbol (typeface)
    Symbol is one of the four standard fonts available on all PostScript-based printers, starting with Apple's original LaserWriter . It contains a complete unaccented Greek alphabet and a selection of commonly used mathematical symbols...

  • Zapf Dingbats
    Zapf Dingbats
    Zapf Dingbats is one of the more common dingbat typefaces. It was designed by the typographer Hermann Zapf in 1978 and licensed by International Typeface Corporation....


These fonts are sometimes also referred to as the "base fourteen fonts". These fonts, or suitable substitute fonts with the same metrics, must always be available in all PDF readers and so need not be embedded in a PDF. PDF viewers must know about the metrics of these fonts. Other fonts may be substituted if they are not embedded in a PDF.
Encodings

Within text strings, characters are shown using character codes (integers) that map to glyphs in the current font using an encoding. There are a number of predefined encodings, including WinAnsi, MacRoman, and a large number of encodings for East Asian languages, and a font can have its own built-in encoding. (Although the WinAnsi and MacRoman encodings are derived from the historical properties of the Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 and Macintosh
Macintosh
The Macintosh , or Mac, is a series of several lines of personal computers designed, developed, and marketed by Apple Inc. The first Macintosh was introduced by Apple's then-chairman Steve Jobs on January 24, 1984; it was the first commercially successful personal computer to feature a mouse and a...

 operating systems, fonts using these encodings work equally well on any platform.) PDF can specify a predefined encoding to use, the font's built-in encoding or provide a lookup table of differences to a predefined or built-in encoding (not recommended with TrueType fonts). The encoding mechanisms in PDF were designed for Type 1 fonts, and the rules for applying them to TrueType fonts are complex.

For large fonts or fonts with non-standard glyphs, the special encodings Identity-H (for horizontal writing) and Identity-V (for vertical) are used. With such fonts it is necessary to provide a ToUnicode table if semantic information about the characters is to be preserved.

Transparency

The original imaging model of PDF was, like PostScript's, opaque: each object drawn on the page completely replaced anything previously marked in the same location. In PDF 1.4 the imaging model was extended to allow transparency. When transparency is used, new objects interact with previously marked objects to produce blending effects. The addition of transparency to PDF was done by means of new extensions that were designed to be ignored in products written to the PDF 1.3 and earlier specifications. As a result, files that use a small amount of transparency might view acceptably in older viewers, but files making extensive use of transparency could be viewed incorrectly in an older viewer without warning.

The transparency extensions are based on the key concepts of transparency groups, blending modes, shape, and alpha. The model is closely aligned with the features of Adobe Illustrator
Adobe Illustrator
Adobe Illustrator is a vector graphics editor developed and marketed by Adobe Systems. Illustrator is similar in scope, intended market, and functionality to its competitors, CorelDraw, Xara Designer Pro and Macromedia FreeHand....

 version 9. The blend modes were based on those used by Adobe Photoshop
Adobe Photoshop
Adobe Photoshop is a graphics editing program developed and published by Adobe Systems Incorporated.Adobe's 2003 "Creative Suite" rebranding led to Adobe Photoshop 8's renaming to Adobe Photoshop CS. Thus, Adobe Photoshop CS5 is the 12th major release of Adobe Photoshop...

 at the time. When the PDF 1.4 specification was published the formulas for calculating blend modes were kept secret by Adobe. They have since been published.

The concept of a transparency group in PDF specification is independent of existing notions of "group" or "layer" in applications such as Adobe Illustrator. Those groupings reflect logical relationships among objects that are meaningful when editing those objects,
but they are not part of the imaging model.

Interactive elements

PDF files may contain interactive elements such as annotations and form fields.

Interactive Forms is a mechanism to add forms to the PDF file format.

PDF currently supports two different methods for integrating data and PDF forms. Both formats today coexist in PDF specification:
  • AcroForms (also known as Acrobat forms), introduced in the PDF 1.2 format specification and included in all later PDF specifications.
  • Adobe XML Forms Architecture (XFA) forms, introduced in the PDF 1.5 format specification. The XFA specification is not included in the PDF specification, it is only referenced as an optional feature. Adobe XFA Forms are not compatible with AcroForms.

AcroForms

AcroForms were introduced in the PDF 1.2 format. AcroForms permit using objects (text boxes, Radiobuttons
Radio button
A radio button or option button is a type of graphical user interface element that allows the user to choose only one of a predefined set of options....

, etc.) and some code (JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

).

Alongside the standard PDF action types, interactive forms (AcroForms) support submitting, resetting, and importing data. The "submit" action transmits the names and values of selected interactive form fields to a specified uniform resource locator (URL). Interactive form field names and values may be submitted in any of the following formats, (depending on the settings of the action’s ExportFormat, SubmitPDF, and XFDF flags):
  • HTML Form format (HTML 4.01 Specification since PDF 1.5; HTML 2.0 since 1.2)
  • Forms Data Format (FDF)
  • XML Forms Data Format (XFDF) (external XML Forms Data Format Specification, Version 2.0; supported since PDF 1.5; it replaced the "XML" form submission format defined in PDF 1.4.)
  • PDF (the entire document can be submitted rather than individual fields and values). (defined in PDF 1.4)


AcroForms can keep form field values in external stand-alone files containing key:value pairs. The external files may use Forms Data Format (FDF) and XML Forms Data Format (XFDF) files. The usage rights (UR) signatures define rights for import form data files in FDF, XFDF and text (CSV
Comma-separated values
A comma-separated values file stores tabular data in plain-text form. As a result, such a file is easily human-readable ....

/TSV) formats, and export form data files in FDF and XFDF formats.
Forms Data Format (FDF)


The Forms Data Format (FDF) is based on PDF, it uses the same syntax and has essentially the same file structure, but is much simpler than PDF, since the body of an FDF document consists of only one required object. Forms Data Format is defined in the PDF format specification (since PDF 1.2). The Forms Data Format can be used when submitting form data to a server, receiving the response, and incorporating into the interactive form. It can also be used to export form data to stand-alone files that can be imported back into the corresponding PDF interactive form. Beginning in PDF 1.3, FDF can be used to define a container for annotations that are separate from the PDF document to which they apply. FDF is typically used to encapsulate information such as X.509 certificates, requests for certificates, directory settings, timestamp server settings, and embedded PDF files for network transmission. The FDF uses the MIME content type application/vnd.fdf, filename extension .fdf and on Mac OS it uses file type 'FDF '. Support for importing and exporting FDF stand-alone files is not widely implemented in free or freeware PDF software. For example, there is no support in Evince, Okular, KPDF or Sumatra PDF. Import support for stand-alone FDF files is implemented in Adobe Reader; export and import support (including saving of FDF data in PDF) is for example implemented in Foxit Reader and PDF-XChange Viewer Free; saving of FDF data in a PDF file is also supported in pdftk.
XML Forms Data Format (XFDF)


XML Forms Data Format (XFDF) is the XML version of Forms Data Format, but the XFDF implements only a subset of FDF containing forms and annotations. There are not XFDF equivalents for some entries in the FDF dictionary - such as the Status, Encoding, JavaScript, Pages keys, EmbeddedFDFs, Differences and Target. In addition, XFDF does not allow the spawning, or addition, of new pages based on the given data; as can be done when using an FDF file. The XFDF specification is referenced (but not included) in PDF 1.5 specification (and in later versions). It is described separately in XML Forms Data Format Specification. The PDF 1.4 specification allowed form submissions in XML format, but this was replaced by submissions in XFDF format in the PDF 1.5 specification. XFDF conforms to the XML standard. XFDF can be used the same way as FDF - e.g. form data is submitted to a server, modifications are made, then sent back and the new form data is imported in an interactive form. It can also be used to export form data to stand-alone files that can be imported back into the corresponding PDF interactive form. A support for importing and exporting FDF stand-alone files is not widely implemented in free or freeware PDF software. Import of XFDF is implemented in Adobe Reader 5 and later versions; import and export is implemented in PDF-XChange Viewer Free; embedding of XFDF data in PDF form is implemented in pdftk (pdf toolkit).

Adobe XML Forms Architecture (XFA)

In the PDF 1.5 format, Adobe Systems
Adobe Systems
Adobe Systems Incorporated is an American computer software company founded in 1982 and headquartered in San Jose, California, United States...

 introduced a new, proprietary format for forms, namely Adobe XML Forms Architecture (XFA) forms. The XFA 2.02 is referenced in the PDF 1.5 specification (and also in later versions) but is described separately in Adobe XML Forms Architecture (XFA) Specification, which has several versions. Adobe XFA Forms are not compatible with AcroForms. Adobe Reader contains "disabled features" for use of XFA Forms, that will activate only when opening a PDF document that was created using enabling technology available only from Adobe. The XFA Forms are not compatible with Adobe Reader prior to version 6.

XFA forms can be created and used as PDF files or as XDP (XML Data Package
XML Data Package
XML Data Package is an XML file format created by Adobe Systems in 2003. It is intended to be an XML-based companion to PDF. It allows PDF content and/or Adobe XML Forms Architecture resources to be packaged within an XML container....

) files. The format of an XFA resource in PDF is described by the XML Data Package Specification. The XDP may be a standalone document or it may in turn be carried inside a PDF document. XDP provides a mechanism for packaging form components within a surrounding XML container. An XDP can also package a PDF file, along with XML form and template data. PDF may contain XFA (in XDP format), but also XFA may contain PDF. When the XFA (XML Forms Architecture) grammars used for an XFA form are moved from one application to another, they must be packaged as an XML Data Package.

When the PDF and XFA are combined, the result is a form in which each page of the XFA form overlays a PDF background. This architecture is
sometimes referred to as XFAF (XFA Foreground). The alternative is to express all of the form, including boilerplate, directly in XFA. It is sometimes called full XFA.

Starting with PDF 1.5, the text contents of variable text form fields, as well as markup annotations may include formatting information (style information). These rich text strings are XML documents that conform to the rich text conventions specified for the XML Forms Architecture specification 2.02, which is itself a subset of the XHTML 1.0 specification, augmented with a restricted set of CSS2 style attributes.
In PDF 1.6, PDF supports the rich text elements and attributes specified in the XML Forms Architecture (XFA) Specification, 2.2.
In PDF 1.7, PDF supports the rich text elements and attributes specified in the XML Forms Architecture (XFA) Specification, 2.4

Logical structure and accessibility

A PDF may contain structure information to enable better text extraction and accessibility. When published, PDF/UA
PDF/UA
PDF/UA is a Standards Committee formed by AIIM.The mission of PDF/UA is to develop technical and other standards for the authoring, remediation and validation of PDF content to ensure accessibility for people that use assistive technology such as screen readers for users who are blind.As of...

, now ISO/AWI 14289, will provide definitive information on how the contents of PDF files are to be tagged with accurate structure information. Tagged PDFs also allow a page-limited reflow of documents
Reflowable document
A reflowable document is a type of electronic document that can adapt its presentation to the output device. Typical desktop publishing output formats like PostScript or PDF are page-oriented, so are not generally reflowable , whereas the world wide web standard, HTML is a reflowable format.The...

 for smaller devices, sometimes called "reflowable PDFs".

Security and signatures

A PDF file may be encrypted for security, or digitally signed for authentication.

The standard security provided by Acrobat PDF consists of two different methods and two different passwords, "user password" and "owner password". A PDF document may be protected by password to open ('user' password) and the document may also specify operations that should be restricted even when the document is decrypted: printing, copying text and graphics out of the document, modifying the document, or adding or modifying text notes and AcroForm fields (using 'owner' password). However, all operations (except the document open password protection, if applicable) which are restricted by "owner" or "user" passwords are trivially circumvented by many commonly available "PDF cracking" software and even freely online, and if circumvented these restrictions no longer let the author control what can and cannot be done with the pdf file once distributed. This warning is also displayed when applying such restrictions using Adobe Acrobat software to create or edit PDF files.

Even without removing the password, most freeware or open source PDF readers will ignore the permission "protections" and will allow the user to print or make copy of excerpts of the text as if the document were not limited by password protection.

Some solutions, like Adobe's LiveCycle Rights Management, are more robust means of information rights management, which can both restrict who can open documents, but also reliably enforce permissions in ways that the standard security handler does not.

Usage rights

Beginning with PDF 1.5, Usage rights (UR) signatures are used to enable additional interactive features that are not available by default in a particular PDF viewer application. The signature is used to validate that the permissions have been granted by a bonafide granting authority. For example, it can be used to allow a user:
  • to save the PDF document along with modified form and/or annotation data
  • import form data files in FDF, XFDF and text (CSV/TSV) formats
  • export form data files in FDF and XFDF formats
  • submit form data
  • instantiate new pages from named page templates
  • apply a digital
    Digital
    A digital system is a data technology that uses discrete values. By contrast, non-digital systems use a continuous range of values to represent information...

     signature
    Signature
    A signature is a handwritten depiction of someone's name, nickname, or even a simple "X" that a person writes on documents as a proof of identity and intent. The writer of a signature is a signatory. Similar to a handwritten signature, a signature work describes the work as readily identifying...

     to existing digital signature
    Digital signature
    A digital signature or digital signature scheme is a mathematical scheme for demonstrating the authenticity of a digital message or document. A valid digital signature gives a recipient reason to believe that the message was created by a known sender, and that it was not altered in transit...

     form field
  • create, delete, modify, copy, import, export annotations


For example, Adobe Systems grants permissions to enable additional features in Adobe Reader, using public-key cryptography
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...

. Adobe Reader will verify that the signature uses a certificate
Public key certificate
In cryptography, a public key certificate is an electronic document which uses a digital signature to bind a public key with an identity — information such as the name of a person or an organization, their address, and so forth...

 from an Adobe-authorized certificate authority. The PDF 1.5 specification declares that other PDF viewer applications are free to use this same mechanism for their own purposes.

File attachments

PDF files can have document-level and page-level file attachments, which the reader can access and open or save to their local filesystem. PDF attachments can be added to existing PDF files for example using pdftk. Adobe Reader provides support for attachments, and poppler
Poppler (software)
In computing, Poppler is a free software library used to render PDF documents. It is used by the PDF viewers of the open source GNOME and KDE desktop environments, and its development is supported by freedesktop.org....

 based readers like Evince
Evince
Evince is a document viewer for PDF, PostScript, DjVu, TIFF and DVI designed for the GNOME desktop environment.The developers of Evince intended to replace the multiple GNOME document viewers with a single and simple application. The Evince motto sums up the project aim: "Simply a Document...

 or Okular
Okular
Okular is the document viewer for KDE SC 4 . It is based on KPDF and it replaced KPDF, KGhostView, KFax, KFaxview and KDVI in KDE 4.Its functionality can be easily embedded in other applications....

 also have some support for document-level attachments.

Metadata

PDF files can contain two types of metadata. The first is the Document Information Dictionary, a set of key/value fields such as author, title, subject, creation and update dates. This is stored in the optional Info trailer of the file. A small set of fields is defined, and can be extended with additional text values if required.

Later, in PDF 1.4, support was added for the Metadata Streams, using the Extensible Metadata Platform
Extensible Metadata Platform
The Adobe Extensible Metadata Platform is a standard, created by Adobe Systems Inc., for processing and storing standardized and proprietary information relating to the contents of a file....

 (XMP) to add XML standards-based extensible metadata as used in other file formats. This allows metadata to be attached to any stream in the document, such as information about embedded illustrations, as well as the whole document (attaching to the document catalog), using an extensible schema.

Subsets

Proper subsets of PDF have been, or are being, standardized under ISO
International Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...

 for several constituencies:
  • PDF/X
    PDF/X
    PDF/X is an umbrella term for several ISO standards that define a subset of the PDF standard. The purpose of PDF/X is to facilitate graphics exchange, and it therefore has a series of printing related requirements which do not apply to standard PDF files. For example, in PDF/X-1a all fonts need to...

     for the printing and graphic arts as ISO 15930 (working in ISO TC130)
  • PDF/A
    PDF/A
    PDF/A is an ISO-standardized version of the Portable Document Format specialized for the digital preservation of electronic documents.PDF/A differs from PDF by omitting features ill-suited to long-term archiving, such as font linking...

     for archiving in corporate/government/library/etc environments as ISO 19005 (work done in ISO TC171)
  • PDF/E
    PDF/E
    ISO 24517-1:2008 is an ISO Standard published in 2008.* Document management—Engineering document format using PDF—Part 1: Use of PDF 1.6 This standard defines a format for the creation of documents used in engineering workflows and is based on the PDF Reference version 1.6 from Adobe Systems.PDF/E...

     for exchange of engineering drawings (work done in ISO TC171)
  • PDF/UA
    PDF/UA
    PDF/UA is a Standards Committee formed by AIIM.The mission of PDF/UA is to develop technical and other standards for the authoring, remediation and validation of PDF content to ensure accessibility for people that use assistive technology such as screen readers for users who are blind.As of...

     for universally accessible PDF files


A PDF/H variant (PDF for Healthcare) is being developed. However, it may consist more of a set of "best practices" than of a specific format or subset.

Mars

Adobe was exploring an XML-based next-generation PDF code name
Code name
A code name or cryptonym is a word or name used clandestinely to refer to another name or word. Code names are often used for military purposes, or in espionage...

d Mars. Information about the Mars file format is published by Adobe at http://www.adobe.com/go/mars and also http://labs.adobe.com/wiki/index.php/Mars.

The format of graphic elements of Mars is sometimes described simply as "SVG
Scalable Vector Graphics
Scalable Vector Graphics is a family of specifications of an XML-based file format for describing two-dimensional vector graphics, both static and dynamic . The SVG specification is an open standard that has been under development by the World Wide Web Consortium since 1999.SVG images and their...

", but according to the version 0.8 draft specification of November 2007 (§3 Mars SVG Support) the format is actually merely similar to SVG: it contains both additions to and subtractions from SVG, so it is in general neither viewable by nor creatable with standard SVG tools: some things will look noticeably different between SVG viewers and Mars viewers.

The Mars format was effectively dropped in 2008.

Accessibility

PDF files can be created specifically to be accessible for disabled people. Current PDF file formats can include tags (XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

), text equivalents, captions, audio descriptions, et cetera. Some software can automatically produce tagged PDFs, however this feature is not always enabled by default. Leading screen reader
Screen reader
A screen reader is a software application that attempts to identify and interpret what is being displayed on the screen . This interpretation is then re-presented to the user with text-to-speech, sound icons, or a Braille output device...

s, including JAWS
JAWS (screen reader)
JAWS is a computer screen reader program in Microsoft Windows that allows blind and visually impaired users to read the screen either with a text-to-speech output or by a Refreshable Braille display....

, Window-Eyes
Window-eyes
Window-Eyes is a screen reader for Microsoft Windows, developed by GW Micro. The first version was released in 1995. The latest version of Window-Eyes is 7.2, released in 2010. and version 7.5.1 was realsed on march 14 2011-Features:...

, Hal, and Kurzweil 1000 and 3000
Kurzweil Educational Systems
Kurzweil Educational Systems, Inc. is an American based company that specializes in providing reading and writing software to assist people who are blind or partially sighted, or who have learning disabilities, such as dyslexia and Attention Deficit Disorder...

 can read tagged PDFs; current versions of the Acrobat and Acrobat Reader programs can also read PDFs aloud. Moreover, tagged PDFs can be re-flowed and magnified for readers with visual impairments. Problems remain with adding tags to older PDFs and those that are generated from scanned documents. In these cases, accessibility tags and re-flowing are unavailable, and must be created either manually or with OCR
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...

 techniques. These processes are inaccessible to some disabled people. PDF/UA
PDF/UA
PDF/UA is a Standards Committee formed by AIIM.The mission of PDF/UA is to develop technical and other standards for the authoring, remediation and validation of PDF content to ensure accessibility for people that use assistive technology such as screen readers for users who are blind.As of...

, the PDF/Universal Accessibility Committee, an activity of AIIM, is working on a specification for PDF accessibility based on ISO 32000.

One of the significant challenges with PDF accessibility is that PDF documents have three distinct views, which, depending on the document's creation, can be inconsistent with each other. The three views are (i) the physical view, (ii) the tags view, and (iii) the content view. The physical view is displayed and printed (what most people consider a PDF document). The tags view is what screen readers read (useful for people with poor eyesight). The content view is displayed when the document is re-flowed to Acrobat (useful for people with mobility disability). For a PDF document to be accessible, the three views must be consistent with each other.

Viruses and exploits

PDF attachments carrying viruses were first discovered in 2001. The virus, named "OUTLOOK.PDFWorm" or "Peachy", uses Microsoft Outlook
Microsoft Outlook
Microsoft Outlook is a personal information manager from Microsoft, available both as a separate application as well as a part of the Microsoft Office suite...

 to send itself as an attachment to an Adobe PDF file. It was activated with Adobe Acrobat, but not with Acrobat Reader.

From time to time, new vulnerabilities are discovered in various versions of Adobe Reader, prompting the company to issue security fixes. Other PDF readers are also susceptible. One aggravating factor is that a PDF reader can be configured to start automatically if a web page has an embedded PDF file, providing a vector for attack. If a malicious web page contains an infected PDF file that takes advantage of a vulnerability in the PDF reader, the system may be compromised even if the browser is secure. Some of these vulnerabilities are a result of the PDF standard allowing PDF documents to be scripted with JavaScript. Disabling JavaScript execution in the PDF reader can help mitigate such future exploits, although it will not provide protection against exploits in other parts of the PDF viewing software. Security experts say that JavaScript is not essential for a PDF reader, and that the security benefit that comes from disabling JavaScript outweighs any compatibility issues caused. One way of avoiding PDF file exploits is to have a local or web service convert files to another format before viewing.

On March 30, 2010 security researcher Didier Stevens reported an Adobe Reader and Foxit Reader exploit which runs a malicious executable if the user allows it to launch when asked.

Usage restrictions and monitoring

PDFs may be encrypted so that a password is needed to view or edit the contents. The PDF Reference defines both 40-bit and 128-bit encryption, both making use of a complex system of RC4
RC4
In cryptography, RC4 is the most widely used software stream cipher and is used in popular protocols such as Secure Sockets Layer and WEP...

 and MD5
MD5
The MD5 Message-Digest Algorithm is a widely used cryptographic hash function that produces a 128-bit hash value. Specified in RFC 1321, MD5 has been employed in a wide variety of security applications, and is also commonly used to check data integrity...

. The PDF Reference also defines ways in which third parties can define their own encryption systems for use in PDF.

PDF files may also contain embedded DRM
Digital rights management
Digital rights management is a class of access control technologies that are used by hardware manufacturers, publishers, copyright holders and individuals with the intent to limit the use of digital content and devices after sale. DRM is any technology that inhibits uses of digital content that...

 restrictions that provide further controls that limit copying, editing or printing. The restrictions on copying, editing, or printing depend on the reader software to obey them, so the security they provide is limited.

The PDF Reference has technical details or see for an end-user overview. Like HTML files, PDF files may submit information to a web server. This could be used to track the IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

 of the client PC, a process known as phoning home
Phoning home
Phoning home, in computing, refers to an act of client to server communication where a client device or client application software reports its location on a network, the currently logged on user, or any other information to a server computer...

. After update 7.0.5 to Acrobat Reader, the user will be notified "via a dialogue box that the author of the file is auditing usage of the file, and be offered the option of continuing."

Through its LiveCycle Policy Server
Adobe LiveCycle
Adobe LiveCycle Enterprise Suite is a SOA Java EE-based server software product from Adobe Systems Incorporated used to build applications that automate a broad range of business processes for enterprises and government agencies...

 product, Adobe provides a method to set security policies on specific documents. This can include requiring a user to authenticate and limiting the timeframe a document can be accessed or amount of time a document can be opened while offline. Once a PDF document is tied to a policy server and a specific policy, that policy can be changed or revoked by the owner. This controls documents that are otherwise "in the wild." Each document open and close event can also be tracked by the policy server. Policy servers can be set up privately or Adobe offers a public service through Adobe Online Services. As with other forms of DRM, adherence to these policies and restrictions may or may not be enforced by the reader software being used.

Default display settings

PDF documents can contain display settings, including the page display layout and zoom level. Adobe Reader will use these settings to override the user's default settings when opening the document. The free Adobe Reader cannot remove these settings.

Content

A PDF file is often a combination of vector graphics
Vector graphics
Vector graphics is the use of geometrical primitives such as points, lines, curves, and shapes or polygon, which are all based on mathematical expressions, to represent images in computer graphics...

, text, and bitmap graphics. The basic types of content in a PDF are:
  • text stored as content streams (i.e., not text)
  • vector graphics for illustrations and designs that consist of shapes and lines
  • raster graphics for photographs and other types of image


In later PDF revisions, a PDF document can also support links (inside document or web page), forms, JavaScript (initially available as plugin for Acrobat 3.0), or any other types of embedded contents that can be handled using plug-ins.

PDF 1.6 supports interactive 3D documents embedded in the PDF - 3D drawings can be embedded using U3D or PRC
PRC (file format)
PRC is a 3D file format that can be used to embed 3D data in a PDF file.This highly compressed format facilitates the storage of different representations of a 3D model. For example, you can save only a visual representation that consists of polygons , or you can save the model's exact geometry...

 and various other data formats.

Two PDF files that look similar on a computer screen may be of very different sizes. For example, a high resolution raster image takes more space than a low resolution one. Typically higher resolution is needed for printing documents than for displaying them on screen. Other things that may increase the size of a file is embedding full fonts, especially for Asiatic scripts, and storing text as graphics.

Implementations

PDF-viewing software is generally provided free of charge, and many versions are available from a variety of sources (List of PDF software).

There are many software options for creating PDFs, including the PDF printing capabilities built in to Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

 and most Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 distributions, the multi-platform OpenOffice.org
OpenOffice.org
OpenOffice.org, commonly known as OOo or OpenOffice, is an open-source application suite whose main components are for word processing, spreadsheets, presentations, graphics, and databases. OpenOffice is available for a number of different computer operating systems, is distributed as free software...

, Microsoft Office 2007
Microsoft Office 2007
Microsoft Office 2007 is a Windows version of the Microsoft Office System, Microsoft's productivity suite. Formerly known as Office 12 in the initial stages of its beta cycle, it was released to volume license customers on November 30, 2006 and made available to retail customers on January 30, 2007...

 (if updated to SP2), WordPerfect
WordPerfect
WordPerfect is a word processing application, now owned by Corel.Bruce Bastian, a Brigham Young University graduate student, and BYU computer science professor Dr. Alan Ashton joined forces to design a word processing system for the city of Orem's Data General Corp. minicomputer system in 1979...

 since version 9, Scribus
Scribus
Scribus is a desktop publishing application, released under the GNU General Public License as free software. It is based on the free Qt toolkit, therefore native versions are available for Linux, Unix-like operating systems, Mac OS X, Microsoft Windows, and OS/2...

, Free PDF XP and numerous PDF print drivers for Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, the pdfTeX
PdfTeX
The computer program pdfTeX is an extension of Knuth's typesetting program TeX, and was originally written and developed into a publicly usable product by Hàn Thế Thành as a part of the work for his PhD thesis at the Faculty of Informatics, Masaryk University, Brno...

 typesetting system, the DocBook
DocBook
DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation....

 PDF tools, applications developed around Ghostscript
Ghostscript
Ghostscript is a suite of software based on an interpreter for Adobe Systems' PostScript and Portable Document Format page description languages.- Features :...

 and Adobe Acrobat
Adobe Acrobat
Adobe Acrobat is a family of application software developed by Adobe Systems to view, create, manipulate, print and manage files in Portable Document Format . All members of the family, except Adobe Reader , are commercial software, while the latter is available as freeware and can be downloaded...

 itself. Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

's online office suite Google Docs also allows for uploading, and saving to the Portable Document Format.

Raster image processor
Raster image processor
A raster image processor is a component used in a printing system which produces a raster image also known as a bitmap. The bitmap is then sent to a printing device for output. The input may be a page description in a high-level page description language such as PostScript, Portable Document...

s (RIPs) are used to convert PDF files into a raster format
Raster graphics
In computer graphics, a raster graphics image, or bitmap, is a data structure representing a generally rectangular grid of pixels, or points of color, viewable via a monitor, paper, or other display medium...

 suitable for imaging onto paper and other media in printers, digital production presses and prepress
Prepress
Prepress is the term used in the printing and publishing industries for the processes and procedures that occur between the creation of a print layout and the final printing...

 in a process known as rasterisation
Rasterisation
Rasterisation is the task of taking an image described in a vector graphics format and converting it into a raster image for output on a video display or printer, or for storage in a bitmap file format....

. RIPs capable of processing PDF directly include the Adobe PDF Print Engine from Adobe Systems
Adobe Systems
Adobe Systems Incorporated is an American computer software company founded in 1982 and headquartered in San Jose, California, United States...

 and Jaws and the Harlequin RIP
Harlequin RIP
The Harlequin RIP was first released in 1990 under the name “ScriptWorks” running as a command-line application to render PostScript language files under Unix...

 from Global Graphics
Global Graphics
Global Graphics SA is a group of companies known for their digital printing products, including the Harlequin and Jaws RIPs. In May 2009 the company launched gDoc Fusion, an edocument builder software application designed for ease of use to create, review, edit, share and archive PDF and XPS...

.

Editing PDFs (structure)

There is also specialized software for editing PDF files, though the choices are much more limited and often expensive. As of version 0.46, Inkscape
Inkscape
Inkscape is a free software vector graphics editor, licensed under the GNU General Public License. Its goal is to implement full support for the Scalable Vector Graphics 1.1 standard....

 also allows PDF editing through an intermediate translation step involving Poppler
Poppler (software)
In computing, Poppler is a free software library used to render PDF documents. It is used by the PDF viewers of the open source GNOME and KDE desktop environments, and its development is supported by freedesktop.org....

.

Enfocus
Enfocus
- History :Enfocus was founded in 1993 by Peter Camps. The company was started to develop solutions for the NeXT platform to edit PostScript files. With the decline of NeXT it migrated to the Macintosh platform...

 PitStop Pro, a plugin for Acrobat, allows manual and automatic editing of PDF files, while the free Enfocus Browser makes it possible to edit the low-level structure of a PDF.

See List of PDF software for a more complete list of PDF editors.

Annotating PDFs

Adobe Acrobat
Adobe Acrobat
Adobe Acrobat is a family of application software developed by Adobe Systems to view, create, manipulate, print and manage files in Portable Document Format . All members of the family, except Adobe Reader , are commercial software, while the latter is available as freeware and can be downloaded...

 is one example of proprietary software that allows the user to annotate, highlight, and add notes to already created PDF files. One UNIX application available as free software
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

 (under the GNU General Public License
GNU General Public License
The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....

) is PDFedit
PDFedit
PDFedit is a free PDF editor for Unix-like operating systems . It does not support editing protected or encrypted PDF files or word processor-style text manipulation, however....

. Another GPL-licensed application native to the unix environment is Xournal
Xournal
Xournal is a notetaking application written for Linux and other GTK+ platforms. It bears some similarity to Windows Journal, Jarnal and Gournal. It is designed to be used with either a stylus or a mouse...

. Xournal allows for annotating in different fonts and colours, as well as a rule for quickly underlining and highlighting lines of text or paragraphs. Xournal also has a shape recognition tool for squares, rectangles and circles. In Xournal annotations may be moved, copied and pasted. The freeware
Freeware
Freeware is computer software that is available for use at no cost or for an optional fee, but usually with one or more restricted usage rights. Freeware is in contrast to commercial software, which is typically sold for profit, but might be distributed for a business or commercial purpose in the...

 Foxit Reader
Foxit Reader
Foxit Reader is a multilingual PDF reader. Both the basic and full version readers can be downloaded for free.Foxit Reader is notable for its short load time and small filesize, and has been compared favorably to Adobe Reader...

, available for Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, allows annotating documents. Tracker Software's PDF-XChange Viewer
PDF-XChange Viewer
PDF-XChange Viewer is a free PDF reader for Microsoft Windows. The basic reader, which can be downloaded free, includes extended/markup capabilities, such as typing, highlighting, callouts, notes, and more. An advanced paid version is also available....

 allows annotations and markups without restrictions in its freeware alternative. Apple's Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

's integrated PDF viewer, Preview, does also enable annotations. For mobile annotation, iAnnotate PDF for the iPad
IPad
The iPad is a line of tablet computers designed, developed and marketed by Apple Inc., primarily as a platform for audio-visual media including books, periodicals, movies, music, games, and web content. The iPad was introduced on January 27, 2010 by Apple's then-CEO Steve Jobs. Its size and...

 and Aji Annotate for the iPhone
IPhone
The iPhone is a line of Internet and multimedia-enabled smartphones marketed by Apple Inc. The first iPhone was unveiled by Steve Jobs, then CEO of Apple, on January 9, 2007, and released on June 29, 2007...

, both produced by Aji, allow annotation of PDFs as well as exporting summaries of the annotations.

There are also web annotation
Web annotation
A web annotation is an online annotation associated with a web resource, typically a web page. With a Web annotation system, a user can add, modify or remove information from a Web resource without modifying the resource itself...

 systems which allow to annotate pdf and other documents formats, e.g. A.nnotate
A.nnotate
A.nnotate is a web service for storing and annotating documents. Documents are either uploaded by the user or fetched from a web address supplied by the user. Uploads are accepted as PDF, Microsoft Word, office formats supported by OpenOffice and common image formats. When a URL of a web page is...

, crocodoc
Crocodoc
Crocodoc is a website that allows users to collaboratively review, mark up and comment on documents. Document types that are supported include PDF, Word and Powerpoint documents, PNG and JPEG images, and web pages...

, WebNotes
WebNotes
WebNotes, Inc is a web based application in the field of Web Annotation. Their hosted solution allows users to highlight text and add sticky notes to web pages...

.

In cases where PDFs are expected to have all of the functionality of paper documents, ink annotation is required. Many programs which accept ink input from the mouse are not responsive enough for handwriting using an input tablet or tablet PC. Existing solutions on the PC include Bluebeam PDF Revu
Bluebeam Software, Inc.
Bluebeam Software, Inc. is a software company founded in 2002 and headquartered in Pasadena, California, U.S. with offices in the Eastern U.S. as well as Sweden. The company specializes in designing solutions for creating, editing, marking up, collaborating and sharing PDF documents...

, and PDF Annotator.

Other applications and functionalities

Several applications embracing the PDF standard are now available as an online service including Scribd
Scribd
Scribd is a Web 2.0 based document-sharing website which allows users to post documents of various formats, and embed them into a web page using its iPaper format. Scribd was founded by Trip Adler, Tikhon Bernstam, and Jared Friedman in 2006...

 for viewing and storing, Pdfvue
Pdfvue
PDFVue is an online PDF viewer and editor that is currently in beta release.Current features include viewing, commenting, annotations, page deletion or rotation and adding fillable form fields to PDF's. Supported browsers include Internet Explorer, Firefox, and Safari...

 for online editing, and Zamzar
Zamzar
Zamzar is a web application to convert files. It was created by brothers Mike and Chris Whyley in England. It allows user to convert files without downloading a software tool. Users can type in a URL or upload one or more files from their computer, Zamzar then converts the file to another format...

 for PDF Conversion.

In 1993 the Jaws RIP
Raster image processor
A raster image processor is a component used in a printing system which produces a raster image also known as a bitmap. The bitmap is then sent to a printing device for output. The input may be a page description in a high-level page description language such as PostScript, Portable Document...

 from Global Graphics
Global Graphics
Global Graphics SA is a group of companies known for their digital printing products, including the Harlequin and Jaws RIPs. In May 2009 the company launched gDoc Fusion, an edocument builder software application designed for ease of use to create, review, edit, share and archive PDF and XPS...

 became the first shipping prepress RIP that interpreted PDF natively without conversion to another format. The company released an upgrade to their Harlequin RIP with the same capability in 1997.

Agfa-Gevaert
Agfa-Gevaert
Agfa-Gevaert N.V. is a European multinational corporation that develops, manufactures, and distributes analogue and digital imaging products and systems, as well as IT solutions. The company has three divisions. Agfa Graphics offers integrated prepress and industrial inkjet systems to the...

 introduced and shipped Apogee, the first prepress workflow system based on PDF, in 1997.

Many commercial offset printers have accepted the submission of press-ready PDF files as a print source, specifically the PDF/X-1a subset and variations of the same. The submission of press-ready PDF files are a replacement for the problematic need for receiving collected native working files.

PDF was selected as the "native" metafile
Metafile
Metafile is a generic term for a file format that can store multiple types of data. This commonly includes graphics file formats. These graphics files can contain raster, vector, and type data...

 format for Mac OS X, replacing the PICT
PICT
PICT is a graphics file format introduced on the original Apple Macintosh computer as its standard metafile format. It allows the interchange of graphics , and some limited text support, between Mac applications, and was the native graphics format of QuickDraw.The original version, PICT 1, was...

 format of the earlier Mac OS
Mac OS
Mac OS is a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems. The Macintosh user experience is credited with popularizing the graphical user interface...

. The imaging model of the Quartz
Quartz (graphics layer)
Quartz specifically refers to a pair of Mac OS X technologies, each part of the Core Graphics framework: Quartz 2D and Quartz Compositor. It includes both a 2D renderer in Core Graphics and the composition engine that sends instructions to the graphics card...

 graphics layer is based on the model common to Display PostScript
Display PostScript
Display PostScript is an on-screen display system. As the name implies, DPS uses the PostScript imaging model and language to generate on-screen graphics...

 and PDF, leading to the nickname "Display PDF". The Preview application can display PDF files, as can version 2.0 and later of the Safari
Safari (web browser)
Safari is a web browser developed by Apple Inc. and included with the Mac OS X and iOS operating systems. First released as a public beta on January 7, 2003 on the company's Mac OS X operating system, it became Apple's default browser beginning with Mac OS X v10.3 "Panther". Safari is also the...

 web browser. System-level support for PDF allows Mac OS X applications to create PDF documents automatically, provided they support the Print command. The files are then exported in PDF 1.3 format according to the file header. When taking a screenshot under Mac OS X versions 10.0 through 10.3, the image was also captured as a PDF; in 10.4 and 10.5 the default behaviour is set to capture as a PNG file, though this behaviour can be set back to PDF if required.

Some desktop printers also support direct PDF printing, which can interpret PDF data without external help. Currently, all PDF capable printers also support PostScript, but most PostScript printers do not support direct PDF printing.

The Free Software Foundation
Free Software Foundation
The Free Software Foundation is a non-profit corporation founded by Richard Stallman on 4 October 1985 to support the free software movement, a copyleft-based movement which aims to promote the universal freedom to create, distribute and modify computer software...

 considers one of their high priority projects to be "developing a free, high-quality and fully functional set of libraries and programs that implement the PDF file format and associated technologies to the ISO 32000 standard." The GNUpdf
GNU PDF
GNUpdf is a GNU project that aims to completely implement the Portable Document Format standards in free software. While many projects, such as Poppler already freely provide support of PDF adequate for most, none provide a complete implementation...

 library has, however, not been released yet, while Poppler
Poppler (software)
In computing, Poppler is a free software library used to render PDF documents. It is used by the PDF viewers of the open source GNOME and KDE desktop environments, and its development is supported by freedesktop.org....

 has enjoyed wider use in applications such as Evince
Evince
Evince is a document viewer for PDF, PostScript, DjVu, TIFF and DVI designed for the GNOME desktop environment.The developers of Evince intended to replace the multiple GNOME document viewers with a single and simple application. The Evince motto sums up the project aim: "Simply a Document...

, which comes with the GNOME
GNOME
GNOME is a desktop environment and graphical user interface that runs on top of a computer operating system. It is composed entirely of free and open source software...

 desktop environment, at the expense of relying on the GPLv2-licensed Xpdf
Xpdf
Xpdf is an open-source PDF viewer for the X Window System and Motif.Xpdf runs on practically any Unix-like operating system. Xpdf can decode LZW and read encrypted PDFs. The official version obeys the DRM restrictions of PDF files, which may prevent copying, printing, or converting some PDF files...

 code base that can't be used by GPLv3 programs. There are also commercial development libraries available as listed in List of PDF software.

The Apache PDFBox project of the Apache Software Foundation
Apache Software Foundation
The Apache Software Foundation is a non-profit corporation to support Apache software projects, including the Apache HTTP Server. The ASF was formed from the Apache Group and incorporated in Delaware, U.S., in June 1999.The Apache Software Foundation is a decentralized community of developers...

 is an open source Java library for working with PDF documents. PDFBox is licensed under the Apache License
Apache License
The Apache License is a copyfree free software license authored by the Apache Software Foundation . The Apache License requires preservation of the copyright notice and disclaimer....

.

See also

  • Comparison of OpenXPS and PDF
    Comparison of OpenXPS and PDF
    This is a comparison of the OpenXPS document file format with the PDF file format. Both file format standards are essentially containers for representing digital content in a paper-like fashion.-Design aims:...

  • DjVu
    DjVu
    DjVu is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy...

  • List of ISO standards
  • List of PDF software
  • PAdES
    PAdES
    For the Romanian commune, see Padeş.PAdES is a set of restrictions and extensions to PDF and ISO 32000-1 making it suitable for advanced electronic signature...

    , PDF Advanced Electronic Signature
  • OpenXPS
  • Open XML Paper Specification
  • Web document
    Web document
    A web document is similar in concept to a web page, but also satisfies the following broader definition:The term "web document" has been used as a fuzzy term in many sources A web document is similar in concept to a web page, but also satisfies the following broader (W3C) definition:The term "web...

  • XSL Formatting Objects
    XSL Formatting Objects
    XSL Formatting Objects, or XSL-FO, is a markup language for XML document formatting which is most often used to generate PDFs. XSL-FO is part of XSL , a set of W3C technologies designed for the transformation and formatting of XML data. The other parts of XSL are XSLT and XPath...


External links


The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK