Open format
Encyclopedia
An open file format
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...

is a published specification for storing digital data
Data (computing)
In computer science, data is information in a form suitable for use with a computer. Data is often distinguished from programs. A program is a sequence of instructions that detail a task for the computer to perform...

, usually maintained by a standards organization
Standards organization
A standards organization, standards body, standards developing organization , or standards setting organization is any organization whose primary activities are developing, coordinating, promulgating, revising, amending, reissuing, interpreting, or otherwise producing technical standards that are...

, which can therefore be used and implemented by anyone. For example, an open format can be implementable by both proprietary
Proprietary software
Proprietary software is computer software licensed under exclusive legal right of the copyright holder. The licensee is given the right to use the software under certain conditions, while restricted from other uses, such as modification, further distribution, or reverse engineering.Complementary...

 and free
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

 and open source software, using the typical software licenses used by each. In contrast to open formats, closed formats are considered trade secrets. Open formats are also called free file formats if they are not encumbered by any copyrights, patents, trademarks or other restrictions (for example, if they are in the public domain
Public domain
Works are in the public domain if the intellectual property rights have expired, if the intellectual property rights are forfeited, or if they are not covered by intellectual property rights at all...

) so that anyone may use it at no monetary cost for any desired purpose.

Sun Microsystems

Sun Microsystems
Sun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

 defines the criteria for open formats as follows:
  • The format is based on an underlying open standard
    Open standard
    An open standard is a standard that is publicly available and has various rights to use associated with it, and may also have various properties of how it was designed . There is no single definition and interpretations vary with usage....

  • The format is developed through a publicly visible, community driven process
  • The format is affirmed and maintained by a vendor-independent standards organization
    Standards organization
    A standards organization, standards body, standards developing organization , or standards setting organization is any organization whose primary activities are developing, coordinating, promulgating, revising, amending, reissuing, interpreting, or otherwise producing technical standards that are...

  • The format is fully documented and publicly available
  • The format does not contain proprietary extensions

US government

Within the framework of Open Government Initiative
Open Government Initiative
The Open Government Initiative is an effort by the administration of President of the United States Barack Obama to "creating an unprecedented level of openness in Government." . The directive starting this initiative was issued on January 20, 2009, Obama's first day in office.The philosophy of the...

, the federal government of the United States
Federal government of the United States
The federal government of the United States is the national government of the constitutional republic of fifty states that is the United States of America. The federal government comprises three distinct branches of government: a legislative, an executive and a judiciary. These branches and...

 adopted the Open Government Directive, according to which: "An open format is one that is platform independent, machine readable, and made available to the public without restrictions that would impede the re-use of that information".

State of Minnesota

The State of Minnesota
Minnesota
Minnesota is a U.S. state located in the Midwestern United States. The twelfth largest state of the U.S., it is the twenty-first most populous, with 5.3 million residents. Minnesota was carved out of the eastern half of the Minnesota Territory and admitted to the Union as the thirty-second state...

 defines the criteria for open, XML-based file formats as follows:
  • The format is interoperable among diverse internal and external platforms and applications
  • The format is fully published and available royalty-free
  • The format is implemented by multiple vendors
  • The format is controlled by an open industry organization with a well-defined inclusive process for evolution of the standard

Commonwealth of Massachusetts

The Commonwealth of Massachusetts
Massachusetts
The Commonwealth of Massachusetts is a state in the New England region of the northeastern United States of America. It is bordered by Rhode Island and Connecticut to the south, New York to the west, and Vermont and New Hampshire to the north; at its east lies the Atlantic Ocean. As of the 2010...

 "defines open formats as specifications for data file formats that are based on an underlying open standard, developed by an open community, affirmed and maintained by a standards body and are fully documented and publicly available."

The Enterprise Technical Reference Model (ETRM) classifies four formats as "Open Formats":
  1. OASIS Open Document Format For Office Applications (OpenDocument) v. 1.1
    OpenDocument
    The Open Document Format for Office Applications is an XML-based file format for representing electronic documents such as spreadsheets, charts, presentations and word processing documents....

  2. Ecma-376 Office Open XML Formats (Open XML)
  3. Hypertext Document Format v. 4.01
    HTML
    HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

  4. Plain Text Format
    Plain text
    In computing, plain text is the contents of an ordinary sequential file readable as textual material without much processing, usually opposed to formatted text....


The Linux Information Project

According to The Linux Information Project, the term open format should refer to "any format that is published for anyone to read and study but which may or may not be encumbered by patents, copyrights or other restrictions on use". - as opposed to a free format which is not encumbered by any copyrights, patents, trademarks or other restrictions

Multimedia

  • ALAC
    Apple Lossless
    Apple Lossless Apple Lossless Apple Lossless (also known as ALAC (Apple Lossless Audio Codec), or ALE (Apple Lossless Encoder) is an audio codec developed by Apple Inc. for lossless data compression of digital music. After initially being proprietary for many years, in late 2011 Apple open sourced...

     — lossless audio codec, previously a proprietary format of Apple Inc.
  • CMML — timed metadata and subtitles
  • DAISY Digital Talking Book
    DAISY Digital Talking Book
    DAISY is a standard for digital talking books. DAISY books are typically used by people have "print disabilities," including blindness, impaired vision, dyslexia...

     — a talking book format
  • FLAC
    FLAC
    FLAC is a codec which allows digital audio to be losslessly compressed such that file size is reduced without any information being lost...

     — lossless audio codec
  • JPEG 2000
    JPEG 2000
    JPEG 2000 is an image compression standard and coding system. It was created by the Joint Photographic Experts Group committee in 2000 with the intention of superseding their original discrete cosine transform-based JPEG standard with a newly designed, wavelet-based method...

     — an image format standardized by ISO/IEC
  • Matroska (mkv)
    Matroska
    The Matroska Multimedia Container is an open standard free container format, a file format that can hold an unlimited number of video, audio, picture or subtitle tracks in one file. It is intended to serve as a universal format for storing common multimedia content, like movies or TV shows...

     — container for all type of multimedia formats (audio, video, images, subtitles)
  • MNG — moving pictures, based on PNG
  • Musepack
    Musepack
    Musepack or MPC is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160–180 kbit/s...

     — an audio codec
  • Ogg
    Ogg
    Ogg is a free, open container format maintained by the Xiph.Org Foundation. The creators of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high quality digital multimedia.The Ogg container format can multiplex...

     — container for Vorbis
    Vorbis
    Vorbis is a free software / open source project headed by the Xiph.Org Foundation . The project produces an audio format specification and software implementation for lossy audio compression...

    , FLAC
    FLAC
    FLAC is a codec which allows digital audio to be losslessly compressed such that file size is reduced without any information being lost...

    , Speex
    Speex
    Speex is a patent-free audio compression format designed for speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm. Speex claims to be free of any patent restrictions and is licensed under the revised BSD...

     (audio formats) & Theora
    Theora
    Theora is a free lossy video compression format. It is developed by the Xiph.Org Foundation and distributed without licensing fees alongside their other free and open media projects, including the Vorbis audio format and the Ogg container....

     (a video format)
  • PNG — a raster image format standardized by ISO/IEC
  • SMIL
    Synchronized Multimedia Integration Language
    SMIL , the Synchronized Multimedia Integration Language, is a W3C recommended XML markup language for describing multimedia presentations. It defines markup for timing, layout, animations, visual transitions, and media embedding, among other things...

     — a media playlisting format and multimedia integration language
  • Speex
    Speex
    Speex is a patent-free audio compression format designed for speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm. Speex claims to be free of any patent restrictions and is licensed under the revised BSD...

     — speech codec
  • SVG
    Scalable Vector Graphics
    Scalable Vector Graphics is a family of specifications of an XML-based file format for describing two-dimensional vector graphics, both static and dynamic . The SVG specification is an open standard that has been under development by the World Wide Web Consortium since 1999.SVG images and their...

     — a vector image format standardized by W3C
  • VRML
    VRML
    VRML is a standard file format for representing 3-dimensional interactive vector graphics, designed particularly with the World Wide Web in mind...

    /X3D
    X3D
    X3D is the ISO standard XML-based file format for representing 3D computer graphics, the successor to the Virtual Reality Modeling Language . X3D features extensions to VRML X3D is the ISO standard XML-based file format for representing 3D computer graphics, the successor to the Virtual Reality...

     — realtime 3D data formats standardized by ISO/IEC
  • WavPack
    WavPack
    WavPack is a free, open source lossless audio compression format developed by David Bryant.-Features:WavPack compression can compress 8-, 16-, 24-, and 32-bit fixed-point, and 32-bit floating point audio files in the .WAV file format. It also supports surround sound streams and high frequency...

     — "Hybrid" (lossless/lossy) audio codec
  • WebM
    WebM
    WebM is an audio-video format designed to provide a royalty-free, open video compression format for use with HTML5 video. The project's development is sponsored by Google....

     — a video/audio format
  • XSPF — a playlist format for multimedia

Text

  • ASCII
    ASCII
    The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

     — a plain text file
  • DVI
    DVI (file format)
    The Device independent file format is the output file format of the TeX typesetting program, designed by David R. Fuchs in 1979. Unlike the TeX markup files used to generate them, DVI files are not intended to be human-readable; they consist of binary data describing the visual layout of a...

     — device independent (TeX)
  • ePub
    EPUB
    EPUB is a free and open e-book standard by the International Digital Publishing Forum...

     — open e-book standard by the International Digital Publishing Forum (IDPF)
  • LaTeX
    LaTeX
    LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...

     — document markup language
  • Office Open XML — a formatted text format (ISO/IEC 29500:2008); see Licensing for details
  • OpenDocument
    OpenDocument
    The Open Document Format for Office Applications is an XML-based file format for representing electronic documents such as spreadsheets, charts, presentations and word processing documents....

     — a formatted text format (ISO/IEC 26300:2006).
  • OpenXPS — open standard for a page description language and a fixed-document format
  • PDF
    Portable Document Format
    Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....

     — open standard for documents exchange (ISO 15930-1:2001, ISO 19005-1:2005, ISO 32000-1:2008). PDF started out a proprietary standard, but was later submitted through standardization
  • PostScript
    PostScript
    PostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...

     — a page description language
    Page description language
    A page description language is a language that describes the appearance of a printed page in a higher level than an actual output bitmap. An overlapping term is printer control language, but it should not be confused as referring solely to Hewlett-Packard's PCL...

     and programming language
    Programming language
    A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

    . PostScript started out as a proprietary standard, but was later submitted through standardization
  • Rich Text Format
    Rich Text Format
    The Rich Text Format is a proprietary document file format with published specification developed by Microsoft Corporation since 1987 for Microsoft products and for cross-platform document interchange....

     — a formatted text format (proprietary, published specification, defined and maintained only by Microsoft)
  • Unicode
    Unicode
    Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

     — a text character format
  • UTF-8
    UTF-8
    UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...

     — text encoding with support for all common languages and scripts

Archiving and compression

  • 7z
    7z
    7z is a compressed archive file format that supports several different data compression, encryption and pre-processing algorithms. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public...

     — for archiving and/or compression
  • bzip2
    Bzip2
    bzip2 is a free and open source implementation of the Burrows–Wheeler algorithm. It is developed and maintained by Julian Seward. Seward made the first public release of bzip2, version 0.15, in July 1996.-Compression efficiency:...

     — for compression
  • gzip
    Gzip
    Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

     — for compression
  • MAFF
    Mozilla Archive Format
    The Mozilla Archive Format is a web page archiving format provided by Firefox for saving one or more web pages together with its associated audio, video, and other related web resources in a single file. It is not the same as the MHTML file format, which uses MIME encoding to store all the...

     — for web page archiving, based on ZIP
    ZIP (file format)
    Zip is a file format used for data compression and archiving. A zip file contains one or more files that have been compressed, to reduce file size, or stored as is...

  • PAQ
    PAQ
    PAQ is a series of lossless data compression archivers that have evolved through collaborative development to top rankings on several benchmarks measuring compression ratio . Specialized versions of PAQ have won the Hutter Prize and the Calgary Challenge...

     — for compression
  • SQX
    SQX
    The SQX-Archiver is an open and free data compression and archival format. It can be used in one's own applications free of charge...

     — for archiving and/or compression
  • tar
    Tar (file format)
    In computing, tar is both a file format and the name of a program used to handle such files...

     — for archiving
  • xz
    Xz
    xz is a lossless data compression file format incorporating the LZMA2 compression algorithm. Like gzip and bzip2, concatenation is supported to compress multiple files, but the convention is to bundle a file that is an archive itself, such as those created by the tar or cpio Unix...

     — for compression
  • ZIP
    ZIP (file format)
    Zip is a file format used for data compression and archiving. A zip file contains one or more files that have been compressed, to reduce file size, or stored as is...

     — for archiving and/or compression; the base format is in the public domain, but newer versions have some patented features

Other

  • CSS
    Cascading Style Sheets
    Cascading Style Sheets is a style sheet language used to describe the presentation semantics of a document written in a markup language...

     — style sheet format usually used with (X)HTML, standardized by W3C
  • CSV
    Comma-separated values
    A comma-separated values file stores tabular data in plain-text form. As a result, such a file is easily human-readable ....

     — comma separated values, commonly used for spreadsheets or simple databases
  • DjVu
    DjVu
    DjVu is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy...

     — file format for scanned images or documents
  • EAS3
    EAS3
    EAS3 is a software toolkit for reading and writing structured binary data with geometry information and for postprocessing of these data. It is meant to exchange floating-point data according to IEEE standard between different computers, to modify them or to convert them into other file formats....

     — binary file format for floating point
    Floating point
    In computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent. The base for the scaling is normally 2, 10 or 16...

     data
  • ELF
    Executable and Linkable Format
    In computing, the Executable and Linkable Format is a common standard file format for executables, object code, shared libraries, and core dumps. First published in the System V Application Binary Interface specification, and later in the Tool Interface Standard, it was quickly accepted among...

     — Executable and Linkable Format
  • FreeOTFE
    FreeOTFE
    FreeOTFE is an open source on-the-fly disk encryption computer program for PCs running Microsoft Windows, and personal digital assistants running Windows Mobile . It creates virtual drives, or disks, to which anything written is automatically encrypted before being stored on a computer's hard or...

     — container for encrypted data
  • Hierarchical Data Format
    Hierarchical Data Format
    Hierarchical Data Format is the name of a set of file formats and libraries designed to store and organize large amounts of numerical data...

     — multi-platform data format for storing multidimensional arrays, among other data structures
  • HTML
    HTML
    HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

    /XHTML
    XHTML
    XHTML is a family of XML markup languages that mirror or extend versions of the widely-used Hypertext Markup Language , the language in which web pages are written....

     — markup language
    Markup language
    A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...

     for web page
    Web page
    A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...

    s (ISO/IEC 15445:2000)
  • iCalendar
    ICalendar
    iCalendar is a computer file format which allows Internet users to send meeting requests and tasks to other Internet users, via email, or sharing files with an extension of .ics...

     — calendar data format
  • JSON
    JSON
    JSON , or JavaScript Object Notation, is a lightweight text-based open standard designed for human-readable data interchange. It is derived from the JavaScript scripting language for representing simple data structures and associative arrays, called objects...

     — object notation, subset of YAML and correct ECMAScript
    ECMAScript
    ECMAScript is the scripting language standardized by Ecma International in the ECMA-262 specification and ISO/IEC 16262. The language is widely used for client-side scripting on the web, in the form of several well-known dialects such as JavaScript, JScript, and ActionScript.- History :JavaScript...

     statement
  • LTFS — Linear Tape File System
  • NetCDF
    NetCDF
    NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The project homepage is hosted by the Unidata program at the University Corporation for Atmospheric Research...

     — for scientific data
  • NZB
    NZB
    NZB is an XML-based file format for retrieving posts from NNTP servers. The format was conceived by the developers of the Newzbin.com Usenet Index. NZB is effective when used with search-capable websites. These websites create NZB files out of what is needed to be downloaded...

     — for multipart binary files on Usenet
    Usenet
    Usenet is a worldwide distributed Internet discussion system. It developed from the general purpose UUCP architecture of the same name.Duke University graduate students Tom Truscott and Jim Ellis conceived the idea in 1979 and it was established in 1980...

  • PHP
    PHP
    PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

     — scripting
    Scripting language
    A scripting language, script language, or extension language is a programming language that allows control of one or more applications. "Scripts" are distinct from the core code of the application, as they are usually written in a different language and are often created or at least modified by the...

     and markup language
    Markup language
    A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...

     for web development
    Web development
    Web development is a broad term for the work involved in developing a web site for the Internet or an intranet . This can include web design, web content development, client liaison, client-side/server-side scripting, web server and network security configuration, and e-commerce development...

  • RSS
    RSS
    -Mathematics:* Root-sum-square, the square root of the sum of the squares of the elements of a data set* Residual sum of squares in statistics-Technology:* RSS , "Really Simple Syndication" or "Rich Site Summary", a family of web feed formats...

     — syndication
  • SDXF
    SDXF
    SDXF is a data serialization format defined by RFC 3072.It allows arbitrary structured data of different types to be assembled together for exchanging between computers of different architectures....

     — the Structured Data eXchange Format
  • SFV — checksum format
  • TrueCrypt
    TrueCrypt
    TrueCrypt is a software application used for on-the-fly encryption . It is free and open source. It can create a virtual encrypted disk within a file or encrypt a partition or the entire storage device .- Operating systems :TrueCrypt supports Microsoft Windows, Mac OS X, and...

     — container for encrypted data
  • WebDAV
    WebDAV
    Web-based Distributed Authoring and Versioning is a set of methods based on the Hypertext Transfer Protocol that facilitates collaboration between users in editing and managing documents and files stored on World Wide Web servers...

     — Internet filesystem format
  • XML
    Extensible Markup Language
    Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

     — a general-purpose markup language
    Markup language
    A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...

    , standardized by W3C
  • YAML
    YAML
    YAML is a human-readable data serialization format that takes concepts from programming languages such as C, Perl, and Python, and ideas from XML and the data format of electronic mail . YAML was first proposed by Clark Evans in 2001, who designed it together with Ingy döt Net and Oren Ben-Kiki...

     — human readable data serialization format

See also

  • Open standard
    Open standard
    An open standard is a standard that is publicly available and has various rights to use associated with it, and may also have various properties of how it was designed . There is no single definition and interpretations vary with usage....

  • Free software
    Free software
    Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

  • Open source
    Open source
    The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

  • List of open source codecs
  • Open system
    Open system (computing)
    Open systems are computer systems that provide some combination of interoperability, portability, and open software standards. The term was popularized in the early 1980s, mainly to describe systems based on Unix,...

  • Free protocol
    Free protocol
    A free protocol is a protocol whose full specification is freely available and for which there are no restrictions on its use. Users may design and use variations that suit their needs, and contribute enhancements for potential incorporation into the next official version of the protocol...

  • Vendor lock-in
    Vendor lock-in
    In economics, vendor lock-in, also known as proprietary lock-in or customer lock-in, makes a customer dependent on a vendor for products and services, unable to use another vendor without substantial switching costs...

  • Embrace, extend and extinguish
    Embrace, extend and extinguish
    "Embrace, extend and extinguish," also known as "Embrace, extend and exterminate," is a phrase that the U.S. Department of Justice found was used internally by Microsoft to describe its strategy for entering product categories involving widely used standards, extending those standards with...

  • Network effect
    Network effect
    In economics and business, a network effect is the effect that one user of a good or service has on the value of that product to other people. When network effect is present, the value of a product or service is dependent on the number of others using it.The classic example is the telephone...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK