Metadata
Encyclopedia
The term metadata is an ambiguous term which is used for two fundamentally different concepts (types). Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at design time the application contains no data. In this case the correct description would be "data about the containers of data". Descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description (resulting in a disambiguating neologism) would be "data about data contents" or "content about content" thus metacontent
Meta Content Framework
Meta Content Framework was a specification of a format for structuring metadata about web sites and other data. MCF was developed by Ramanathan V. Guha at Apple Computer between 1995 and 1997...

. Descriptive, Guide and the National Information Standards Organization
National Information Standards Organization
The National Information Standards Organization is a United States non-profit standards organization that develops, maintains and publishes technical standards related to publishing, bibliographic and library applications. It was founded in 1939, incorporated as a not-for-profit education...

 concept of administrative metadata are all subtypes of metacontent.

Metadata (metacontent) is traditionally found in the card catalogs
Library catalog
A library catalog is a register of all bibliographic items found in a library or group of libraries, such as a network of libraries at several locations...

 of libraries
Library
In a traditional sense, a library is a large collection of books, and can refer to the place in which the collection is housed. Today, the term can refer to any collection, including digital sources, resources, and services...

. As information has become increasingly digital, metadata is also used to describe digital data using metadata standards
Metadata standards
Metadata standards are requirements which are intended to establish a common understanding of the meaning or semantics of the data, to ensure correct and proper use and interpretation of the data by its owners and users...

 specific to a particular discipline. By describing the contents and context
Context
Context may refer to:* Context , the relevant constraints of the communicative situation that influence language use, language variation, and discourse summary...

 of data files
Computer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...

, the quality of the original data/files is greatly increased. For example, a webpage may include metadata specifying what language it's written in, what tools were used to create it, and where to go for more on the subject, allowing browsers to automatically improve the experience of users.

Definition

Metadata (metacontent) is defined as data providing information about one or more aspects of the data, such as:
  • Means of creation of the data
  • Purpose of the data
  • Time and date of creation
  • Creator or author of data
  • Placement on a computer network
    Computer network
    A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....

     where the data was created
  • Standards used
  • The basic information of a piece of music

For example, a digital image
Digital image
A digital image is a numeric representation of a two-dimensional image. Depending on whether or not the image resolution is fixed, it may be of vector or raster type...

 may include metadata that describes how large the picture is, the color depth, the image resolution, when the image was created, and other data. A text document's metadata may contain information about how long the document is, who the author is, when the document was written, and a short summary of the document.

Metadata is data. As such, metadata can be stored and managed in a database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

, often called a Metadata registry
Metadata registry
A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...

 or Metadata repository
Metadata repository
A Metadata repository is a database created to gather, store, and distribute contextual information about business data, when documented it is known as metadata...

. However, it is impossible to identify metadata just by looking at it because a user would not know when data is metadata or just data.

Libraries

Metadata has been used in various forms as a means of cataloging archived information. The Dewey Decimal System employed by libraries for the classification of library materials is an early example of metadata usage. Library catalogues used 3x5 inch cards to display a book's title, author, subject matter, and a brief plot synopsis along with an abbreviated alpha-numeric
Alphanumeric
Alphanumeric is a combination of alphabetic and numeric characters, and is used to describe the collection of Latin letters and Arabic digits or a text constructed from this collection. There are either 36 or 62 alphanumeric characters. The alphanumeric character set consists of the numbers 0 to...

 identification system which indicated the physical location of the book within the library's shelves.
Such data helps classify, aggregate, identify, and locate a particular book. Another form of older metadata collection is the use by US Census Bureau of what is known as the "Long Form." The Long Form asks questions that are used to create demographic data to create patterns and to find patterns of distribution.
The term was coined in 1968 by Philip Bagley, one of the pioneers of computerized document retrieval
Information retrieval
Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...

. Since then the fields of information management, information science, information technology, librarianship and GIS have widely adopted the term. In these fields the word metadata is defined as "data about data".
While this is the generally accepted definition, various disciplines have adopted their own more specific explanation and uses of the term.

For the purposes of this article, an "object" refers to any of the following:
  • A physical item such as a book, CD, DVD, map, chair, table, flower pot, etc.
  • An electronic file such as a digital image, digital photo, document, program file, database table, etc.

Photographs

Metadata may be written into a digital photo file that will identify who owns it, copyright & contact information, what camera created the file, along with exposure information and descriptive information such as keywords about the photo, making the file searchable on the computer and/or the Internet. Some metadata is written by the camera and some is input by the photographer and/or software after downloading to a computer.

Photographic Metadata Standards are governed by organizations that develop the following standards. They include, but are not limited to:
  • IPTC Information Interchange Model
    IPTC Information Interchange Model
    The Information Interchange Model is a file structure and set of metadata attributes that can be applied to text, images and other media types...

     IIM (International Press Telecommunications Council),
  • IPTC Core Schema for XMP
  • XMP
    Extensible Metadata Platform
    The Adobe Extensible Metadata Platform is a standard, created by Adobe Systems Inc., for processing and storing standardized and proprietary information relating to the contents of a file....

     – Extensible Metadata Platform (an Adobe standard)
  • Exif – Exchangeable image file format, Maintained by CIPA (Camera & Imaging Products Association) and published by JEITA (Japan Electronics and Information Technology Industries Association)
  • Dublin Core
    Dublin Core
    The Dublin Core metadata terms are a set of vocabulary terms which can be used to describe resources for the purposes of discovery. The terms can be used to describe a full range of web resources: video, images, web pages etc and physical resources such as books and objects like artworks...

     (Dublin Core Metadata Initiative – DCMI)
  • PLUS (Picture Licensing Universal System).

Video

Metadata is particularly useful in video, where information about its contents (such as transcripts of conversations and text descriptions of its scenes) are not directly understandable by a computer, but where efficient search is desirable.

Web pages

Web pages often include metadata in the form of meta tags
Meta element
Meta elements are the HTML or XHTML <meta … > element used to provide structured metadata about a Web page. Multiple elements are often used on the same page: the element is the same, but its attributes are different...

. Description and keywords meta tags are commonly used to describe the Web page's content. Most search engines use this data when adding pages to their search index.

Creation of metadata

Metadata can be created either by automated information processing or by manual work. Elementary metadata captured by computers can include information about when a file was created, who created it, when it was last updated, file size and file extension.

Metadata types

The metadata application is manifold covering a large variety of fields of application there are nothing but specialised and well accepted models to specify types of metadata. Bretheron & Singley (1994) distinguish between two distinct classes: structural/control metadata and guide metadata. Structural metadata is used to describe the structure of computer systems such as tables, columns and indexes. Guide metadata is used to help humans find specific items and is usually expressed as a set of keywords in a natural language. According to Ralph Kimball
Ralph Kimball
Ralph Kimball is an author on the subject of data warehousing and business intelligence. He is widely regarded as one of the original architects of data warehousing and is known for long-term convictions that data warehouses must be designed to be understandable and fast...

 metadata can be divided into 2 similar categories—Technical metadata and Business metadata. Technical metadata correspond to internal metadata, business metadata to external metadata. Kimball adds a third category named Process metadata. On the other hand, NISO distinguishes between three types of metadata: descriptive, structural and administrative. Descriptive metadata is the information used to search and locate an object such as title, author, subjects, keywords, publisher; structural metadata gives a description of how the components of the object are organised; and administrative metadata refers to the technical information including file type. Two sub-types of administrative metadata are rights management metadata and preservation metadata.

Metadata structures

Metadata (metacontent), or more correctly, the vocabularies used to assemble metadata (metacontent) statements, is typically structured according to a standardized concept using a well defined metadata scheme, including: metadata standards
Metadata standards
Metadata standards are requirements which are intended to establish a common understanding of the meaning or semantics of the data, to ensure correct and proper use and interpretation of the data by its owners and users...

 and metadata models
Metadata modeling
Metadata modeling is a type of metamodeling used in software engineering and systems engineering for the analysis and construction of models applicable and useful some predefined class of problems....

. Tools such as controlled vocabularies
Controlled vocabulary
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other form of knowledge organization systems...

, taxonomies
Taxonomy
Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...

, thesauri, data dictionaries
Data dictionary
A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to...

 and metadata registries
Metadata registry
A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...

 can be used to apply further standardization to the metadata.

Metadata syntax

Metadata (metacontent) syntax refers to the rules created to structure the fields or elements of metadata (metacontent). A single metadata scheme may be expressed in a number of different markup or programming languages, each of which requires a different syntax. For example, Dublin Core may be expressed in plain text, HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

, XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 and RDF
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...

.

A common example of (guide) metacontent is the bibliographic classification, the subject, the Dewey Decimal class number. There is always an implied statement in any "classification" of some object. To classify an object as, for example, Dewey class number 514 (Topology) (e.g. a book has this number on the spine) the implied statement is: "<514>. This is a subject-predicate-object triple, or more importantly, a class-attribute-value triple. The first two elements of the triple (class, attribute) are pieces of some structural metadata having a defined semantic. The third element is a value, preferably from some controlled vocabulary, some reference (master) data. The combination of the metadata and master data elements results in a statement which is a metacontent statement i.e. "metacontent = metadata + master data". All these elements can be thought of as "vocabulary". Both metadata and master data are vocabularies which can be assembled into metacontent statements. There are many sources of these vocabularies, both meta and master data: UML, EDIFACT, XSD, Dewey/UDC/LoC, SKOS, ISO-25964, Pantone, Linnaean Binomial Nomenclature etc. Using controlled vocabularies for the components of metacontent statements, whether for indexing or finding, is endorsed by ISO-25964: "If both the indexer and the searcher are guided to choose the same term for the same concept, then relevant documents will be retrieved." This is particularly relevant when considering that the behemoth of the internet, Google, is simply indexing then matching text strings, there is no intelligence or "inferencing" occurring.

Hierarchical, linear and planar schemata

Metadata schema can be hierarchical in nature where relationships exist between metadata elements and elements are nested so that parent-child relationships exist between the elements.
An example of a hierarchical metadata schema is the IEEE LOM
Learning object metadata
Learning Object Metadata is a data model, usually encoded in XML, used to describe a learning object and similar digital resources used to support learning...

 schema where metadata elements may belong to a parent metadata element.
Metadata schema can also be one dimensional, or linear, where each element is completely discrete from other elements and classified according to one dimension only.
An example of a linear metadata schema is Dublin Core schema which is one dimensional.
Metadata schema are often two dimensional, or planar, where each element is completely discrete from other elements but classified according to two orthogonal dimensions.

Metadata hypermapping

In all cases where the metadata schemata exceed the planar depiction, some type of hypermapping is required to enable display and view of metadata according to chosen aspect and to serve special views. Hypermapping frequently applies to layering of geographical and geological information overlays.

Granularity

Granularity is a term that applies to data as well as to metadata. The degree to which metadata is structured is referred to as its granularity. Metadata with a high granularity allows for deeper structured information and enables greater levels of technical manipulation however, a lower level of granularity means that metadata can be created for considerably lower costs but will not provide as detailed information. The major impact of granularity is not only on creation and capture, but moreover on maintenance. As soon as the metadata structures get outdated, the access to the referred data will get outdated. Hence granularity shall take into account the effort to create as well as the effort to maintain.

Metadata standards

International standards apply to metadata. Much work is being accomplished in the national and international standards communities, especially ANSI
Ansi
Ansi is a village in Kaarma Parish, Saare County, on the island of Saaremaa, Estonia....

 (American National Standards Institute) and ISO (International Organization for Standardization) to reach consensus on standardizing metadata and registries.

The core standard is ISO/IEC
International Electrotechnical Commission
The International Electrotechnical Commission is a non-profit, non-governmental international standards organization that prepares and publishes International Standards for all electrical, electronic and related technologies – collectively known as "electrotechnology"...

 11179-1:2004 and subsequent standards (see ISO/IEC 11179
ISO/IEC 11179
ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...

). All yet published registrations according to this standard cover just the definition of metadata and do not serve the structuring of metadata storage or retrieval neither any administrative standardisation. It is important to note that this standard refers to metadata as data about containers of data and not to metadata (metacontent) as data about data contents. It should also be noted that this standard describes itself originally as a "data element" registry, describing disembodied data elements, and explicitly disavows the capability of containing complex structures. Thus the original term "data element" is more applicable than the later applied buzzword "metadata".

Data Virtualization

Data Virtualization has emerged as the new software technology to complete the virtualization stack in the enterprise. Metadata is used in Data Virtualization servers which are enterprise infrastructure components, along side with Database and Application servers. Metadata in these servers is saved as persistent repository and describes business objects in various enterprise systems and applications.

SVN Checkout Metadata

.SVN hidden files created in the web root folder which can reveal crucial information of the code repositories.

Statistics and census services

Standardization work has had a large impact on efforts to build metadata systems in the statistical community. Several metadata standards are described, and their importance to statistical agencies is discussed. Applications of the standards at the Census Bureau, Environmental Protection Agency, Bureau of Labor Statistics, Statistics Canada, and many others are described. Emphasis is on the impact a metadata registry can have in a statistical agency.

Library and information science

Libraries
Library
In a traditional sense, a library is a large collection of books, and can refer to the place in which the collection is housed. Today, the term can refer to any collection, including digital sources, resources, and services...

 employ metadata in library catalog
Library catalog
A library catalog is a register of all bibliographic items found in a library or group of libraries, such as a network of libraries at several locations...

ues, most commonly as part of an Integrated Library Management System. Metadata is obtained by cataloguing resources such as books, periodicals, DVDs, web pages or digital images. This data is stored in the integrated library management system, ILMS, using the MARC
MARC standards
MARC, MAchine-Readable Cataloging, is a data format and set of related standards used by libraries to encode and share information about books and other material they collect...

 metadata standard. The purpose is to direct patrons to the physical or electronic location of items or areas they seek as well as to provide a description of the item/s in question.

More recent and specialized instances of library metadata include the establishment of digital libraries
Digital library
A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...

 including e-print
Eprint
An eprint is a digital version of a research document that is accessible online, whether from a local Institutional, or...

 repositories and digital image libraries. While often based on library principles the focus on non-librarian use, especially in providing metadata means they do not follow traditional or common cataloging approaches. Given the custom nature of included materials metadata fields are often specially created e.g. taxonomic classification fields, location fields, keywords or copyright statement. Standard file information such as file size and format are usually automatically included.

Standardization for library operation has been a key topic in international standardization (ISO) for decades. Standards for metadata in digital libraries include Dublin Core
Dublin Core
The Dublin Core metadata terms are a set of vocabulary terms which can be used to describe resources for the purposes of discovery. The terms can be used to describe a full range of web resources: video, images, web pages etc and physical resources such as books and objects like artworks...

, METS
METS
The Metadata Encoding and Transmission Standard is a metadata standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium...

, MODS, DDI
Data Documentation Initiative
The Data Documentation Initiative is an international project to create a standard for information describing statistical and social science data. Begun in 1995, the effort brings together data professionals from around the world to develop the standard. The DDI specification, written in XML,...

, ISO standard Digital Object Identifier (DOI)
Digital object identifier
A digital object identifier is a character string used to uniquely identify an object such as an electronic document. Metadata about the object is stored in association with the DOI name and this metadata may include a location, such as a URL, where the object can be found...

, ISO standard Uniform Resource Name (URN)
Uniform Resource Name
A uniform resource name is a uniform resource identifier that uses the urn scheme and does not imply availability of the identified resource. Both URNs and URLs are URIs, and a particular URI may be a name and a locator at the same time.The functional requirements for uniform resource names are...

, PREMIS
Preservation Metadata: Implementation Strategies (PREMIS)
PREMIS is an international working group concerned with developing metadata for use in digital preservation....

 schema, Ecological Metadata Language
Ecological Metadata Language
Ecological Metadata Language is a metadata standard developed by and for the ecology discipline. It is based on prior work done by the Ecological Society of America and others, including the Knowledge Network for Biocomplexity. EML is a set of XML schema documents that allow for the structural...

, and OAI-PMH
Open Archives Initiative Protocol for Metadata Harvesting
OAI-PMH is a protocol developed by the Open Archives Initiative. It is used to harvest the metadata descriptions of the records in an archive so that services can be built using metadata from many archives...

. Leading libraries in the world give hints on their metadata standards strategies.

United States

Problems involving metadata in litigation in the United States
United States
The United States of America is a federal constitutional republic comprising fifty states and a federal district...

 are becoming widespread. Courts have looked at various questions involving metadata, including the discoverability of metadata by parties. Although the Federal Rules of Civil Procedure have only specified rules about electronic documents, subsequent case law has elaborated on the requirement of parties to reveal metadata. In October 2009, the Arizona Supreme Court
Arizona Supreme Court
The Arizona Supreme Court is the state supreme court of the U.S. state of Arizona. It consists of a Chief Justice, a Vice Chief Justice, and three associate justices. Each justice is appointed by the governor of Arizona from a list recommended by a bipartisan commission. Justices stand for...

 has ruled that metadata records are public record.

Document Metadata has proven particularly important in legal environments in which litigation has requested metadata, which can include sensitive information detrimental to a party in court.

Using metadata removal tool
Metadata removal tool
Metadata removal tool or Metadata scrubber is a type of privacy software built to protect the privacy of its users by removing potentially privacy-compromising metadata from files before they are shared with others Metadata removal tool or Metadata scrubber is a type of privacy software built to...

s to "clean" documents can mitigate the risks of unwittingly sending sensitive data. This process partially (see Data remanence
Data remanence
Data remanence is the residual representation of data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written...

) protects law firms from potentially damaging leaking of sensitive data through Electronic Discovery
Electronic Discovery
Electronic discovery refers to discovery in civil litigation which deals with the exchange of information in electronic format . Usually a digital forensics analysis is performed to recover evidence...

.

Metadata in healthcare

Australian researches in medicine started a lot of metadata definition for applications in health care. That approach offers the first recognised attempt to adhere to international standards in medical sciences instead of defining a proprietary standard under the WHO umbrella first.

The medical community yet did not approve the need to follow metadata standards despite respective research.

Metadata and data warehousing

Data warehouse
Data warehouse
In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...

 (DW) is a repository of an organization's electronically stored data. Data warehouses are designed to manage and store the data whereas the Business Intelligence
Business intelligence
Business intelligence mainly refers to computer-based techniques used in identifying, extracting, and analyzing business data, such as sales revenue by products and/or departments, or by associated costs and incomes....

 (BI) focuses on the usage of data to facilitate reporting and analysis.

The purpose of a data warehouse is to house standardized, structured, consistent, integrated, correct, cleansed and timely data, extracted from various operational systems in an organization. The extracted data is integrated in the data warehouse
Data warehouse
In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...

 environment in order to provide an enterprise wide perspective, one version of the truth. Data is structured in a way to specifically address the reporting and analytic requirements.

An essential component of a data warehouse
Data warehouse
In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...

/business intelligence
Business intelligence
Business intelligence mainly refers to computer-based techniques used in identifying, extracting, and analyzing business data, such as sales revenue by products and/or departments, or by associated costs and incomes....

 system is the metadata and tools to manage and retrieve metadata. Ralph Kimball
Ralph Kimball
Ralph Kimball is an author on the subject of data warehousing and business intelligence. He is widely regarded as one of the original architects of data warehousing and is known for long-term convictions that data warehouses must be designed to be understandable and fast...

  describes metadata as the DNA of the data warehouse as metadata defines the elements of the data warehouse
Data warehouse
In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...

 and how they work together.

Kimball
Ralph Kimball
Ralph Kimball is an author on the subject of data warehousing and business intelligence. He is widely regarded as one of the original architects of data warehousing and is known for long-term convictions that data warehouses must be designed to be understandable and fast...

 et al. refers to three main categories of metadata: Technical metadata, business metadata and process metadata. Technical metadata is primarily definitional while business metadata and process metadata are primarily descriptive. Keep in mind that the categories sometimes overlap.
  • Technical metadata defines the objects and processes in a DW/BI system, as seen from a technical point of view. The technical metadata includes the system metadata which defines the data structures such as: Tables, fields, data types, indexes and partitions in the relational engine, and databases, dimensions, measures, and data mining models. Technical metadata defines the data model and the way it is displayed for the users, with the reports, schedules, distribution lists and user security rights.

  • Business metadata is content from the data warehouse described in more user-friendly terms. The business metadata tells you what data you have, where it comes from, what it means and what its relationship is to other data in the data warehouse. Business metadata may also serves as documentation for the DW/BI system. Users who browse the data warehouse are primarily viewing the business metadata.

  • Process metadata is used to describe the results of various operations in the data warehouse. Within the ETL
    Extract, transform, load
    Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...

     process all key data from tasks are logged on execution. This includes start time, end time, CPU seconds used, disk reads, disk writes and rows processed. When troubleshooting the ETL or query
    Information retrieval
    Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...

     process, this sort of data becomes valuable. Process metadata is the fact measurement when building and using a DW/BI system. Some organizations make a living out of collecting and selling this sort of data to companies - in that case the process metadata becomes the business metadata for the fact and dimension tables. Process metadata is in interest of business people who can use the data to identify the users of their products, which products they are using and what level of service they are receiving.

Metadata on the Internet

The HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

 format used to define web pages allows for the inclusion of a variety of types of metadata, from basic descriptive text, dates and keywords to further advanced metadata schemes such as the Dublin Core
Dublin Core
The Dublin Core metadata terms are a set of vocabulary terms which can be used to describe resources for the purposes of discovery. The terms can be used to describe a full range of web resources: video, images, web pages etc and physical resources such as books and objects like artworks...

, e-GMS
E-GMS
The e-GMS is the UK e-Government Metadata Standard. It is an application profile of the Dublin Core Metadata Element Set.The e-GMS defines how UK public sector bodies should label content such as web pages and documents in order to make such information more easily managed, found and shared.The...

, and AGLS standards. Pages can also be geotagged
GeoTagging
Geotagging is the process of adding geographical identification metadata to various media such as a geotagged photograph or video, websites, SMS messages, QR Codes or RSS feeds and is a form of geospatial metadata...

 with coordinates
Geographic coordinate system
A geographic coordinate system is a coordinate system that enables every location on the Earth to be specified by a set of numbers. The coordinates are often chosen such that one of the numbers represent vertical position, and two or three of the numbers represent horizontal position...

. Metadata may be included in the page's header or in a separate file. Microformat
Microformat
A microformat is a web-based approach to semantic markup which seeks to re-use existing HTML/XHTML tags to convey metadata and other attributes in web pages and other contexts that support HTML, such as RSS...

s allow metadata to be added to on-page data in a way that users do not see, but computers can readily access.

Interestingly, many search engines are cautious about using metadata in their ranking algorithms due to exploitation of metadata and the practice of search engine optimization, SEO
Search engine optimization
Search engine optimization is the process of improving the visibility of a website or a web page in search engines via the "natural" or un-paid search results...

, to improve rankings. See Meta element
Meta element
Meta elements are the HTML or XHTML <meta … > element used to provide structured metadata about a Web page. Multiple elements are often used on the same page: the element is the same, but its attributes are different...

 article for further discussion.

Metadata on the broadcast industry

In broadcast
Broadcast
Broadcast or Broadcasting may refer to:* Broadcasting, the transmission of audio and video signals* Broadcast, an individual television program or radio program* Broadcast , an English electronic music band...

 industry, metadata are linked to audio and video Broadcast media to:
  • identify the media: clip
    Media clip
    A media clip is a short segment of media, either an audio clip or a video clip.Media clips may be promotional in nature, as with movie clips. For example, to promote upcoming movies, many actors are accompanied by movie clips on their circuits. Additionally, media clips may be raw materials of...

     or playlist
    Playlist
    In its most general form, a playlist is simply a list of songs. They can be played in sequential or shuffled order. The term has several specialized meanings in the realms of radio broadcasting and personal computers.-In radio:...

     names, duration, timecode, etc.
  • describe the content: notes regarding the quality of video content, rating, description (for example, during a sport event, keywords
    Keywords
    Keywords are the words that are used to reveal the internal structure of an author's reasoning. While they are used primarily for rhetoric, they are also used in a strictly grammatical sense for structural composition, reasoning, and comprehension...

     like goal, red card will be associated to some clips)
  • classify media: metadata allow to sort the media or to easily and quickly find a video content (a TV news could urgently need some archive content for a subject).


These metadata can be linked to the video media thanks to the video servers. All last broadcast
Broadcast
Broadcast or Broadcasting may refer to:* Broadcasting, the transmission of audio and video signals* Broadcast, an individual television program or radio program* Broadcast , an English electronic music band...

ed sport events like FIFA World Cup
FIFA World Cup
The FIFA World Cup, often simply the World Cup, is an international association football competition contested by the senior men's national teams of the members of Fédération Internationale de Football Association , the sport's global governing body...

 or Olympic Games
Olympic Games
The Olympic Games is a major international event featuring summer and winter sports, in which thousands of athletes participate in a variety of competitions. The Olympic Games have come to be regarded as the world’s foremost sports competition where more than 200 nations participate...

 use these metadata to distribute their video content to TV stations through keywords. It's often the host broadcaster who is in charge of organizing metadata through its International Broadcast Centre and its video servers. Those metadata are recorded with the images and are entered by metadata operators (loggers) who associate in live metadata available in metadata grids through software (such as Multicam(LSM)
Multicam(LSM)
Multicam is software developed by the Belgian company EVS Broadcast Equipment. Combined with its remote controller, it allows to control the XT[2] video server and offers highly reactive live editing solutions like instant replays and slow-motion....

 or IPDirector
IPDirector
IPDirector is an integrated suite developed by the Belgian company EVS Broadcast Equipment which groups several video production management applications which allow to control the ingest and the play out of video feeds in the XT[2] video server....

 used during FIFA World Cup
FIFA World Cup
The FIFA World Cup, often simply the World Cup, is an international association football competition contested by the senior men's national teams of the members of Fédération Internationale de Football Association , the sport's global governing body...

 or Olympic Games
Olympic Games
The Olympic Games is a major international event featuring summer and winter sports, in which thousands of athletes participate in a variety of competitions. The Olympic Games have come to be regarded as the world’s foremost sports competition where more than 200 nations participate...

).

Geospatial metadata

Metadata that describe geographic objects (such as datasets, maps, features, or simply documents with a geospatial component) have a history dating back to at least 1994 (refer MIT Library page on FGDC Metadata). This class of metadata is described more fully on the Geospatial metadata
Geospatial metadata
Geospatial metadata is a type of metadata that is applicable to objects that have an explicit or implicit geographic extent, in other words, are associated with some position on the surface of the Globe...

 page.

Ecological & environmental metadata

Ecological and environmental metadata are intended to document the who, what, when, where, why, and how of data collection for a particular study. Metadata should be generated in a format commonly used by the most relevant science community, such as Darwin Core
Darwin Core
Darwin Core is a body of data standards which function as an extension of Dublin Core for biodiversity informatics applications, establishing a vocabulary of terms to facilitate the discovery, retrieval, and integration of information about organisms, their spatiotemporal occurrence, and...

, Ecological Metadata Language
Ecological Metadata Language
Ecological Metadata Language is a metadata standard developed by and for the ecology discipline. It is based on prior work done by the Ecological Society of America and others, including the Knowledge Network for Biocomplexity. EML is a set of XML schema documents that allow for the structural...

, or Dublin Core
Dublin Core
The Dublin Core metadata terms are a set of vocabulary terms which can be used to describe resources for the purposes of discovery. The terms can be used to describe a full range of web resources: video, images, web pages etc and physical resources such as books and objects like artworks...

. Metadata editing tools exist to facilitate metadata generation (e.g. Metavist, Mercury: Metadata Search System
Mercury: Metadata Search System
Mercury is a Distributed Metadata Management, Data Discovery and Access System . It is a scientific data search system to capture and manage biogeochemical and ecological data in support of the National Aeronautics and Space Administration's Earth science programs. Mercury was originally developed...

, Morpho). Metadata should describe provenance of the data (where it originated, as well as any transformations the data underwent) and how to give credit for (cite) the data products.

Metadata on CDs and DVDs

CDs such as recordings of music will carry a layer of metadata about the recordings such as dates, artist, genre, copyright owner, etc. The metadata, not normally displayed by CD players, can be accessed and displayed by specialized music playback and/or editing applications.

Cloud applications

With the availability of Cloud
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....

 applications, which include those to add metadata to content, metadata is increasingly available over the Internet.

Metadata storage

Metadata can be stored either internally, in the same file as the data, or externally, in a separate file. Metadata that is embedded with content is called embedded metadata. A data repository typically stores the metadata detached from the data. Both ways have advantages and disadvantages:
  • Internal storage allows transferring metadata together with the data it describes; thus, metadata is always at hand and can be manipulated easily. This method creates high redundancy and does not allow holding metadata together.
  • External storage allows bundling metadata, for example in a database, for more efficient searching. There is no redundancy and metadata can be transferred simultaneously when using streaming
    Streaming media
    Streaming media is multimedia that is constantly received by and presented to an end-user while being delivered by a streaming provider.The term "presented" is used in this article in a general sense that includes audio or video playback. The name refers to the delivery method of the medium rather...

    . However, as most formats use URIs
    Uniform Resource Identifier
    In computing, a uniform resource identifier is a string of characters used to identify a name or a resource on the Internet. Such identification enables interaction with representations of the resource over a network using specific protocols...

     for that purpose, the method of how the metadata is linked to its data should be treated with care. What if a resource does not have a URI (resources on a local hard disk or web pages that are created on-the-fly using a content management system)? What if metadata can only be evaluated if there is a connection to the Web, especially when using RDF?
    Resource Description Framework
    The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...

     How to realize that a resource is replaced by another with the same name but different content?


Moreover, there is the question of data format: storing metadata in a human-readable format such as XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 can be useful because users can understand and edit it without specialized tools. On the other hand, these formats are not optimized for storage capacity; it may be useful to store metadata in a binary, non-human-readable format instead to speed up transfer and save memory.

Database management

Each relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

 system has its own mechanisms for storing metadata. Examples of relational-database metadata include:
  • Tables of all tables in a database, their names, sizes and number of rows in each table.
  • Tables of columns in each database, what tables they are used in, and the type of data stored in each column.

In database terminology, this set of metadata is referred to as the catalog
Database catalog
The database catalog of a database instance consists of metadata in which definitions of database objects such as base tables, views , synonyms, value ranges, indexes, users, and user groups are stored....

. The SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....

 standard specifies a uniform means to access the catalog, called the information schema
Information Schema
In relational databases, the information schema is an ANSI standard set of read-only views which provide information about all of the tables, views, columns, and procedures in a database...

, but not all databases implement it, even if they implement other aspects of the SQL standard. For an example of database-specific metadata access methods, see Oracle metadata
Oracle metadata
The Oracle Database contains tables which describe what database objects – i.e. tables, procedures, triggers etc. – exist within the database. This information about the information is known as metadata....

. Programmatic access to metadata is possible using APIs such as JDBC, or SchemaCrawler.

See also

  • Vocabulary OneSource
    Vocabulary OneSource
    OneSource is an evolving data analysis tool used internally by the Air Force Command and Control Integration Center's Vocabulary Services Team, and made available to general data management community. It is used by the greater US Department of Defense and NATO community for controlled vocabulary...

  • Agris: International Information System for the Agricultural Sciences and Technology
  • Classification scheme
    Classification scheme
    In metadata a classification scheme is a hierarchical arrangement of kinds of things or groups of kinds of things. Typically it is accompanied by descriptive information of the classes or groups. A classification scheme is intended to be used for an arrangement or division of individual objects...

  • Crosswalk (metadata)
    Crosswalk (metadata)
    A crosswalk is a table that shows equivalent elements in more than one database schema. It maps the elements in one schema to the equivalent elements in another schema.For example, this is a metadata crosswalk from MARC to Dublin Core:...

  • Data Dictionary
    Data dictionary
    A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to...

     (aka metadata repository)
  • Dublin Core
    Dublin Core
    The Dublin Core metadata terms are a set of vocabulary terms which can be used to describe resources for the purposes of discovery. The terms can be used to describe a full range of web resources: video, images, web pages etc and physical resources such as books and objects like artworks...

  • Folksonomy
    Folksonomy
    A folksonomy is a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content; this practice is also known as collaborative tagging, social classification, social indexing, and social tagging...

  • GEOMS – Generic Earth Observation Metadata Standard
    GEOMS – Generic Earth Observation Metadata Standard
    GEOMS - Generic Earth Observation Metadata Standard is a metadata standard used for archiving data from groundbased networks, like the , and for using this kind of data for the validation of and satellite data.- Introduction :...

  • ISO/IEC 11179
    ISO/IEC 11179
    ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...

  • IPDirector
    IPDirector
    IPDirector is an integrated suite developed by the Belgian company EVS Broadcast Equipment which groups several video production management applications which allow to control the ingest and the play out of video feeds in the XT[2] video server....

  • Knowledge tag
  • Meta element
    Meta element
    Meta elements are the HTML or XHTML <meta … > element used to provide structured metadata about a Web page. Multiple elements are often used on the same page: the element is the same, but its attributes are different...

  • Multicam(LSM)
    Multicam(LSM)
    Multicam is software developed by the Belgian company EVS Broadcast Equipment. Combined with its remote controller, it allows to control the XT[2] video server and offers highly reactive live editing solutions like instant replays and slow-motion....

  • Metadata from Wikiversity
  • Metadata discovery
    Metadata discovery
    In metadata, metadata discovery is the process of using automated tools to discover the semantics of a data element in data sets. This process usually ends with a set of mappings between the data source elements and a centralized metadata registry....

  • Metadata facility for Java
    Metadata facility for Java
    The Metadata Facility for Java is a specification for Java that defines an API for annotating fields, methods, and classes as having particular attributes that indicate they should be processed in specific ways by development tools, deployment tools, or run-time libraries.The specification was...


  • Metadata Access Point Interface
    IF-MAP
    The Interface for Metadata Access Points is an open standard client/server protocol developed by the Trusted Computing Group as one of the core protocols of the Trusted Network Connect open architecture....

  • Metadata publishing
    Metadata publishing
    Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes....

  • Metadata registry
    Metadata registry
    A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...

  • METAFOR
    METAFOR
    The Common Metadata for Climate Modelling Digital Repositories, or METAFOR project, is creating a Common Information Model for climate data and the models that produce it....

     Common Metadata for Climate Modelling Digital Repositories
  • Mercury: Metadata Search System
    Mercury: Metadata Search System
    Mercury is a Distributed Metadata Management, Data Discovery and Access System . It is a scientific data search system to capture and manage biogeochemical and ecological data in support of the National Aeronautics and Space Administration's Earth science programs. Mercury was originally developed...

  • Microcontent
    Microcontent
    There are at least two interpretations of the term microcontent. Usability adviser Jakob Nielsen originally referred to microcontent as small groups of words that can be skimmed by a person to get a clear idea of the content of a Web page. He included article headlines, page titles, subject lines...

  • Microformat
    Microformat
    A microformat is a web-based approach to semantic markup which seeks to re-use existing HTML/XHTML tags to convey metadata and other attributes in web pages and other contexts that support HTML, such as RSS...

  • Ontology (computer science)
    Ontology (computer science)
    In computer science and information science, an ontology formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to reason about the entities within that domain and may be used to describe the domain.In theory, an ontology is...

  • Official statistics
    Official statistics
    Official statistics are statistics published by government agencies or other public bodies such as international organizations. They provide quantitative or qualitative information on all major areas of citizens' lives, such as economic and social development, living conditions, health, education,...

  • Preservation Metadata
    Preservation metadata
    Preservation metadata is an essential component of most digital preservation strategies. As an increasing proportion of the world’s information output shifts from analog to digital form, it is necessary to develop new strategies to preserve this information for the long-term. Preservation metadata...

  • SDMX
    SDMX
    SDMX is an initiative to foster standards for the exchange of statistical information. It started in 2001 and aims at fostering standards for Statistical Data and Metadata eXchange...

  • Semantic Web
    Semantic Web
    The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

  • SGML
  • The Metadata Company
    The Metadata Company
    Metadata is the name of a US corporation and a registered trademark in the United States.Though the term "metadata" has a common generic use in information technology, claims of trademark have since brought about legal threats against its use in the generic sense.-History:The word Metadata was...

  • Universal Data Element Framework
    Universal Data Element Framework
    The Universal Data Element Framework provides the foundation for building an enterprise-wide controlled vocabulary. It is a standard way of indexing enterprise information that can produce big cost savings...

  • XSD
  • DataONE
    DataONE
    Data Observation Network for Earth is a project supported by the National Science Foundation under the DataNet program. DataONE will provide scientific data archiving for ecological and environmental data produced by scientists worldwide. DataONE's stated goal is to preserve and provide access to...



External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK