All Topics  
Metadata

 

   Email Print
   Bookmark   Link






 

Metadata



 
 
Metadata (meta data, or sometimes metainformation) is "data about other data", of any sort in any media. An item of metadata may describe an individual datum
Datum

A geodetic datum is a reference from which measurements are made. In surveying and geodesy,a datum is a set of reference points on the earth's surface against which position measurements are made, and an associated model of the shape of the earth to define a geographic coordinate system....
, or content item, or a collection of data including multiple content items and hierarchical levels, for example a database schema
Database schema

The schema of a database system is its structure described in a formal language supported by the database management system . In a relational database, the schema defines the Table_, the Field in each table, and the relationships between fields and tables....
. In data processing, metadata is definitional data that provides information about or documentation of other data managed within an application or environment.






Discussion
Ask a question about 'Metadata'
Start a new discussion about 'Metadata'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Metadata (meta data, or sometimes metainformation) is "data about other data", of any sort in any media. An item of metadata may describe an individual datum
Datum

A geodetic datum is a reference from which measurements are made. In surveying and geodesy,a datum is a set of reference points on the earth's surface against which position measurements are made, and an associated model of the shape of the earth to define a geographic coordinate system....
, or content item, or a collection of data including multiple content items and hierarchical levels, for example a database schema
Database schema

The schema of a database system is its structure described in a formal language supported by the database management system . In a relational database, the schema defines the Table_, the Field in each table, and the relationships between fields and tables....
. In data processing, metadata is definitional data that provides information about or documentation of other data managed within an application or environment. The term should be used with caution as all data is about something, and is therefore metadata.

For example, metadata would document data about data elements or attributes, (name, size, data type, etc) and data about records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.). Metadata may include descriptive information about the context, quality and condition, or characteristics of the data. It may be recorded with high or low granularity
Granularity

Granularity is the extent to which a system is broken down into small parts, either the system itself or its description or observation. It is the "extent to which a larger entity is subdivided....
.

Purpose

Metadata provides context for data.

Metadata is used to facilitate the understanding, characteristics, and management usage of data. The metadata required for effective data management varies with the type of data and context of use. In a library
Library

A library is a collection of information, sources, resources, books, and services, and the structure in which it is housed: it is organized for use and maintained by a public body, an institution, or a private individual....
, where the data is the content of the titles stocked, metadata about a title would typically include a description of the content, the author
Author

An author is defined both as "the person who originates or gives existence to anything" and that authorship determines responsibility for what is created....
, the publication date and the physical location.

Examples of metadata


Book

Examples of metadata regarding a book
Book

A book is a set or collection of written, printed, illustrated, or blank sheets, made of paper, parchment, or other material, usually fastened together to hinge at one side....
 would be the title, author, date of publication, subject, a unique identifier (such an International Standard Book Number
International Standard Book Number

The International Standard Book Number, or ISBN, is a unique, numeric commercial book identifier based upon the 9-digit Standard Book Numbering code created by Gordon Foster, now Emeritus Professor of Statistics at Trinity College, Dublin, for the booksellers and stationers W.H....
), its dimensions, number of pages, and the language of the text.

Photograph

Metadata for a photograph
Photograph

A photograph is an created by light falling on a light-sensitive surface, usually photographic film or an electronic imager such as a Charge-coupled device or a Complementary metal?oxide?semiconductor chip....
 would typically include the date and time at which it was taken and details of the camera settings (such as focal length, aperture, exposure). Many digital cameras record metadata in exchangeable image file format
Exchangeable image file format

Exchangeable image file format is a specification for the file format used by digital cameras. The specification uses the existing JPEG, TIFF Rev....
 (EXIF).

Audio

A digital audio file, such as an MP3
MP3

MPEG-1 Audio Layer 3, more commonly referred to as MP3, is a digital audio Encoder format using a form of lossy data compression. It is a common audio format for consumer audio storage, as well as a de facto standard encoding for the transfer and playback of music on digital audio players....
 of a song, might include the album name, song title, genre, year, composer, contributing artist, track number and album art.

Web page

The HTML
HTML

HTML, an Acronym and initialism of HyperText Markup Language, is the predominant markup language for Web pages. It provides a means to describe the structure of text-based information in a document?by denoting certain text as links, headings, paragraphs, lists, and so on?and to supplement that text with interactive forms, embedded '...
 used to mark-up web pages allows for the inclusion of a variety of types of meta data, from simple descriptive text, dates and keywords to highly-granular information such as the Dublin Core
Dublin Core

The Dublin Core metadata element set is a standard for cross-domain information Resource description. It provides a simple and standardised set of conventions for describing things online in ways that make them easier to find....
 and e-GMS
E-GMS

The e-GMS is the UK e-Government Metadata Standard. It is an application profile of the Dublin Core.The e-GMS defines how UK public sector bodies should label content such as web pages and documents in order to make such information more easily managed, found and shared....
 standards. Pages can be geotagged
GeoTagging

Geotagging is the process of adding geographical identification metadata to various media such as photographs, video, websites, or RSS feeds and is a form of geospatial metadata....
 with coordinates
Geographic coordinate system

A geographic coordinate system enables every location on the Earth to be specified in three coordinates, using mainly a Spherical coordinates#Spherical coordinates....
. Metadata may be included in the page's header or in a separate file. Microformats allow on-page data to be marked up as meta data. The Hypertext Transfer Protocol
Hypertext Transfer Protocol

Hypertext Transfer Protocol is an application-level protocol for distributed, collaborative, hypermedia information systems. Its use for retrieving inter-linked resources led to the establishment of the World Wide Web....
 used to link web pages also includes metadata.

Levels

The hierarchy of metadata descriptions can go on forever, but usually context or semantic understanding makes extensively detailed explanations unnecessary.

The role played by any particular datum
Datum

A geodetic datum is a reference from which measurements are made. In surveying and geodesy,a datum is a set of reference points on the earth's surface against which position measurements are made, and an associated model of the shape of the earth to define a geographic coordinate system....
 depends on the context. For example, when considering the geography of London, "E8 3BJ" would be a datum and "Post Code" would be metadatum. But, when considering the data management of an automated system that manages geographical data, "Post Code" might be a datum and then "data item name" and "6 characters, starting with A–Z" would be metadata.

In any particular context, metadata characterizes the data it describes, not the entity described by that data. So, in relation to "E8 3BJ", the datum "is in London" is a further description of the place in the real world which has the post code "E8 3BJ", not of the code itself. Therefore, although it is providing information connected to "E8 3BJ" (telling us that this is the post code of a place in London), this would not normally be considered metadata, as it is describing "E8 3BJ" as a place in the real world and not as data.

Definitions


Etymology

Meta
Meta

Meta , is a prefix used in English language in order to indicate a concept which is an abstraction from another concept, used to complete or add to the latter....
 is a classical Greek preposition (µet’ a???? eta????) and prefix (µetaßas??) conveying the following senses in English, depending upon the case of the associated noun: among; along with; with; by means of; in the midst of; after; behind. In epistemology
Epistemology

Epistemology or theory of knowledge is the branch of philosophy concerned with the nature and scope of knowledge. It addresses the questions:...
, the word means "about (its own category)"; thus metadata is "data about the data".

Varying definitions

The term was introduced intuitively, without a formal definition. Because of that, today there are various definitions. The most common one is the literal translation:
  • "Data about data are referred to as metadata."


Example: "12345" is data, and with no additional context is meaningless. When "12345" is given a meaningful name (metadata) of "ZIP code
ZIP Code

File:UseZipCode.JPGThe ZIP code is the system of postal codes used by the United States Postal Service . The letters ZIP, an acronym for Zone Improvement Plan, are properly written in capital letters and were chosen to suggest that the mail travels more efficiently, and therefore more quickly, when senders use the code....
", one can understand , and further placing "ZIP code" within the context of a postal address) that "12345" refers to the General Electric
General Electric

The General Electric Company, or GE is a multinational corporation United States technology and Service s conglomerate incorporated in the State of New York....
 plant in Schenectady, New York
Schenectady, New York

Schenectady is a city in Schenectady County, New York, New York, United States, of which it is the county seat. As of the United States Census 2000, the city had a population of 61,821, making it the ninth-largest city in New York....
.

As for most people the difference between data and information
Information

Information as a Conveyed concept has a diversity of meanings, from everyday usage to technical settings. Generally speaking, the concept of information is closely related to notions of constraint, communication, control system, data, form, instruction, knowledge, Meaning , stimulation, pattern, perception, and knowledge representation....
 is merely a philosophical one of no relevance in practical use, other definitions are:
  • Metadata is information about data.
  • Metadata is information about information.
  • Metadata contains information about that data or other data


There are more sophisticated definitions, such as:
  • "Metadata is structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities."
  • "[Metadata is a set of] optional structured descriptions that are publicly available to explicitly assist in locating objects."


These are used more rarely because they tend to concentrate on one purpose of metadata — to find "objects", "entities" or "resources" — and ignore others, such as using metadata to optimize compression algorithms
Data compression

In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits than an code representation would use through use of specific encoding schemes....
, or to perform additional computations using the data.

The metadata concept has been extended into the world of systems to include any "data about data": the names of tables, columns, programs, and the like. Different views of this "system metadata" are detailed below, but beyond that is the recognition that metadata can describe all aspects of systems: data, activities, people and organizations involved, locations of data and processes, access methods, limitations, timing and events, as well as motivation and rules.

Fundamentally, then, metadata is "the data that describe the structure and workings of an organization's use of information, and which describe the systems it uses to manage that information". To do a model of metadata is to do an "Enterprise model" of the information technology industry itself.

Metadata and Markup

In the context of the web and the work of the W3C in providing markup technologies of HTML
HTML

HTML, an Acronym and initialism of HyperText Markup Language, is the predominant markup language for Web pages. It provides a means to describe the structure of text-based information in a document?by denoting certain text as links, headings, paragraphs, lists, and so on?and to supplement that text with interactive forms, embedded '...
, XML and SGML the concept of metadata has specific context that is perhaps clearer than in other information domains. With markup technologies there is metadata, markup and data content. The metadata describes characteristics about the data, while the markup identifies the specific type of data content and acts as a container for that document instance. This page in Wikipedia is itself an example of such usage, where the textual information is data, how it is packaged, linked, referenced, styled and displayed is markup and aspects and characteristics of that markup are metadata set globally across Wikipedia.

In the context of markup the metadata is architected to allow optimization of document instances to contain only a minimum amount of metadata, while the metadata itself is likely referenced externally such as in a schema
Schema

The word schema comes from the Greek word "s???a" , which means shape, or more generally, plan. The Greek plural is "s???ata" . In English, both schemas and schemata are used as plural forms, although the latter is the standard form for written English....
 definition (XSD) instance. Also it should be noted that markup provides specialised mechanisms that handle referential data, again avoiding confusion over what is metadata or data, and allowing optimizations. The reference and ID mechanisms in markup allowing reference links between related data items, and links to data items that can then be repeated about a data item, such as an address or product details. These are then all themselves simply more data items and markup instances rather than metadata.

Similarly there are concepts such as classifications, ontologies and associations for which markup mechanisms are provided. A data item can then be linked to such categories via markup and hence providing a clean delineation between what is metadata, and actual data instances. Therefore the concepts and descriptions in a classification would be metadata, but the actual classification entry for a data item is simply another data instance.

Some examples can illustrate the points here. Items in bold are data content, in italic are metadata, normal text items are all markup.

The two examples show in-line use of metadata within markup relating to a data instance (XML) compared to simple markup (HTML).

A simple HTML
HTML

HTML, an Acronym and initialism of HyperText Markup Language, is the predominant markup language for Web pages. It provides a means to describe the structure of text-based information in a document?by denoting certain text as links, headings, paragraphs, lists, and so on?and to supplement that text with interactive forms, embedded '...
 instance example:

<span style="normalText">Example</span>

And then an XML instance example with metadata:

nillable="true">John

Where the inline assertion that a person's middle name may be an empty data item is metadata about the data item. Such definitions however are usually not placed inline in XML. Instead these definitions are moved away into the schema
Schema

The word schema comes from the Greek word "s???a" , which means shape, or more generally, plan. The Greek plural is "s???ata" . In English, both schemas and schemata are used as plural forms, although the latter is the standard form for written English....
 definition that contains the metadata for the entire document instance. This again illustrates another important aspect of metadata in the context of markup. The metadata is optimally defined only once for a collection of data instances. Hence repeated items of markup are rarely metadata, but rather more markup data instances themselves.

Hierarchies of metadata

When structured into a hierarchical arrangement, metadata is more properly called an ontology
Ontology (computer science)

In computer science and information science, an ontology is a formal representation of a set of concepts within a Domain of discourse and the relationships between those concepts....
 or schema
Schema

The word schema comes from the Greek word "s???a" , which means shape, or more generally, plan. The Greek plural is "s???ata" . In English, both schemas and schemata are used as plural forms, although the latter is the standard form for written English....
. Both terms describe "what exists" for some purpose or to enable some action. For instance, the arrangement of subject headings in a library catalog serves not only as a guide to finding books on a particular subject in the stacks, but also as a guide to what subjects "exist" in the library's own ontology and how more specialized topics are related to or derived from the more general subject headings.

Metadata is frequently stored in a central location and used to help organizations standardize their data. This information is typically stored in a metadata registry
Metadata registry

A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method....
.

Difference between data and metadata

Usually it is not possible to distinguish between (plain) data and metadata because:
  • Something can be data and metadata at the same time. The headline of an article is both its title (metadata) and part of its text (data).
  • Data and metadata can change their roles. A poem, as such, would be regarded as data, but if there is a song that uses it as lyrics, the whole poem could be attached to an audio file of the song as metadata. Thus, the labeling depends on the point of view.


These considerations apply no matter which of the above definitions is considered, except where explicit markup is used to denote what is data and what is metadata.

Use

Metadata has many different applications; this section lists some of the most common.

Metadata is used to speed up and enrich searching for resources. In general, search queries using metadata can save users from performing more complex filter operations manually. It is now common for web browsers (with the notable exception of Mozilla Firefox), P2P applications and media management software to automatically download and locally cache metadata, to improve the speed at which files can be accessed and searched.

Metadata may also be associated to files manually. This is often the case with documents which are scanned into a document storage repository such as FileNet or Documentum. Once the documents have been converted into an electronic format a user brings the image up in a viewer application, manually reads the document and keys values into an online application to be stored in a metadata repository.

Metadata provide additional information to users of the data it describes. This information may be descriptive ("These pictures were taken by children in the school's third grade class.") or algorithmic ("Checksum=139F").

Metadata helps to bridge the semantic gap
Semantic gap

The semantic gap characterizes the difference between two descriptions of an object by different linguistic representations, for instance languages or symbols....
. By telling a computer how data items are related and how these relations can be evaluated automatically, it becomes possible to process even more complex filter and search operations. For example, if a search engine understands that "Van Gogh" was a "Dutch painter", it can answer a search query on "Dutch painters" with a link to a web page about Vincent Van Gogh, although the exact words "Dutch painters" never occur on that page. This approach, called knowledge representation, is of special interest to the semantic web
Semantic Web

The Semantic Web is an evolving extension of the World Wide Web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content....
 and artificial intelligence
Artificial intelligence

Artificial intelligence is the intelligence of machines and the branch of computer science which aims to create it. Major AI textbooks define the field as "the study and design of intelligent agents,"...
.

Certain metadata is designed to optimize lossy compression. For example, if a video has metadata that allows a computer to tell foreground from background, the latter can be compressed more aggressively to achieve a higher compression rate.

Some metadata is intended to enable variable content presentation. For example, if a picture has metadata that indicates the most important region — the one where there is a person — an image viewer on a small screen, such as on a mobile phone's, can narrow the picture to that region and thus show the user the most interesting details. A similar kind of metadata is intended to allow blind people to access diagrams and pictures, by converting them for special output devices or reading their description using text-to-speech
Speech synthesis

Speech synthesis is the artificial production of human Speech communication. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or Computer hardware....
 software.

Other descriptive metadata can be used to automate workflows. For example, if a "smart" software tool knows content and structure of data, it can convert it automatically and pass it to another "smart" tool as input. As a result, users save the many copy-and-paste operations required when analyzing data with "dumb" tools.

Metadata is becoming an increasingly important part of electronic discovery
Electronic Discovery

Electronic discovery, or "e-discovery", refers to discovery in civil litigation which deals with information in electronic format also referred to as Electronically Stored Information "ESI"....
. Application and file system metadata derived from electronic document
Electronic document

An electronic document is any electronic media Content that are intended to be used in either an electronic form or as printed output.Originally, any computer data were considered as something internal — the final data output was always on paper....
s and files can be important evidence. Recent changes to the Federal Rules of Civil Procedure
Federal Rules of Civil Procedure

The Federal Rules of Civil Procedure are rules governing civil procedure in United States district courts, that is, court procedures for civil suits....
 make metadata routinely discoverable as part of civil litigation
Civil law (common law)

Civil law, as opposed to criminal law, refers to that branch of law dealing with disputes between individuals and/or organizations, in which damages may be awarded to the victim....
. Parties to litigation are required to maintain and produce metadata as part of discovery
Discovery (law)

In law, discovery is the pre-trial phase in a lawsuit in which each party through the law of civil procedure can request documents and other evidence from other parties or can compel the production of evidence by using a subpoena or through other discovery devices, such as requests for production of documents, and deposition s....
, and spoliation
Spoliation of evidence

In law, spoliation of evidence is the intentional or negligent withholding, hiding, alteration or destruction of evidence relevant to a legal proceeding....
 of metadata can lead to sanctions.

Metadata has become important on the World Wide Web
World Wide Web

The World Wide Web is a very large set of interlinked hypertext documents accessed via the Internet. With a Web browser, one can view Web pages that may contain writing, s, videos, and other multimedia and navigate between them using hyperlinks....
 because of the need to find useful information from the mass of information available. Manually-created metadata adds value because it ensures consistency. If a web page about a certain topic contains a word or phrase, then all web pages about that topic should contain that same word or phrase. Metadata also ensures variety, so that if a topic goes by two names each will be used. For example, an article about "sport utility vehicle
Sport utility vehicle

A sport utility vehicle is a generic marketing description for a vehicle similar to a station wagon but built on a light-truck chassis. Usually equipped with four-wheel drive for on or off-road ability, some SUVs include the towing capacity of a pickup truck with the passenger-carrying space of a minivan....
s" would also be tagged
Tag (metadata)

A tag is a non-hierarchical index term assigned to a piece of information . This kind of metadata helps describe an item and allows it to be found again by browsing or searching....
 "4 wheel drives", "4WDs" and "four wheel drives", as this is how SUVs are known in some countries.

Examples of metadata for an audio CD
Compact Disc

A Compact Disc is an optical disc used to store Data , originally developed for storing digital audio. The CD, available on the market since October 1982, remains the standard physical medium for sale of commercial Sound recording and reproduction to the present day....
 include the MusicBrainz
MusicBrainz

MusicBrainz is a project that aims to create an open content music database. Similar to the freedb project, it was founded in response to the restrictions placed on the CDDB....
 project and All Media Guide
All Media Guide

All Media Guide , is the company which owns and maintains Allmusic, Allgame and Allmovie. AMG was founded in 1990 by popular-culture archivist Michael Erlewine....
's Allmusic. Similarly, MP3
MP3

MPEG-1 Audio Layer 3, more commonly referred to as MP3, is a digital audio Encoder format using a form of lossy data compression. It is a common audio format for consumer audio storage, as well as a de facto standard encoding for the transfer and playback of music on digital audio players....
 files have metadata tags in a format called ID3
ID3

ID3 is a metadata container most often used in conjunction with the MP3 audio file format. It allows information such as the title, artist, album, track number, or other information about the file to be stored in the file itself....
.

Types of metadata

Metadata can be classified by:
  • Content. Metadata can either describe the resource itself (for example, name and size of a file) or the content of the resource (for example, "This video shows a boy playing football").
  • Mutability. With respect to the whole resource, metadata can be either immutable (for example, the "Title" of a video does not change as the video itself is being played) or mutable (the "Scene description" does change).
  • Logical function. There are three layers of logical function: at the bottom the subsymbolic layer that contains the raw data itself, then the symbolic layer with metadata describing the raw data, and on the top the logical layer containing metadata that allows logical reasoning using the symbolic layer


types of metadata are;
  1. descriptive metadata.
  2. administrative metadata.
  3. structural metadata.
  4. technical metadata.
  5. use metadata


To successfully develop and use metadata, several important issues should be treated with care:

Metadata risks

Microsoft Office
Microsoft Office

Microsoft Office is a popular set of interrelated desktop applications, servers and services. Microsoft Office is collectively referred to as an office suite, for the Microsoft Windows and Mac OS X operating systems....
 files include metadata beyond their printable content, such as the original author's name, the creation date of the document, and the amount of time spent editing it. Unintentional disclosure can be awkward or even, in professional practices requiring confidentiality, raise malpractice concerns. Some of Microsoft Office document's metadata can be seen by clicking File then Properties from the program's menu. Other metadata is not visible except through external analysis of a file, such as is done in forensics. The author of the Microsoft Word-based Melissa
Melissa (computer worm)

The Melissa worm, also known as "Mailissa", "The Simpsons", "Kwyjibo", or "Kwejeebo", is a mass-mailing macro virus , hence leading some to classify it as a computer worm....
 computer virus in 1999 was caught due to Word metadata that uniquely identified the computer used to create the original infected document.

Metadata lifecycle

Even in the early phases of planning and designing it is necessary to keep track of all metadata created. It is not economical to start attaching metadata only after the production process has been completed. For example, if metadata created by a digital camera at recording time is not stored immediately, it may have to be restored afterwards manually with great effort. Therefore, it is necessary for different groups of resource producers to cooperate using compatible methods and standards.
  • Manipulation. Metadata must adapt if the resource it describes changes. It should be merged when two resources are merged. These operations are seldom performed by today's software; for example, image editing programs usually do not keep track of the Exif
    Exchangeable image file format

    Exchangeable image file format is a specification for the file format used by digital cameras. The specification uses the existing JPEG, TIFF Rev....
     metadata created by digital cameras.
  • Destruction. It can be useful to keep metadata even after the resource it describes has been destroyed, for example in change histories within a text document or to archive file deletions due to digital rights management. None of today's metadata standards consider this phase.


Storage

Metadata can be stored either internally, in the same file as the data, or externally, in a separate file. Metadata that are embedded with content is called embedded metadata. A data repository typically stores the metadata detached from the data. Both ways have advantages and disadvantages:

  • Internal storage allows transferring metadata together with the data it describes; thus, metadata is always at hand and can be manipulated easily. This method creates high redundancy and does not allow holding metadata together.
  • External storage allows bundling metadata, for example in a database, for more efficient searching. There is no redundancy and metadata can be transferred simultaneously when using streaming
    Streaming media

    Streaming media is multimedia that is constantly received by, and normally presented to, an End-user while it is being delivered by a streaming provider ....
    . However, as most formats use URI
    Uniform Resource Identifier

    In Information technology, a Uniform Resource Identifier is a Character string of Character s used to Identifier or name a Resource on the Internet....
    s for that purpose, the method of how the metadata is linked to its data should be treated with care. What if a resource does not have a URI (resources on a local hard disk or web pages that are created on-the-fly using a content management system)? What if metadata can only be evaluated if there is a connection to the Web, especially when using RDF
    Resource Description Framework

    The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model. It has come to be used as a general method for conceptual description or modeling, of information that is implemented in web resources; using a variety of syntax formats....
    ? How to realize that a resource is replaced by another with the same name but different content?


Moreover, there is the question of data format: storing metadata in a human-readable format such as XML can be useful because users can understand and edit it without specialized tools. On the other hand, these formats are not optimized for storage capacity; it may be useful to store metadata in a binary, non-human-readable format instead to speed up transfer and save memory.

Criticisms

Some critics argue the Metadata is too expensive, time-consuming, subjective, requires both domain and metadata creation expertise and depends heavily on context. Doctorow and Shirkey outline some of these criticisms.

The opposers of metadata sometimes use the term metacrap
Metacrap

Metacrap is a portmanteau drawn from metadata and wikt:crap. The origin of the word is unknown, but it was popularized by Cory Doctorow in a 2001 essay titled "Metacrap: Putting the torch to seven straw-men of the meta-utopia."...
 to refer to the unsolved problems of metadata in some scenarios. These people are also referred to as "Meta Haters."

Critics may also observe that perceived ineffectiveness of metadata is not due to lack of maturity in the industry. Definitive theoretical knowledge and over a dozen industry tools to handle metadata can be referenced as far back as 1982. This makes metadata theory and tooling far older, yet observably less successful, than other technology areas such as object-oriented programming, data warehousing, and the commercialized Internet.

Types

In general, there are two distinct classes of metadata: structural or control metadata and guide metadata. Structural metadata is used to describe the structure of computer systems such as tables, columns and indexes. Guide metadata is used to help humans find specific items and is usually expressed as a set of keywords in a natural language.

Metatadata can be divided into 3 distinct categories:
  • Administrative
  • Descriptive
  • Structural


Information Technology and Software Engineering metadata


General IT metadata
In contrast, David Marco, another metadata theorist, defines metadata as "all physical data and knowledge from inside and outside an organization, including information about the physical data, technical and business processes, rules and constraints of the data, and structures of the data used by a corporation." Others have included web services, systems and interfaces. In fact, the entire Zachman framework
Zachman framework

File:Simple example Zachman Framework double row.jpgThe Zachman Framework is a framework for enterprise architecture, which provides a formal and highly structured way of view model and defining an enterprise....
 (see Enterprise Architecture
Enterprise architecture

The term enterprise architecture refers to many things. Like architecture in general, it can refer to a description, a process or a profession....
) can be represented as metadata.

Notice that such definitions expand metadata's scope considerably, to encompass most or all of the data required by the Management Information System
Management information system

A management information system is a subset of the overall internal controls of a business covering the application of people, documents, technologies, and procedures by management accountants to solving business problems such as costing a product, service or a business-wide strategy....
s capability. In this sense, the concept of metadata has significant overlaps with the ITIL
Itil

Itil may mean:*Atil or Itil, the ancient capital of Khazaria*Itil , also Idel, Atil, Atal, the ancient and modern Turkic name of the river Volga....
 concept of a Configuration Management Database (CMDB
CMDB

A configuration management database is a repository of information related to all the components of an information system. Although repositories similar to CMDBs have been used by IT departments for many years, the term CMDB stems from ITIL ....
), and also with disciplines such as Enterprise Architecture
Enterprise architecture

The term enterprise architecture refers to many things. Like architecture in general, it can refer to a description, a process or a profession....
 and IT portfolio management
IT portfolio management

IT portfolio management is the application of systematic management to large classes of items managed by enterprise Information Technology capabilities....
.

This broader definition of metadata has precedent. Third generation corporate repository products (such as those eventually merged into the CA Advantage line) not only store information about data definitions (COBOL copybooks, DBMS schema), but also about the programs accessing those data structures, and the Job Control Language
Job Control Language

Job Control Language is a scripting language used on IBM mainframe operating systems to instruct the system on how to run a batch processing or start a subsystem....
 and batch job infrastructure dependencies as well. These products (some of which are still in production) can provide a very complete picture of a mainframe computing environment, supporting exactly the kinds of impact analysis required for ITIL-based processes such as Incident
Itil

Itil may mean:*Atil or Itil, the ancient capital of Khazaria*Itil , also Idel, Atil, Atal, the ancient and modern Turkic name of the river Volga....
 and Change Management. The ITIL
Itil

Itil may mean:*Atil or Itil, the ancient capital of Khazaria*Itil , also Idel, Atil, Atal, the ancient and modern Turkic name of the river Volga....
  includes the Data Management volume which recognizes the role of these metadata products on the mainframe, posing the CMDB
CMDB

A configuration management database is a repository of information related to all the components of an information system. Although repositories similar to CMDBs have been used by IT departments for many years, the term CMDB stems from ITIL ....
 as the distributed computing equivalent. CMDB vendors however have generally not expanded their scope to include data definitions, and metadata solutions are also available in the distributed world. Determining the appropriate role and scope for each is thus a challenge for large IT organizations requiring the services of both.

Since metadata is pervasive, centralized attempts at tracking it need to focus on the most highly leveraged assets. Enterprise Assets may only constitute a small percentage of the entire IT portfolio.

Some practitioners have successfully managed IT metadata using the Dublin Core
Dublin Core

The Dublin Core metadata element set is a standard for cross-domain information Resource description. It provides a simple and standardised set of conventions for describing things online in ways that make them easier to find....
 metamodel.

IT metadata management products
First generation data dictionary/metadata repository tools would be those only supporting a specific DBMS, such as IDMS
IDMS

IDMS is a CODASYL database management system first developed at Goodrich Corporation and later marketed by Cullinane Database Systems . Since 1989 the product has been owned by Computer Associates, who renamed it CA-IDMS....
's IDD (integrated data dictionary), the IMS
Information Management System

IBM Information Management System is a joint hierarchical database and information management system with extensive transaction processing capabilities....
 Data Dictionary, and ADABAS
Adabas

ADABAS is Software AG?s primary database management system....
's Predict.

Second generation would be ASG's DATAMANAGER product which could support many different file and DBMS types.

Third generation repository products became briefly popular in the early 1990s along with the rise of widespread use of RDBMS engines such as IBM's DB2
IBM DB2

DB2 is one of IBM's families of relational database management system software products within IBM's broader IBM Information Management Software line....
.

Fourth generation products link the repository with more Extract, transform, load
Extract, transform, load

Extract, transform, and load in database usage and especially in data warehouse involves:* data extraction from outside sources* data transformation it to fit operational needs ...
 tools and can be connected with architectural modeling tools.

Fifth generation products are taking things to a new level by integrating distributed computing, specialized hardware, extreme visualization, and analytics, in a sense that now allows vertical uses of metadata in all sorts of things such as applications, messaging buses etc.

Relational database metadata
Each relational database
Relational database

A relational database is a database that groups data using common attributes found in the data set. The resulting "clumps" of organized data are much easier for people to understand....
 system has its own mechanisms for storing metadata. Examples of relational-database metadata include:
  • Tables of all tables in a database, their names, sizes and number of rows in each table.
  • Tables of columns in each database, what tables they are used in, and the type of data stored in each column.


In database terminology, this set of metadata is referred to as the catalog
Database catalog

The database catalog of a database instance consists of metadata in which definitions of database Object files such as Basis tables, view tables, synonyms, value ranges, indexes, User , and user groups are stored....
. The SQL
SQL

SQL is a database computer language designed for the retrieval and management of data in relational database management systems , database schema creation and modification, and database object access control management....
 standard specifies a uniform means to access the catalog, called the INFORMATION_SCHEMA, but not all databases implement it, even if they implement other aspects of the SQL standard. For an example of database-specific metadata access methods, see Oracle metadata
Oracle metadata

The ORACLE application server and Oracle relational database keep metadata in two areas: data dictionary tables and a metadata registry.The global built-in functions accessing Oracle RDBMS data dictionary tables are:...
.

Data warehouse metadata
Data warehouse
Data warehouse

Data warehouse is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis....
 metadata systems are sometimes separated into two sections:
  1. back room metadata that are used for Extract, transform, load
    Extract, transform, load

    Extract, transform, and load in database usage and especially in data warehouse involves:* data extraction from outside sources* data transformation it to fit operational needs ...
     functions to get OLTP data into a data warehouse
  2. front room metadata that are used to label screens and create reports


Kimball lists the following types of metadata in a data warehouse (See also ):

  • source system metadata
    • source specifications, such as repositories
      Repository

      A repository can be:* Repository clone, a concept from distributed revision control* Component repository management, a field of configuration management...
      , and source logical schema
      Logical schema

      A Logical Schema is a data model problem domain expressed in terms of a particular data management technology. Without being specific to a particular database management product, it is in terms of either relational tables and columns, object-oriented classes, or XML tags....
      s
    • source descriptive information, such as ownership descriptions, update frequencies and access method
      Access method

      An access method is a function of a mainframe computer operating system that enables access to data on disk, tape or other external devices. They were introduced in 1963 in IBM OS/360 operating system....
      s
    • process information, such as job schedules and extraction code
  • data staging metadata
    • data acquisition
      Data acquisition

      Data acquisition is the sampling of the real world to generate data that can be manipulated by a computer. Sometimes abbreviated DAQ or DAS, data acquisition typically involves acquisition of signals and waveforms and processing the signals to obtain desired information....
       information, such as data transmission
      Data transmission

      Data transmission is the physical transfer of data from point-to-point often represented as an electro-magnetic Signal over a point-to-point or point-to-multipoint communication channel....
       scheduling and results, and file usage
    • dimension table
      Dimension table

      In data warehousing, a dimension table is one of the set of companion tables to a fact table.The fact table contains business facts or measures and foreign keys which refer to candidate keys in the dimension tables....
       management, such as definitions of dimensions, and surrogate key
      Surrogate key

      A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data....
       assignments
    • transformation
      Program transformation

      A program transformation is any operation that takes a program and generates another program. It is often important that the derived program be semantically equivalent to the original, relative to a particular Formal semantics of programming languages....
       and aggregation
      Aggregation

      Aggregation may refer to:* Link aggregation, using multiple Ethernet network cables/ports in parallel to increase link speed* Purchasing aggregation, combining multiple users of a specific material or service to increase the purchasing power of the combined group....
      , such as data enhancement and mapping, DBMS load scripts, and aggregate definitions
    • audit, job logs and documentation, such as data lineage records, data transform logs
  • DBMS metadata, such as:
    • DBMS system table contents
    • processing hint


Michael Bracket defines metadata (what he calls "Data resource data") as "any data about the organization's data resource". Adrienne Tannenbaum defines metadata as "the detailed description of instance data. The format and characteristics of populated instance data: instances and values, dependent on the role of the metadata recipient". These definitions are characteristic of the "data about data" definition.

Business Intelligence metadata
Business Intelligence
Business intelligence

Business intelligence refers to skills, technologies, applications and practices used to help a business acquire a better understanding of its commercial context....
 is the process of analyzing large amounts of corporate data, usually stored in large databases such as a Data Warehouse
Data warehouse

Data warehouse is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis....
, tracking business performance, detecting patterns and trends, and helping enterprise business users make better decisions. Business Intelligence metadata describes how data is queried, filtered, analyzed, and displayed in Business Intelligence software tools, such as Reporting tools, OLAP tools, Data Mining tools.

Examples:
  • Data Mining
    Data mining

    Data mining is the process of extracting hidden patterns from data. As more data is gathered, with the amount of data doubling every three years, data mining is becoming an increasingly important tool to transform this data into information....
     metadata: The descriptions and structures of DataSets, Algorithms, Queries
  • OLAP metadata: The descriptions and structures of Dimensions, Cubes, Measures (Metrics), Hierarchies, Levels, Drill Paths
  • Reporting metadata: The descriptions and structures of Reports, Charts, Queries, DataSets, Filters, Variables, Expressions


Business Intelligence metadata can be used to understand how corporate financial reports reported to Wall Street
Wall Street

Wall Street is a street in lower Manhattan, New York City, New York, United States. It runs east from Broadway to South Street on the East River, through the historical center of the Financial District, Manhattan....
 are calculated, how the revenue, expense and profit are aggregated from individual sales transactions stored in the data warehouse. A good understanding of Business Intelligence metadata is required to solve complex problems such as compliance with corporate governance standards, such as Sarbanes Oxley (SOX) or Basel II.

File system metadata
Nearly all file system
File system

In computing, a file system is a method for store and organize computer files and the data they contain to make it easy to find and access them....
s keep metadata about files out-of-band
Out-of-band

Out-of-band is a technical term with different uses in communications and telecommunication. It refers to communications which occur outside of a previously established communications method or channel....
. Some systems keep metadata in directory
Directory (file systems)

In computing, a directory, folder, catalog, or drawer is a virtual container within a digital file system, in which groups of files and other directories can be kept and organized....
 entries; others in specialized structure like inode
Inode

In computing, an inode is a data structure on a traditional Unix-style file system such as Unix File System. An inode stores basic information about a regular computer file, directory , or other file system object....
s or even in the name of a file. Metadata can range from simple timestamp
Timestamp

A timestamp is a sequence of characters, denoting the date and/or time at which a certain event occurred. This data is usually presented in a consistent format, allowing for easy comparison of two different records and tracking progress over time; the practice of recording timestamps in a consistent manner along with the actual data is called...
s, mode bits, and other special-purpose information used by the implementation itself, to icon
Icon (computing)

On computer displays, a computer icon is a small pictogram. Icons have been used to supplement the normal alphanumerics of the computer. Modern computers now can handle bitmapped graphics on the display terminal, so the icons are widely used to assist users....
s and free-text comments, to arbitrary attribute-value pair
Attribute-value pair

A name?value pair or attribute?value pair is a fundamental data representation in computing systems and applications. Designers often desire an open-ended data structure that allows for future extension without modifying existing code or data....
s.

With more complex and open-ended metadata, it becomes useful to search for files based on the metadata contents. The Unix
Unix

Unix is a computer operating system originally developed in 1969 by a group of American Telephone & Telegraph employees at Bell Labs, including Ken Thompson , Dennis Ritchie, Douglas McIlroy, and Joe Ossanna....
 find
Find

The find program is a directory Search_engine_ on Unix-like platforms. It searches through one or more directory tree of a filesystem, locating Computer files based on some user-specified criteria....
 utility was an early example, although inefficient when scanning hundreds of thousands of files on a modern computer system. Apple Computer
Apple Computer

Apple Inc., formerly Apple Computer Inc., is an United States multinational corporation which designs and manufactures consumer electronics and software products....
's Mac OS X
Mac OS X

Mac OS X is a line of computer operating systems developed, marketed, and sold by Apple Inc., and since 2002 has been included with all new Macintosh computer systems....
 operating system supports cataloguing and searching for file metadata through a feature known as Spotlight
Spotlight (software)

Spotlight is a system-wide desktop search feature of Apple Inc. Mac OS X operating system introduced in version Mac OS X v10.4 on April 29, 2005....
, as of version 10.4
Mac OS X v10.4

Mac OS X version 10.4 ?Tiger? was the fifth Software version of Mac OS X, Apple Inc. desktop and server operating system for Macintosh computers....
. Microsoft
Microsoft

Microsoft Corporation is a multinational corporation computer technology corporation that develops, manufactures, licenses, and supports a wide range of computer software products for computing devices....
 worked in the development of similar functionality with the Instant Search system in Windows Vista
Windows Vista

Windows Vista is one member in a family of operating systems developed by Microsoft for use on personal computers, including home and business Desktop computer, laptops, Tablet PCs, and media center PCs....
, as well as being present in SharePoint Server. Linux
Linux

Linux is a generic term referring to Unix-like computer operating systems based on the Linux kernel. Their development is one of the most prominent examples of free and open source software collaboration; typically all the underlying source code can be used, freely modified, and redistributed by anyone under the terms of the GNU GPL license...
 implements file metadata using extended file attributes
Extended file attributes

Extended file attributes is a file system feature that enables users to associate computer files with metadata not interpreted by the filesystem, whereas regular attributes have a purpose strictly defined by the filesystem ....
.

Program metadata
Metadata is casually used to describe the controlling data used in software architectures that are more abstract or configurable. Most executable file
Executable

In computing, an executable causes a computer "to perform indicated tasks according to encoded instruction ," as opposed to a file that only contains data ....
 formats include what may be termed "metadata" that specifies certain, usually configurable, behavioral runtime
Runtime

In computer science, runtime or run time describes the operation of a computer program, the duration of its execution, from beginning to termination ....
 characteristics. However, it is difficult if not impossible to precisely distinguish program "metadata" from general aspects of stored-program computing architecture
Von Neumann architecture

The von Neumann architecture is a design model for a stored-program digital computer that uses a central processing unit and a single separate computer storage structure to hold both instructions and data ....
; if the machine reads it and acts upon it, it is a computational instruction
Instruction (computer science)

In computer science, an instruction is a single operation of a central processing unit defined by an instruction set architecture. In a broader sense, an "instruction" may be any representation of an element of an executable program, such as a bytecode....
, and the prefix "meta" has little significance.

In Java
Java (programming language)

Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java ....
, the class file format
Class (file format)

In the Java , source files are compiled into class files which have a .class extension. Since Java is a platform-independent language, source code is compiled into an output file known as Byte-code, which it stores in a .class file....
 contains metadata used by the Java compiler
Java compiler

A Java compiler is a compiler for the Java . The most common form of output from a Java compiler are Class containing platform-neutral Java bytecode....
 and the Java virtual machine
Java Virtual Machine

A Java Virtual Machine is a set of computer software programs and data structures which use a virtual machine model for the execution of other computer programs and Scripting language....
 to dynamically link classes
Class (computer science)

In object-oriented programming, a class is a programming language construct that is used as a blueprint to create Object s. This blueprint includes Attribute s and Method s that the created objects all share....
 and to support reflection
Reflection (computer science)

In computer science, reflection is the process by which a computer program can observe and modify its own structure and behaviour. The programming paradigm driven by reflection is called reflective programming....
. The Java Platform, Standard Edition
Java Platform, Standard Edition

Java Platform, Standard Edition or Java SE is a widely used Platform for programming in the Java language. It is the Java Platform used to deploy porting Application software for general use....
 since J2SE 5.0 has included a metadata facility
Metadata facility for Java

The Metadata Facility for Java is a specification for Java that defines an API for annotation field , method , and class as having particular attributes that indicate they should be processed in specific ways by development tools, deployment tools, or library ....
 to allow additional annotations that are used by development tools.

In MS-DOS
MS-DOS

MS-DOS is an operating system commercialized by Microsoft. It was the most commonly used member of the DOS family of operating systems and was the main operating system for personal computers during the 1980s....
, the COM file
COM file

In many computer operating systems, a COM file is a type of executable; the name is derived from the file name extension .COM. Originally, the term stood for "Command file", a text file containing commands to be issued to the operating system , on many of the Digital Equipment Corporation minicomputer and mainframe operating systems going...
 format does not include metadata, while the EXE
EXE

EXE is the common filename extension denoting an executable file in the DOS, OpenVMS, Microsoft Windows, ReactOS, and OS/2 operating systems.Besides the executable program itself, many EXE files contain other components called Resource , such as bitmaps and icons which the executable program may use for its graphical user interface....
 file and Windows PE
Portable Executable

The Portable Executable format is a file format for executables, object file, and Dynamic-link librarys, used in 32-bit and 64-bit versions of Microsoft Windows operating systems....
 formats do. These metadata can include the company that published the program, the date the program was created, the version number and more.

In the Microsoft .NET
.NET Framework

The Microsoft .NET Framework is a software framework that is available with several Microsoft Windows operating systems. It includes a large Library of coded solutions to prevent common programming problems and a virtual machine that manages the execution of programs written specifically for the Software framework....
 executable format, extra metadata is included to allow reflection
Reflection (computer science)

In computer science, reflection is the process by which a computer program can observe and modify its own structure and behaviour. The programming paradigm driven by reflection is called reflective programming....
 at runtime.

Existing software metadata
Object Management Group
Object Management Group

Object Management Group is a consortium, originally aimed at setting standardization for distributed object-oriented systems, and is now focused on modeling and model-based standards....
 (OMG) has defined metadata format for representing entire existing applications for the purposes of software mining
Software mining

Software mining is a promising application of knowledge discovery in the area of software modernization which involves understanding existing software artifacts....
, software modernization
Software modernization

Software modernization is the process of understanding and evolving existing software assets. This can mean moving off of a legacy hardware platform, replacing a system with a package, or leveraging existing legacy business rules to re-architecture it to a new environment all together....
 and software assurance. This specification, called the OMG Knowledge Discovery Metamodel
Knowledge Discovery Metamodel

Knowledge Discovery Metamodel is publicly available specification from the Object Management Group . KDM is a common intermediate representation for existing software systems and their operating environments, that defines common metadata required for deep semantic integration of Application Lifecycle Management tools....
 (KDM) is the OMG's foundation for "modeling in reverse". KDM is a common language-independent intermediate representation that provides an integrated view of an entire enterprise application, including its behavior (program flow), data, and structure. One of the applications of KDM is Business Rules Mining.

Knowledge Discovery Metamodel
Knowledge Discovery Metamodel

Knowledge Discovery Metamodel is publicly available specification from the Object Management Group . KDM is a common intermediate representation for existing software systems and their operating environments, that defines common metadata required for deep semantic integration of Application Lifecycle Management tools....
 includes a fine grained low-level representation (called "micro KDM"), suitable for performing static analysis of programs.

Document metadata
Most programs that create documents, including Microsoft SharePoint, Microsoft Word and other Microsoft Office
Microsoft Office

Microsoft Office is a popular set of interrelated desktop applications, servers and services. Microsoft Office is collectively referred to as an office suite, for the Microsoft Windows and Mac OS X operating systems....
 products, save metadata with the document files. These metadata can contain the name of the person who created the file (obtained from the operating system), the name of the person who last edited the file, how many times the file has been printed, and even how many revisions have been made on the file. Other saved material, such as deleted text (saved in case of an undelete command), document comments and the like, is also commonly referred to as "metadata", and the inadvertent inclusion of this material in distributed files has sometimes led to undesirable disclosures.

Document Metadata is particularly important in legal environments where litigation can request this sensitive information (metadata) which can include many elements of private detrimental data. This data has been linked to multiple lawsuits that have got corporations into legal complications.

Many legal firms today use "Metadata Management Software", also known as "Metadata Removal Tools". This software can be used to clean documents before they are sent outside of their firm. This process, known as metadata management, protects lawfirms from potentially unsafe leaking of sensitive data through Electronic Discovery
Electronic Discovery

Electronic discovery, or "e-discovery", refers to discovery in civil litigation which deals with information in electronic format also referred to as Electronically Stored Information "ESI"....
.

For a list of executable formats, see object file
Object file

In computer science, object code, or an object file, is the representation of code that a compiler or assembler generates by processing a source code file....
.

Digital library metadata

There are three categories of metadata that are frequently used to describe objects in a digital library
Digital library

A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks....
:

  1. descriptive - Information describing the intellectual content of the object, such as MARC
    MARC standards

    MARC is an acronym, used in the field of library science, that stands for machine readable cataloging. The MARC standards consist of the MARC formats, which are standards for the representation and communication of bibliographic and related information in machine-readable form, and related documentation....
     cataloguing records, finding aids or similar schemes. It is typically used for bibliographic purposes and for search and retrieval.
  2. structural - Information that ties each object to others to make up logical units (e.g., information that relates individual images of pages from a book to the others that make up the book).
  3. administrative - Information used to manage the object or control access to it. This may include information on how it was scanned, its storage format, copyright
    Copyright

    Copyright is a form of intellectual property which gives the creator of an original work exclusive rights for a certain time period in relation to that work, including its publication, distribution and adaptation; after which time the work is said to enter the public domain....
     and licensing information, and information necessary for the long-term preservation
    Digital preservation

    Digital preservation is the management of digital information over time. Preservation of digital information is widely considered to require more constant and ongoing attention than preservation of other media....
     of the digital objects.


Standards for metadata in digital libraries include Dublin Core
Dublin Core

The Dublin Core metadata element set is a standard for cross-domain information Resource description. It provides a simple and standardised set of conventions for describing things online in ways that make them easier to find....
, METS
METS

The Metadata Encoding and Transmission Standard schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium....
, PREMIS
Preservation Metadata: Implementation Strategies (PREMIS)

PREMIS is an international working group concerned with developing metadata for use in digital preservation.In 2003 the Online Computer Library Center and Research Libraries Group established the PREMIS working group, which consisted of a multi-national roster of more than thirty representatives from the cultural, government, and private...
 schema, and OAI-PMH
Open Archives Initiative Protocol for Metadata Harvesting

OAI-PMH is a protocol developed by the Open Archives Initiative. It is used to harvest the Metadata descriptions of the records in an archive so that services can be built using metadata from many archives....
.

Image metadata

Examples of image files containing metadata include Exchangeable image file format
Exchangeable image file format

Exchangeable image file format is a specification for the file format used by digital cameras. The specification uses the existing JPEG, TIFF Rev....
 (EXIF) and Tagged Image File Format
Tagged Image File Format

Tagged Image File Format is a file format for storing raster graphics, including photographs and line art. It is now under the control of Adobe Systems....
 (TIFF).

Having metadata about images embedded in TIFF or EXIF files is one way of acquiring additional data about an image. Tagging
Tag (metadata)

A tag is a non-hierarchical index term assigned to a piece of information . This kind of metadata helps describe an item and allows it to be found again by browsing or searching....
 pictures with subjects, related emotions, and other descriptive phrases helps Internet users find pictures easily rather than having to search through entire image collections. A prime example of an image tagging service is Flickr
Flickr

Flickr is an and video hosting service website, web services suite, and online community platform. In addition to being a popular Web site for users to share personal photographs, the service is widely used by bloggers as a photo repository....
, where users upload images and then describe the contents. Other patrons of the site can then search for those tags. Flickr uses a folksonomy
Folksonomy

Folksonomy is the practice and method of collaboratively creating and managing Tag to annotate and categorization Content . Folksonomy describes the bottom-up classification systems that emerge from social tagging....
: a free-text keyword system in which the community defines the vocabulary through use rather than through a controlled vocabulary
Controlled vocabulary

Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri and taxonomies....
.

Users can also tag photos for organization purposes using Adobe's Extensible Metadata Platform
Extensible Metadata Platform

The Adobe Extensible Metadata Platform is a standard for processing and storing standardized and proprietary metadata, created by ADBE.XMP standardizes the definition, creation, and processing of extensible metadata....
 (XMP) language, for example.

Digital photography is increasingly making use of technical metadata tags describing the conditions of exposure. Photographers shooting Camera RAW
RAW image format

A raw image file contains minimally processed data from the image sensor of either a digital camera, or motion picture film scanner. Raw files are so named because they are not yet processed and therefore are not ready to be used with a bitmap graphics editor or Printing....
 file formats can use applications such as Adobe Bridge
Adobe Bridge

Adobe Bridge is an organizational program created and released by Adobe Systems as a part of the Adobe Creative Suite. Its primary purpose is to link the parts of the Creative Suite together using a format similar to the file browser found in previous versions of Adobe Photoshop....
 or Apple Computer's Aperture
Aperture (photography software)

Aperture is a software program for Mac OS X developed by Apple Inc. designed to assist professional photographers in post-production work. It was announced at a New York media event on October 19, 2005 and released on November 30, 2005....
 to work with camera metadata for post-processing.

Geospatial metadata

Metadata that describe geographic objects (such as datasets, maps, features, or simply documents with a geospatial component) have a history going back to at least 1994 (refer ). This class of metadata is described more fully on the Geospatial metadata
Geospatial metadata

Geospatial metadata is a type of metadata that is applicable to objects that have an explicit or implicit Geography extent, in other words, are associated with some position on the surface of the Globe....
 page.

Meta-metadata

Since metadata are also data, it is possible to have metadata of metadata–"meta-metadata." what is Machine-generated meta-metadata, such as the reversed index created by a free-text search engine, is generally not considered metadata, though.

See also

  • Data Dictionary
    Data dictionary

    A data dictionary, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to databases and Database management system:...
  • Dublin Core
    Dublin Core

    The Dublin Core metadata element set is a standard for cross-domain information Resource description. It provides a simple and standardised set of conventions for describing things online in ways that make them easier to find....
  • Folksonomy
    Folksonomy

    Folksonomy is the practice and method of collaboratively creating and managing Tag to annotate and categorization Content . Folksonomy describes the bottom-up classification systems that emerge from social tagging....
  • Meta tag
  • Metadata discovery
    Metadata discovery

    In metadata, metadata discovery is the process of using automated tools to discover the semantics of a data element in data sets. This process usually ends with a set of mappings between the data source elements and a centralized metadata registry....
  • Metadata facility for Java
    Metadata facility for Java

    The Metadata Facility for Java is a specification for Java that defines an API for annotation field , method , and class as having particular attributes that indicate they should be processed in specific ways by development tools, deployment tools, or library ....
  • Metadata publishing
    Metadata publishing

    Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes....
  • Metadata registry
    Metadata registry

    A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method....
  • Microcontent
    Microcontent

    There are at least two different interpretations of the term microcontent. Originally, Nielsen referred to microcontent as content that is taken out of its context and which should therefore ideally be a clear explanation of the further article....
  • Microformats
  • Official statistics
    Official statistics

    Official statistics are related directly to the field of statistics and cover all major areas of citizens' lives, such as economic and social development, living conditions , health , education , and the environment ....
  • Semantic Web
    Semantic Web

    The Semantic Web is an evolving extension of the World Wide Web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content....
  • The Metadata Company
    The Metadata Company

    Metadata is the name of a United States corporation and a registered trademark in the United States.Though the term "metadata" has a common generic use in information technology, claims of trademark have since brought about legal threats against its use in the generic sense....
  • Universal Data Element Framework
    Universal Data Element Framework

    The Universal Data Element Framework provides the foundation for building an enterprise-wide controlled vocabulary. It is a standard way of indexing enterprise information that can produce big cost savings....


External links

  • Cory Doctorow
    Cory Doctorow

    Cory Doctorow is a Canada blogger, journalist and science fiction author who serves as co-editor of the blog Boing Boing. He is an activist in favor of liberalizing copyright laws and a proponent of the Creative Commons organization, using some of their licenses for his books....
    's opinion on the limitations of metadata on the Internet
    Internet

    The Internet is a global network of interconnected computers, enabling users to share information along multiple channels. Typically, a computer that connects to the Internet can access information from a vast array of available server and other computers by moving information from them to the computer's local memory....
    , 2001
  • - AnonWatch
  • - NISO, 2004