Tag (metadata)
Encyclopedia
In online computer systems terminology, a tag is a non-hierarchical keyword or term assigned to a piece of information (such as an Internet bookmark, digital image, or computer file
Computer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...

). This kind of metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

 helps describe an item and allows it to be found again by browsing or searching. Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system.

Tagging was popularized by websites associated with Web 2.0
Web 2.0
The term Web 2.0 is associated with web applications that facilitate participatory information sharing, interoperability, user-centered design, and collaboration on the World Wide Web...

 and is an important feature of many Web 2.0 services. It is now also part of some desktop software.

History and context

Labeling and tagging are carried out to perform functions such as aiding in classification, marking ownership, noting boundaries, and indicating online identity
Online identity
An online identity, internet identity, or internet persona is a social identity that an Internet user establishes in online communities and websites...

. They may take the form of words, images, or other identifying marks. An analogous example of tags in the physical world is museum object tagging. In the organisation of information and objects, the use of textual keywords as part of identification and classification long predates computers. However, computer based searching made the use of keywords a rapid way of exploring records. Online and Internet databases and early websites deployed them as a way for publishers to help users find content. In 2003, the social bookmarking
Social bookmarking
Social bookmarking is a method for Internet users to organize, store, manage and search for bookmarks of resources online. Unlike file sharing, the resources themselves aren't shared, merely bookmarks that reference them....

 website Delicious provided a way for its users to add "tags" to their bookmarks (as a way to help find them later); Delicious also provided browseable aggregated views of the bookmarks of all users featuring a particular tag. Flickr
Flickr
Flickr is an image hosting and video hosting website, web services suite, and online community that was created by Ludicorp in 2004 and acquired by Yahoo! in 2005. In addition to being a popular website for users to share and embed personal photographs, the service is widely used by bloggers to...

 allowed its users to add free-form tags to each of their pictures, constructing flexible and easy metadata that made the pictures highly searchable. The success of Flickr and the influence of Delicious popularized the concept, and other social software
Social software
Social software applications include communication tools and interactive tools. Communication tools typically handle the capturing, storing and presentation of communication, usually written but increasingly including audio and video as well. Interactive tools handle mediated interactions between a...

 websites – such as YouTube
YouTube
YouTube is a video-sharing website, created by three former PayPal employees in February 2005, on which users can upload, view and share videos....

, Technorati
Technorati
Technorati is an Internet search engine for searching blogs. By June 2008, Technorati was indexing 112.8 million blogs and over 250 million pieces of tagged social media...

, and Last.fm
Last.fm
Last.fm is a music website, founded in the United Kingdom in 2002. It has claimed 30 million active users in March 2009. On 30 May 2007, CBS Interactive acquired Last.fm for UK£140m ....

 – also implemented tagging. "Labels" in Gmail
Gmail
Gmail is a free, advertising-supported email service provided by Google. Users may access Gmail as secure webmail, as well via POP3 or IMAP protocols. Gmail was launched as an invitation-only beta release on April 1, 2004 and it became available to the general public on February 7, 2007, though...

 are similar to tags.

Websites that include tags often display collections of tags as tag cloud
Tag cloud
A tag cloud is a visual representation for text data, typically used to depict keyword metadata on websites, or to visualize free form text. 'Tags' are usually single words, and the importance of each tag is shown with font size or color...

s. A user's tags are useful both to them and to the larger community of the website's users.

Tags may be a "bottom-up" type of classification, compared to hierarchies
Hierarchy
A hierarchy is an arrangement of items in which the items are represented as being "above," "below," or "at the same level as" one another...

, which are "top-down". In a traditional hierarchical system (taxonomy
Taxonomy
Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...

), the designer sets out a limited number of terms to use for classification, and there is one correct way to classify each item. In a tagging system, there are an unlimited number of ways to classify an item, and there is no "wrong" choice. Instead of belonging to one category, an item may have several different tags. Some researchers and applications have experimented with combining structured hierarchy and "flat" tagging to aid in information retrieval.

Within a blog

Many blog
Blog
A blog is a type of website or part of a website supposed to be updated with new content from time to time. Blogs are usually maintained by an individual with regular entries of commentary, descriptions of events, or other material such as graphics or video. Entries are commonly displayed in...

 systems allow authors to add free-form tags to a post, along with (or instead of) placing the post into categories. For example, a post may display that it has been tagged with baseball and tickets. Each of those tags is usually a web link leading to an index page listing all of the posts associated with that tag. The blog may have a sidebar listing all the tags in use on that blog, with each tag leading to an index page. To reclassify a post, an author edits its list of tags. All connections between posts are automatically tracked and updated by the blog software; there is no need to relocate the page within a complex hierarchy of categories.

For an event

An official tag is a keyword adopted by events and conferences for participants to use in their web publications, such as blog entries, photos of the event, and presentation slides. Search engines can then index them to make relevant materials related to the event searchable in a uniform way. In this case, the tag is part of a controlled vocabulary
Controlled vocabulary
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other form of knowledge organization systems...

.

Triple tags

A triple tag or machine tag uses a special syntax
Syntax
In linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....

 to define extra semantic information about the tag, making it easier or more meaningful for interpretation by a computer program. Triple tags comprise three parts: a namespace
Namespace
In general, a namespace is a container that provides context for the identifiers it holds, and allows the disambiguation of homonym identifiers residing in different namespaces....

, a predicate, and a value. For example, "geo:long=50.123456" is a tag for the geographical longitude
Longitude
Longitude is a geographic coordinate that specifies the east-west position of a point on the Earth's surface. It is an angular measurement, usually expressed in degrees, minutes and seconds, and denoted by the Greek letter lambda ....

 coordinate whose value is 50.123456. This triple structure is similar to the Resource Description Framework
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...

 model for information.

The triple tag format was first devised for geolicious in November 2004, to map Delicious bookmarks, and gained wider acceptance after its adoption by Mappr and GeoBloggers to map Flickr
Flickr
Flickr is an image hosting and video hosting website, web services suite, and online community that was created by Ludicorp in 2004 and acquired by Yahoo! in 2005. In addition to being a popular website for users to share and embed personal photographs, the service is widely used by bloggers to...

 photos. In January 2007, Aaron Straup Cope at Flickr
Flickr
Flickr is an image hosting and video hosting website, web services suite, and online community that was created by Ludicorp in 2004 and acquired by Yahoo! in 2005. In addition to being a popular website for users to share and embed personal photographs, the service is widely used by bloggers to...

 introduced the term machine tag as an alternative name for the triple tag, adding some questions and answers on purpose, syntax, and use.

Specialized metadata for geographical identification is known as geotagging
GeoTagging
Geotagging is the process of adding geographical identification metadata to various media such as a geotagged photograph or video, websites, SMS messages, QR Codes or RSS feeds and is a form of geospatial metadata...

; machine tags are also used for other purposes, such as identifying photos taken at a specific event or naming species using binomial nomenclature
Binomial nomenclature
Binomial nomenclature is a formal system of naming species of living things by giving each a name composed of two parts, both of which use Latin grammatical forms, although they can be based on words from other languages...

.

Hashtags

Short messages on services such as Twitter
Twitter
Twitter is an online social networking and microblogging service that enables its users to send and read text-based posts of up to 140 characters, informally known as "tweets".Twitter was created in March 2006 by Jack Dorsey and launched that July...

 or identi.ca
Identi.ca
identi.ca is an open source social networking and micro-blogging service. Based on StatusNet, a micro-blogging software package built on the OpenMicroBlogging specification, Identi.ca allows users to send text updates up to 140 characters long...

 may be tagged by including one or more hashtags: words or phrases prefixed with the symbol #
Number sign
Number sign is a name for the symbol #, which is used for a variety of purposes including, in some countries, the designation of a number...

, with multiple words concatenated, such as those in:
#Wikipedia is my favourite kind of #encyclopedia


Then, a person can search for the string #Wikipedia and this tagged word will appear in the search engine
Web search engine
A web search engine is designed to search for information on the World Wide Web and FTP servers. The search results are generally presented in a list of results often referred to as SERPS, or "search engine results pages". The information may consist of web pages, images, information and other...

 results. These hashtags also show up in a number of trending topic
Trending topic
A trending topic is a word, phrase or topic that is posted multiple times on the social networking and microblogging service Twitter. Trending topics become popular either through a concerted effort by users or because of an event that prompts people to talk about one specific topic...

s websites, including Twitter's own front page. Such tags are case-insensitive, with CamelCase
CamelCase
CamelCase , also known as medial capitals, is the practice of writing compound words or phrases in which the elements are joined without spaces, with each element's initial letter capitalized within the compound and the first letter either upper or lower case—as in "LaBelle", "BackColor",...

 often used for readability.

Definitions for some hashtags are available at hashtag.org. Hashtags were invented on Twitter by Chris Messina.

One phenomenon specific to the Twitter ecosystem are micro-memes
Internet meme
The term Internet meme is used to describe a concept that spreads via the Internet. The term is a reference to the concept of memes, although the latter concept refers to a much broader category of cultural information.-Description:...

, which are emergent
topics for which a hashtag is created, used widely for a few days, then disappears.

Other sites, such as Hashable, have adopted the hashtag to use for other reasons.

The feature has been added to other, non-short-message-oriented services, such as the user comment systems on YouTube
YouTube
YouTube is a video-sharing website, created by three former PayPal employees in February 2005, on which users can upload, view and share videos....

 and Gawker Media
Gawker Media
Gawker Media is an American online media company and blog network, founded and owned by Nick Denton based in New York City. It is considered to be one of the most visible and successful blog-oriented media companies. , it is the parent company for 11 different weblogs: Gawker.com, Fleshbot,...

; in the case of the latter, hashtags for blog comments and directly-submitted comments are used to maintain a more constant rate of user activity even when paid employees are not logged into the website. Real-time search aggregators such as Google Real-Time Search
Google Real-Time Search
Google Real-Time Search was a feature of Google Search provided by Google in which search results also sometimes included real-time information from sources such as Twitter, Facebook, blogs, and news websites. The feature was introduced on December 7, 2009 and went off-line on July 2, 2011 after...

 also support hashtags in syndicated posts, meaning that hashtags inserted into Twitter posts can be hyperlinked to incoming posts falling under that same hashtag; this has further enabled a view of the "river" of Twitter posts which can result from search terms or hashtags.

Advantages and disadvantages

In a typical tagging system, there is no explicit information about the meaning or semantics
Semantics
Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....

 of each tag, and a user can apply new tags to an item as easily as applying older tags. Hierarchical classification systems can be slow to change, and are rooted in the culture and era that created them. The flexibility of tagging allows users to classify their collections of items in the ways that they find useful, but the personalized variety of terms can present challenges when searching and browsing.

When users can freely choose tags (creating a folksonomy
Folksonomy
A folksonomy is a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content; this practice is also known as collaborative tagging, social classification, social indexing, and social tagging...

, as opposed to selecting terms from a controlled vocabulary
Controlled vocabulary
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other form of knowledge organization systems...

), the resulting metadata can include homonym
Homonym
In linguistics, a homonym is, in the strict sense, one of a group of words that often but not necessarily share the same spelling and the same pronunciation but have different meanings...

s (the same tags used with different meanings) and synonym
Synonym
Synonyms are different words with almost identical or similar meanings. Words that are synonyms are said to be synonymous, and the state of being a synonym is called synonymy. The word comes from Ancient Greek syn and onoma . The words car and automobile are synonyms...

s (multiple tags for the same concept), which may lead to inappropriate connections between items and inefficient searches for information about a subject. For example, the tag "orange" may refer to the fruit
Orange (fruit)
An orange—specifically, the sweet orange—is the citrus Citrus × sinensis and its fruit. It is the most commonly grown tree fruit in the world....

 or the color
Orange (colour)
The colour orange occurs between red and yellow in the visible spectrum at a wavelength of about 585–620 nm, and has a hue of 30° in HSV colour space. It is numerically halfway between red and yellow in a gamma-compressed RGB colour space, the expression of which is the RGB colour wheel. The...

, and items related to a version of Apple's operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 may be tagged "Mac OS X", "Lion", "software", or a variety of other terms. Users can also choose tags that are different inflection
Inflection
In grammar, inflection or inflexion is the modification of a word to express different grammatical categories such as tense, grammatical mood, grammatical voice, aspect, person, number, gender and case...

s of words (such as singular and plural), which can contribute to navigation difficulties if the system does not include stemming
Stemming
In linguistic morphology and information retrieval, stemming is the process for reducing inflected words to their stem, base or root form—generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same...

 of tags when searching or browsing. Larger-scale folksonomies address some of the problems of tagging, in that users of tagging systems tend to notice the current use of "tag terms" within these systems, and thus use existing tags in order to easily form connections to related items. In this way, folksonomies collectively develop a partial set of tagging conventions.

Complex system dynamics

Despite the apparent lack of control, research has shown that a simple form of shared vocabularies emerges in social bookmarking systems. Collaborative tagging exhibits a form of complex system
Complex system
A complex system is a system composed of interconnected parts that as a whole exhibit one or more properties not obvious from the properties of the individual parts....

s dynamics, (or self organizing
Self-organization
Self-organization is the process where a structure or pattern appears in a system without a central authority or external element imposing it through planning...

 dynamics). Thus, even if no central controlled vocabulary constrains the actions of individual users, the distribution of tags that describe different resources (e.g., websites) converges over time to stable power law
Power law
A power law is a special kind of mathematical relationship between two quantities. When the frequency of an event varies as a power of some attribute of that event , the frequency is said to follow a power law. For instance, the number of cities having a certain population size is found to vary...

 distributions. Once such stable distributions form, simple vocabularies can be extracted by examining the correlation
Correlation
In statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....

s that form between different tags. This informal collaborative system of tag creation and management has been called a folksonomy
Folksonomy
A folksonomy is a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content; this practice is also known as collaborative tagging, social classification, social indexing, and social tagging...

.

Spamming

Tagging systems open to the public are also open to tag spam, in which people apply an excessive number of tags or unrelated tags to an item (such as a YouTube
YouTube
YouTube is a video-sharing website, created by three former PayPal employees in February 2005, on which users can upload, view and share videos....

 video) in order to attract viewers. This abuse can be mitigated using human or statistical identification of spam items. The number of tags allowed may also be limited to reduce spam.

Syntax

Some tagging systems provide a single text box
Text box
A text box, text field or text entry box is a kind of widget used when building a graphical user interface . A text box's purpose is to allow the user to input text information to be used by the program...

 to enter tags, so to be able to tokenize the string, a separator must be used. Two popular separators are the space character
Space (punctuation)
In writing, a space is a blank area devoid of content, serving to separate words, letters, numbers, and punctuation. Conventions for interword and intersentence spaces vary among languages, and in some cases the spacing rules are quite complex....

 and the comma
Comma
A comma is a type of punctuation mark . The word comes from the Greek komma , which means something cut off or a short clause.Comma may also refer to:* Comma , a type of interval in music theory...

. To enable the use of separators in the tags, a system may allow for higher-level separators (such as quotation mark
Quotation mark
Quotation marks or inverted commas are punctuation marks at the beginning and end of a quotation, direct speech, literal title or name. Quotation marks can also be used to indicate a different meaning of a word or phrase than the one typically associated with it and are often used to express irony...

s) or escape character
Escape character
In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters...

s. Systems can avoid the use of separators by allowing only one tag to be added to each input widget
Web widget
In computing a web widget is a software widget for the web. It's a small application that can be installed and executed within a web page by an end user. They are derived from the idea of code reuse. Other terms used to describe web widgets include: portlet, gadget, badge, module, webjit, capsule,...

 at a time, although this makes adding multiple tags more time-consuming.

A syntax for use within HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

 is to use the rel-tag microformat
Microformat
A microformat is a web-based approach to semantic markup which seeks to re-use existing HTML/XHTML tags to convey metadata and other attributes in web pages and other contexts that support HTML, such as RSS...

 which uses the rel attribute
Rel attribute
A link relation is a descriptive attribute attached to a hyperlink in order to define the type of the link, or the relationship between the source and destination resources. The attribute can be used by automated systems, or can be presented to a user in a different way.In HTML these are designated...

 with value "tag" (i.e., rel="tag") to indicate that the linked-to page acts as a tag for the current context.

See also

  • Ontology
  • Semantic Web
    Semantic Web
    The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

  • Knowledge tags
    Knowledge tags
    A knowledge tag is a type of meta-information that describes or defines some aspect of an information resource . Knowledge tags are more than traditional non-hierarchical keywords or terms...

  • Folksonomy
    Folksonomy
    A folksonomy is a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content; this practice is also known as collaborative tagging, social classification, social indexing, and social tagging...

  • Microformats
    Microformats
    A microformat is a web-based approach to semantic markup which seeks to re-use existing HTML/XHTML tags to convey metadata and other attributes in web pages and other contexts that support HTML, such as RSS...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK