URL shortening
Encyclopedia
URL shortening is a technique on the World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

 in which a Uniform Resource Locator
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

 (URL) may be made substantially shorter in length and still direct to the required page. This is achieved by using an HTTP Redirect
URL redirection
URL redirection, also called URL forwarding and the very similar technique domain redirection also called domain forwarding, are techniques on the World Wide Web for making a web page available under many URLs.- Similar domain names :...

 on a domain name
Domain name
A domain name is an identification string that defines a realm of administrative autonomy, authority, or control in the Internet. Domain names are formed by the rules and procedures of the Domain Name System ....

 that is short, which links to the web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...

 that has a long URL. For example, the URL http://en.wikipedia.org/wiki/URL_shortening can be shortened to http://bit.ly/urlwiki or http://tinyurl.com/urlwiki. This is especially convenient for messaging technologies such as Twitter
Twitter
Twitter is an online social networking and microblogging service that enables its users to send and read text-based posts of up to 140 characters, informally known as "tweets".Twitter was created in March 2006 by Jack Dorsey and launched that July...

 and Identi.ca
Identi.ca
identi.ca is an open source social networking and micro-blogging service. Based on StatusNet, a micro-blogging software package built on the OpenMicroBlogging specification, Identi.ca allows users to send text updates up to 140 characters long...

, which severely limit the number of characters
Character (computing)
In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....

 that may be used in a message. Short URLs allow otherwise long web addresses to be referred to in a tweet. In November 2009, the shortened links on one URL shortening service were accessed 2.1 billion times.

Normally, a URL shortening service will use the top-level domain of a country that allows foreign sites to use its extension, such as .ly or .to (Libya and Tonga), to redirect worldwide using a short alphanumeric sequence after the provider's site address in order to point to the long URL.

Another use of URL shortening is to disguise the underlying address. Although this may be desired for legitimate business or personal reasons, it is open to abuse and for this reason, some URL shortening service providers have found themselves on spam blacklists, because of the use of their redirect services by sites trying to bypass those very same blacklists. Some websites prevent short, redirected URLs from being posted.

Purposes

There are several reasons to use URL shortening. The free hosting space provided by access Internet Service Provider
Internet service provider
An Internet service provider is a company that provides access to the Internet. Access ISPs directly connect customers to the Internet using copper wires, wireless or fiber-optic connections. Hosting ISPs lease server space for smaller businesses and host other people servers...

s for its customers may generate an aesthetically unpleasing address. Some web developers on mainstream sites tend to pass descriptive attributes in the URL to represent data hierarchies, command structures, transaction paths or session information and this often results in a URL that contains a large number of characters, is awkward to reproduce and impossible to remember. Copying a URL that is hundreds of characters long can only really be successfully done by copy-and-paste. Trying to type one by hand will be time-consuming and may result in errors. Thus a short URL is more useful to write in an e-mail
E-mail
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

 message or an internet forum
Internet forum
An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are at least temporarily archived...

 post.

On Twitter
Twitter
Twitter is an online social networking and microblogging service that enables its users to send and read text-based posts of up to 140 characters, informally known as "tweets".Twitter was created in March 2006 by Jack Dorsey and launched that July...

 and some instant-messaging
Instant messaging
Instant Messaging is a form of real-time direct text-based chatting communication in push mode between two or more people using personal computers or other devices, along with shared clients. The user's text is conveyed over a network, such as the Internet...

 services, there is a limit on the total number of characters that can be used in a message. Using a URL shortener can make it easier to include a URL within a short message. Some shortening services, such as tinyurl.com
TinyURL
TinyURL is a URL shortening service, a web service that provides short aliases for redirection of long URLs. Kevin Gilbertson, a web developer, launched the service in January 2002 so that he would be able to link directly to newsgroup postings that frequently had long and cumbersome addresses.-...

and bit.ly
Bit.ly
bitly is a URL shortening service owned by bitly, Inc., a betaworks company. It is especially popular on microblogging website Twitter because it is the default URL shortening service on the website since May 6, 2009, replacing TinyURL...

, can generate URLs that are human-readable, although the resulting strings are longer than those generated by a length-optimized service. Finally, the URL shortening sites provide detailed information on the clicks the link receives, this can be simpler than setting up an equally powerful server side analytics engine.

Registering a short URL

An increasing number of websites are registering their own short URLs to make sharing via Twitter
Twitter
Twitter is an online social networking and microblogging service that enables its users to send and read text-based posts of up to 140 characters, informally known as "tweets".Twitter was created in March 2006 by Jack Dorsey and launched that July...

 and SMS
SMS
SMS is a form of text messaging communication on phones and mobile phones. The terms SMS or sms may also refer to:- Computer hardware :...

 easier. This can normally be done online, at the web pages of a URL shortening service. Short URLs often circumvent the intended use of top-level domain
Top-level domain
A top-level domain is one of the domains at the highest level in the hierarchical Domain Name System of the Internet. The top-level domain names are installed in the root zone of the name space. For all domains in lower levels, it is the last part of the domain name, that is, the last label of a...

s for indicating the country of origin; domain registration in many countries requires proof of physical presence within that country, although a redirected URL has no such guarantee.

Techniques

In URL shortening, every long URL is associated with a unique key
Unique key
In relational database design, a unique key can uniquely identify each row in a table, and is closely related to the Superkey concept. A unique key comprises a single column or a set of columns. No two distinct rows in a table can have the same value in those columns if NULL values are not used...

, which is the part after http://top-level domain name/, for example http://tinyurl.com/m3q2xt has a key of m3q2xt. Not all redirection is treated equally; the redirection instruction sent to a browser can contain in its header the HTTP status 301 (permanent redirect) or 302 (temporary redirect).

There are several techniques to implement a URL shortening. Keys can be generated in base 36
Base 36
Base 36 is a positional numeral system using 36 as the radix. The choice of 36 is convenient in that the digits can be represented using the Arabic numerals 0-9 and the Latin letters A-Z...

, assuming 26 letters and 10 numbers. In this case, each character in the sequence will be 0, 1, 2, ..., 9, a, b, c, ..., y, z. Alternatively, if uppercase and lowercase letters are differentiated, then each character can represent a single digit within a number of base 62 (26 + 26 + 10). In order to form the key, a hash function
Hash function
A hash function is any algorithm or subroutine that maps large data sets to smaller data sets, called keys. For example, a single integer can serve as an index to an array...

 can be made, or a random number
Random number
Random number may refer to:* A number generated for or part of a set exhibiting statistical randomness.* A random sequence obtained from a stochastic process.* An algorithmically random sequence in algorithmic information theory....

 generated so that key sequence is not predictable. Or users may propose their own keys. For example, http://en.wikipedia.org/w/index.php?title=TinyURL&diff=283621022&oldid=283308287 can be shortened to http://bit.ly/tinyurlwiki.

Not all protocols are capable of being shortened, as of 2011, although protocols such as http, https, ftp, ftps, mailto, news, mms, rtmp, rtmpt, e2dk, pop, imap, nntp, news, ldap, gopher, dict and dns are being addressed by such services as URL Shortener. Typically, data: and javascript: URLs are not supported for security reasons. Some URL shortening services support the forwarding of mailto URLs, as an alternative to address munging
Address munging
Address munging is the practice of disguising, or munging, an e-mail address to prevent it being automatically collected and used as a target for people and organizations who send unsolicited bulk e-mail...

, to avoid unwanted harvest by web crawlers or bot
Bot
Bot or BOT may refer to:-Computing:* Bot, another also name for a Web crawler* Bots , an open-source EDI software* BOTS, a computer game* Internet bot, a computer program that does automated tasks...

s. This may sometimes be done using short, CAPTCHA
CAPTCHA
A CAPTCHA is a type of challenge-response test used in computing as an attempt to ensure that the response is generated by a person. The process usually involves one computer asking a user to complete a simple test which the computer is able to generate and grade...

-protected URLs, but this is not common.

Tinyarro.ws, urlrace.com, and qoiob.com use Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 characters to achieve the shortest URLs possible, since more condensed URLs are possible with a given number of characters compared to those using a standard Latin alphabet
Latin alphabet
The Latin alphabet, also called the Roman alphabet, is the most recognized alphabet used in the world today. It evolved from a western variety of the Greek alphabet called the Cumaean alphabet, which was adopted and modified by the Etruscans who ruled early Rome...

.

History

The idea of URL shortening dates back at least to 2001. The first notable URL shortening service, TinyURL
TinyURL
TinyURL is a URL shortening service, a web service that provides short aliases for redirection of long URLs. Kevin Gilbertson, a web developer, launched the service in January 2002 so that he would be able to link directly to newsgroup postings that frequently had long and cumbersome addresses.-...

, was launched in 2002. Its popularity influenced the creation of at least 100 similar websites, although most are simply domain alternatives. Initially Twitter
Twitter
Twitter is an online social networking and microblogging service that enables its users to send and read text-based posts of up to 140 characters, informally known as "tweets".Twitter was created in March 2006 by Jack Dorsey and launched that July...

 automatically translated long URLs using TinyURL, although it began using bit.ly in 2009.

In May 2009, the service .tk
.tk
.tk is the Internet country code top-level domain for Tokelau, a territory of New Zealand located in the South Pacific.-Overview:Tokelau allows any individual to register domain names. Users and small businesses may register up to 3 domain names free of charge...

, which previously generated memorable domains via URL redirection
URL redirection
URL redirection, also called URL forwarding and the very similar technique domain redirection also called domain forwarding, are techniques on the World Wide Web for making a web page available under many URLs.- Similar domain names :...

, launched tweak.tk, which generates very short URLs. On 14 August 2009, WordPress
WordPress
WordPress is a free and open source blogging tool and publishing platform powered by PHP and MySQL. It is often customized into a content management system . It has many features including a plug-in architecture and a template system. WordPress is used by over 14.7% of Alexa Internet's "top 1...

 announced the wp.me URL shortener for use when referring to any WordPress.com blog post. In November 2009, shortened links on bit.ly
Bit.ly
bitly is a URL shortening service owned by bitly, Inc., a betaworks company. It is especially popular on microblogging website Twitter because it is the default URL shortening service on the website since May 6, 2009, replacing TinyURL...

were accessed 2.1 billion times. Around that time, bit.ly and TinyURL were the most widely used URL-shortening services.

On 10 August 2009, however, tr.im, announced that it was curtailing the generation of new shortened URLs, but assured that existing tr.im short URLs would "continue to redirect, and will do so until at least December 31, 2009". A blog post on the site attributed this move to several factors, including a lack of suitable revenue-generating mechanisms to cover ongoing hosting and maintenance costs, a lack of interest among possible purchasers of the service and Twitter's default use of the bit.ly shortener. This blog post also questioned whether other shortening services can successfully make money from URL shortening in the longer term. A few days later, tr.im appeared to alter its stance, announcing that it would resume all operations "going forward, indefinitely, while we continue to consider our options in regards to tr.im's future" but, as of July 11, 2011, the tr.im service failed.

In December 2009, the URL shortener TO./ NanoURL was launched by .TO. This service creates a URL address which looks like http://to./xxxx, where xxxx represents a combination of random numbers and letters. NanoURL currently generates the shortest URLs of all URL shortening services, because it is hosted on a top-level domain
Top-level domain
A top-level domain is one of the domains at the highest level in the hierarchical Domain Name System of the Internet. The top-level domain names are installed in the root zone of the name space. For all domains in lower levels, it is the last part of the domain name, that is, the last label of a...

 (the one of Tonga
Tonga
Tonga, officially the Kingdom of Tonga , is a state and an archipelago in the South Pacific Ocean, comprising 176 islands scattered over of ocean in the South Pacific...

). This rare form of URL may cause problems with some browsers, however, where the string is interpreted as a search term and passed to a search engine, instead of being opened. As of 2011, the service is no longer available.

On 14 December 2009, Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

 announced a service called Google URL Shortener at goo.gl, which originally was only available for use through Google products (such as Google Toolbar
Google Toolbar
Google Toolbar is an Internet browser toolbar only available for Internet Explorer and Firefox .-Google Toolbar 1.0 December 11, 2000:New features:*Direct access to the Google search functionality from any web page*Web Site search...

 and FeedBurner
FeedBurner
FeedBurner is a web feed management provider launched in 2004. FeedBurner was founded by Dick Costolo, Eric Lunt, Steve Olechowski, and Matt Shobe. Costolo, a University of Michigan graduate, became CEO of Twitter in 2010...

). It does, however, have two extensions (Standard and Lite versions) for Google Chrome
Google Chrome
Google Chrome is a web browser developed by Google that uses the WebKit layout engine. It was first released as a beta version for Microsoft Windows on September 2, 2008, and the public stable release was on December 11, 2008. The name is derived from the graphical user interface frame, or...

. On 21 December 2009, Google also announced a service called YouTube
YouTube
YouTube is a video-sharing website, created by three former PayPal employees in February 2005, on which users can upload, view and share videos....

 URL Shortener, youtu.be, and since September 2010, Google URL Shortener has become available via a direct interface.

Linkrot

The convenience offered by URL shortening also introduces potential problems, which have led to criticism of the use of these services. Short URLs, for example, will be subject to linkrot if the shortening service stops working; all URLs related to the service will become broken. It is a legitimate concern that many existing URL shortening services may not have a sustainable business model in the long term. This worry was highlighted by a statement from tr.im in August 2009 (see above). In late 2009, the Internet Archive
Internet Archive
The Internet Archive is a non-profit digital library with the stated mission of "universal access to all knowledge". It offers permanent storage and access to collections of digitized materials, including websites, music, moving images, and nearly 3 million public domain books. The Internet Archive...

 started the "301 Works" projects, together with twenty collaborating companies (initially), whose short URLs will be preserved by the project. The URL shortening service ur1.ca provides its entire database as a file download, so if its website stops working, other websites may be able to provide ways to correct broken links to URLs shortened with its service.

Closure by Internet service provider

URL shortening sites are sometimes shut down by their hosting Internet service provider
Internet service provider
An Internet service provider is a company that provides access to the Internet. Access ISPs directly connect customers to the Internet using copper wires, wireless or fiber-optic connections. Hosting ISPs lease server space for smaller businesses and host other people servers...

 (ISP) because the links are being used for illicit purposes. For example, upon closing operations, "u.nu" announced:
The last straw came on September 3, 2010,
when the server was disconnected without notice by our hosting provider
in response to reports of a number of links to child pornography sites.
The disconnection of the server caused us serious problems, and to be
honest, the level and nature of the abuse has become quite demoralizing.
Given the choice between spending time and money to find a different home,
or just giving up, the latter won out.

International law

Shortened internet links typically use foreign country domain names, and are therefore under the jurisdiction of that nation. Libya
Libya
Libya is an African country in the Maghreb region of North Africa bordered by the Mediterranean Sea to the north, Egypt to the east, Sudan to the southeast, Chad and Niger to the south, and Algeria and Tunisia to the west....

, for instance, exercised its control over the .ly domain in October 2010 to shut down vb.ly for violating Libyan pornography laws. Failure to predict such problems with URL shorteners and investment in URL shortening companies may reflect a lack of due diligence
Due diligence
"Due diligence" is a term used for a number of concepts involving either an investigation of a business or person prior to signing a contract, or an act with a certain standard of care. It can be a legal obligation, but the term will more commonly apply to voluntary investigations...

.

Blocking

Some websites prevent short, redirected URLs from being posted.

In 2009, the Twitter
Twitter
Twitter is an online social networking and microblogging service that enables its users to send and read text-based posts of up to 140 characters, informally known as "tweets".Twitter was created in March 2006 by Jack Dorsey and launched that July...

 network replaced TinyURL with Bit.ly as its default shortener of links longer than twenty-six characters. In April 2009, TinyURL was reported to be blocked in Saudi Arabia
Saudi Arabia
The Kingdom of Saudi Arabia , commonly known in British English as Saudi Arabia and in Arabic as as-Sa‘ūdiyyah , is the largest state in Western Asia by land area, constituting the bulk of the Arabian Peninsula, and the second-largest in the Arab World...

. Yahoo! Answers
Yahoo! Answers
Yahoo! Answers is a community-driven question-and-answer site or a knowledge market launched by Yahoo! on June 28, 2005 that allows users to both submit questions to be answered and answer questions asked by other users...

 blocks postings that contain TinyURLs and Wikipedia
Wikipedia
Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Its 20 million articles have been written collaboratively by volunteers around the world. Almost all of its articles can be edited by anyone with access to the site,...

 does not accept links by any URL shortening services in its articles.

Privacy and security

Users may be exposed to privacy issues through the URL shortening service's ability to track a user's behavior across many domains.

On the security side, a short URL obscures the target address, and as a result, can be used to redirect to an unexpected site. Examples of this are rickrolling
Rickrolling
Rickrolling is an Internet meme involving the music video for the 1987 Rick Astley song "Never Gonna Give You Up". The meme is a bait and switch; a person provides a hyperlink seemingly relevant to the topic at hand, but actually leads to Astley's video...

, redirecting to shock site
Shock site
A shock site is a website that is intended to be offensive, disgusting and/or disturbing to its viewers, containing materials of high shock value which is also considered distasteful and crude, and is generally of a pornographic, scatological, extremely violent, insulting, painful, profane, or...

s, or to affiliate websites. Short URLs can also unexpectedly redirect a user to scam pages or pages containing malware or XSS attacks, which use the redirect to bypass URL blacklists. TinyURL tries to disable spam-related links from redirecting. ZoneAlarm, however, has warned its users: "TinyURL may be unsafe. This website has been known to distribute spyware." TinyURL countered this problem by offering an option to preview a link before using a shortened URL. This ability is installed on the browser via the TinyURL website, however, and requires the use of cookies. However, a preview may also be obtained by simply prefixing the word "preview" to the front of the URL: for example, http://tinyurl.com/8kmfp could be retyped as http://preview.tinyurl.com/8kmfp to see where the link will lead. Security professionals suggest that users should always preview a short URL before accessing it, following an instance where the URL shortening service cli.gs was compromised, exposing millions of users to security uncertainties.

Some URL shortening services have started filtering their links through services like Google Safe Browsing. Many sites that accept user-submitted content block links, however, to certain domains in order to cut down on spam
Spam (electronic)
Spam is the use of electronic messaging systems to send unsolicited bulk messages indiscriminately...

 and for this reason, known URL redirection services are often themselves added to spam blacklists.

Due to such problems, other websites such as FindHiddenURL appeared. Such websites target the shortened URL and provide the user with the original hidden link and with description about the link.

Additional layer of complexity

Short URLs, although making it easier to access what might otherwise be a very long URL or user-space on an ISP server, add an additional layer of complexity to the process of retrieving web pages. Every access requires more requests (at least one more DNS lookup and HTTP request), thereby increasing latency, the time taken to access the page and also the risk of failure, since the shortening service may become unavailable. Another operational limitation of URL shortening services is that browsers do not resend POST bodies when a redirect is encountered. This can be overcome by making the service a reverse proxy
Reverse proxy
In computer networks, a reverse proxy is a type of proxy server that retrieves resources on behalf of a client from one or more servers. These resources are then returned to the client as though it originated from the reverse proxy itself...

, or by elaborate schemes involving cookies and buffered POST bodies, but such techniques present security and scaling challenges, and are therefore not used on extranets or Internet-scale services.

See also

  • Clean URL – http://example.com/index.asp?mod=profiles&id=193 becomes http://example.com/user/john-doe
  • Country code top-level domain
    Country code top-level domain
    A country code top-level domain is an Internet top-level domain generally used or reserved for a country, a sovereign state, or a dependent territory....

  • Domain hack
    Domain hack
    A domain hack is an unconventional domain name that combines domain levels, especially the top-level domain , to spell out the full "name" or title of the domain. Examples include del.icio.us , goo.gl and fold.it...

     – an unconventional domain name that spells out the full "name" or title of the domain e.g. http://del.icio.us or http://goo.gl
  • Domain name system
    Domain name system
    The Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...

  • Generic top-level domain
    Generic top-level domain
    A generic top-level domain is one of the categories of top-level domains maintained by the Internet Assigned Numbers Authority for use in the Domain Name System of the Internet....

  • Link rot
    Link rot
    Link rot , also known as link death or link breaking is an informal term for the process by which, either on individual websites or the Internet in general, increasing numbers of links point to web pages, servers or other resources that have become permanently unavailable...

  • List of Internet top-level domains
  • Semantic URL
    Semantic URL
    The term semantic URL refers to a URL which is of a form that is immediately and intuitively meaningful to non-experts. Such URL schemes tend to reflect the conceptual structure of a collection of information and decouple the user interface from a server's internal representation of information...

  • Top-level domain
    Top-level domain
    A top-level domain is one of the domains at the highest level in the hierarchical Domain Name System of the Internet. The top-level domain names are installed in the root zone of the name space. For all domains in lower levels, it is the last part of the domain name, that is, the last label of a...

  • URL redirection
    URL redirection
    URL redirection, also called URL forwarding and the very similar technique domain redirection also called domain forwarding, are techniques on the World Wide Web for making a web page available under many URLs.- Similar domain names :...

  • Vanity domain
    Vanity domain
    In the Domain Name System , a vanity domain is a domain name whose purpose is to express the individuality of the person on whose behalf it is registered. This contrasts with domain names which resolve to an organisation or a service that organisation offers...

  • Vanity URL
    Vanity URL
    A vanity URL is a URL or domain name, created to point to something to which it is related and indicated in the name of the URL, very similar to a Personalized URL. In many cases this is done by a company to point to a specific product or advertising campaign microsite...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK