URL redirection
Encyclopedia
URL redirection, also called URL forwarding and the very similar technique domain redirection also called domain forwarding, are techniques on the World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

 for making a web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...

 available under many URLs
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

.

Similar domain names

A user might mis-type a URL—for example, "example.com" and "exmaple.com". Organizations often register these "mis-spelled" domains and re-direct them to the "correct" location: example.com. The addresses example.com and example.net could both redirect to a single domain, or web page, such as example.org. This technique is often used to "reserve" other top-level domain
Top-level domain
A top-level domain is one of the domains at the highest level in the hierarchical Domain Name System of the Internet. The top-level domain names are installed in the root zone of the name space. For all domains in lower levels, it is the last part of the domain name, that is, the last label of a...

s (TLD) with the same name, or make it easier for a true ".edu" or ".net" to redirect to a more recognizable ".com" domain.

Moving a site to a new domain

A web page may be redirected for several reasons:
  • a web site might need to change its domain name;
  • an author might move his or her pages to a new domain;
  • two web sites might merge.


With URL redirects, incoming links to an outdated URL can be sent to the correct location. These links might be from other sites that have not realized that there is a change or from bookmarks/favorites that users have saved in their browsers.

The same applies to search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...

s. They often have the older/outdated domain names and links in their database and will send search users to these old URLs. By using a "moved permanently" redirect to the new URL, visitors will still end up at the correct page. Also, in the next search engine pass, the search engine should detect and use the newer URL.

Logging outgoing links

The access logs of most web servers keep detailed information about where visitors came from and how they browsed the hosted site. They do not, however, log which links visitors left by. This is because the visitor's browser has no need to communicate with the original server when the visitor clicks on an outgoing link.

This information can be captured in several ways. One way involves URL redirection. Instead of sending the visitor straight to the other site, links on the site can direct to a URL on the original website's domain that automatically redirects to the real target. This technique bears the downside of the delay caused by the additional request to the original website's server. As this added request will leave a trace in the server log, revealing exactly which link was followed, it can also be a privacy issue.

The same technique is also used by some corporate websites to implement a statement that the subsequent content is at another site, and therefore not necessarily affiliated with the corporation. In such scenarios, displaying the warning causes an additional delay.

Short aliases for long URLs

Web applications often include lengthy descriptive attributes in their URLs which represent data hierarchies, command structures, transaction paths and session information. This practice results in a URL that is aesthetically unpleasant and difficult to remember, and which may not fit within the size limitations of microblogging
Microblogging
Microblogging is a broadcast medium in the form of blogging. A microblog differs from a traditional blog in that its content is typically smaller in both actual and aggregate file size...

 sites. URL shortening
URL shortening
URL shortening is a technique on the World Wide Web in which a Uniform Resource Locator may be made substantially shorter in length and still direct to the required page. This is achieved by using an HTTP Redirect on a domain name that is short, which links to the web page that has a long URL...

 services provide a solution to this problem by redirecting a user to a longer URL from a shorter one..

Meaningful, persistent aliases for long or changing URLs

Sometimes the URL of a page changes even though the content stays the same. Therefore URL redirection can help users who have bookmarks. This is routinely done on Wikipedia whenever a page is renamed.

Manipulating search engines

Some years ago, redirect techniques were used to fool search engines. For example, one page could show popular search terms to search engines but redirect the visitors to a different target page. There are also cases where redirects have been used to "steal" the page rank of one popular page and use it for a different page, usually involving the 302 HTTP status code of "moved temporarily."

Search engine providers noticed the problem and took appropriate actions . Usually, sites that employ such techniques to manipulate search engines are punished automatically by reducing their ranking or by excluding them from the search index.

As a result, today, such manipulations usually result in less rather than more site exposure.

Satire and criticism

In the same way that a Google bomb
Google bomb
The terms Google bomb and Googlewashing refer to practices, such as creating large numbers of links, that cause a web page to have a high ranking for searches on unrelated or off topic keyword phrases, often for comical or satirical purposes...

 can be used for satire and political criticism, a domain name that conveys one meaning can be redirected to any other web page, sometimes with malicious intent. The website shadyurl.com offers a satirical service that will create an apparently "suspicious and frightening" redirection URL for even benign webpages. For example, an input of en.wikipedia.org generates 5z8.info/hookers_e4u5_inject_worm.

Manipulating visitors

URL redirection is sometimes used as a part of phishing
Phishing
Phishing is a way of attempting to acquire information such as usernames, passwords, and credit card details by masquerading as a trustworthy entity in an electronic communication. Communications purporting to be from popular social web sites, auction sites, online payment processors or IT...

 attacks that confuse visitors about which web site they are visiting . Because modern browsers always show the real URL in the address bar, the threat is lessened. However, redirects can also take you to sites that will otherwise attempt to attack in other ways. For example, a redirect might take a user to a site that would attempt to trick them into downloading antivirus software and ironically installing a trojan of some sort instead.

Removing referer information

When a link is clicked, the browser sends along in the HTTP request a field called referer
HTTP referer
The referrer, or HTTP referrer — also known by the common misspelling referer that occurs as an HTTP header field — identifies, from the point of view of an Internet webpage or resource, the address of the webpage The referrer, or HTTP referrer — also known by the common misspelling...

 which indicates the source of the link. This field is populated with the URL of the current web page, and will end up in the logs
Server log
A server log is a log file automatically created and maintained by a server of activity performed by it.A typical example is a web server log which maintains a history of page requests. The W3C maintains a standard format for web server log files, but other proprietary formats exist...

 of the server serving the external link. Since sensitive pages may have sensitive URLs (for example, http://company.com/plans-for-the-next-release-of-our-product), it is not desirable for the referer URL to leave the organization. A redirection page that performs referrer hiding could be embedded in all external URLs, transforming for example http://externalsite.com/page into http://redirect.company.com/http://externalsite.com/page. This technique also eliminates other potentially sensitive information from the referer URL, such as the session ID
Session ID
In computer science, a session identifier, session ID or session token is a piece of data that is used in network communications to identify a session, a series of related message exchanges. Session identifiers become necessary in cases where the communications infrastructure uses a stateless...

, and can reduce the chance of phishing
Phishing
Phishing is a way of attempting to acquire information such as usernames, passwords, and credit card details by masquerading as a trustworthy entity in an electronic communication. Communications purporting to be from popular social web sites, auction sites, online payment processors or IT...

 by indicating to the end user that they passed a clear gateway to another site.

Techniques

There are several techniques to implement a redirect. In many cases, Refresh meta tag
Meta refresh
Meta refresh is a legacy method of instructing a web browser to automatically refresh the current web page or frame after a given time interval, using an HTML meta element with the http-equiv parameter set to "refresh" and a content parameter giving the time interval in seconds...

 is the simplest one. However, there exist several strong opinions discouraging this method.

Manual redirect

The simplest technique is to ask the visitor to follow a link to the new page, usually using an HTML anchor as such:

Please follow this link.

This method is often used as a fall-back for automatic methods — if the visitor's browser does not support the automatic redirect method, the visitor can still reach the target document by following the link.

HTTP status codes 3xx

In the HTTP protocol used by the World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

, a redirect is a response with a status code beginning with 3 that induces a browser to go to another location, with annotation describing the reason, which allows for the correct subsequent action (such as changing links in the case of code 301, a permanent change of address)

The HTTP standard defines several status codes for redirection:
  • 300 multiple choices (e.g. offer different languages)
  • 301 moved permanently
  • 302 found (originally temporary redirect, but now commonly used to specify redirection for unspecified reason)
  • 303 see other (e.g. for results of cgi-scripts)
  • 307 temporary redirect


All of these status codes require that the URL of the redirect target be given in the Location: header of the HTTP response. The 300 multiple choices will usually list all choices in the body of the message and show the default choice in the Location: header.

Within the 3xx range, there are also some status codes that are quite different from the above redirects (they are not discussed here with their details):
  • 304 not modified
  • 305 use proxy


This is a sample of an HTTP response that uses the 301 "moved permanently" redirect:


HTTP/1.1 301 Moved Permanently
Location: http://www.example.org/
Content-Type: text/html
Content-Length: 174



Moved


Moved


This page has moved to http://www.example.org/.





Using server-side scripting for redirection

Often, web authors don't have sufficient permissions to produce these status codes: The HTTP header is generated by the web server program and not read from the file for that URL. Even for CGI scripts, the web server usually generates the status code automatically and allows custom headers to be added by the script. To produce HTTP status codes with cgi-scripts, one needs to enable non-parsed-headers.

Sometimes, it is sufficient to print the "Location: 'url'" header line from a normal CGI script. Many web servers choose one of the 3xx status codes for such replies.

Frameworks for server-side content generation typically require that HTTP headers be generated before response data. As a result, the web programmer who is using such a scripting language to redirect the user's browser to another page must ensure that the redirect is the first or only part of the response. In the ASP scripting
Active Server Pages
Active Server Pages , also known as Classic ASP or ASP Classic, was Microsoft's first server-side script engine for dynamically-generated Web pages. Initially released as an add-on to Internet Information Services via the Windows NT 4.0 Option Pack Active Server Pages (ASP), also known as Classic...

 language, this can also be accomplished using the methods response.buffer=true and response.redirect "http://www.example.com/". Using PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

, one can use the header function as follows:


header('HTTP/1.1 301 Moved Permanently');
header('Location: http://www.example.com/');
exit;


According to the HTTP protocol, the Location header must contain an absolute URI
Uniform Resource Identifier
In computing, a uniform resource identifier is a string of characters used to identify a name or a resource on the Internet. Such identification enables interaction with representations of the resource over a network using specific protocols...

. When redirecting from one page to another within the same site, it is a common mistake to use a relative URI. As a result most browsers tolerate relative URIs in the Location header, but some browsers display a warning to the end user.


There are other methods that can be used for performing redirects, but they do not offer the flexibility that mod_rewrite offers. These alternative rules use functions within mod_alias:

Redirect permanent /oldpage.html http://www.example.com/newpage.html
Redirect 301 /oldpage.html http://www.example.com/newpage.html



To redirect a requests for any non-canonical domain name using .htaccess or within a section in an Apache config file:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^([^.:]+\.)*oldsite\.example\.com\.?(:[0-9]*)?$ [NC]
RewriteRule ^(.*)$ http://newsite.example.net/$1 [R=301,L]


Use of .htaccess for this purpose usually does not require administrative permissions. However, .htaccess can be disabled by your host, and so may not work (or continue to work) if they do so.

In addition, some server configurations may require the addition of the line:

Options +FollowSymLinks

ahead of the "RewriteEngine on" directive, in order to enable the mod_rewrite module.

When you have access to the main Apache config files (such as httpd.conf), it is best to avoid the use of .htaccess files.

If the code is placed into an Apache config file and not within any container, then the RewriteRule pattern must be changed to include a leading slash:

RewriteEngine on

RewriteCond %{HTTP_HOST} ^([^.:]+\.)*oldwebsite\.com\.?(:[0-9]*)?$ [NC]
RewriteRule ^/(.*)$ http://www.preferredwebsite.net/$1 [R=301,L]

Refresh Meta tag and HTTP refresh header

Netscape
Netscape
Netscape Communications is a US computer services company, best known for Netscape Navigator, its web browser. When it was an independent company, its headquarters were in Mountain View, California...

 introduced a feature to refresh the displayed page after a certain amount of time. This method is often called meta refresh
Meta refresh
Meta refresh is a legacy method of instructing a web browser to automatically refresh the current web page or frame after a given time interval, using an HTML meta element with the http-equiv parameter set to "refresh" and a content parameter giving the time interval in seconds...

. It is possible to specify the URL of the new page, thus replacing one page after some time by another page:
  • HTML tag
  • An exploration of dynamic documents
  • Meta refresh
    Meta refresh
    Meta refresh is a legacy method of instructing a web browser to automatically refresh the current web page or frame after a given time interval, using an HTML meta element with the http-equiv parameter set to "refresh" and a content parameter giving the time interval in seconds...



A timeout of 0 seconds means an immediate redirect. Meta Refresh with a timeout of 0 seconds is accepted as a 301 permanent redirect by Google, allowing to transfer PageRank from static html files.

This is an example of a simple HTML document that uses this technique:







Please follow this link.





  • This technique is usable by all web authors because the meta tag is contained inside the document itself.
  • The meta tag must be placed in the "head" section of the HTML file.
  • The number "0" in this example may be replaced by another number to achieve a delay of that many seconds.
  • This is a proprietary extension to HTML introduced by Netscape but supported by most web browsers. The manual link in the "body" section is for users whose browsers do not support this feature.


This is an example of achieving the same effect by issuing an HTTP refresh header:


HTTP/1.1 200 ok
Refresh: 0; url=http://www.example.com/
Content-type: text/html
Content-length: 78

Please follow this link!

This response is easier to generate by CGI programs because one does not need to change the default status code.
Here is a simple CGI program that effects this redirect:

  1. !/usr/bin/perl

print "Refresh: 0; url=http://www.example.com/\r\n";
print "Content-type: text/html\r\n";
print "\r\n";
print "Please follow this link!"

Note: Usually, the HTTP server adds the status line and the Content-length header automatically.

This method is considered by the W3C
World Wide Web Consortium
The World Wide Web Consortium is the main international standards organization for the World Wide Web .Founded and headed by Tim Berners-Lee, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the...

 to be a poor method of redirection, since it does not communicate any information about either the original or new resource, to the browser (or search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...

). The W3C's Web Content Accessibility Guidelines (7.4) discourage the creation of auto-refreshing pages, since most web browsers do not allow the user to disable or control the refresh rate. Some articles that they have written on the issue include W3C Web Content Accessibility Guidelines (1.0): Ensure user control of time-sensitive content changes and Use standard redirects: don't break the back button!

------------------------------
This example works best for a refresh, or in simple terms - a redirect for webpages, as follows, however, for a refresh under 4 seconds, your webpage will not be given priority listing on search engines. For some users, this is preferred not to be listed. Inline, you will find the time as in seconds:
CONTENT="2
this number can be adjusted to suit your needs.

Place in your head:








------------------------------

JavaScript redirects

JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

 offers several ways to display a different page in the current browser window. Quite frequently, they are used for a redirect. However, there are several reasons to prefer HTTP header or the refresh meta tag (whenever it is possible) over JavaScript redirects:
  • Security considerations
  • Some browsers don't support JavaScript
  • many web crawler
    Web crawler
    A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web robots, or—especially in the FOAF community—Web scutters.This process is called Web...

    s don't execute JavaScript.

Frame redirects

A slightly different effect can be achieved by creating a single HTML frame that contains the target page:




<br /> <body>Please follow <a href="http://www.example.com/">link</a>!</body><br />


One main difference to the above redirect methods is that for a frame redirect, the browser displays the URL of the frame document and not the URL of the target page in the URL bar.

This technique is commonly called cloaking. This may be used so that the reader sees a more memorable URL or, with fraudulent intentions, to conceal a phishing
Phishing
Phishing is a way of attempting to acquire information such as usernames, passwords, and credit card details by masquerading as a trustworthy entity in an electronic communication. Communications purporting to be from popular social web sites, auction sites, online payment processors or IT...

 site as part of website spoofing
Website spoofing
Website spoofing is the act of creating a website, as a hoax, with the intention of misleading readers that the website has been created by a different person or organisation. Another meaning for spoof is fake websites. Normally, the spoof website will adopt the design of the target website and...

.

Redirect loops

It is quite possible that one redirect leads to another redirect. For example, the URL http://www.wikipedia.com/wiki/URL_redirection (note the differences in the domain name) is first redirected to http://www.wikipedia.org/wiki/URL_redirection and again redirected to the correct URL: http://en.wikipedia.org/wiki/URL_redirection. This is appropriate: the first redirection corrects the wrong domain name, the second redirection selects the correct language section, and finally, the browser displays the correct page.

Sometimes, however, a mistake can cause the redirection to point back to the first page, leading to an infinite loop of redirects. Browsers usually break that loop after a few steps and display an error message instead.

The HTTP standard states:

A client SHOULD detect infinite redirection loops, since such loops generate network traffic for each redirection.


Previous versions of this specification recommended a maximum of five redirections; some clients may exist that implement such a fixed limitation.


Services

There exist services that can perform URL redirection on demand, with no need for technical work or access to the webserver your site is hosted on.

URL redirection services

A redirect service is an information management system, which provides an internet link that redirects users to the desired content. The typical benefit to the user is the use of a memorable domain name, and a reduction in the length of the URL or web address. A redirecting link can also be used as a permanent address for content that frequently changes hosts, similarly to the Domain Name System
Domain name system
The Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...

.

Hyperlinks involving URL redirection services are frequently used in spam messages directed at blogs and wikis. Thus, one way to reduce spam is to reject all edits and comments containing hyperlinks to known URL redirection services; however, this will also remove legitimate edits and comments and may not be an effective method to reduce spam.

Recently, URL redirection services have taken to using AJAX
Ajax
- Mythology :* Ajax , son of Telamon, ruler of Salamis and a hero in the Trojan War, also known as "Ajax the Great"* Ajax the Lesser, son of Oileus, ruler of Locris and the leader of the Locrian contingent during the Trojan War.- People :...

 as an efficient, user friendly method for creating shortened URLs.

A major drawback of some URL redirection services is the use of delay pages, or frame based advertising, to generate revenue.

History

The first redirect services took advantage of top-level domains (TLD) such as ".to
.to
.to is the Internet country code top-level domain of the island kingdom of Tonga.The government of Tonga sells domains in its ccTLD to any interested party. Because to is a common English preposition, it became popular to craft memorable URLs called domain hacks that take advantage of this, such...

" (Tonga), ".at
.at
.at is the Internet country code top-level domain for Austria. It is administered by .The .at top-level domain has a number of second-level domains...

" (Austria) and ".is
.is
.is is the Internet country code top-level domain for Iceland. The very first .is domain, hi.is, the domain of University of Iceland, was registered on December 11. 1986. .Is registration is open to all without any special restriction. The country code is derived from the first two letters of...

" (Iceland). Their goal was to make memorable URLs. The first mainstream redirect service was V3.com that boasted 4 million users at its peak in 2000. V3.com success was attributed to having a wide variety of short memorable domains including "r.im", "go.to", "i.am", "come.to" and "start.at". V3.com was acquired by FortuneCity.com, a large free web hosting company, in early 1999. In 2001 emerged .tk
.tk
.tk is the Internet country code top-level domain for Tokelau, a territory of New Zealand located in the South Pacific.-Overview:Tokelau allows any individual to register domain names. Users and small businesses may register up to 3 domain names free of charge...

 (Tokelau) as a TLD used for memorable names. As the sales price of top level domains started falling from $70.00 per year to less than $10.00, the demand for memorable redirection services eroded.

With the launch of TinyURL
TinyURL
TinyURL is a URL shortening service, a web service that provides short aliases for redirection of long URLs. Kevin Gilbertson, a web developer, launched the service in January 2002 so that he would be able to link directly to newsgroup postings that frequently had long and cumbersome addresses.-...

 in 2002 a new kind of redirecting service was born, namely URL shortening
URL shortening
URL shortening is a technique on the World Wide Web in which a Uniform Resource Locator may be made substantially shorter in length and still direct to the required page. This is achieved by using an HTTP Redirect on a domain name that is short, which links to the web page that has a long URL...

. Their goal was to make long URLs short, to be able to post them on internet forums. Since 2006, with the 140 character limit on the extremely popular Twitter
Twitter
Twitter is an online social networking and microblogging service that enables its users to send and read text-based posts of up to 140 characters, informally known as "tweets".Twitter was created in March 2006 by Jack Dorsey and launched that July...

 service, these short URL services have seen a resurgence.

Referrer Masking

Redirection services can hide the referrer by placing an intermediate page between the page the link is on and its destination. Although these are conceptually similar to other URL redirection services, they serve a different purpose, and they rarely attempt to shorten or obfuscate the destination URL (as their only intended side-effect is to hide referrer information and provide a clear gateway between other websites.)

This type of redirection is often used to prevent potentially-malicious links from gaining information using the referrer, for example a session ID
Session ID
In computer science, a session identifier, session ID or session token is a piece of data that is used in network communications to identify a session, a series of related message exchanges. Session identifiers become necessary in cases where the communications infrastructure uses a stateless...

 in the query string. Many large community websites use link redirection on external links to lessen the chance of an exploit that could be used to steal account information, as well as make it clear when a user is leaving a service, to lessen the chance of effective phishing
Phishing
Phishing is a way of attempting to acquire information such as usernames, passwords, and credit card details by masquerading as a trustworthy entity in an electronic communication. Communications purporting to be from popular social web sites, auction sites, online payment processors or IT...

.

Here is a simplistic example of such a service, written in PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

.

$url = htmlspecialchars($_GET['url']);
header( 'Location: http://'.$url. );
?>



Redirecting...



Attempting to redirect to http://.



External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK