Cross-site scripting
Encyclopedia
Cross-site scripting is a type of computer security
Computer insecurity
Computer insecurity refers to the concept that a computer system is always vulnerable to attack, and that this fact creates a constant battle between those looking to improve security, and those looking to circumvent security.-Security and systems design:...

 vulnerability typically found in Web application
Web application
A web application is an application that is accessed over a network such as the Internet or an intranet. The term may also mean a computer software application that is coded in a browser-supported language and reliant on a common web browser to render the application executable.Web applications are...

s that enables attackers to inject
Code injection
Code injection is the exploitation of a computer bug that is caused by processing invalid data. Code injection can be used by an attacker to introduce code into a computer program to change the course of execution. The results of a code injection attack can be disastrous...

 client-side script into Web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...

s viewed by other users. A cross-site scripting vulnerability may be used by attackers to bypass access control
Access control
Access control refers to exerting control over who can interact with a resource. Often but not always, this involves an authority, who does the controlling. The resource can be a given building, group of buildings, or computer-based information system...

s such as the same origin policy
Same origin policy
In computing, the same origin policy is an important security concept for a number of browser-side programming languages, such as JavaScript. The policy permits scripts running on pages originating from the same site to access each other's methods and properties with no specific restrictions, but...

. Cross-site scripting carried out on websites accounted for roughly 80.5% of all security vulnerabilities documented by Symantec as of 2007. Their effect may range from a petty nuisance to a significant security risk, depending on the sensitivity of the data handled by the vulnerable site and the nature of any security mitigation implemented by the site's owner.

Background

Cross-site scripting holes are web-application vulnerabilities which allow attackers to bypass client-side security mechanisms normally imposed on web content by modern web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

s. By finding ways of injecting malicious scripts into web pages, an attacker can gain elevated access-privileges to sensitive page content, session cookies, and a variety of other information maintained by the browser on behalf of the user. Cross-site scripting attacks are therefore a special case of code injection
Code injection
Code injection is the exploitation of a computer bug that is caused by processing invalid data. Code injection can be used by an attacker to introduce code into a computer program to change the course of execution. The results of a code injection attack can be disastrous...

.

The expression "cross-site scripting" originally referred to the act of loading the attacked, third-party web application from an unrelated attack site, in a manner that executes a fragment of JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

 prepared by the attacker in the security context
Same origin policy
In computing, the same origin policy is an important security concept for a number of browser-side programming languages, such as JavaScript. The policy permits scripts running on pages originating from the same site to access each other's methods and properties with no specific restrictions, but...

 of the targeted domain (a reflected or non-persistent XSS vulnerability). The definition gradually expanded to encompass other modes of code injection, including persistent and non-JavaScript vectors (including Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, ActiveX
ActiveX
ActiveX is a framework for defining reusable software components in a programming language-independent way. Software applications can then be composed from one or more of these components in order to provide their functionality....

, VBScript
VBScript
VBScript is an Active Scripting language developed by Microsoft that is modeled on Visual Basic. It is designed as a “lightweight” language with a fast interpreter for use in a wide variety of Microsoft environments...

, Flash
Adobe Flash
Adobe Flash is a multimedia platform used to add animation, video, and interactivity to web pages. Flash is frequently used for advertisements, games and flash animations for broadcast...

, or even pure HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

, and SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....

 Queries), causing some confusion to newcomers to the field of information security
Information security
Information security means protecting information and information systems from unauthorized access, use, disclosure, disruption, modification, perusal, inspection, recording or destruction....

.

XSS vulnerabilities have been reported and exploited since the 1990s. Prominent sites affected in the past include the social-networking sites Twitter
Twitter
Twitter is an online social networking and microblogging service that enables its users to send and read text-based posts of up to 140 characters, informally known as "tweets".Twitter was created in March 2006 by Jack Dorsey and launched that July...

,
Facebook
Facebook
Facebook is a social networking service and website launched in February 2004, operated and privately owned by Facebook, Inc. , Facebook has more than 800 million active users. Users must register before using the site, after which they may create a personal profile, add other users as...

,
MySpace
MySpace
Myspace is a social networking service owned by Specific Media LLC and pop star Justin Timberlake. Myspace launched in August 2003 and is headquartered in Beverly Hills, California. In August 2011, Myspace had 33.1 million unique U.S. visitors....

, and Orkut
Orkut
Orkut is a social networking website that is owned and operated by Google Inc. The service is designed to help users meet new and old friends and maintain existing relationships...

. In recent years, cross-site scripting flaws surpassed buffer overflow
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

s to become the most common publicly-reported security vulnerability, with some researchers viewing as many as 68% of websites as likely open to XSS attacks.

Types

There is no single, standardized classification of cross-site scripting flaws, but most experts distinguish between at least two primary flavors of XSS: non-persistent and persistent. Some sources further divide these two groups into traditional (caused by server-side code flaws) and DOM
Document Object Model
The Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...

-based
(in client-side code).

Non-persistent

Example of non-persistent XSS
Non-persistent XSS vulnerabilities in Google could allow malicious sites to attack Google users who visit them while logged in.


The non-persistent (or reflected) cross-site scripting vulnerability is by far the most common type. These holes show up when the data provided by a web client, most commonly in HTTP query parameters or in HTML form submissions, is used immediately by server-side scripts to generate a page of results for that user, without properly sanitizing the request.

Because HTML documents have a flat, serial structure that mixes control statements, formatting, and the actual content, any non-validated user-supplied data included in the resulting page without proper HTML encoding, may lead to markup injection. A classic example of a potential vector is a site search engine: if one searches for a string, the search string will typically be redisplayed verbatim on the result page to indicate what was searched for. If this response does not properly escape
Escape character
In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters...

 or reject HTML control characters, a cross-site scripting flaw will ensue.

A reflected attack is typically delivered via email or a neutral web site. The bait is an innocent-looking URL, pointing to a trusted site but containing the XSS vector. If the trusted site is vulnerable to the vector, clicking the link can cause the victim's browser to execute the injected script.

Persistent

Example of persistent XSS
A persistent cross-zone scripting
Cross-zone scripting
Cross-zone scripting is a browser exploit taking advantage of a vulnerability within a zone-based security solution. The attack allows content in unprivileged zones to be executed with the permissions of a privileged zone - i.e. a privilege escalation within the client executing the script...

 vulnerability coupled with a computer worm
Computer worm
A computer worm is a self-replicating malware computer program, which uses a computer network to send copies of itself to other nodes and it may do so without any user intervention. This is due to security shortcomings on the target computer. Unlike a computer virus, it does not need to attach...

 allowed execution of arbitrary code and listing of filesystem contents via a QuickTime movie on MySpace
MySpace
Myspace is a social networking service owned by Specific Media LLC and pop star Justin Timberlake. Myspace launched in August 2003 and is headquartered in Beverly Hills, California. In August 2011, Myspace had 33.1 million unique U.S. visitors....

.


The persistent (or stored) XSS vulnerability is a more devastating variant of a cross-site scripting flaw: it occurs when the data provided by the attacker is saved by the server, and then permanently displayed on "normal" pages returned to other users in the course of regular browsing, without proper HTML escaping. A classic example of this is with online message boards where users are allowed to post HTML formatted messages for other users to read.

For example, suppose there is a dating website where members scan the profiles of other members to see if they look interesting. For privacy reasons, this site hides everybody's real name and email. These are kept secret on the server. The only time a member's real name and email are in the browser is when the member is signed in, and they can't see anyone else's.

Suppose that Mallory, a hacker, joins the site and wants to figure out the real names of the men she sees on the site. To do so, she writes a script that runs from men's browsers when they visit her profile. The script then sends a quick message to her own server, which collects this information.

To do this, for the question "Describe your Ideal First Date", Mallory gives a short answer (to appear normal) but the text at the end of her answer is her script to steal names and emails. If the script is enclosed inside a <script> element, it won't be shown on the screen. Then suppose that Bob, a member of the dating site, reaches Mallory’s profile, which has her answer to the First Date question. Her script is run automatically by the browser and steals a copy of Bob’s real name and email directly from his own machine.

Persistent XSS can be more significant than other types because an attacker's malicious script is rendered automatically, without the need to individually target victims or lure them to a third-party website. Particularly in the case of social networking sites, the code would be further designed to self-propagate across accounts, creating a type of a client-side worm
Computer worm
A computer worm is a self-replicating malware computer program, which uses a computer network to send copies of itself to other nodes and it may do so without any user intervention. This is due to security shortcomings on the target computer. Unlike a computer virus, it does not need to attach...

.

The methods of injection can vary a great deal; in some cases, the attacker may not even need to directly interact with the web functionality itself to exploit such a hole. Any data received by the web application (via email, system logs, etc.) that can be controlled by an attacker could become an injection vector.

Traditional versus DOM-based vulnerabilities

Example of DOM-based XSS
Before the bug was resolved, Bugzilla error pages were open to DOM
Document Object Model
The Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...

-based XSS attack in which arbitrary HTML and scripts could be injected using forced error messages.

Traditionally, cross-site scripting vulnerabilities would occur in server-side code responsible for preparing the HTML response to be served to the user. With the advent of web 2.0
Web 2.0
The term Web 2.0 is associated with web applications that facilitate participatory information sharing, interoperability, user-centered design, and collaboration on the World Wide Web...

 applications a new class of XSS flaws emerged: DOM-based vulnerabilities. DOM-based vulnerabilities occur in the content processing stages performed by the client, typically in client-side JavaScript
Client-side JavaScript
Client-side JavaScript is JavaScript that runs on the client-side. While JavaScript was originally created to run this way, the term was coined because the language is no longer limited to just client-side, since server-side JavaScript is now available.-Environment:The most common Internet media...

. The name refers to the standard model for representing HTML or XML contents which is called the Document Object Model
Document Object Model
The Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...

 (DOM). JavaScript programs manipulate the state of a web page and populate it with dynamically-computed data primarily by acting upon the DOM.

A typical example is a piece of JavaScript accessing and extracting data from the URL via the location.* DOM, or receiving raw non-HTML data from the server via XMLHttpRequest
XMLHttpRequest
XMLHttpRequest is an API available in web browser scripting languages such as JavaScript. It is used to send HTTP or HTTPS requests directly to a web server and load the server response data directly back into the script. The data might be received from the server as XML text or as plain text...

, and then using this information to write dynamic HTML without proper escaping, entirely on client side.

Exploit scenarios

Attackers intending to exploit cross-site scripting vulnerabilities must approach each class of vulnerability differently. For each class, a specific attack vector is described here. The names below are technical terms, taken from the cast of characters
Alice and Bob
The names Alice and Bob are commonly used placeholder names for archetypal characters in fields such as cryptography and physics. The names are used for convenience; for example, "Alice sends a message to Bob encrypted with his public key" is easier to follow than "Party A sends a message to Party...

 commonly used in computer security.

Non-persistent:
  1. Alice often visits a particular website, which is hosted by Bob. Bob's website allows Alice to log in with a username/password pair and stores sensitive data, such as billing information.
  2. Mallory observes that Bob's website contains a reflected XSS vulnerability.
  3. Mallory crafts a URL to exploit the vulnerability, and sends Alice an email, enticing her to click on a link for the URL under false pretenses. This URL will point to Bob's website (either directly or through an iframe
    IFrame
    iFrame can be:* I-frames, in video compression; see video compression picture types* iFrame * The HTML iframe element....

     or ajax
    Ajax (programming)
    Ajax is a group of interrelated web development methods used on the client-side to create asynchronous web applications...

    ), but will contain Mallory's malicious code, which the website will reflect.
  4. Alice visits the URL provided by Mallory while logged into Bob's website.
  5. The malicious script embedded in the URL executes in Alice's browser, as if it came directly from Bob's server (this is the actual XSS vulnerability). The script can be used to send Alice's session cookie to Mallory. Mallory can then use the session cookie to steal sensitive information available to Alice (authentication credentials, billing info, etc.) without Alice's knowledge.


Persistent attack:
  1. Mallory posts a message with malicious payload to a social network.
  2. When Bob reads the message, Mallory's XSS steals Bob's cookie.
  3. Mallory can now hijack
    Session hijacking
    In computer science, session hijacking is the exploitation of a valid computer session—sometimes also called a session key—to gain unauthorized access to information or services in a computer system. In particular, it is used to refer to the theft of a magic cookie used to authenticate a user to a...

     Bob's session and impersonate Bob.

Framework:

The Browser Exploitation Framework
BeEF (Browser Exploitation Framework)
The Browser Exploitation Framework is a powerful professional security tool. BeEF is pioneering techniques that provide the experienced penetration tester with practical client side attack vectors....

 could be used to attack the web site and the user's local environment.

Contextual output encoding/escaping of string input

The primary defense mechanism to stop XSS is contextual output encoding/escaping. There are several different escaping schemes that must be used depending on where the untrusted string needs to be placed within an HTML document including HTML entity encoding, JavaScript escaping, CSS escaping, and URL (or percent) encoding
Percent-encoding
Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a Uniform Resource Identifier under certain circumstances. Although it is known as URL encoding it is, in fact, used more generally within the main Uniform Resource Identifier set, which includes both Uniform...

. Most web applications that do not need to accept rich data can use escaping to largely eliminate the risk of XSS in a fairly straightforward manner.

It is worth noting that although it is widely recommended, simply performing HTML entity encoding on the five XML significant characters is not always sufficient to prevent many forms of XSS. Encoding can be tricky, and the use of a security encoding library is highly recommended.

Safely validating untrusted HTML input

Many operators of particular web applications (e.g. forums and webmail) wish to allow users to utilize some of the features HTML provides, such as a limited subset of HTML markup. When accepting HTML input from users (say, <b>very</b> large), output encoding (such as &lt;b&gt;very&lt;/b&gt; large) will not suffice since the user input needs to be rendered as HTML by the browser (so it shows as "very large", instead of "<b>very</b> large"). Stopping XSS when accepting HTML input from users is much more complex in this situation. Untrusted HTML input must be run through an HTML policy engine to ensure that it does not contain XSS. HTML sanitization
HTML sanitization
HTML sanitization is the process of examining an HTML document and producing a new HTML document that preserves only whatever tags are designated "safe"...

 tools such as OWASP AntiSamy and http://htmlpurifier.org/ accomplish this task.

Cookie security

Besides content filtering, other imperfect methods for cross-site scripting mitigation are also commonly used. One example is the use of additional security controls when handling cookie
HTTP cookie
A cookie, also known as an HTTP cookie, web cookie, or browser cookie, is used for an origin website to send state information to a user's browser and for the browser to return the state information to the origin site...

-based user authentication. Many web applications rely on session cookies for authentication between individual HTTP requests, and because client-side scripts generally have access to these cookies, simple XSS exploits can steal these cookies. To mitigate this particular threat (though not the XSS problem in general), many web applications tie session cookies to the IP address of the user who originally logged in, and only permit that IP to use that cookie. This is effective in most situations (if an attacker is only after the cookie), but obviously breaks down in situations where an attacker spoofs their IP address
IP address spoofing
In computer networking, the term IP address spoofing or IP spoofing refers to the creation of Internet Protocol packets with a forged source IP address, called spoofing, with the purpose of concealing the identity of the sender or impersonating another computing system.-Background:The basic...

, is behind the same NAT
Network address translation
In computer networking, network address translation is the process of modifying IP address information in IP packet headers while in transit across a traffic routing device....

ed IP address or web proxy—or simply opts to tamper with the site or steal data through the injected script, instead of attempting to hijack the cookie for future use.

Another mitigation present in IE (since version 6), Firefox (since version 2.0.0.5), Safari (since version 4), Opera (since version 9.5) and Google Chrome, is a HttpOnly flag which allows a web server to set a cookie that is unavailable to client-side scripts. While beneficial, the feature does not fully prevent cookie theft nor can it prevent attacks within the browser.

Disabling scripts

Finally, while Web 2.0
Web 2.0
The term Web 2.0 is associated with web applications that facilitate participatory information sharing, interoperability, user-centered design, and collaboration on the World Wide Web...

 and Ajax
Ajax (programming)
Ajax is a group of interrelated web development methods used on the client-side to create asynchronous web applications...

 designers favor the use of JavaScript, some web applications are written to (sometimes optionally) operate completely without the need for client-side scripts. This allows users, if they choose, to disable scripting in their browsers before using the application. In this way, even potentially malicious client-side scripts could be inserted unescaped on a page, and users would not be susceptible to XSS attacks.

Some browsers or browser plugins can be configured to disable client-side scripts on a per-domain basis. If scripting is allowed by default, then this approach is of limited value, since it blocks bad sites only after the user knows that they are bad, which is too late. Functionality that blocks all scripting and external inclusions by default and then allows the user to enable it on a per-domain basis is more effective. This has been possible for a long time in IE (since version 4) by setting up its so called "Security Zones", and in Opera (since version 9) using its "Site Specific Preferences". A solution for Firefox and other Gecko
Gecko (layout engine)
Gecko is a free and open source layout engine used in many applications developed by Mozilla Foundation and the Mozilla Corporation , as well as in many other open source software projects....

-based browsers is the open source NoScript
NoScript
NoScript is a free and open-source extension for Mozilla Firefox, SeaMonkey, and other Mozilla-based web browsers, created and actively maintained by Giorgio Maone, an Italian software developer and member of the Mozilla Security Group...

 add-on which, in addition to the ability to enable scripts on a per-domain basis, provides some anti-XSS protection even when scripts are enabled.

The most significant problem with blocking all scripts on all websites by default is substantial reduction in functionality and responsiveness (client-side scripting can be much faster than server-side scripting because it does not need to connect to a remote server and the page or frame does not need to be reloaded). Another problem with script blocking is that many users do not understand it, and do not know how to properly secure their browsers. Yet another drawback is that many sites do not work without client-side scripting, forcing users to disable protection for that site and opening their systems to vulnerabilities. The Firefox NoScript extension enables users to allow scripts selectively from a given page while disallowing others on the same page. For example, scripts from example.com could be allowed, while scripts from advertisingagency.com that are attempting to run on the same page could be disallowed.

Emerging defensive technologies

There are three classes of XSS defense that are emerging. These include, Mozilla's Content Security Policy, Javascript Sandbox tools, and Auto-escaping templates. These mechanisms are still evolving but promise a future of heavily reduced XSS.

Scanning service

Some companies offer a periodic scan service, essentially simulating an attack from their server to a client's in order to check if the attack is successful. If the attack succeeds, the client receives detailed information on how it was performed and thus has a chance to fix the issues before the same attack is attempted by someone else. A trust seal
Trust seal
A Trust seal is a seal awarded by proprietary companies to businesses to display, in an attempt to boost customer confidence. There are several well-known trust seals from different companies, and the requirements for the displaying merchant vary, but typically involve a dedication to good security...

 can be displayed on the site that passes a recent scan. The scanner may not find all possible vulnerabilities, and therefore sites with trust seals may still be vulnerable to new types of attack, but the scan may detect some problems. After the client fixes them, the site is more secure than it was before using the service. For sites that require complete mitigation of XSS, assessment techniques like manual code review are necessary. Additionally, if javascript is executing on the page, the seal can be overwritten with a static copy of the seal.

Related vulnerabilities

Several classes of vulnerabilities or attack techniques are related to XSS: cross-zone scripting
Cross-zone scripting
Cross-zone scripting is a browser exploit taking advantage of a vulnerability within a zone-based security solution. The attack allows content in unprivileged zones to be executed with the permissions of a privileged zone - i.e. a privilege escalation within the client executing the script...

 exploits "zone" concepts in certain browsers and usually executes code with a greater privilege. HTTP header injection
HTTP header injection
HTTP header injection is a general class of web application security vulnerability which occurs when Hypertext Transfer Protocol headers are dynamically generated based on user input...

 can be used to create cross-site scripting conditions due to escaping problems on HTTP protocol level (in addition to enabling attacks such as HTTP response splitting
HTTP response splitting
HTTP response splitting is a form of web application vulnerability, resulting from the failure of the application or its environment to properly sanitize input values...

).

Cross-site request forgery
Cross-site request forgery
Cross-site request forgery, also known as a one-click attack or session riding and abbreviated as CSRF or XSRF, is a type of malicious exploit of a website whereby unauthorized commands are transmitted from a user that the website trusts...

 (CSRF/XSRF) is almost the opposite of XSS, in that rather than exploiting the user's trust in a site, the attacker (and his malicious page) exploits the site's trust in the client software, submitting requests that the site believes represent conscious and intentional actions of authenticated users.

Lastly, SQL injection
SQL injection
A SQL injection is often used to attack the security of a website by inputting SQL statements in a web form to get a badly designed website in order to dump the database content to the attacker. SQL injection is a code injection technique that exploits a security vulnerability in a website's software...

 exploits a vulnerability in the database layer of an application. When user input is incorrectly filtered any SQL statements can be executed by the application.

See also

  • Same origin policy
    Same origin policy
    In computing, the same origin policy is an important security concept for a number of browser-side programming languages, such as JavaScript. The policy permits scripts running on pages originating from the same site to access each other's methods and properties with no specific restrictions, but...

  • Metasploit Project
    Metasploit Project
    The Metasploit Project is an open-source computer security project which provides information about security vulnerabilities and aids in penetration testing and IDS signature development....

    , an open-source penetration testing tool that includes tests for XSS
  • w3af
    W3af
    w3af is an open-source web application security scanner. The project provides a vulnerability scanner and exploitation tool for Web applications...

    , an open-source web application security scanner
  • NoScript
    NoScript
    NoScript is a free and open-source extension for Mozilla Firefox, SeaMonkey, and other Mozilla-based web browsers, created and actively maintained by Giorgio Maone, an Italian software developer and member of the Mozilla Security Group...

     and NotScripts
    NotScripts
    NotScripts is a free and open-source extension for Google Chrome, Chromium, and Opera web browsers. NotScripts blocks execution of JavaScript, Java, Flash, Silverlight, and other plugins and scripted content. NotScripts has a whitelist to allow execution of scripts from certain sites. -External...

    , free open-source extensions for Mozilla Firefox and Google Chrome (respectively) that block execution of scripts
  • XSSer: automatic -framework- to detect, exploit and report XSS vulnerabilities
  • Cross-document messaging
    Cross-document messaging
    Cross-document messaging, or web messaging, is an API introduced in the WHATWG HTML5 draft specification, allowing documents to communicate with one another across different origins, or source domains. Prior to HTML5, web browsers disallowed cross-site scripting, to protect against security attacks...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK