Introduction to Secure Development - Input Validation

Jan 27 2010

If you're like most developers, you probably have barely enough time to implement the never-ending list of requirements, much less worry about the security of your code. However, the vast majority of IT security incidents can be traced back to the development process: minor programming mistakes or design flaws can turn into big headaches when a skilled attacker discovers them. This blog is focused on discussing secure development practices. It will showcase common pitfalls and will provide practical solutions that can be easily integrated into your daily work.

This is the first in a series of regular posts about secure development. I will start by doing posts on some of the most common programming errors and their IT security implications, eventually moving into other aspects of secure development. In the first ten posts, we will discuss the top ten web application vulnerabilities as reported by OWASP, the Open Web Application Security Project. For now, we will focus on Java-based web applications, based on OWASP's J2EE Top 10 list. However, the majority of topics are universal and can be translated into almost any programming language.

Let’s start with one of the most common and far-reaching problem areas for programmers: unvalidated input. Consider the following servlet URL:

.../ImageServlet?url=http://backendhost/images/bg.gif

At first glance, this seems harmless since you are just displaying an image. However, without proper validation, an attacker could abuse this feature to gain proxy access to your internal network or even browse the contents of the server's filesystem, for example:

.../ImageServlet?url=http://weblogic/console
.../ImageServlet?url=file:///etc/passwd

Input validation is especially crucial in a web application, where the stateless nature of the HTTP protocol adds a unique set of challenges. An attacker can easily modify any part of the HTTP request: the URL, query string, headers, cookies, and form fields. Some common input tampering attacks include: cross site scripting and request forgery, buffer overflows, SQL injection, cookie poisoning, and hidden field manipulation.

Often, the root cause of input validation vulnerabilities can be traced to two common practices: 1) input is validated at the client only (via JavaScript, for example) and 2) input filtering is implemented as a blacklist, which can be bypassed.

Client-side validation caused a problem last year for Time Warner Cable customers, as described in this October 2009 Wired article. “Time Warner had hidden administrative functions from its customers with Javascript code. By simply disabling Javascript in his browser, [the attacker] was able to see those functions, which included a tool to dump the router’s configuration file. That file, it turned out, included the administrative login and password in cleartext.”

A famous case of blacklist bypass was the Samy worm on MySpace in October 2005, where the attacker used a number of techniques to inject JavaScript into his MySpace page. For example, since "javascript" was blacklisted, he simply inserted a newline character in the middle: "java\nscript", which most browsers interpreted correctly as the start of JavaScript code.

To protect against these types of attacks, web developers can use three simple techniques: 1) all input must be validated on the server, 2) do not abuse “hidden” form fields for temporary client-side storage; use server-side session variables instead, and 3) perform all input validation by whitelisting only. A robust whitelisting approach uses a default-deny policy, and only allows input through after the following checks:

  1. Data type (string, date, integer, etc.)
  2. Minimum and maximum length
  3. Whether null/blank values are allowed
  4. Numeric range (if applicable)
  5. Specific string patterns (regular expressions): phone numbers, zip codes, e-mail addresses

Almost all modern programming languages have support for regular expressions (regex's), which are one of the most powerful text-matching tools available. If you're not familiar with them, here is a good starting point. Even if you're already a regex wizard, there are many pre-made ones available online to save you time. Check out http://regexlib.com/ for regex's you can use to validate common strings, such as e-mail addresses, dates, and mailing addresses.

I hope you've enjoyed this first post on secure development. Next week, we'll talk about injection attacks, including the ever-popular SQL injection.

About the Author

Daniel is a business and technical systems analyst with a background in IT security and software development. He has four years of experience in the IT security field, including published academic research. His main areas of expertise include secure development, network security, and authentication. In addition to security, Daniel has a software development background in languages such as Java, PHP, SQL, and Perl. He also has over 12 years experience working with and administering various versions of Linux and related open-source software.

Disclaimer

The words and opinions expressed here are those of each article's respective author, and do not necessarily represent the views of CapTech Ventures.