screen shot of SQL Server Parameterization instructions

SQL Injection – Beyond the Basics

How to protect against SQL Injection

One of the biggest problems with SQL injection is that while documentation of this exploit has existed for over a decade, it still remains a tremendously effective attack vector for hackers. For example, during the time frame in which I wrote this article, tens of thousands of sites were compromised using SQL injection thanks to the automated Lilupophilupop attack. SQL injection also featured prominently in the news since it was the primary attack vector used in a number of well-publicized exploits against larger companies in which there were significant data breaches.

As these examples show, there's still a huge disconnect on the part of many developers and DBAs about how to properly protect against SQL injection. To help eliminate this disconnect, I'll review some common myths about dealing with SQL injection before transitioning to an overview of the core concepts that provide the key to true protection. In this discussion, I'm assuming that you have a basic understanding of how SQL injection works and that you understand that it's not a flaw or limitation of relational databases but rather a coding issue. (If you don't know how SQL injection works, see the SQLinjection web page in Wikipedia.) 

Myth 1: Scrubbing User-Supplied Data Prevents SQL Injection

Far too many developers assume that SQL injection can be defeated by merely "scrubbing" out bad input that hackers try to inject. With this flawed mindset, developers attempt to ritualistically protect databases by passing all user-supplied data through well-intentioned helper functions designed to scrub out malicious input. However, developers routinely forget to process non-obvious input such as values from a query string or cookies in web applications. More important, this approach can be easy for hackers to defeat because it doesn't address the root cause of SQL injection.

At the most basic level, most scrubbing routines attempt to replace single ticks (or the apostrophe character) with double ticks as a means of escaping any malicious input provided by end users. However, there's simply so much more involved in protecting against SQL injection than worrying about simple string-replacement routines for ticks. In fact, the biggest problem with scrubbing routines is that most SQL injection attacks are automated through the use of highly complex tooling that efficiently and intelligently probes for a multitude of vulnerabilities in a very short amount of time. (A great open-source white-hat injection testing tool is sqlmap.) 

With automated injection tools, hackers can test dictionaries of complex combinations of encoding tricks, escape routines, homoglyphs, and uncommonly used Unicode and ASCII characters as a means of trying to compromise overly simplistic string-replacement routines defined by developers who don't think like hackers. Moreover, sophisticated attacks will even attempt (in automated fashion) to combine input from multiple fields as a means of leveraging truncation and other tricks in order to create compound attacks that introduce a much wider range of possible injection vectors.

Hackers only need to find a single weakness or flaw to exploit, whereas developers have to stay one step ahead of all techniques and dictionary-like attacks that hackers can use in automated fashion to find exploits. As such, scrubbing is a fundamentally flawed approach to dealing with SQL injection.

Myth 2: Whitelisting Provides Complete Protection

Given that scrubbing, or blacklisting, is a flawed approach to security, security experts commonly recommend a better approach known as whitelisting. In whitelisting, input is constrained so that it conforms to specific enumerated values that represent known-good forms of input. Although whitelisting represents a security best practice (that comes with the added benefit of decreasing troubleshooting and debugging by means of constraining variability), it isn't sufficient as a means for protecting against SQL injection alone. That's because whitelisting does nothing to protect against the primary cause of SQL injection, which is letting end-user supplied input be elevated to the same level as the command structure. Consequently, while whitelisting is part of any viable defense-in-depth approach to security, it's still not sufficient enough to protect against SQL injection on its own.

Myth 3: Stored Procedures Provide Complete Protection

Another common myth is that stored procedures unequivocally protect against SQL injection. Sadly, while the proper use of stored procedures provides fantastic protection against SQL injection, there's nothing magical about stored procedures that prevents them from being used incorrectly or unsafely. Sometimes, organizations blindly assume that using stored procedures will automatically protect them from SQL injection, which can put them at much greater risk than those relying on scrubbing routines. Stated differently, blindly assuming that stored procedures provide protection can be fatal, as you'll see in the next section.

Defeating Injection: Separating Command from Data

The only sure way to prevent SQL injection is to ensure that command language is properly separated from data or user input -- the same technique used to defeat most forms of injection attacks. As Figure 1 demonstrates, any scenario that enables end-user input to become equal with command language provides malicious users with the ability to take control of commands. After they have control, hackers and their automated tools can hijack developer-supplied interfaces into underlying data storage (i.e., existing SQL statements) and probe for metadata about current permissions, environment, and data structures. They can then start exploiting data and permissions to the point where they can potentially take full control of entire databases, servers, or even entire networks.

Figure 1: Clear separation of data and command language versus concatenation
Figure 1: Clear separation of data and command language versus concatenation 

Therefore, to prevent data from being elevated to the same status as command language, developers must ensure the use of parameterization throughout the entire process, from when the code is assembled in their applications to when the code is executed in the underlying database. To this end, parameterization can be safely enforced through the use of either stored procedures or parameterized queries. Either approach is equally viable in defeating SQL injection, because both approaches require the use of command objects. These objects keep data and command language distinct, preventing hackers from elevating their input to the same level as the command language. However, there are some common pitfalls that can result in losing parameterization, even if it has been initially set up correctly.

Parameterization Gotchas

Data-access frameworks that exclusively rely on parameterization (e.g., LINQ to SQL, ADO.NET Entity Framework) facilitate protection from SQL injection. In a similar fashion, custom (or in-house) data-access code can be easily protected against SQL injection by ensuring the proper and exclusive use of parameterization, whether that's through the use of stored procedures or parameterized queries. In either case, however, there are ways in which parameterization can be easily used incorrectly or improperly enforced, resulting in "gotchas" that can easily defeat efforts to protect against SQL injection.

Abusing the connection or command object. A key problem with parameterization is that "wiring up" large numbers of parameters for each and every data-access operation becomes very tedious for developers, especially since non-explicit mappings (or code that takes more time to write) is frequently the source of performance problems that arise due to data-coercion issues. As such, developers occasionally take shortcuts in which they "short-circuit" the tedium of wiring-up command and parameter-collection objects, choosing instead to execute SQL statements directly against command or connection objects. By taking this approach, developers fail to send full-blown remote procedure calls (RPCs) to the server, which properly ensures parameterization. Instead, strings of concatenated commands and data are sent to the server to be executed as a batch or statement, which can lead to SQL injection attacks.

For example, consider a rudimentary data-access routine in which end users are able to pull up product details based on a supplied product name using the simple stored procedure shown in Figure 2. The stored procedure clearly separates the command language (i.e., the SELECT statement) from the data by means of parameterization.

Figure 2: A properly defined stored procedure
Figure 2: A properly defined stored procedure 

However, as a shortcut, developers might concatenate the name of the stored procedure and the user-supplied parameters to create a single string that ends up getting executed against the server as a single statement or batch operation, as shown in Figure 3. With this type of shortcut, developers can save themselves a few lines of code (and likely thousands of lines of code systemwide), but they've short-circuited parameterization by removing the protection afforded by executing a full-blown RPC against the database server. And in doing so, they've made their application wide open for SQL injection.

Figure 3: Code that short circuits the command object, thereby enabling SQL injection
Figure 3: Code that short circuits the command object, thereby enabling SQL injection 

Dynamic SQL within stored procedures. In some cases (which should be a definite and well thought-out exception instead of the rule), it might be necessary to create dynamically defined SQL within a stored procedure to satisfy certain complex business or functional requirements. In this scenario, it's possible for developers to easily re-introduce the potential for SQL injection, even if they've properly ensured parameterization throughout the process. Once again, the way that this occurs is by means of blindly concatenating user-supplied input and command language into a single string, which is then executed on the server, as shown in Figure 4.

Figure 4: Stored procedure that includes dynamic SQL, thereby enabling SQL injection
Figure 4: Stored procedure that includes dynamic SQL, thereby enabling SQL injection 

Consequently, in rare cases where dynamic SQL needs to be handled from within stored procedures, I commonly recommend that developers leverage an ingenious approach outlined by David Penton in the blog post "Not-so Dynamic Sql."  Not only does this approach protect against SQL injection but it also helps make dynamic SQL significantly easier to maintain from a readability perspective.

Likewise, it's important to point out that while SQL Server 2005 and later provide some protections against certain types of dynamic SQL utilization from within stored procedures, these protections primarily serve to mitigate the amount of damage that hackers can inflict instead of stopping SQL injection outright. Therefore, to learn more about these protections as well as more about dealing with dynamic SQL in general, I highly recommend a careful review of Erland Sommarskog's fantastic content "Giving Permissions through Stored Procedures"  and "The Curse and Blessings of Dynamic SQL." 

Protect Against SQL Injection

Protecting against SQL injection is simply a matter of ensuring proper parameterization -- in other words, ensuring that user-supplied input is never put on a par with command language. With this understanding, it's easy to see why scrubbing routines don't provide sufficient protection from SQL injection. It's also easy to discern ways in which developers and applications can easily run afoul of these basic precepts, facilitating the potential for SQL injection. Protecting against SQL injection is therefore a matter of making sure that you understand the basics and you avoid common pitfalls.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.