I’m a big advocate of trying to get as much query plan reuse as possible. By plan reuse, I’m talking about SQL Server’s ability to reuse an already existing plan in the procedure cache to run a newly executed query that’s similar in nature. Reusing an already compiled plan instead of compiling a completely new plan can save a considerable amount of time and resources.
The queries that most often don’t reuse an existing plan are those generated dynamically by the client or the middle tier of an application. These types of queries are referred to as ad hoc queries or ad hoc batches. Many of them don’t attempt to use parameters in any way and essentially hard-code all of the queries’ literal values. But the most common offense is not using parameters in the search arguments (SARGs) of WHERE clauses.
Fortunately, two SQL Server options can help alleviate some of the pains associated with lack of plan reuse caused by applications that weren’t written with plan reuse in mind. Before I tell you about these options, though, it’s helpful to know the basics about the procedure cache process.
Procedure Cache Basics
Each time SQL Server gets a request to run an ad hoc query, it must first look through the existing plans in the procedure cache to see if one already exists. After using a hash routine to narrow down the list, SQL Server ends up doing a character-by-character, case- and space-sensitive comparison of the new batch against existing batches. A single character or literal value difference will result in a mismatch. If SQL Server finds an exact match, it uses the existing plan to execute the new batch. If it doesn’t find an exact match, SQL Server parses and compiles a new plan and places that plan in the cache before it uses the new plan to execute the query. For more details on how this process works, see the excellent white paper “Plan Caching in SQL Server 2008”.
If there are lots of similar plans already in the procedure cache, the query lookup process can be quite resource intensive. So is compiling and optimizing a new plan for each batch that doesn’t have an exact match. This also results in yet another query plan placed into the cache, which uses up vital memory and potentially makes the next query lookup process even more resource intensive.
By default, SQL Server tries to find a way to make ad hoc queries reuse previous plans, which is referred to as auto-parameterization. When a batch has been auto-parameterized, some or all of the statements’ literal values get replaced with a parameter. So, it essentially takes an ad hoc batch that looks like
SELECT * FROM Customers WHERE CustID = 652837; SELECT * FROM Customers WHERE CustID = 843762;
and turns it into
DECLARE @p1 INT; SET @p1 = 652837; SELECT * FROM Customers WHERE CustID = @p1; DECLARE @p1 INT; SET @p1 = 843762; SELECT * FROM Customers WHERE CustID = @p1;
(Note that if you were to run this ad hoc batch and view the auto-parameterized statements in SQL Server Profiler, they won’t be the same as shown here. I added the parameter assignments to better get the point across.) Now when SQL Server attempts to find a similar query plan in the cache, it looks at the parameterized version of the query (minus the parameter assignments). It’ll find a match each time a subsequent similar parameterized batch is called and reuse the previously existing plan. The result is a far more efficient use of resources and a better performing system overall.
At this point, you might be wondering why you should worry about how well your queries are written if SQL Server auto-parameterizes them. The catch is that most queries don’t qualify to be auto-parameterized by default. There are many restrictions that can prevent a query from being considered for auto-parameterization. For example, a simple join precludes auto-parameterization from taking place. You can find out about the other restrictions in the “Plan Caching in SQL Server 2008” white paper or in SQL Server Books Online (BOL).
That doesn’t mean you should give up, though. There’s a database-level option named Parameterization. It has two modes: Simple (the default) and Forced. When you set the option to Forced, you essentially tell SQL Server to be more aggressive and attempt to auto-parameterize all the batches that get executed for that particular database. This mode can make all the difference in the world if you have a workload that’s issuing ad hoc batches that don’t reuse existing plans.
So, if it makes such a difference, why isn’t Forced the default? Well, the process of auto-parameterization can be expensive in and of itself. If batches can’t be auto-parameterized, Forced mode can add extra overhead with no benefit. You’ll need to determine which mode is best for each database. Luckily, you can change the mode without restarting SQL Server or stopping existing workloads. Another possibility is to use Forced mode in conjunction with Plan Guides and Plan Guide Templates to fine tune which types of batches will use auto-parameterization in this mode. (For more information, see BOL’s “Specifying Query Parameterization Behavior by Using Plan Guides”.)
In some situations, ad hoc batches can’t be auto-parameterized even in Forced mode, so lots and lots of similar plans still end up in the procedure cache. This is when the instance-level Optimize for ad hoc workloads option comes into play. (Note that this option is available only in SQL Server 2008 and later.)
When the Optimize for ad hoc workloads option is turned on, there’s a slight change in how plans get stored in the procedure cache from that point on. (It doesn’t affect plans already in the cache.) What happens is that the first time a plan is generated, SQL Server places only a stub of the plan instead of the entire plan in the cache. The stub contains enough information so that SQL Server can use it to find matches with future batches. Only when SQL Server finds a match between a stub and a new batch does it add the full plan to the procedure cache. Because the size of a stub is very small compared to the size of a full plan, you can save considerable space by keeping only the stubs of plans that haven’t been reused in the cache.
If you’re like most DBAs, you probably have little choice but to work with ad hoc batches. So, you might as well take advantage of these two options where applicable. In my opinion, the Optimize for ad hoc workloads option should have been turned on by default, as it has basically no downsides. So unless you’re sure all the calls to your databases will reuse plans, I suggest you enable this option in your SQL Server instances. You can find out how to do so in BOL’s “Optimize for ad hoc workloads Option” web page. Whether the Parameterizations option’s Forced mode would be helpful needs to be examined on a database-by-database basis and turned on where it makes sense. Forced mode usually makes things noticeably better or noticeably worse, so the decision whether or not to keep using it will be simple. You can find out how to turn on this mode in BOL’s “Forced Parameterization” web page.