Inside SQL Server: SQL Server 7.0 Plan Caching

SQL Server's ability to reuse compiled plans for stored procedures has been one of its most heavily marketed features since its first release. However, from a performance standpoint, reuse of compiled plans wasn't the most important reason for using stored procedures. In fact, the time saved by not needing to compile a new plan was often only a tiny percentage of the total processing time, especially for decision-support queries involving joins of large tables and multiple search conditions. SQL Server 7.0 changes that situation dramatically. The query optimizer has been completely rewritten and enhanced to provide dozens of new processing techniques. With more possibilities for the optimizer to consider, producing a query plan can take a large percentage of the overall processing time. In fact, for some queries, the time needed to compile and optimize the query can exceed the time needed to execute the query and return the results. Therefore, reusing a precompiled plan can produce big performance gains.

In all earlier releases of SQL Server, you could reuse only the plan for stored procedures. SQL Server 7.0 expands that capability by using four mechanisms—ad hoc caching, autoparameterization, the sp_executesql procedure, and the prepare and execute method—to save and reuse individual query plans.

Ad Hoc Caching

SQL Server caches plans from ad hoc queries, and if a subsequent query matches a previous one exactly, SQL Server uses the cached plan. This caching feature requires no extra work, but it is limited to exact textual matches. For example, if you submit the following three queries, the first and third use the same plan, but the second one generates a new plan.

SELECT count(*) FROM
 Northwind..products WHERE
 categoryID BETWEEN 1 and 9
go
SELECT count(*) FROM
 Northwind..products WHERE
 categoryID < 10
go
SELECT count(*) FROM
 Northwind..products WHERE
 categoryID BETWEEN 1 and 9
go

The simplest way to discover whether recompilation is taking place is to use the option SET STATISTICS TIME ON while you run the Query Analyzer to test queries. (For more information on optimizing queries, see Kalen Delaney, "SQL Server Statistics: A Useful Query Optimizer Tool," August 1999.) Before any SQL statement runs, this option tells you the time SQL Server needed to parse and compile the query. Keep in mind that query optimization is the most time-consuming part of compilation.

When I ran the above three queries after setting STATISTICS TIME ON, I got the results for parse-and-compile time that you see in Figure 1. The queries all return the same results, and the same rows satisfy all the queries. However, because the second query uses the less-than operator instead of BETWEEN, it doesn't match the other queries' text and the plan isn't reused. The second query had to recompile; you can see the value of 10 milliseconds for compile time. The third query is identical to the first and reuses the plan from the first query. You can see that its compile time is zero. (Make sure you run the three statements as separate batches, separated by go. If you submit the three statements to SQL Server as a single batch, they are all parsed together, and you don't get separate parse-and-compile data for each one.)

Note that if you want to run these tests a second time, you need to remove the plans from the cache, so you can start from scratch. You can clear all compiled plans from SQL Server's cache by executing the command DBCC FREEPROCCACHE. This command is undocumented and unsupported, and Microsoft doesn't guarantee its availability in future versions of the product. However, it's available in SQL Server 7.0. If you don't use this command, the only way to clear all plans from the cache is to stop and start your SQL Server.

Although the reuse of ad hoc plans can be a good thing, SQL Server might not always reuse the plans that you intend it to use, so I don't recommend that you plan your applications to take advantage of it. Keep in mind that for SQL Server to reuse an ad hoc plan, the query's text must be identical to the original's. You can't change the spacing, indentation, or line breaks. You can't substitute the double dot (..) for the database owner (DBO). Case differences, even on a case-insensitive server, mean that the query processor doesn't see an exact textual match.

Autoparameterization

A second method for plan reuse is autoparameterization. For simple queries, SQL Server guesses which constants might be parameters and then attempts to treat them as parameters. This method lets SQL Server treat your query as a stored procedure. If the autoparameterization attempt is successful, subsequent queries that follow the same basic template can use the same plan.

SQL Server can use four templates for autoparameterization, which means that the query must be one of the query types shown below.

INSERT table VALUES (\{constant | NULL |
	DEFAULT\}, ...)
DELETE table WHERE key-expression
UPDATE table SET colname = constant WHERE 
	key-expression
SELECT \{* | column-list\} FROM table WHERE 
	key-expression \[ORDER BY column-list\]

(Note that key-expression is an expression involving only column names, constants, AND operators, and comparison operators: <, >, =, <=, >=, and <>. It doesn't need to involve a key column.)

For example, these two queries use the same plan:

SELECT CustomerID, OrderDate, ShipName FROM
	orders 
WHERE orderID = 10255

SELECT CustomerID, OrderDate, ShipName FROM
	orders 
WHERE orderID = 10355

Internally, SQL Server parameterizes these queries as

SELECT CustomerID, OrderDate, ShipName FROM orders 
WHERE orderID = @p

If you run the two queries with the STATISTICS TIME option on, rows return for parse-and-compile time for each query. Parse-and-compile results for the first query might look like those in Figure 2.

The first line reports the time to compile the query and save its plan in the parameterized form shown above. In the second parse-and-compile line, you see that SQL Server is using the parameterized form of the query. Note that the time is zero because the plan is already compiled. One way to tell that SQL Server is using autoparameterization is that two parse-and-compile lines appear in the output for STATISTICS TIME.

Another technique you can use to determine whether the query processor has used autoparameterization and not ad hoc query plan caching is to examine the syscacheobjects system table in the Master database. This system table keeps track of the compiled objects in cache at any time. This table is accessible only by system administrators. Looking at the rows in the table, you see that the objtype column tells you whether the query plan was saved as an ad hoc plan or as a prepared, parameterized plan. The syscacheobjects table also shows two different types of plans: compiled and executable. The value of the cacheobjtype column identifies the type. A compiled plan is the part of a plan that is reentrant, and multiple users can share it. Think of an executable plan as an instance of a compiled plan that contains information describing a process that executes the query. In most cases, both compiled and executable plans remain in the memory cache, subject to memory pressure from other processes or applications.

SQL Server can let other queries of the same template use the same plan only if the template is safe. A template is safe if the plan doesn't change even if the parameters change. You typically don't see autoparameterization if a WHERE clause involves an inequality or if the clause is searching for a set of duplicate values. This limitation ensures that autoparameterization won't degrade a query's performance.

When attempting autoparameterization, SQL Server is lenient about white space in your queries, but it won't reuse a plan if you change capitalization or fully qualify an object owner. Like caching the plans for ad hoc queries, autoparameterization is convenient, but it doesn't always result in the best plan. When deciding whether a template is safe, the SQL Server query processor is more conservative than your application can be. SQL Server guesses which values are parameters, whereas your application probably knows. SQL Server uses these techniques to optimize performance, but because the optimizer is so conservative about plan reuse, I recommend that you use one of the following two mechanisms to mark parameters.

The sp_executesql Procedure

The sp_executesql stored procedure falls between ad hoc caching and stored procedures. Using sp_executesql requires that you identify parameters when you call the procedure, but it doesn't require SQL Server to keep track of your code as a permanent object in the system tables, as it does for stored procedures.

The procedure requires that you specify the base text of your query, the parameter definitions, and the values of the parameters, by using this syntax:

sp_executesql @query_text, @batch_parameter
	_definitions,param1,...paramN

Repeated calls with the same query_text use the cached plan, with the new parameter values specified. All the following cases use the same cached plan:

sp_executesql N'insert mytable
	values(@p)',N'@p float',1.0 
go
sp_executesql N'insert mytable
	values(@p)',N'@p float',2.0 
go
sp_executesql N'insert mytable
	values(@p)',N'@p float',1.0
go

ODBC and OLE DB expose this functionality via SQLExecDirect and ICommandWithParameters. Refer to your ODBC and OLE DB documentation for more details.

The Prepare and Execute Method

With the prepare and execute method, as with sp_executesql, the application identifies parameters to the batch. However, the prepare and execute method doesn't require that the full text of the batch be sent at each execution. Instead, the full text is sent once at prepare time; a handle that you can use to invoke the batch at execute time is returned. ODBC and OLE DB expose this functionality via SQLPrepare/SQLExecute and IcommandPrepare. By using these functions, you tell SQL Server to use this batch repeatedly. This function isn't available through Transact SQL (T-SQL) programming; you must use the API.

Monitoring Plan Caching

In addition to using SET STATISTICS TIME ON and examining the syscacheobjects table, you can use the SQL Server Performance Monitor to monitor plan caching. The performance monitor has a counter called SQL Server:SQL Statistics and several counters dealing with autoparameterization. You can monitor these counters to determine how many failed or unsafe autoparameterization attempts have occurred. A failed parameterization attempt means that the query doesn't match one of the basic templates. An unsafe autoparameterization attempt means that the query matched one of the templates but is unsafe because its optimal plan changes if the parameters change. If the failed and unsafe autoparameterization numbers are high, inspect your applications for situations in which the application can explicitly mark the parameters.

SQL Server's ability to automatically reuse plans can go a long way toward making queries run as fast as possible. Whenever possible, tell SQL Server which values are the queries' parameters. There are many advantages to using stored procedures to save plans and identify parameters, and I'll talk about stored procedure plan caching in a future column. But when you choose not to use stored procedures, the new autoparameterization mechanisms are a good option.

Comments

Plain text