For a while now, I've suspected that the execution plan the SQL Server 2005 query optimizer typically chooses for a common query pattern is suboptimal, but until recently I never bothered to try and prove it. Not long ago, though, I managed to confirm that my suspicions were correct. I'll describe my findings and recommend a course of action to optimize the default suboptimal plan. And in the sidebar "More Fun With Execution Plans," I discuss another interesting example of optimizer behavior.
Suboptimal Plan for an ORDER BY Query
The query pattern I refer to occurs when you request to sort rows from a table based on a column on which you have a nonclustered, noncovering index. For example, suppose you have a table T1 with columns col1 (INT), col2 (INT), and filler (CHAR(200)). The table has a large number of rows (say 1,000,000). You have a clustered index on col1 and a nonclustered index on col2. Examine the following query, and try to think of an optimal execution plan for it:
SELECT * FROM dbo.T1 ORDER BY col2;
There are more than two possible execution plans for this query, but these two are the most straightforward:
- Perform a full table scan (unordered clustered index scan), then sort the rows by col2 (call this plan Table Scan Plus Sort).
- Perform an ordered nonclustered index scan of the leaf of the index on col2, and look up each data row by performing a seek operation in the clustered index for each row (call this plan Ordered Index Scan Plus Lookups).
For large tables (with hundreds of thousands or millions of rows), the optimizer opts for Table Scan Plus Sort by default, whereas I propose that Ordered Index Scan Plus Lookups is more optimal. Maybe the costing algorithms that the SQL Server 2005 optimizer uses for this query type rely here on relics from older systems.
In terms of logical reads alone, of course, Table Scan Plus Sort involves many fewer logical reads than Ordered Index Scan Plus Lookups. For this example, let's assume that table T1 resides on 25,000 pages (in the clustered index's leaf) and that the number of pages in the nonclustered index on col2 is 2000. Both the clustered and nonclustered indexes have three levels in the balanced trees.
With Table Scan Plus Sort, the number of logical reads is 25,000. With Ordered Index Scan Plus Lookups, 2000 logical reads are required to fully scan the leaf level of the nonclustered index, plus 1,000,000 × 3 logical reads for the lookups. In total, you get more than 3,000,000 logical reads for Ordered Index Scan Plus Lookups. So in terms of logical reads alone, obviously Ordered Index Scan Plus Lookups involves many more than Table Scan Plus Sort.
However, sorting a large number of rows can be very expensive. Table Scan Plus Sort needs to sort as many rows as the number of rows in the table, whereas Ordered Index Scan Plus Lookups involves no sorting at all because the nonclustered index on col2 is scanned in order. The cost of each lookup is constant (three reads), whereas the cost of sorting isn't linear—that is, in terms of algorithmic complexity, sorting is n log(n). Even worse, if the table is too big, so that there's not enough memory for the sort operation, SQL Server will have to spill rows to disk to manage the sort operation; in this case, sorting becomes substantially more expensive.
Before I present a benchmark proving this point, run the code in Listing 1 to create the table T1 in the testdb database and populate it with 1,000,000 rows. The code first creates a helper function called fn_nums that returns a result set with a sequence of a requested number of integers. The code then uses this function to populate T1 with the requested number of rows (1,000,000).
Now request the estimated execution plan for the following query:
SELECT * FROM dbo.T1 ORDER BY col2;
You'll get the execution plan that Figure 1 shows: Table Scan Plus Sort.
To measure this plan's performance, in SQL Server Management Studio (SSMS), go to Tools, Options, SQL Server, Query results, Results to grid, and select the Discard results after execution check box; doing so eliminates the time it takes to generate the output. Then in a new query window, run the following code:
USE testdb; DBCC DROPCLEANBUFFERS; SELECT * FROM dbo.T1 ORDER BY col2;
On my test system, this query took more than two minutes to run (the runtime in seconds is displayed at the bottom-right corner of SSMS).
Forcing an Optimal Plan
To test the performance of the desired plan (Ordered Index Scan Plus Lookups), I wanted to force SQL Server to use the nonclustered index. Initially I thought that doing so would be simple—just specify an index hint. But the task proved to be more complicated than I expected.
Here's the index hint that I added to the query:
SELECT * FROM dbo.T1 WITH (index = idx_nc_col2) ORDER BY col2;
To my surprise, instead of getting Ordered Index Scan Plus Lookups, I got the execution plan that Figure 2 shows. I didn't expect this plan, but after I analyzed it, I understood what was going on. First, the leaf level of the nonclustered index is fully scanned but without reliance on order (Ordered = False). The rows returned from this scan are then sorted based on the clustering keys (i.e., first sort operator in the plan). The reason for this sort is to optimize the lookups for the data rows (Clustered Index Seek operator). If you think about it, SQL Server fetches data rows that reside in the same data page by using the Clustered Index Seek operator consecutively. After all data rows are fetched, the rows are sorted based on their col2 values (i.e., second sort operator in the plan).
Interesting as it might be, this plan is still less efficient than the one I want: Ordered Index Scan Plus Lookups. The problem I faced at that point was that by using a simple index hint, I could force the optimizer to use the nonclustered index, but not in the manner in which I wanted it to be used (namely, ordered scan followed by lookups).
Originally I thought that my only option to force the optimizer to use the desired plan was to use a new query hint in SQL Server 2005 called USE PLAN. This hint lets you provide an XML value containing a complete execution plan that you want SQL Server to run. But apparently there's a much simpler option—to use the FASTFIRSTROW query hint. This hint tells the optimizer that you want a plan that will return the first row as quickly as possible. The default plan that the optimizer chooses when no hints are specified has to first finish sorting the data before it can return the first row. Therefore, it can take some time before SQL Server can return the first row. By using an ordered scan of the nonclustered index, SQL Server can return the first row almost instantly. So although the optimizer estimates that in terms of throughput (total runtime) Ordered Index Scan Plus Lookups is less efficient than Table Scan Plus Sort, doubtless the former is faster in terms of response time (first row returned). Thus, by specifying the hint, as in the following query, you get the desired plan, which Figure 3 shows.
SELECT * FROM dbo.T1 WITH (FASTFIRSTROW) ORDER BY col2;
This query ran for about 20 seconds on my test system—several times faster than with the default plan (Table Scan Plus Sort).
Trust Your Intuition
I ran a benchmark testing the two plans using table sizes ranging from 100,000 rows to 1,000,000 rows in steps of 100,000 rows. Web Figure 4, shows the benchmark results. You can see that in all cases, Ordered Index Scan Plus Lookups (forced with the FASTFIRSTROW hint) is substantially faster than Table Scan Plus Sort.
This exercise was an important lesson for me. Trust your intuition, and when you suspect that a plan that the optimizer produces isn't optimal, benchmark to check whether your suspicions are correct. I'm not saying that it's common to get suboptimal plans in trivial cases, but it's good to know that when you do find such a plan you have tools to deal with it. In such cases, it's important to report the problem to Microsoft, as I did.