Quaere Verum - Clustered Index Scans - Part II | ITPro Today: IT News, How-Tos, Trends, Case Studies, Career Tips, More

Question: Which access methods are available to SQL Server to fully scan the table’s data when an ORDER BY clause is not specified?

1. Ordered Clustered Index Scan

Follow the linked list at the leaf level of the clustered index from head to tail. The performance of this access method varies depending on the level of logical fragmentation of the clustered index. The higher the fragmentation is the slower is the access method. Later I will demonstrate this.

2. Unordered Clustered Index Scan / Table Scan

The linked list at the leaf level of the clustered index is one mechanism by which SQL Server can locate the table’s data. But there’s another system that SQL Server uses to map data called IAM (short for Index Allocation Map). I will not provide the gory details about IAM rather just focus on the essentials required for my discussion. IAM pages are bitmaps that map extents belonging to an index or a heap (a table without a clustered index) by their file order.

SQL Server maintains one or more IAM pages to map/keep track of the extents that reside at the leaf level of a clustered index by their file order. An IAM page has a bit per extent in the file, and the bit is set to 1 if the extent it represents belongs to the object owning the IAM page and to 0 otherwise. Each IAM page maps 4 GB of space in the file.

When in need to fully scan the table’s data, technically SQL Server can tell which pages/extents belong to the table by examining the IAM pages owned by the table. By using IAM pages, the scan is performed by file order (as opposed to linked list order). The efficiency of such a scan depends on the file system fragmentation, but as I mentioned earlier, I’m going to assume a simplified scenario where there’s only one data file with no file system fragmentation. In such a case, clearly an unordered clustered index scan / table scan would seem like a better choice than an ordered clustered index scan. With no logical fragmentation the performance of the two access methods should be similar. But while logical fragmentation degrades the performance of an ordered scan, it has no effect on an unordered scan.

This is true with both a heap and a clustered table, and in fact with a scan of the leaf level of nonclustered indexes as well. With a heap, IAM is the only mechanism that is currently available to fully scan the table’s data (unordered scan). With a clustered table (or any index), there are two options: using the linked list (ordered scan) or using IAM pages (unordered scan).

Note though that when the execution plan shows an unordered index scan (Ordered: False), it doesn’t necessarily mean that SQL Server will use IAM pages to scan the data, rather it means that SQL Server DOESN’T HAVE to scan the data in linked list order.

3. Advanced Scanning

Here’s a snippet from Books Online that explains Advanced Scanning:

“One part of the SQL Server Enterprise Edition advanced scan feature allows multiple tasks to share full table scans. If the execution plan of a Transact-SQL statement requires a scan of the data pages in a table and the relational Database Engine detects that the table is already being scanned for another execution plan, the Database Engine joins the second scan to the first, at the current location of the second scan. The Database Engine reads each page one time and passes the rows from each page to both execution plans. This continues until the end of the table is reached.

At that point, the first execution plan has the complete results of a scan, but the second execution plan must still retrieve the data pages that occur before the point at which it joined the in-progress scan. The scan for second execution plan then wraps back to the first data page of the table and scans forward to the point at which it joined the first scan. Any number of scans can be combined like this. The Database Engine will keep looping through the data pages until it has completed all the scans. This mechanism is sometimes known as "merry-go-round scanning" and is the reason that the order of results from a SELECT statement without a predicate to sort on cannot be guaranteed.”

With the fundamentals out of the way, I’ll proceed to the main points I wanted to discuss…

Recent Revelations / Eye-Opener
Back to my original question: given a table T1 with a clustered index on col1, is the following query guaranteed to return the data in clustered index order?

SELECT * FROM T1;

Being familiar with the fundamentals described so far, and backed by ANSI SQL, for years I believed that the answer was a simple no. The standard allows SQL Server to return the data in any order. With a heap, clearly there’s only one reasonable way to scan the data—unordered scan based on IAM pages. With a clustered table, though SQL Server can rely on either the linked list or on IAM pages, it seemed natural to believe that it would always go for the
latter since it’s more efficient. So until recently I never bothered to verify whether such was the truth…

I want to stress that I still firmly believe that the working-premise should be that such a query is not guaranteed to return the data in clustered index order. But next I will demonstrate that things are not as simple as they might seem.

The moment of truth arrived recently when I tried to demonstrate that the above query returns unordered data. As an afterthought I wrote similar code to the one I provided earlier to create the table T1 and populate it with random values, and issued a SELECT * without an ORDER BY clause. On the tip of my tong was the sentence “notice that the data is returned unordered as SQL Server used IAM pages to locate the data.” But to my surprise, I kept getting the data back in clustered index order:

col1        filler

----------- -------

5           a

11          a

13          a

14          a

15          a

21          a

22          a

30          a

31          a

32          a

33          a

34          a

36          a

45          a

50          a

…

The textual execution plan shows:

  |--Clustered Index Scan(OBJECT:(\[testdb\].\[dbo\].\[T1\].\[idx_cl_col1\]))

Without any mention that it was an ordered operation (ORDERED FORWARD or ORDERED BACKWARD). The graphical execution plan shows a Clustered Index Scan with Ordered: False. So far I believed that an Index Scan with Ordered: False simply meant an unordered scan based on IAM pages. But the result set of the query was returned in order. I checked fragmentation, used DBCC IND and DBCC PAGE, all of which indicated that the data resides in the file in a different order than in the linked list.

I tried many tests with different ways to populate the table, with varying numbers of rows, fragmentation levels, etc.; but in all of my tests the data kept coming back in order. Until I tried specifying the NOLOCK hint (read uncommitted isolation level):

SELECT * FROM dbo.T1 WITH (NOLOCK);

And suddenly the data came back unordered:

col1        filler

----------- -------

5           a

11          a

13          a

14          a

484         a

484         a

358         a

359         a

366         a

845         a

846         a

854         a

858         a

210         a

214         a

…

Investigation with DBCC IND and DBCC PAGE verified that now SQL Server used IAM pages (allocation order) to scan the data. Not being able to explain the discrepancy, I turned for help. Lubor my mentor and Srikumar from the Storage Engine team explained.

Apparently allocation order scans (using IAM pages) are used to read data in two cases: when NOLOCK or TABLCOK are specified. So the following query also returns the data in allocation order:

SELECT * FROM dbo.T1 WITH (TABLOCK);


col1        filler

----------- -------

5           a

11          a

13          a

14          a

484         a

484         a

358         a

359         a

366         a

845         a

846         a

854         a

858         a

210         a

214         a

…

However, the original query without any hints is run in read committed isolation level and without locking the whole table to begin with.

* Disclaimer, the explanations below are not in the direct words of Lubor and Srikumar rather partially my interpretations, so if there are any inaccuracies they are mine alone.

In a heap, neither inserts nor updates can cause page splits or data movement. SQL Server looks for the first free space large enough to host a new row using bitmaps called PFS (page free space), and if no free space exists it allocates a new page. When a dynamic column is updated to a longer value due to an update, and there’s not
enough space in the original page to host the expanded row, SQL Server moves it to a new location but keeps a forwarding pointer in the original page. That’s why in a heap it’s “safe” to read the data by performing allocation order scans.

However, with data in an index things are different. Data can move within the index due to page splits, updating a key, or updating a dynamic column value to a longer value. Allocation order scans cannot tolerate data movement (due to updates from the same or other concurrent transaction). SQL Server may lose the scan position or return the same row twice due to splits.

Hence, to guarantee consistency, in all cases besides when NOLOCK or TABLOCK are specified, SQL Server scans the data in index order by following the linked list.

Continues in Part III...

Comments

Plain text