Question: I’ve recently implemented an automated index maintenance system for our SQL Servers but I’ve run into some trouble. It seems that for some of the tables, the clustered indexes are being rebuilt no matter how much fragmentation they have – how can that be so if I’m checking the fragmentation using the DMV?
Answer: This is very common problem that people have when rolling their own index maintenance.
To restate the issue: an index is rebuilt or reoganized (I’ll just use rebuild for the rest of this post) even though it has very low fragmentation. Why does this happen?
The problem can affect nonclustered and clustered indexes and unless you’re actively keeping track of when the fragmentation threshold triggers the maintenance routine to rebuild an index, you’ll never know that you’ve got the problem. It can be a big problem – especially if the maintenance routine is unnecessarily and repeatedly rebuilding large indexes. This wastes CPU and I/O resources, and also can generate a lot of transaction log.
The problem arises when using the sys.dm_db_index_physical_stats DMV on indexes that have off-row LOB data. This can happen with both clustered and nonclustered indexes because nonclustered indexes can explicitly INCLUDE non-key LOB columns, and both can have ROW_OVERFLOW data (i.e. SQLVARIANT or N/VARCHAR (1-8000) data) that has been pushed off-row when the record size exceeds 8060 bytes.
In these cases, the index will have multiple “allocation units” – one for the row data, one for off-row LOB columns, and one for ROW_OVERFLOW. The sys.dm_db_index_physical_stats DMV will return at least one row of output for each allocation unit in an index, including the fragmentation of the allocation unit. It’s entirely possible that an index may have very low fragmentation in the index rows, but the LOB or ROW_OVERFLOW data is fragmented and erroneously triggers a rebuild operation during regular maintenance.
The fix for this problem is to change the code you have that queries the sys.dm_db_index_physical_stats DMV to have a WHERE clause like:
WHERE [alloc_unit_type_desc] = ‘IN_ROW_DATA’
This will exclude the LOB and ROW_OVERFLOW allocation units from consideration and prevent unnecessary rebuilds being triggered. You might want to get more sophisticated with your logic if you *do* want to look at the LOB data to decide when to use the LOB_COMPACTION option of ALTER INDEX … REORGANIZE, but you’ll need to dump the output of querying the DMV into a temporary table to be able to do that as well as trigger rebuilds.
Instead of rolling your own index maintenance routine you should check out the free maintenance package written by Ola Hallengren – it’s very comprehensive and I recommend it to SQLskills clients.