Winning the Fragmentation Battle

Downloads
48914.zip

A few months ago, I was awakened to the insistent buzzing and chirping of my BlackBerry telling me I had a high-priority message. All the customers who used one of my databases were calling in complaining that our Web application was taking 20 to 30 seconds to load the pages they used the most. Performance had gradually worsened over the prior few weeks and was now at the point where just a few more people logging in would bring the system down. I had to find the source of the problem—and soon. Here's how I tracked down the trouble—which, as I discovered, stemmed from a combination of fragmentation in tables and database files and page-density inefficiencies—and the steps I took to correct it.

Narrowing Down the Problem

My first step was to open Performance Monitor to see whether one of the "Big Four"—CPU, memory, disk, and network—might be a factor in the slowdown. Processor:% Processor Time was in the normal range for our database server, SQL Server:Buffer Manager:Free Pages showed more than 2000 pages available, and Network Interface:Bytes Total/sec was only about 1/20 the capacity of a Gigabit Ethernet network. Physical Disk:Disk Bytes/sec was consistently 100 to 200 percent higher than normal for our server, though. Disk usage appeared to be the source of the problem.

In this case, apparently the hard disk had enough excessive activity for a sustained period to slow all the other operations. Because ADO.NET has a fairly aggressive algorithm for adding connections to the connection pool, even a small increase in the amount of time a user's process holds a connection multiplies the number of connections. More connections to the same data source usually increases the probability of conflicting locks on commonly accessed data, which in turn slows SQL Server's response to queries. Much as one slow car during rush hour slows down everyone on the road, a minor increase in response time in a database server can have a cascading slowdown effect on performance in a busy application environment.

Hard-Disk Constraints

Because there are physical limits to how much data can flow between memory and disk, maximizing disk performance is key to optimum overall database performance. Hard disks not only have a physical limit to how fast the heads can read the electromagnetic variations on the disk, but the rotational speed of the disk also limits when the heads have data to read. If a page that SQL Server needs isn't currently under the read/write head, the disk controller must wait until the section of the disk containing the page rotates under the head. The rotational speed of the disk, therefore, controls how quickly data is presented to the read/write head and how fast the head can read and write bits.

The rotating-disk limitations, then, make the order and length of data requests important. SQL Server's storage engine batches reads and writes as much as possible and orders the pages in the sequence they occur on the disk. The goal is to read the data in as close to page-number order as possible because doing so results in the least number of waits for the correct section to rotate under the heads. The Performance Monitor counters Avg Disk Bytes/Transfer, Avg Disk Bytes/Read, and Avg Disk Bytes/Write tell you how many bytes are being requested at each I/O operation. Drives dedicated to holding SQL Server's databases will never have fewer than 8196 bytes per transfer, but what you want is a consistent 65,536 (or more) bytes per transfer (65,536 bytes, or 64K, equal one extent). If you see a value that's less than 65,536 as an average, fragmentation is likely to be a problem.

Performance Monitor showed excessive disk usage, but I'd need another tool to find out why. FileMon is a free tool from Sysinternals (http://www.sysinternals.com) that reports all reads and writes and the number of bytes per operation for all files. When you set FileMon's filter to monitor only database files, you can easily see how many bytes SQL Server is reading from and writing to each file in a file group. When I noticed that FileMon reported only 8196 bytes per read for a file that had a 2.5GB table and about 1GB in indexes, I knew I'd probably found the culprit for my performance problems.

Remember that each I/O includes a delay while the controller waits for the disk to rotate under the read/write head; therefore, it always takes longer to read eight pages than one extent. Additionally, the storage engine can't order the reads for efficient access because it can't predict what requests will come in the future. Disk optimization comes from a combination of storing data in large chunks and accessing the disk as little as possible. FileMon's report that the storage engine was reading a 2.5GB table one page at a time meant that something was keeping the engine from batching the requests into extents. It was time to look at some tables to find out why.

Finding Fragmentation

SQL Server's DBCC SHOWCONTIG command is a useful tool for helping you identify the extent of fragmentation in a table. Unfortunately, running the command exacerbates the performance problem you're trying to fix because it creates large amounts of I/O. The FAST option has less impact on server performance than running the command without it, but even running DBCC SHOWCONTIG FAST slows the system too much when performance is already bad. The answer is to run Database Consistency Checker (DBCC) when you have time to perform a complete scan and specify TABLERESULTS, which outputs the scan results as a table, then save the results.

I wrote the uspBuildFraglist T-SQL stored procedure, which is excerpted in Listing 1, to automate running the DBCC SHOWCONTIG WITH TABLERESULTS command. UspBuildFraglist loops through the list of tables in the specified database and, for each table, runs DBCC SHOWCONTIG WITH TABLE RESULTS,ALL_INDEXES, which displays fragmentation information for the data and indexes of the specified table. The results are first stored in a temporary table, then moved into a permanent table in a database that I created for holding data generated by administrative and maintenance processes such as this one.

The permanent table has a column named LastScanTime for tracking when the table was scanned. The IF statement at callout A in Listing 1 checks that column for each table and skips the DBCC statement if the table was scanned within the last day. This check allows uspBuildFraglist to be run multiple times without duplicating the work done by previous scans.

A second feature of uspBuildFraglist is the delay after each scan. The code at callout B checks the sysprocesses table to see whether the scanning process is blocking any processes. If so, the procedure waits for 30 seconds. If not, the procedure waits for 5 seconds before it runs the next scan. The ability to pause between scans is the primary advantage of scanning each table individually instead of scanning the entire database all at once. It's a simple way to minimize potential blocking problems that DBCC can cause.

Although uspBuildFraglist performs a simple task, it lets you check fragmentation without aggravating performance problems. Be aware that the procedure will produce more accurate results if you run it after any jobs that affect database-file allocations—for example, database shrink jobs—or purge data.

Interpreting DBCC's Results

To interpret the DBCC command output generated via uspBuildFraglist, you must first determine whether the table being inspected has a clustered index. Because the clustered index orders the data physically on the disk according to the index columns, fragmentation usually occurs when data has been deleted from a table but no date has been inserted that would fill the empty space. A common reason for this type of fragmentation is having a clustered index on a monotonically increasing primary key— for example, an identity column. Unfortunately, the default for primary keys is to use a clustered index, and the default for many database designers is to use identity columns as primary keys.

Another common way for both clustered and non-clustered indexes to become fragmented is through allocations of extents over time. In an online transaction processing (OLTP) environment, typically new rows are added a few at a time, not in large blocks; therefore, all the tables in a file group will be allocating extents one or two at a time. That means that the extents could be scattered throughout the file. If new rows are added at the same rates for all tables, the distribution will be fairly even. If one table adds more rows than others, that table's extents might cluster together, but they would be merely little islands of continuity in a sea of disjoint allocations. SQL Server makes no attempt to keep a table's extents together; therefore, OLTP systems will always become more fragmented as they grow.

The Logical Scan Fragmentation value in the DBCC SHOWCONTIG results is the best indicator of fragmented indexes. This value shows the percentage of time that an index's pages don't follow each other in index order. Since most of the data in an index is stored at the leaf level, any value greater than 0% means more I/O to the disk and less chance that SQL Server's storage manager can read a single extent instead of eight individual pages.

Another DBCCSHOWCONTIG value to check is Avg. Page Density, which shows how much of the page, on average, is being used by data. Although it's possible for a table to have no page fragmentation because all the pages are indeed continuous, a page can have empty spaces because of deletions, updates, and page splits. Because SQL Server always reads a minimum of a full page, a table scan or leaf scan of an index will always require reading more pages than the number of bytes used divided by 8192 if the page contains empty space. Reading those extra pages causes more disk activity, and as the table grows larger, has a steadily increasing impact on read performance, especially on table scans.

Dealing with page splitting can also be problematic. The usual advice for minimizing page splits is to use a fill factor that leaves room for new additions. Page splits produce a significant, though short-term, decrease in performance as rows are moved to the new page; however, the longer-term problem is that SQL Server must read more pages because they aren't completely full. For a table that has 1 million pages of data, a 20 percent fill factor results in 1.2 million pages, which is an extra 16MB of data.The key question is whether the short-term effect of page splitting is greater than the long-term effect of reading more pages. When you consider that sustained page splitting has a similar effect to setting the fill factor to 50 percent, you can see that maintaining a high average page density can be just as important as having low fragmentation.

Another Clue: Database-File Fragmentation

What does it mean when Performance Monitor shows excessive I/Os but FileMon shows at least 65,536 bytes per I/O? It means the database file itself is fragmented on the disk. Performance Monitor reports physical disk metrics whereas FileMon reports I/O at the file-system level. To load a page of data, Windows must translate the page offset in the file to that set of bytes' physical location on the disk. NTFS tries to decrease disk fragmentation by allocating space to a file from the largest chunk of free space on the disk but doesn't attempt to optimize those allocations. (Inside the Windows NT File System, by Helen Custer, Microsoft Press, 1994, provides a good explanation of how NTFS works.) If your database has been expanded and shrunk (i.e., by using the DBCC SHRINKDB command) on a regular basis, it's likely that parts of your database files are scattered throughout the disk, and it's very unlikely that those parts are ordered to enable the storage manager's ordering by page number to do any good.

A partial solution is to use a disk-defragmentation application to order the parts so that reading from the first byte to the last byte of a file lets the disk controller read from the disk in only one direction. But this is only a partial solution because defragmentation applications don't usually move the parts so that they're contiguous. Since even a small gap between allocations on the disk forces a wait as the disk rotates under the read/write head, more gaps will always translate into slower performance than the theoretical maximum.

A solution for new databases is to allocate enough space to allow for growth when you first create the database. If the disk is newly formatted, NTFS will allocate all the space in one, contiguous area of the disk. After that, you can use DBCC commands and index rebuilds to minimize fragmentation.

A more effective solution for existing databases is to perform a full backup of the database, drop the database (which deletes the files), defrag the disk, then restore the database. The restore process will cause Windows to allocate disk space from the largest available chunks of space; therefore,the pages will likely be physically contiguous on the disk itself. The end result should be that the storage manager's ordering of page requests will be more effective because the controller can access bigger chunks of data in a single I/O.

Solving the Problems

The trail of clues I found—table and database-file fragmentation and page-density inefficiencies—led me to identify three issues that I needed to address to solve my database-performance problem. First, the largest, most heavily used table needed an index to support a foreign-key constraint. This particular problem wasn't really caused by fragmentation but instead by the table scans of the 2.5GB foreign-key table during updates of the primary-key table. The scans were causing so much disk I/O that it was impossible to tell whether other disk problems existed. Adding the index drastically decreased update times and cut disk I/O by about 80 percent during updates. To address the remaining two issues, I wrote the uspDefragTables stored procedure, which I'll discuss in more detail shortly.

The second issue I discovered was that the table was a heap, which means that its data was stored in the order in which it was inserted. This condition by itself wasn't detrimental; the table's pages were only a little fragmented since it received about 75 to 80 percent of the total inserts made to all tables in the database for the database, which had the effect of keeping the extents close together. Remember, though, that the logical and extent fragmentation numbers in DBCC SHOWCONTIG don't apply to a heap; therefore, a heap with 0 percent fragmentation can still be poorly organized for the way an application requests data.

In fact, it was the data's organization that was causing fragmentation-like effects on data retrieval. The largest table holds detail information about various topics for individuals who use the application, and the table's heap structure forces multiple individuals' detail data to share the same page. The net effect is that retrieving all the data for an individual (which can happen dozens of times in a few minutes) forces SQL Server to skip through the table and read a few rows from each of tens of pages per person. Not only does that result in more I/O, but it nearly guarantees that the extents holding those pages won't be adjacent. Additionally, because the table has thousands of inserts per minute, each extent probably holds only a small number of rows for any given individual.

The third issue, which makes the organization of the data even less efficient, is that DBCC SHOWCONTIG showed that page density was less than 50 percent. This meant that the table was using more than twice the pages necessary to store the data. Too many, half-empty pages scattered throughout the database file is a recipe for really bad performance.

The uspDefragTables stored procedure, which Listing 2 shows, solves the data-organization and page-density problems. The procedure iterates through all the rows in the table that holds the DBCC SHOWCONTIG output saved by uspBuild-Fraglist, and it looks for tables that exceed the level of fragmentation specified by the @MaxFrag parameter, as the code at callout A in Listing 2 shows.

UspDefragTables runs the DBCC DBREINDEX command on any index that has logical or extent fragmentation or an average page density that's less than 100 - @MaxFrag. For clustered indexes, this command reorganizes the entire table. For non-clustered indexes, it reorganizes the index's leaf level.

Notice that uspDefragTables specifically ignores the heap (indexID = 0) because DBCC DBREINDEX doesn't work on heaps. The best that DBCC DBREINDEX can do is create defragmented nonclustered indexes. Improving the organization of a heap requires other techniques that are often best done manually.

The most commonly recommended technique for defragmenting a heap is to create a clustered index, then drop the index immediately afterward. Dropping the index doesn't affect the physical order of the pages; therefore, you get the benefits of higher average page density and lower fragmentation without the overhead of maintaining a clustered index on a heavily inserted table. You should base your choice of columns used to build the clustered index on how your application uses the data, not how the data will be inserted over time. The best option for tables that have detail data is to use foreign-key columns, so that the clustered index will group the data by the related table's primary key. Any new data is inserted at the end of the table, but at least the existing data is stored in contiguous pages and extents.

Keeping Up with Fragmentation

It turned out that the ultimate reason for the early interruption of my sleep was because our maintenance job hadn't been run for a few weeks. (You can learn more about the maintenance procedures I wrote in the Web-exclusive sidebar "Relevant T-SQL Stored Procedures," http://www.sqlmag.com, InstantDoc ID 48913.) I'd been busy working on higher-priority tasks and hadn't had time to diagnose what was causing the maintenance job to fail. At the time, I believed the failure wasn't a critical problem because I thought it took months for the databases to become fragmented. That morning, I realized I was greatly mistaken, at least about some of the heavily inserted and updated tables.

After I added the index to prevent the table scans and built a clustered index to organize the data more efficiently (which had the effect of defragmenting the nonclustered indexes, too), I watched the fragmentation levels over the next 2 weeks. It took only a few days for the largest tables to start showing signs of fragmentation again, and by the end of a week, they were getting noticeably slower. It was clear that fragmentation would always be lurking in the shadows, and only by maintaining constant vigilance would I keep it at bay.

Morris Lewis ([email protected]) is a senior consultant for Decision Source, which specializes in business intelligence consulting, and is the author of SQL Server Security Distilled (2nd ed., Apress).

Comments

Plain text