Nine Ways to Optimize SQL Server 6.5 Queries

What a difference an index can make! I received that reminder recently when I added an index and changed a query's runtime from 40 minutes to 12 seconds. Indexing, however, is only one tool in a DBA's performance-tuning toolkit. You can get better performance by configuring everything from your hardware to your database options. Here are my top nine ways to performance-tune SQL Server 6.5.

1. Put as much RAM in your server as it will hold. The more of your database that can fit into data cache, the faster your queries will run; therefore, maximize RAM in proportion to your database size. For example, if your database is 1GB, 1GB of RAM lets you put most of your database in memory. Leave some RAM for Windows NT. I prefer to leave between 64MB and 128MB of RAM available for NT and to allocate the rest to SQL Server. And keep NT's Physical Memory Available at 4MB or more. Any less, and NT starts excessively paging virtual memory to disk.

2. Use RAID 0 or 5 arrays for the data portion of your database. RAID 0 and 5 arrays spread read requests across multiple physical disks. You know what happens in urban areas on bridges spanning bodies of water when thousands of vehicles must pass through a limited space at the same time. The same bottleneck happens with reads to your database device files. If you can find a way to channel your data through multiple paths, the server can read a block of data from each hard disk in parallel, and you can get a nearly linear improvement in performance. RAID 0 and 5 arrays give that improvement for your read operations. For example, Ron Soukup, in Inside SQL Server 6.5 (Microsoft Press, 1997), says that one 4GB hard disk might offer 80 to 90 I/O operations per second, but a RAID 0 array consisting of eight 500MB hard disks can handle as many as 400 I/Os per second. There is a point of diminishing returns, but in general, the more hard disks you have in your array, the faster the read performance is.

3. Set the Max Async I/0s option to take advantage of your hardware. Your hard disk subsystem can probably handle more than the default of eight asynchronous I/O operations per second that was set more than three years ago with the release of SQL Server 6.5. To optimize this setting, increment it in small steps while watching the Physical Disk object's AvgDiskQueueLength counter in NT's Performance Monitor. As long as the counter value consistently stays under twice the number of disks you have, you aren't overloading the hard-disk subsystem.

4. Set the table lock escalation (LE) thresholds. The LE Threshold Maximum, LE Threshold Minimum, and LE Threshold Percent settings work together to determine how many pages SQL Server will lock before it upgrades to a table lock. The default settings for these thresholds are 200, 20, and 0, respectively. For very large tables, a table lock could save the overhead of placing thousands of locks. For databases with hundreds of tables, those savings can have a significant impact on performance.

5. Create clustered indexes for queries that read ranges of values. Because clustered indexes organize the data physically in index order, they're perfect indexes for queries that retrieve a range of values. For example, if an invoice item table has a clustered index on the InvoiceID column, all the entries for invoice 0001 are at the beginning of the table and all the entries for invoice 9999 are at the end. That order means that when the invoice item table is on the many side of a one-to-many join, SQL Server can simply find the first InvoiceID and read sequentially through the records until the InvoiceID changes.

6. Create nonclustered indexes for queries that search for unique values. Instead of containing all the page's contents as the clustered index does, the nonclustered index's leaf level has one entry for each row in the table. That entry is the page number and row number where the row resides. If your query requests data from a column that isn't part of the index, SQL Server must read the row's page into data cache and retrieve the row's contents. The main benefit of a nonclustered index is that SQL Server can find unique values very quickly. Primary keys usually need nonclustered indexes because primary keys by definition have unique values, for which a nonclustered index is perfect. SQL Server doesn't always use a nonclustered index just because you create it, however. So, only create indexes that support queries that return a small percentage of the rows in a table.

7. Create composite indexes to support multiple queries. If your tables can have UPDATE and INSERT operations as well as read operations, keeping the number of indexes low will minimize the overhead of maintaining the indexes. INSERT operations force SQL Server to add entries to the index, and UPDATE operations might cause a row to shift to a new place in the index or even to a new page in the table. Furthermore, SQL Server often performs updates as a combination of a row delete followed by a row insert, which is the worst case in terms of index management overhead. One answer is to create composite indexes that SQL Server can use for multiple queries.

8. Index joined columns. Because joins find all the matching rows in the inner table based on values in the outer table, SQL Server searches the inner table for every row in the outer table. With an index, SQL Server can retrieve only the rows that match the join condition. If the inner table is several times the size of the outer table, the performance gain can be several orders of magnitude better. (For more information, see Itzik Ben-Gan and Kalen Delaney, "Advanced JOIN Techniques," December 1999.) Whether you choose a clustered or unclustered index depends mostly on the presence of other columns in the SELECT list. Use a nonclustered index if the list contains only columns that are part of the join.

9. Take advantage of covering indexes. A nonclustered index that contains all the columns referenced in a SELECT, UPDATE, or DELETE statement is a covering index and the query is a covered query. Because the nonclustered index has an entry in the leaf level for every row in the table, all the data for the query will be in the index. Therefore, the query processor can do a leaf scan instead of a table scan. In general, you'll see dramatic improvement in query performance if you can build a covering index, mainly because the index contains only a subset of the row. The attached penalty is that adding more columns to an index results in fewer entries per index page, which means your indexes require more space and more I/Os to read into data cache. As long as the index's ratio of column size to total row size is relatively small, building a covering index is worthwhile.

Comments

Plain text