As a performance consultant, I see SQL Server installations that range in size from the tiny to some of the largest in the world. But one of the most elusive aspects of today’s SQL Server installations is—without a doubt—a properly sized or configured storage subsystem. There are many reasons why this is the case.
My goal is to help you gather and analyze the necessary data to determine whether your storage is keeping up with the demand of your workloads. Of course, that goal is easier stated than accomplished, because every system has unique objectives and thresholds of what's acceptable and what's unacceptable. My plan is not to teach you how to configure your storage; that's beyond the scope of this article. Rather,I want to show you how you can use a fairly common set of industry guidelines to get an initial idea about acceptable storage performance related to SQL Server.
The most common reason for an improperly sized or configured storage subsystem is simply that insufficient money is budgeted for the storage from the start. Suffice it to say, we generally worry too much about the presence of wasted free disk space and not enough about the performance aspect. Performance comes from having many disks, which tends to give an excess of free disk space. Unfortunately, once a system is in production, changing the storage configuration is difficult. Many installations fall into this trap.
The second most common reason is that many SQL Server instances use storage on a SAN that's improperly configured for their workload and that shares resources with other I/O-intensive applications—particularly true when the shared resources include the physical disks themselves. Such a scenario not only affects performance but also can lead to unpredictable results as the other applications' workload changes.
The third most common reason is that many DBAs simply aren't as familiar with hardware-related concerns as they are with SQL Server. This imbalance can lead to less than desirable decisions when it comes to properly sizing the storage.
Three Sources of Help
Before I dive into acceptable limits, I should point out one important aspect of evaluating I/O response times: Just because you deem I/O response times unacceptable doesn't automatically mean that you need to adjust your I/O capacity. Often poor response times mean that you're simply doing too much physical I/O or that you're doing it inefficiently. If you're scanning tables or indexes when you should be doing seeks instead, you'll ultimately induce more physical I/O than you might otherwise incur. When it comes to efficient I/O processing, never underestimate the importance of a well tuned and optimized database and associated queries. A little tuning effort can usually avoid a costly and often difficult upgrade of the storage subsystem.
Related: I/O I/O It's Why My Server's Slow
When it comes to defining I/O response times related to SQL Server, there are three primary sources of data that you should watch regularly. You obtain two of these—the virtual file and wait stats—directly from SQL Server, and you get the third via the Windows Performance Monitor counters. Individually and especially together, these performance metrics can shed a lot of light on how well your disk subsystem is handling your requests for physical I/O. You just need to know what to look for and where to find it.
Physical I/O response time is the amount of time (in seconds) that an average read or write request to a physical disk takes to complete. You can get this data easily by using Performance Monitor's Logical or Physical Disk counters for Avg. Disk sec/Read and Avg. Disk sec/Write. Most experts would agree that with today’s technology, the response times when using 15,000rpm disks should be in the range that Figure 1 shows.
Keep in mind that these numbers are what most systems can expect to achieve with a properly sized and configured storage subsystem. If you're averaging slower response times, you probably have room for improvement. But remember that what's deemed unacceptable for one system might be perfectly fine for another system, due to the individual requirements. You'll need to make that determination on a case-by-case basis, but you can use these numbers as guidelines so that you have something to shoot for.
Another approach is to keep an eye on the PAGEIOLATCH_XX and WRITELOG waits. These stats will also tell you a lot about how well you're keeping up with the data and log I/O requests. High average waits on either of these stats should spark interest in terms of physical I/O response times. To get up to speed on exactly what wait stats are and how to collect the information, refer to my articles “Getting to Know Wait Stats” and “Dissecting SQL Server’s Top Waits.“
You can also obtain average physical I/O read and write response times by using the virtual file statistics as outlined in “Getting to Know Virtual File Stats.” The file stats DMV also tells you the amount of data read or written—which can be extremely helpful in determining whether you're accessing too much data in the first place, as previously mentioned.
It's a Start
Utilize these provided response times as an initial guideline. If you can meet these specs, you shouldn't have to worry about physical I/O as a bottleneck. Again, not every situation calls for such good response times and you need to adapt accordingly for each system. However, I've found that many people don’t know what to shoot for in the first place! Hopefully, this article provides a good start so that you can concentrate on a firm objective in terms of I/O response time.