Clown Cramming
Don't just throw expensive hardware at a problem
January 23, 2007
How many circus clowns can you fit into a Volkswagen Beetle? I'd bet good money that I could squeeze 10 clowns into one, as long as they were small and I had plenty of butter to grease the clowns up. (Editor's note: No clowns were harmed during the writing of this article.) Adding each additional clown would become more and more difficult, and predicting the theoretical limit of how many clowns could squeeze into the car would be extremely tough.
Obviously, 10,000 clowns wouldn't fit. What if I reduced the workload—that is, decreased the number of clowns I'm trying to squeeze through my bottleneck (i.e., the car) by a factor of 10. A 10x decrease in the workload is substantial; I could surely fit just 1000 clowns into the Beetle. Right? All the butter in the world wouldn't get that job done.
What does all this clown imagery have to do with SQL Server I/O? Oddly enough, it can help us frame a problem that a tremendously large number of SQL Server customers suffer from—namely, I/O subsystems that aren't properly equipped to handle the workload placed on them.
The Wrong Approach
Many customers assume that "adding a faster I/O subsystem" will magically solve their problems. Perhaps they mistakenly believe that stuffing 1000 clowns into a car must be easier than stuffing 10,000 clowns in it. Alas, hardware is pretty mystical and unapproachable to a lot of software people. We don't visualize the bandwidth and bottleneck problems in the same way.
Suppose your existing SQL Server– induced workload is I/O-bound and that your I/O subsystem can't keep up. An unacceptably large number of SQL Server customers—and probably customers of other major database platforms—will immediately say, "I guess it's time to buy a better SAN." They'll diligently research SAN options, then spend a lot of time and money setting up the new SAN. Whether the solution will help is a crapshoot unless the customer does the necessary due diligence to understand the I/O characteristics of the workload he or she expects the I/O subsystem to handle—contrasted against the SAN's actual performance and throughput capabilities. Spending hundreds of thousands of dollars on a new I/O subsystem that's five to twenty times faster than your current system won't necessarily have a material impact on the overall performance of your application if your new SAN purchase is the circus equivalent of saying, "10,000 clowns won't fit, so let's try 1000."
Sometimes, throwing hardware at a problem works. You'll never squeeze 100 clowns into a Beetle, but you might be able to cram 100 clowns into a school bus rated to carry 50 people. It would be a tight fit, and the bus's performance probably wouldn't be up to your expectations, but you'd be able to drive those clowns around town. Sometimes, you'll get lucky while trying to solve I/O problems by simply buying more I/O—without having a concrete grasp of your existing I/O workload requirements. However, it's never a good idea. And you risk having to explain to your boss why the new six-figure SAN hasn't changed anything.
There's no magic formula that tells you "how much SAN you need." However, if your vendor is willing to take scads of your company's hard-earned money without asking you for detailed, low-level I/O-usage patterns—and you don't know how to size the system yourself—you should consider seeking outside assistance to ensure that your new I/O capacity is sized and configured for your needs.
The Right Approach
This column's intent isn't to teach you how to be an I/O expert. An important skill among successful IT pros is knowing the limits of one's abilities, recognizing situations in which the costs of making the wrong decision are substantial, and seeking appropriate outside counsel as necessary. But I can't leave you hanging entirely.
There are countless ways to look at I/O information, and SQL Server 2005 provides a rich set of tools for monitoring performance metrics. However, I've always been a big fan of fn_virtualfilestats, a handy function that lets you quickly and easily track the number of physical I/O read and write operations that happen on a per-file level for SQL Server. It tracks the actual number of bytes transferred for reads and writes on a per-file basis. It also gives you valuable information about the total number of I/O stalls that each file in your SQL Server instance has experienced. All this information can be helpful in understanding the existing I/O patterns of your current workload and in properly sizing any new I/O subsystem purchase that you're contemplating.
Join Us Online!
Save the clowns. Size your I/O subsystem properly. And be sure to visit SQL Server Magazine performance-tuning forums (http://sqlforums.windowsitpro.com) for further advice and tips.
About the Author
You May Also Like