Failover clustering is one of the most popular high-availability options for SQL Server. If you're about to implement failover clustering in your environment, you need to devote a lot of planning and coordination to all aspects of your cluster design. You can't just draw up the plan for your cluster on a cocktail napkin. Fixing a bad design after the fact can translate to significant downtime, and downtime defeats the entire purpose of implementing highly available clusters. You'll need to work with your fellow Windows, storage, and network engineers to properly implement your failover clusters—it's a team effort.
One of the most important aspects to get right when configuring your clustered instances of SQL Server is the disk configuration. It isn't a simple prospect, and I often see it done incorrectly in client environments. To address this important topic, I have three avenues for you to take. First, this article provides a concise primer to fuel discussions about your failover clustering implementations—both new and existing. This article is written with SQL Server 2005 in mind, but most (if not all) of the concepts apply to previous versions of SQL Server as well. Second, if you need a quick failover clustering overview, see the Microsoft white paper I co-wrote, "SQL Server 2005 Failover Clustering," at the Microsoft Download Center. Finally, if you want to read about failover clustering in great detail, you can check out my new book Pro SQL Server 2005 High Availability (Apress, 2007).
Supportable Cluster Solutions
Before you even install Windows on the server, it's an absolute requirement that your entire cluster solution—down to the disk solution, host bus adapters (HBAs), and drivers—appear in the Windows Server Catalog of Tested Products list as a valid cluster solution. This requirement is clearly defined in the Microsoft articles "The Microsoft SQL Server support policy for Microsoft Clustering" and "The Microsoft support policy for server clusters, the Hardware Compatibility List, and the Windows Server Catalog."
Failing to deploy a certified cluster tends to lead to downtime, and having the wrong drivers, BIOS, or firmware can lead to other problems such as disk corruption. Always check the Windows Server Catalog or your vendor's support matrix for clustered solutions to see what is supported for a Microsoft cluster. Just because a newer HBA driver is available doesn't mean that you should update that driver on the server. Make sure that you have known good backups that are recent and tested, should you encounter a catastrophic disk problem such as corruption resulting from a storage engineer introducing a problem with a new driver.
Disk Configuration Basics
Microsoft's implementation of disks in a cluster is a shared nothing approach. Although any node might eventually be able to own a disk resource, only one node can own it at any given time, and it can only be used by the resources in a single cluster group. A disk resource isn't shared nor can it be used by other resources outside the cluster group or the node owning it. Other vendors have implemented a form of clustering that uses a shared disk subsystem, which lets more than one resource access the same disk simultaneously, but it involves a piece of code called the lock manager for managing access to the disk resource to ensure that no conflicts occur.
As you plan your disk configuration, you must understand the difference between logical and physical disk configuration. Unfortunately, I encounter many DBAs who don't understand this point. For Windows to use a disk on a storage array, a Logical Unit Number (LUN) is defined on the physical array. The LUN is a grouping of disks achieved through some form of RAID.
I won't discuss the various RAID flavors here, but you can find information in "Know Your RAID Levels." Some hardware vendors implement proprietary versions of RAID on certain arrays, so sometimes you not only won't have control over its configuration but you also won't be able to abide by typical best practices. A LUN must be low-level formatted before it's presented to Windows. To format a LUN, you can use vendor-specific tools from the hardware manufacturer. The formatting typically occurs at the time your disk array is set up, so be sure to work closely with the engineers who set up the array to ensure that the formatting occurs properly.
Once the LUN is ready to be presented to Windows, you need to make the disk usable in Windows. This process involves formatting the disk and possibly assigning it a drive letter, depending on how you plan to use the disk. Formatting the disk in Windows is a completely separate process from the earlier low-level format, which the array itself required. After you format the disk, your logical disk will be ready to use. Disk formatting in Windows is a subject that DBAs need to be vigilant about. Many storage or Windows engineers aren't familiar with SQL Server, so they wind up both low-level formatting the disks at the disk subsystem level and formatting in Windows with the default settings. Or, they're more familiar with Oracle and assume that SQL Server is the same thing.
The default block size in Windows is 4KB. That size might be fine for a file system—but not SQL Server data files. SQL Server writes are 8KB and readahead is 64KB. It's OK to format transaction log disks with 4KB—if they'll never contain a data file. I recommend formatting the disks with 64KB. You could use a higher block size than 64KB, but you might not realize any benefit. I always recommend playing it safe. If applicable, perform a sector alignment of the disks in Windows before formatting them. You could potentially see as much as a 20 percent performance gain. Some disk subsystems don't need sector alignment, so check with your vendor's recommendations.
When formatting and defining your disks in Windows for use on a cluster, don't define two partitions and drive letters on a single LUN, as you see in Figure 1. As you can see, to Windows, disks I and J are two logical disks that happen to be carved out on one LUN.
However, when the disk is added to the Windows server cluster, it's added as one disk, as you see in Figure 2. Cluster Administrator recognizes it only as one big disk that happens to have two drive letters. You couldn't have two separate SQL Server instances sharing the drive because it can reside in only one cluster group. You might as well have just used one drive letter.
The SQL Server installation process requires that you choose a cluster group into which all clustered resources will be placed. Multiple SQL Server or SQL Server Analysis Services (SSAS) installations can't share a cluster group, meaning that any disks in a cluster group can be used only by a single instance of SQL Server or SSAS. They can't be shared. If you have more than one instance that you're planning on adding to a Windows server cluster, you'll need dedicated disks for each.
All disks for a clustered installation must be on the shared disk subsystem. I'm frequently asked if, in a clustered installation, a local disk can be used for things such as the system databases (especially tempdb) or backups. The answer is no.
During the SQL Server or SSAS installation process, you choose only one disk in the cluster group where the resources will be installed. This disk will contain the system databases and default files. To use all disks in the cluster group, after the installation is complete, you must add them as a dependency of the SQL Server or SSAS resources. This process involves taking the resource offline, which translates into downtime. Therefore, make sure that you perform this configuration step before the servers become live in your production environment.
Capacity and Performance
Two other major factors in your disk design are capacity and performance, both of which are universal concerns with any SQL Server deployment. However, these two factors can be more challenging to achieve in a clustered configuration where you might have design restrictions.
Capacity planning. Capacity planning is essential for both sizing the disks and knowing the performance your environment will require to ensure that the disks have enough I/O capacity. (For information about capacity planning, see "SQL Server Consolidation.") Although the act of sizing isn't straightforward, its outcome will be a number (both I/Os and space) that lets you meet your current needs while providing enough space for the future. As you'll see in a moment, you'll need to make some tradeoffs, so determining this number isn't simple.
Despite what many storage engineers will tell you, having tons of cache on your disk subsystem isn't the panacea for great performance. If a disk subsystem is shared among file servers and other applications (along with your SQL Server implementation), each usage has its own I/O patterns and optimizations and will take part of that cache. You can't assume your data will remain in the disk cache. You should get as much cache as you need, but never assume that it will compensate for all the problems of a poor disk design. It's nearly impossible on a shared disk subsystem to optimize for only one type of I/O. How many applications perform only write or read? The reality is that you'll always have a mixture of I/O types, and you need to factor that into your disk design.
Performance. When it comes to performance, the problem with most SQL Server deployments is related to I/O. Do you have the right amount of I/O available for your workload? Back in the good old days of SCSI-based disk subsystems—when you had more control over how the disks were carved up and could easily dedicate spindles to SQL Server and isolate LUNs—I/O was "easy" to manage. Let's assume that a modern spindle gets about 150 I/Os per second. If your workload needs 2000 I/Os, you would need the equivalent of at least 14 physical disks' worth of I/O to meet your current need (and that doesn't address growth, for which you would need to factor a percentage more to allow for future capacity).
With modern disk subsystems, many storage engineers will carve the storage into one (or a few) big chunks, from which the disks that are presented to Windows are configured on top of it. The result is that only a percentage of a given spindle's I/Os are available to a LUN because it's comprised of slices of multiple spindles instead of dedicated disks. Also, not only does this situation potentially mix I/O types on a single disk but it might require a larger pool of disks to get equivalent performance than if dedicated spindles were used. In my experience, storage engineers only ask how much space you need, not how many I/Os. This isn't a problem until you have performance problems that wind up being related to insufficient I/Os (meaning a poor disk implementation). In this case, the DBA loses because the DBA generally isn't hardware savvy and needs help proving that the performance problem isn't actually SQL Server.
Determining your I/O requirements isn't exactly straightforward. As I discussed in "SQL Server Consolidation," you need to know the I/O signatures of individual databases, especially under load. You need to think about how both your data and log files are used and how they might grow. You need to know how the database is used, and you need to understand the types of activity that will occur in it. Performance Monitor counters tell only part of the story. I always stress that effective DBAs need to understand not only SQL Server but also Windows and hardware. What you don't know can hurt you—and that's usually the reason why I'm called to assist customers: Administrators haven't thought these concerns through, and now a problem needs to be fixed.
The most challenging design aspect of any failover clustering implementation is the 26 drive-letter limitation. A SQL Server failover cluster requires that all drives associated with the clustered instance have a drive letter. You only have 26 letters available to you. SQL Server 2005 adds support for mount points, but they don't absolve the need for a drive letter: For SQL Server to use it in a cluster, a mount point must be mounted under a drive letter. What a mount point does, however, is let you use an existing clustered drive letter to add more space without requiring the use of another drive letter. This functionality has obvious benefits, such as separating out disk I/O since a mount point is a physically separate disk. Keep in mind that these are still LUNs that you'll need to manage. You don't want to use 26 drive letters, but you also don't want a bunch of LUNs overwhelming your servers. You must achieve a delicate balance.
Having only a finite number of drive letters means that you have to be smart about your disk design. You must design for the here and now while keeping an eye toward possible future expansion. In a typical environment, no one has 26 drive letters available. A few are already gone (e.g., local system disk, floppy drive, mapped drives, CD/DVD drive), so you have at most 22 or 23 available to you. Of those 22 or 23, a few more are going to be taken by the clustered Microsoft Distributed Transaction Coordinator (MS DTC) and the quorum drive, so you're down to 20 or 21. After all that, you can think about your disk design for SQL Server.
Consider an example in which one instance of SQL Server 2005 and one of SSAS 2005 are installed on the same Windows server cluster. If you're planning on putting 100 databases (all of which have relatively low I/O requirements) under the SQL Server instance, you won't be able to split out data and log files for each database onto an individual drive letter. To avoid disk contention, it's usually recommended to separate data and log files for a database. However, this advice doesn't necessarily mean putting all data or all log files for your databases on a single drive. Achieving this is virtually impossible, and in reality, the recommendation to split data and log files is an impractical "best practice" nine times out of ten. When you consider that you also don't want a ton of LUNs in your disk configuration, you can see how you're going to have to make some tradeoffs.
In the context of this example, a rule is put into place that no single data/log-file disk for the SQL Server installation will have more than 25 databases. That decision translates into four disks for the current number of databases, but you add one more for expansion purposes. One disk is required for the system databases, and because only the tempdb database is heavily used, it can coexist with the other system databases on that disk, but it shouldn't be placed with any of the user databases. To split up I/O and provide for a bit of growth, you devote two disks for backups. The SSAS implementation is straightforward, and you split out the system drive from your cubes. Table 1 shows what the disk design would look like.
If you take the same example but utilize mount points in conjunction with the required drive letters for the data/log-file disks as well as the backup disks for SQL Server, drive utilization changes. Table 2 is the reconfiguration, including the use of mount points. I'm not advocating this approach over the previous one, which uses only drive letters; it's just different. Although you gain only five drive letters by using mount points, those five letters could make the difference between adding another SQL Server instance or two to your cluster—as well as possibly simplifying database management because your DBAs need only remember a few letters. Most of my customers haven't used mount points with their failover clusters; however, for some, mount points might be the difference between achieving a certain goal and needing to buy more servers to obtain the number of required SQL Server instances. Remember that if you're clustering SQL Server 2000, using mount points isn't an option.
At the end of the Day
Implementing failover clustering isn't a trivial task. More than nearly any other component of a clustered deployment, the disk subsystem can contribute to downtime or other availability problems down the road if you fail to design and implement it properly. Take your time and work toward a disk configuration that will serve you well for the lifespan of the solution that you're putting into place. Make sure that it can support your business not only in terms of space but also in terms of performance. Although it might be a lot of hard work and compromise, your end state will give you an available, scalable, reliable, and manageable solution for years to come.