4 Failover Clustering Hassles and How to Avoid Them

Failover clustering is a fault-tolerance technology, minimizing service interruptions due to hardware failure or planned maintenance. In many ways, failover clustering has suffered from an image problem. Failover clustering works well technically, but its perceived configuration and maintenance complexities scare off many potential users.

Failover Clustering is just too hard to set up and use.

This is the single most common complaint I hear about failover clustering. This view stems from the pre-Windows Server 2008 days of high availability when creating a cluster was a fear-inducing procedure that required many pages of wizard input and huge amounts of configuration detail. Clustering generally required an expert, and you had to perform tasks on each node of the cluster. Once you'd actually created the cluster, maintenance was the next challenge and, once more, you probably needed a cluster specialist. All of this assuming you could actually get hardware that was on the cluster-supported list.

Microsoft went back to the drawing board with Server 2008 and started from scratch on many user interface elements, including management and cluster creation. The company also simplified hardware requirements to make clustering more accessible. Windows Server 2003 has a number of different quorum models to cater to different scenarios, such as the File Share Witness, which was needed for clusters with no common storage. The File Share Witness was initially required for Exchange Cluster Continuous Replication. Server 2008 merged all the different quorum models into a single unified model that could run in different modes but was far simpler to understand.

The cluster creation experience in Server 2008 consists of launching the cluster creation wizard and specifying the servers that will be in the cluster, a name for the new cluster, and an IP address if DHCP isn't configured on the NICs. That's it, three dialog screens in total. The cluster creation performs an analysis of the servers being added to the cluster, ascertains the availability of common storage, architects the right mode for quorum based on storage and number of nodes, and configures all of the nodes in one go. There's no need to go to each node to set up the cluster. Also, there's a validation stage as part of the cluster creation that checks your hardware and configurations. Assuming validation passes (which is likely, as long as your nodes are running the same processor architecture, version of Windows, and so on), your cluster is supported by Microsoft, with no need to check a Microsoft Hardware Certification List for your cluster or server hardware.

Ongoing management is just as simple. Any time you need to make a change, there are wizards to guide you through the modification. If you have a problem, running the validation again often gives good insight to the cause of the problem. This information is further improved with Server 2008 R2, and Server 2008 R2 also gives you full PowerShell management support for clusters.

I still experience some downtime with a failover cluster.

This is a common misunderstanding about failover clusters and can cause frustration. It's the distinction between high availability, which failover clustering provides, and fault tolerance, where failover cluster can be only a part of the solution.

Failover clustering provides a framework of capabilities that services and applications can take advantage of in different ways. At the most basic level, failover clustering keeps an eye on all the nodes in the cluster. If one node becomes unavailable, clustering moves the services and dependent resources from the dead node and distributes them through the rest of the cluster, onto the remaining healthy nodes. With this basic usage of failover clustering, you'll see some downtime when the node hosting a service or application crashes. That crash has to be detected. Then the resources that node had mounted, such as LUNs, must be mounted on a new target node, and the service or application has to be restarted. All of these steps take time, so the service will be unavailable for a while. This would be common for something like a file or print service that's hosted as part of a cluster. It's also the case with services such as Exchange Server 2007 and Exchange 2003 Single Copy Cluster. The key fact is that failover clustering technology will get the service restarted and available again as quickly as possible, providing high availability, but not 100 percent availability.

When people talk about fault tolerance, they're talking about a configuration that can tolerate a failure with no service downtime to the end user. Fault tolerant solutions typically require far more complex architectures than failover clustering, because they have to facilitate services running on multiple nodes at the same time. They also have to keep data synchronized between nodes in real time and provide failure detection and failover processes to minimize any downtime to the point that it isn't noticeable. The in-box failover clustering cannot do this for services and applications using Windows-only functionality because of the differences in implementation that are required for all the different ways applications can work.

Failover clustering provides the basic infrastructure that applications and services can build on to provide Fault Tolerant solutions, but that's not to say that application can't be fault tolerant without failover clustering. Many services are fault tolerant without failover clustering, such as Active Directory and IIS farms that use network load balancing.

A good example is Exchange 2010's Database Availability Groups (DAGs). DAGs use failover clustering behind the scenes for certain aspects of resource availability. They then add additional technology to replicate mailbox database data to multiple servers and provide client communication points in the form of Client Access Servers that present the data to the clients from the mailbox servers. If you're seeing short periods of downtime when a node fails, this probably isn't a problem—it's by design.

Why can't I have a cluster over more than one location without expensive network solutions?

Cluster-enabled services typically have a number of resources allocated to them, including an IP address. Within a single location, you can have multiple nodes connected to the same network segment, or at least network segments that can be in the same IP subnet. This means the IP address for the service can be hosted on any node in the cluster, because they all have the same network connectivity capabilities. Now imagine you want to spread a cluster with nodes in multiple locations. Multiple locations typically means different network segments and IP subnets. This is a problem because you can't have a cluster resource IP address of 192.168.1.10 being hosted in a location that is subnet 192.168.10.0, the routing just wouldn't work. The solution to this problem has been to stretch subnets across multiple locations, which typically involves very expensive network implementations, prohibiting all but the largest companies from using clustering in multi-site scenarios.

Server 2008 introduced a key change that brought multisite clustering to everyone, and it can be summed up in one word: OR. Before Server 2008, you could allocate multiple IP addresses to a service or application as part of the resource group, but all the IP addresses had to be present—they all had to be functional on all nodes in the cluster. The Server 2008 introduction of OR means you can allocate multiple IP addresses to a service or application and specify an OR relationship. The OR lets you allocate multiple IP addresses to cater for the various IP subnets the service may run on in multiple locations. The IP address that matches the location where the service is currently active is used for client connectivity, which now means you can have multi-site clustering without the expensive network solutions.

Just because you can allocate multiple IP addresses in an OR relationship doesn't mean all your multi-site problems will be magically solved. When you have a single IP address for a service, the clients always know the address to talk to the service. If you have multiple IP addresses for a service, the solution is more complicated. You may need to use services, such as DNS, with very short Time to Live values on the hostname records, so clients don't cache old IP addresses, or use the option to register all IP providers so all IP addresses are registered in DNS. More likely, you may use some kind of middle communication tier for the clients, such as (going back to the Exchange 2010 example) the Client Access Server role.

I have high availability at the virtualization and application levels. Which should I use? It's just confusing.

I get this question a lot, but there's a basic piece of guidance that will help you make the right decision. The trend today is towards virtualizing everything you can, and the major virtualization solutions offer high availability services that work in both planned and unplanned situations. In a planned situation, for example, you might want to install a patch that requires a reboot to a Hyper-V server. You can use the Hyper-V Live Migration function to copy the memory and state of the running virtual machines (VMs) to another Hyper-V server and avoid any VM downtime.

Unplanned scenarios, where the virtual server just crashes, don't give you time to copy the VMs' memory and states to other virtualization servers, so the VMs have to be restarted on a new virtual server in a crash consistent state. The services offered by the VMs will be unavailable while the guest OS boots and the services start. So, with virtualization you have the option of high availability at the virtualization level, but with unplanned server downtime, you'll have a period of unavailability.

The alternative is to enable high availability within the guest OSs using traditional technologies, such as failover clustering, with the applications. This requires that the applications support failover clustering. If they do, application-aware high availability will generally offer far less downtime then would be associated with restarting the OS (which you have to do with virtualization high availability).

Consider an Exchange mailbox server that's made highly available through the virtualization layer and one that's made highly available within the guest OS. When using virtualization high availability, you install one instance of the Exchange mailbox server role in a VM, with its configuration and virtual hard disks on shared storage. You make the VM highly available through the virtualization features (in the case of Hyper-V, failover clustering is actually used on the Hyper-V hosts). If the server hosting the VM crashes, another server will restart the VM in exactly the same way a physical box has to reboot after a crash. There would be a possibility of disk and database corruption due to improper shutdown, so it may need to run integrity checks, which can be very slow. This scenario is illustrated in the left side of Figure 1.

Figure 1

If you instead employ Exchange's high availability features, illustrated in the right side of Figure 1, which use failover clustering in the guest OSS, you have two instances of the Exchange mailbox server role (with Exchange 2010, you can have up to 16 in a cluster or DAG). It's critical that each instance be on separate servers—you're not adding much benefit hosting both instances on the same physical box. You should add anti-affinity rules to make sure the instances don't run on the same box. You don't need to use shared storage.

Each instance runs the Exchange software. Logs are shipped from the active copy of the database to the passive copy and replayed there, keeping the databases synchronized. If the server that's hosting the active copy crashes, the guest OS will see that the active Exchange mailbox server is no longer responding and take ownership of the mailbox server IP and name resources. It will try to copy any missing transaction logs, check with hub transport to make sure no messages have been lost, and start offering mailbox services from its own copy of the database. This way is much faster and cleaner than high availability at the virtualization layer.

In general, if you're running an application that supports high availability, such as Exchange or SQL Server, it's better to enable high availability at the application level within the guest OSs to achieve the optimum high availability. If you have an application that doesn't support high availability, enabling high availability at the virtualization layer is the next best thing.

Comments

Plain text