Clustering Terms and Technologies

Active/active configuration: In an active/active cluster configuration, both servers perform meaningful work all the time. For example, an active/active cluster might have Microsoft SQL Server running on both nodes, or SQL Server on one and Exchange on the other. Active/active clusters let you fully utilize your server resources and can support load balancing.

Active/standby configuration: In an active/standby cluster configuration, one server functions as a cold standby (i.e., it performs no meaningful work) to an active server, or one server performs meaningful work but does not run the same application at the same time as the other server. For example, an active/standby cluster might have Microsoft SQL Server installed on both nodes, but running on only one node at a time; when a failover occurs, the database moves to the other system.

Alias: When you create a cluster, you create an alias that users connect to. From the cluster's standpoint, the alias refers to the server that owns the service at a particular time. From the users' standpoint, the alias refers to the service they connect to (e.g., SQL Server), regardless of which server the service resides on. When a failover occurs, users reconnect (unless the application or client operating system handles reconnection for them) to the alias (not to a server) and continue working. If users connect directly to the primary server rather than to the alias, they cannot reestablish their connections when the primary server fails.

Application recovery kit: Most clustering solutions have application recovery kits (DLLs, services, etc.) for specific server-side applications, such as Microsoft SQL Server, Microsoft Exchange Server, and Lotus Notes. The kit enables the clustering software to fail over all resources (files, IP addresses, disk drives) associated with an application from one node to the other.

Client agent software: In the past, some NT clustering solutions (most notably, Digital Clusters for Windows NT 1.0) required agent software to run on the client side. This software let users access cluster aliases, and even let some client applications understand and handle the service disruption when the primary node failed. Client agent software provided a way to easily create cluster-aware applications.

Cluster objects and groups: Most clustering solutions employ cluster objects and groups (services). An application, a disk drive (for database volumes, files, or directories), an IP address, and so forth are examples of cluster objects. A cluster group is a collection of related objects that make up a cluster resource such as SQL Server.

Clustering APIs: Most clustering vendors provide APIs so that you can design recovery kits for applications not covered by the basic clustering software. These APIs let you program to an established (possibly proprietary) standard; several vendors will support the Microsoft Wolfpack API standard.

Clustering DLLs: Clustering solutions have specific DLLs that provide cluster-aware functionality to let programmers create applications that fail over gracefully from one node to the other on system failure.

Distributed lock management (DLM): Distributed lock management (DLM) enables two servers to access the same physical disk at the same time without corrupting the data. If a device is updating a particular file or piece of data, the device gets locked so that another controller can't seize ownership and overwrite the data. NT does not currently support DLM, so disks are dedicated to one node or the other.

Failback: Failback switches the service or object from the secondary node back to the primary node after a failover has occurred, and resets ordinary operating conditions. Failback is usually a manual process, but can be automatic.

Failover: Failover occurs when a node in the cluster or a component of the cluster configuration fails. Cluster services or resources from the faulty node relocate to the operational node. You can also trigger failover manually (e.g., by turning off a system to upgrade it), taking one server offline and shifting the work to the secondary server. System failover capability means that only the entire system can fail over; individual objects (such as applications, disk volumes, etc.) cannot. The secondary system assumes the identity and workload of the primary system (only on system failure) in addition to its identity and workload. Application failover capability means that the systems administrator can fail over one application or object at a time, keeping all other services and users intact on the nodes.

Fault-tolerant cluster: A fault-tolerant cluster ties together every action of the two nodes in the cluster, including the instructions running on the CPUs (i.e., the CPUs on each server run in lockstep). One server can completely fail without cluster users ever knowing the difference. The other server takes over instantly because it has been performing the same work as the primary server.

Heartbeat: A heartbeat is the signal that the nodes in a cluster send each other to verify they are alive and functioning. The nodes transmit the heartbeat over direct (crossover) LAN connections, through a hub or switch, or even via the SCSI bus. If the heartbeat ceases, one of the nodes has failed and the clustering software instructs the other node to take over. Employing more than one method to generate a heartbeat eliminates the problem of a minor failure triggering an unwanted failover.

Interconnect: The interconnect provides a communications link between the nodes in a cluster. Status information, heartbeat, and other intercluster data travels over the interconnect. This connection can be over your LAN or directly from node to node, using Ethernet, 100Base-T, ServerNet, Fibre Channel, serial, SCSI, and so forth. Fault-tolerant clustering solutions typically use more than one interconnect simultaneously to prevent unwanted failovers.

Load balancing: Currently, NT clustering solutions do not support dynamic load balancing (performance clusters), but you can perform manual load balancing. For example, on one server, you can run Microsoft SQL Server with the accounting department's database, and on the other server, run SQL Server with the order entry department's database. Each server is fully loaded, with separate databases on one shared SCSI disk array. Each server is the primary owner for one database, and the secondary owner for the otherboth servers are fully utilized but cannot run both databases at the same time.

Mirrored-disk cluster: In a mirrored-disk cluster, the servers contain duplicate drives (no shared disk array). Data from the primary server is replicated to the secondary server over a dedicated private network connection (or proprietary high-speed interconnect). In the event of a primary server failure, the clustering software shifts object ownership to the secondary server, which uses the duplicate drives to run the applications and data.

Node: A node is one server in a cluster. A node does not include the shared disk array, if one exists.

NT service: A program that runs as an NT service can start or stop automatically with the server, provide key functions without user or administrator interaction, and integrate closely with the operating system. All the NT clustering solutions reviewed in this issue have an NT service component.

NUMA: Non-Uniform Memory Architecture (NUMAalso known as ccNUMA on Intel-based systems) is an alternative multiprocessing or symmetric multiprocessing (SMP) approach to highly available dynamic clusters. NUMA uses a high-speed system interconnect to tie local memory subsystems into one continuous block. NUMA presents many advantages for scaling beyond four or even eight processors, but it requires that software (operating system and applications) understands near and far memory. NT doesn't implement NUMAand probably never willbecause of NT's memory and security architecture.

Performance cluster: Performance clusters support dynamic load balancing. In a performance cluster, adding nodes to the cluster increases performance by distributing the compute load across multiple systems and provides fault tolerance. You can improve server throughput in a linear fashion with each computer you add for the same application. NT clustering solutions do not currently support performance clusters.

Primary server and secondary server: In a two-node cluster, one server is the primary server and the other is the secondary server. The primary server usually controls and runs the service. The secondary server is the failover system. Each server can be both primary and secondary so that the servers back each other up. Both servers can perform meaningful work on the network.

RAID: Redundant Array of Inexpensive Disks (RAID) is a strategy that uses technologies such as disk striping, disk mirroring, and disk striping with parity to offer levels of data redundancy and fault tolerance. All the clustering solutions on NT support fault-tolerant disk subsystems via hardware-based RAID, and Wolfpack will support NT's software-based RAID.

Shared-disk cluster: In a shared-disk cluster, the two servers connect to one disk array. The servers can access the disk array in one of two ways: simultaneous access or shared SCSI access. With simultaneous access, each server can read from and write to the same physical disks (not at the same time) on the same disk subsystem bus. Simultaneous access requires distributed lock management (DLM) to prevent data corruption; NT doesn't support DLM. With shared SCSI (also known as split-SCSI) access, each server is a termination point on the SCSI bus, each server probably has only one system drive installed, and all failover-enabled applications and data for both servers reside on the disk array. Only one server at a time owns the disk drives. When one server fails, the other server takes control of the SCSI bus and all assigned drives.

Shared-everything cluster: Shared-everything clusters follow one of two approaches: shared memory bus architectures or cross-bar systems. In each approach, all CPUs can access all system resources (memory, disk, network, etc.). You will not find the cross-bar architecture on the macro-scale in a clustering environment because it was designed for massively parallel systems such as mainframes. It uses an array configuration of memory blocks and CPU and cache modules (requiring physical locality), resulting in highly scalable systems.

Shared-memory cluster: Think of a shared-memory cluster as an extension of an SMP system's internal design in which all CPUs on one board access one memory system. A high-speed interconnect (e.g., NUMA) ties together two or more nodes in a performance cluster at the memory bus level to create one large, shared, memory pool that all CPUs can simultaneously access. Shared-memory clustering provides dynamic scalability, but NT does not support this technology.

Shared-nothing cluster: In shared-nothing clusters, each node in the cluster is a self-contained, fully functional server. Even if the nodes share a disk array, only one server at a time accesses the disk array, just as if the other system didn't exist. All the NT clustering solutions the Lab reviewed are shared-nothing clusters. The two types of shared-nothing architectures using interconnection networks are network bus architecture and switching fabric architecture. The network bus is a single connection from system to system, resulting in contention and bandwidth difficulties for heavily loaded systems. Vinca StandbyServer is an example that uses the network bus for disk mirroring, even though the CPU, memory, and disk systems of the two servers are separate. The switching fabric architecture uses a high-speed switching technology (such as Tandem ServerNet) to tie the nodes' CPU and memory systems together without the limitations of a bus architecture; each node remains a self-contained system.

Comments

Plain text