Exchange Server's Client Access: Load-Balancing Your Servers

The Client Access server role plays a big part in Microsoft Exchange Server 2010 by providing the access point for every Exchange client. With such a big responsibility, you need to ensure that your Client Access servers can handle the load from your users and that these servers have minimal downtime.

In my previous articles in this series, I provided an introduction to the Client Access server role ("Exchange Server's Client Access: An Introduction") and gave you an overview of how to deploy it in your organization ("Exchange Server's Client Access: Deploying Your Servers"). In this article, I focus on giving your Client Access servers a little R&R—that's Resiliency and Redundancy. I'll do this by showing you how load balancing works and then how to apply it to the Client Access role.

Introduction to Load Balancing

At a high level, load balancing is a way to distribute workload across multiple systems. Sometimes people refer to these systems as a farm or an array. By distributing the load, you maximize your servers' resource use while minimizing response delay and system downtime. In a load-balanced array, redundancy is acquired inherently: If one of the systems in the array goes offline, that system's load transfers automatically to another system.

Network-based load balancing is a form of clustering, but you shouldn't confuse it with the concept of resource clustering. The difference between the two is that in resource clustering there are resources that the groups of servers share. These resources can take many forms, such as system services, IP addresses, or data. In network-based load balancing, however, there are no shared resources across the nodes. Instead, load balancing is TCP/IP-based. A user accesses a network service on a specific port, such as an HTTPS connection on port 443, and the load balancer ensures that one of the servers in the array responds to that connection.

Load balancers come in two primary forms: external devices and server software. Different load-balancing products not only work differently but also scale differently and provide unique features. In general, though, the premise is the same—they intercept network traffic before it reaches the service and distribute the load across the array of servers.

External load balancers sit in front of the server array and are exposed to the end user or reverse proxy instead of the individual servers. Figure 1 shows a load-balanced array of Client Access servers with an external load balancer. In this design, the load balancer has a unique name (mail.contoso.com) that users access directly. When a connection is established with the load balancer, the device determines which Client Access server will do the work. The external load balancer then brokers the connection to one of the Client Access server nodes.

Figure 1: A load-balanced array of Client Access servers with an external load balancer

There are several ways an external load balancer can determine which Client Access server to direct the connection to. For example, it can use predefined rules, such as a client subnet mapping, or it can take a round-robin approach. The load balancer, based on its monitoring configuration, can determine if a back-end server is healthy. Different types of load balancers can perform different types of monitoring, such as determining if a service is running or if a port is accessible on the Client Access server.

A software-based load balancer is installed on each of the servers in the array. One common software-based load balancer you might see in smaller Client Access server implementations is the Network Load Balancing (NLB) service that's built in to Windows Server. Software load balancers typically use predefined algorithms to determine which server should handle user requests. Figure 2 demonstrates a typical Windows NLB configuration.

Figure 2: A typical Windows NLB configuration for Client Access servers

In this case, the user connects to a virtual name and IP address that's shared by every server in the array. This shared IP address is assigned a unique MAC address that's used by every node, either in unicast or multicast mode. In unicast mode, the MAC address of one of the network adapters in each node is replaced with the shared MAC address. In multicast mode, there's an additional (multicast) MAC address added, so the network adapters retain their original MAC address as well. In either case, each server receives the network traffic for the shared IP address. To determine which node in the load-balanced array handles the network packets, a filtering algorithm is used. For Windows NLB, this algorithm is based on the IP address of the client that's connecting to the array. Every node in the array uses the same algorithm, so only one node will respond to the packet and the others drop it.

Windows NLB isn't suitable for most deployments. Although it technically supports as many as 32 nodes, you should never use Windows NLB to load-balance more than 8 Client Access servers. Another disadvantage of Windows NLB is its limited built-in intelligence. It can monitor whether a port is up or down, but that's it. If a service crashes or something else breaks on the Client Access server, Windows NLB might still think the service is running and continue to direct clients to the server. Also, for Exchange deployments where you have a couple of all-in-one servers that participate in a Database Availability Group (DAG), you can't use Windows NLB because it's incompatible with Windows Failover Clustering.

Understanding Persistence

Load balancing works particularly well when no session-specific data needs to be maintained for each connection, such as when connecting to a farm of web servers that host static data. This situation doesn't apply to Client Access servers, however. Outlook Web App (OWA), Exchange Control Panel (ECP), and Exchange Web Services (EWS) require session state to be maintained. To understand how session-specific data affects these services, let's look at an example. Suppose there are two Client Access server nodes in an array (CAS1 and CAS2) and the virtual name of the array is mail.contoso.com. A user accesses https://mail.contoso.com/owa to log on to her web mail account. When she accesses mail.contoso.com, the connection is handled by CAS1. She enters her credentials and establishes an authenticated session with CAS1. What would happen if the user opened a message and the load balancer decided that CAS2 was going to handle the request? Because session data isn't shared between CAS1 and CAS2, the user isn't authenticated with CAS2. Instead of the message opening, she would be prompted for re-authentication.

To solve this problem, load balancers employ a technique called persistence or sticky connections, which ensures that after the connection is established with CAS1, that connection always goes to CAS1 while the session is open. The only acceptable time that the connection might be shifted to CAS2 is if CAS1 goes offline. In that case, the user would re-authenticate and continue working in CAS2. For an external load balancer, persistence is configured in the device itself and not on the Client Access server. Typically, external load balancers achieve persistence based on the client's IP address, the session ID of the SSL connection, or through the use of cookies. Of the three, cookie-based persistence is the most commonly employed technique when load balancing OWA.

In Windows NLB, persistence is configured in the affinity setting. You have three options when configuring affinity—Network, Single, or None. Table 1 outlines what each of these affinity settings does.

Load-Balancing HTTP Traffic

Load-balancing techniques and configurations differ between different Client Access server services. Let's look at the HTTP-based services that should be load-balanced and discuss how to handle each one.

OWA and ECP. OWA and ECP connections can be load balanced like a standard web application. You'll want to ensure that you use persistence when load balancing these services because session data is maintained. If you don't, users might be randomly asked to re-authenticate when they're directed to another server in the array. Cookie-based persistence is commonly used with external load balancers. Make sure you use the same load-balancer configuration for both OWA and ECP to ensure that a user who clicks the Options button in OWA doesn't get directed to another server when connecting to ECP. For both OWA and ECP, you want to load-balance TCP port 443.

Exchange Web Services. EWS connections can be load-balanced in a manner similar to OWA. If an EWS client makes a call that keeps some form of stateful data in memory on the Client Access server (such as a subscription), then there needs to be assurance that the client will hit that same Client Access node on subsequent calls. Some EWS clients won't require persistence, but some will. Therefore, you'll want to provide persistence on your load balancer for EWS. However, EWS clients might not support cookie-based persistence, so you'll need to use persistence based on SSL session ID or client IP address. To ensure that EWS requests proxied between sites are persistent, EWS uses a special parameter on its virtual directory called InternalNLBBypassUrl. When you install the Client Access server, this parameter is set to the internal name of the Client Access server. This parameter is used to ensure that the correct Client Access node is used in EWS proxy scenarios, so you shouldn't change this URL.

Outlook Anywhere. If you used Outlook Anywhere in Exchange 2007 without persistence or with SSL session ID–based persistence, you could run into remote procedure call (RPC) problems around DSProxy. RPC connections are full-duplex connections: They require that data can be sent and received at the same time. HTTP doesn't allow for such transmissions because it's only half-duplex. So to simulate the required behavior in RPC/HTTP, two connections are established—RPC_IN_DATA for the incoming connection and RPC_OUT_DATA for the outgoing connection. Each of these connections is associated with a session ID for the clients. When the RPC component receives these connections with a matching session ID, it knows it needs to reply to RPC_IN_DATA requests over the RPC_OUT_DATA connection. If the RPC endpoint is the same for RPC_IN_DATA and RPC_OUT_DATA, it doesn't matter which Client Access server the connection is brokered through. Both the Information Store (port 6001) and the referral service (port 6002) had no problems with this in the past.

However, in Exchange 2007, DSProxy (port 6004) simply proxied these connections rather than being the actual endpoint. Because of this setup, the RPC_IN_DATA and RPC_OUT_DATA connections would sometimes be established with different domain controllers (DCs), breaking directory connections. There were some workarounds in Exchange 2007 to prevent this from happening, but the workarounds caused additional risks, such as tying Outlook profile creation to a single DC.

In Exchange 2010, this problem is resolved because DSProxy is no longer used. Instead, referrals are used for directory connections. The Name Service Provider Interface (NSPI) now exists on the Client Access server in the form of the Address Book service, so the client establishes the directory connection with the Client Access server and the Client Access server connects to a DC over LDAP. Even though persistence isn't required, you might still choose to use it to minimize session handshakes, although cookie-based persistence can't be used with Outlook Anywhere.

Load-Balancing MAPI Traffic

Having to load-balance MAPI traffic is a new problem we face in Exchange 2010. Until now, MAPI traffic went directly to Mailbox servers, so no form of load balancing was necessary. However, because MAPI clients connect to the Client Access server through the RPC Client Access service in Exchange 2010, you need to ensure that load balancing is in place if you want a fully redundant and highly available configuration. So there's a new Active Directory (AD) object called the Client Access array object introduced in Exchange 2010 for MAPI load balancing. This object lets you address a group of Client Access servers in a site as a single array. You can't have more than one Client Access array per AD site.

MAPI traffic is based on RPC, which works differently from the HTTP traffic used by other Client Access services. RPC is used to execute code on another system on the network or in a different address space on the same system. To establish an RPC connection, a request is made on the RPC endpoint mapper port, 135. From there, a port is pulled from a dynamic range (ports 1024–65535) and that port is assigned to the RPC connection. Subsequent communications occur over one of the dynamically assigned ports.

Because of this model, RPC requires a wide range of ports to be open to clients. Therefore, when load-balancing MAPI over RPC, all of these ports—135 and 1024 through 65535—need to be specified in the load-balancer configuration. However, you also have the option of statically defining the RPC ports used. If you do this, the load balancer needs to be configured for both mailbox access and the Address Book service because they use separate RPC connections. To set a static port for mailbox access, create a DWORD registry key called TCP/IP Port on your Client Access server in the following registry location: HKLM\System\CurrentControlSet\Services\MSExchangeRPC\ParametersSystem. You might have to create the ParametersSystem subkey if it doesn't already exist. For the value of this key, enter the port number you want to use. As Figure 3 shows, I'm using port 50000.

Figure 3: Setting a DWORD value to create a static port for mailbox access

For the Address Book service, you need to edit the XML file named microsoft.exchange.addressbook.service.exe.config in the Bin folder where you installed Exchange. For a default installation, that path would be C:\Program Files\Microsoft\Exchange Server\V14\Bin\. Open this file in Notepad and find the line that reads

Change the value from 0 to the number of the port that you want to use, as Figure 4 shows. Reboot the Client Access server after making these changes.

Figure 4: Editing the XML file for the Address Book service to use a static port

If you set static ports, ensure that you make this same change on every Client Access server in the array. You can then configure your load balancer to load-balance only these ports instead of the entire range. When using static ports, you'll want to ensure that the load balancer is configured for ports 135 in addition to the two static ports that you defined.

After the load balancer is configured, you're halfway to having your MAPI clients load-balanced. To complete this process, you need to create a Client Access array object that's assigned to the AD site that the load-balanced Client Access servers exist in. You can do this with the following Exchange Management Shell (EMS) cmdlet:

New-ClientAccessArray -FQDN
  outlook.contoso.com -Site "Baltimore"

Next, you have to configure each of the mailbox databases in that site to use that Client Access array. This step is important because this parameter is used to tell your Outlook clients which Client Access server is used for directory connections—remember, the Client Access server now contains the NSPI endpoint. If you already have the Client Access array object created and assigned to a site, all new mailbox databases created in that site are automatically configured correctly. However, if you had databases in existence before you created the Client Access array object, you can configure the database to use the array with the following EMS cmdlet:

Set-MailboxDatabase DB01
  -RpcClientAccessServer outlook.contoso.com

Putting It All Together

Now that we've looked at how load balancing works and understand the factors that affect load-balancing Client Access servers, let's put all the pieces together in a series of steps. You can use the following process to load-balance your Client Access servers.

Install the Client Access servers and optionally configure static ports for RPC connections by MAPI clients.
Create the DNS entries for your load-balanced names. In this article, I used mail.contoso.com for web-based clients and outlook.contoso.com for MAPI clients.
Install and configure the load balancer. Table 2 summarizes the ports and persistence settings that you need for each Client Access service.
Create the Client Access array object using the New-ClientAccessArray cmdlet.
Configure your existing mailbox databases in the site to use the new Client Access array object.

When you understand the configuration nuances of each Client Access service, setting up a load-balanced Client Access server array isn't difficult. Vendors of hardware load balancers might have some specific guidance for configuring their particular device for Exchange 2010. Therefore, ensure that you understand your vendor's recommendations in addition to the information that I've provided in this article.

Comments

Plain text