Lync Server 2013 Routing Groups

With the introduction of routing groups, determining where a user account resides in a Lync Server 2013 environment is a little more complex than determining its location in a Lync Server 2010 (or earlier) environment. In addition, because of the integration of Windows Fabric into Lync Server 2013, Microsoft highly recommends having three Front End servers in a pool, which impacts how users are divided among those servers. To understand the effects of routing groups, Windows Fabric, and the new recommendation, I'll discuss the routing group architecture and how a server failure can affect users and the Front End servers on which they're homed.

Routing Group Architecture

In Lync Server 2010, a hash algorithm determines where a user exists in the Front End pool, and there isn't anything an administrator can do to control it. Because of this setup, you always know on which Front End server the user is homed.

Lync Server 2013 doesn't use this hash algorithm to determine where a user is going to end up. Instead, when you provision a user, the user is immediately put into a particular routing group. Each routing group is assigned to a particular Front End server until that server is rebooted or goes down. The routing information can be seen in the AD object attribute msRTCSIP-UserRoutingGroupId.

Each routing group has a primary, secondary, and tertiary copy when at least three Front End servers are deployed. Windows Fabric allocates the primary copies of the routing groups to different Front End servers, making sure that a single Front End server doesn't contain multiple primary copies of routing groups while another Front End server has none. That way, if a Front End server goes down, user disruption is limited. In addition, Windows Fabric makes sure that a Front End server contains only one copy of a routing group. For example, it makes sure that a Front End server doesn't contain the primary and secondary copies of a routing group or the secondary and tertiary copies of a routing group.

Server Outage Scenario

When at least three Front End servers are deployed, routing groups are relocated if a Front End server containing a primary copy of a routing group goes down. When the Front End server goes down, users will experience a momentary disconnect. The Lync client disconnects during the shuffling of routing groups on the Front End servers, then reconnects once the user's secondary copy of the routing group establishes a connection to another Lync Front End server.

Because of this reshuffling, the routing groups might be assigned to different Front End servers. The best way to see what might occur is with an example.

Suppose you have a Lync Server 2013 pool that contains four Front End servers, which contain the primary, secondary, and tertiary copies of three routing groups, as Figure 1 shows.

Figure 1: Front End Servers and Their Routing Groups

Let's say that the FE3.Contoso.com server goes down due to a hardware failure. In response to this failure, the primary, secondary, and tertiary copies of routing groups are rearranged on the servers that are still running. Figure 2 shows the new arrangement. Notice that the primary copies of the three routing groups are on separate Front End servers. Also notice that some of the secondary and tertiary copies were moved so that each Front End server contains only one copy of a routing group.

Figure 2: Location of the Routing Groups After FE3.Contoso.com Goes Down

After the hardware problem is fixed and the Front End server is restored to a healthy state (i.e., the Lync-related services are running and the Lync Front End server is able to act as a Session Initiation Protocol, or SIP, Registrar to users again), the routing groups will shuffle once again. However, the routing groups won't necessarily be located on the Front End servers where they resided before the service disruption. Instead, they're assigned to the Front End servers in a way that avoids user disruption (i.e., disconnecting users).

For example, Figure 3 shows what happens when FE3.Contoso.com comes back online. The primary copies of the routing groups remain on the current servers to avoid user disruption, while the secondary copies of all three groups move to FE3.Contoso.com. This is the new current state of routing group location for each Lync Front End server.

Figure 3: Location of the Routing Groups After FE3.Contoso.com Comes Back Online

Get Familiar with Lync 2013 Routing Groups

Windows Fabric makes shuffling routing groups possible. The good thing about the way this technology works is that it "just works." You don't need to worry about which routing groups should go where when you're installing or configuring Front End servers. Everything happens in the background. Although I concentrated on what happens during a server failure, the same process occurs if a Front End server needs to be rebooted after regular maintenance or Lync-related services are stopped and restarted.

Comments

Plain text