Load-Balancing Exchange 2010: KEMP Hits the Mark

Regardless of the size of your organization, setting up and running a Microsoft Exchange Server infrastructure can be a complicated proposition. Exchange Server 2010 provided numerous improvements in the way of high availability, compliance features, and other niceties and necessaries, but with new features come new complexities, such as the need for careful load balancing to make sure your Client Access servers are available to serve client connections at all times.

IT pro Charlie Muir has firsthand experience of what it takes to successfully load balance an Exchange 2010 environment. When Muir started as head of service integration at international law firm Hill Dickinson, they had just completed a migration from Novell GroupWise to Exchange 2010 but hadn't employed load balancing. Muir was instrumental in helping the firm choose and implement KEMP Technologies's LoadMaster load balancers for their environment.

Hill Dickinson consists of eight offices, most in the United Kingdom, but with one in Greece and one in Singapore. Muir's IT staff supports about 1,400 users across these sites. I spoke with Muir about his company's implementation of KEMP load balancers as well as the challenge of providing top-level service to users in his distributed environment and the unique atmosphere of working IT for a law firm.

BKW: Since you've worked for different types of companies, have you noticed a difference between a law firm and other types of businesses for their expectations on the messaging infrastructure?

CM: I think it's safe to say that within the legal environment, the reliance on electronic messaging, and email specifically, is a lot higher. The legal industry as a whole has understood the importance of messaging, email, to communicate with their clients. It's not an additional technology to help them do their job: They need this technology to do their job. And thankfully, they're willing to commit the resources to implement the solution properly, which makes our job a lot easier. In an industry where messaging is an additional mechanism of communication, or it's a nice-to-have, it's hard to get those commitments from the business to do things properly.

We're in a situation now where we have done what we wanted to do, how we wanted to do it. I've been lucky enough to do that at a number of firms now, where we can take the business requirements and translate them directly into solutions. We look at designing the business to provide a performance advantage over and above our competitors, and we hope that we can do that through the use of technology.

The direct differences are that unlike the standard business model where you have a small number of people at the top of the hierarchy over a lot of people further down, we have around 15 percent of our clients who are shareholders in the business. It's based on partnership, which is a good thing, but at times we have to cater for a larger portion of the user base who have the right to direct and to control what we do. So it makes things a little harder in some respects, and easier in others.

BKW: What was the technological situation in your environment when you came on board that led you to look for a load-balancing solution?

CM: When I arrived, the firm had just finished the GroupWise to Exchange 2010 migration, which had taken about a year. It was not necessarily done as we would have wanted to do. It was outsourced to a firm who did an in-place migration from GroupWise to Exchange. We had probably 10 or 12 different GroupWise post offices spread around our various offices, and they were migrated to Exchange 2010 multirole servers that sat next to that GroupWise box in each location.

So we ended up with a distributed Exchange 2010 model full of multirole servers, but then we had seven or eight different Exchange 2010 boxes that didn't have the requisite level of resilience or continuity. We had a DAG that shared data between them, but we essentially had one box per site. They were all Hub/CAS servers, they were all Mailbox servers, they all held public folder data, and it was very obvious very early on that the requirement of the business for performance and a highly available messaging environment meant that we needed to redesign that. The hard part had been done -- the data had been moved from GroupWise to Exchange, but it wasn't necessarily up to the standard that was required by the business.

We were in a lucky position to be able to reevaluate requirements from the business and design a messaging solution around those requirements. The decision was made to centralize in line with some additional network improvements that were going on at the same time. It allowed us to centralize our messaging environment into one location. We tended to feel it was better to have a highly resilient, highly available, high-performance solution out of a single office rather than a distributed, not necessarily so resilient solution spread across the country.

We built an entirely new environment alongside the old environment. There was no crossover between the old and new. We built a brand-new environment, a centralized environment, back here in our data center to host all the UK company mailboxes and for all the various different offices, the idea being that we would have local resilience built in to the solution.

We set all of our key services for different warranty and utility levels. We decided the messaging environment would take on our Gold level service, our highest service. We looked at the amount of downtime that was acceptable to the business and what our SLAs for our more time-sensitive recovery for those services would be. It became very clear very early that we had to try and put all our eggs in one basket to allow us to create a solution robust enough to protect them. It's ended up that we have resilience in this office for all our Gold level services through the implementation of a brand-new fully virtualized environment.

We've also moved from Dell EqualLogic SAN to Dell Compellent storage system, which has been a great step for us. And the one new environment on the back of that new network infrastructure, we've moved to Xsigo I/O virtualization. So we've gone to a fully virtualized environment on virtualized I/O, and the Dell Compellent storage has let us be flexible on the way that we stored data right across our environment, the idea being that everything we have in this office is fully resilient. So any particular component of our new Exchange environment can disappear, and we still have full resilience within this building.

We also have another continuity data center. In the event there's a continuity problem, we'd be able to fail over to it. We've applied that same rationale to all our Gold level services.

One of the risks that was identified was the client connectivity from the desktop machine to Exchange Hub/CAS servers. We have a mix of Outlook 2003 and 2010 clients that are out in the estate, so we needed a mechanism for us to control the access of all those clients. Whichever tried to authorize, whether it was the fat client or OWA or Outlook Anywhere, any method should connect with the same level of resilience as we would expect from our Exchange environment.

Our ethos is that we would look at everything from the end-user perspective. Irrespective of how resilient the Exchange environment is, if users can't connect to it, for whatever reason, the end is always the same: Can the users use the system? If they can't connect to it, then from a user-service perspective, they've lost that service. It doesn't matter whether you prove that Exchange was still up throughout that outage.

It was clear early on looking at best practices for the virtualization of Exchange that load balancing for that Hub/CAS layer is a Microsoft requirement. We've got four Hub/CAS servers that are all part of the same CAS array in our local office. So we've gone for a more resilient level of warranty as part of the service. We said that we wanted two load balancers in this office fronting that Hub/CAS layer, so effectively four servers in the Hub/CAS layer that are resilient themselves. We specified that such that we could run everything on one and two of those full servers so that at any time, up to three of those boxes could be unavailable. All those boxes are spread across different virtual hosts, on different virtual I/O, storage, and in different cabinets, on different power supplies, and so on. We would have to have something so dire for us to lose more than a couple of those Hub/CAS servers, which are fronted by two LoadMaster load balancers, which are in separate racks, on separate power supplies, to try and give us the best chance of not losing connectivity to our services.

BKW: After you had the environment specced out, it's time to choose your load balancers. You looked at the market, and what did you see out there?

CM: We looked at a number of the market leading solutions and also relied on a couple of key resources that we were working with at the moment. We wanted to try and choose best in breed. We didn't want to get too niche; we needed to pick something that we can understand. We want to pick things that people that we trust and people that we engage with are happy to endorse and support if needs be.

We have a relationship with one of the UK Microsoft MVPs, Simon Butler. We regard him very highly from an Exchange messaging perspective. So much so that regardless of the fact that we have the skills in-house to design and implement our environment, we wanted to make sure that we hadn't missed anything or whether there weren't things happening outside of our world that we were missing. Simon spoke very highly of both the F5 units and the LoadMaster units. We also took a brief look at the Citrix NetScaler and the Barracuda load balancer boxes.

We had the requirement to implement an Exchange-specific solution to start with, with a view to widening that out to additional services that we knew were using NLB or were going to have some load-balancing requirements going forward. We didn't have specifics on what they were. We just had to make sure that the KEMP units that we were looking at would go on doing it when we applied the load. The Citrix NetScaler would have handled those requirements, but it didn't have enough information to justify the additional costs involved, so it wasn't implemented. Barracuda was a strange one in that they decided not to get back to us. I've used Barracuda before, and I like their boxes, but they didn't want to engage with us. The other units we looked at were F5, which from a feature perspective were very similar to KEMP.

We had some communication with KEMP through System Professional. Both were very helpful about facilitating what we were trying to do, offering advice. KEMP's LoadMaster Sizing Guide for Microsoft Exchange 2010 made a huge difference for deciding on a model and swinging the balance toward KEMP. It's very easy to use, informative, and something I would recommend everyone who manages an Exchange environment to go and have a look at. It allowed us to let KEMP become the main player straightaway because we could get the answers we wanted. We could review our figures, change the projections, increase company size, and push through the sort of standard progression of our email environment as it has been for the past three or four years, and reassure ourselves that those units were still going to work.

We struggled quite badly with some of the other units to get some definitive answer as to what solution would work for us. We know how many users we've got, we know how much mail we push through, we know how many emails we send each day, how they connect, when they connect. Finding out which specific unit you needed from other suppliers, was difficult for us.

The cost of the KEMP units were such that we could look at them as a short-to-medium term solution. We didn't necessarily have to work out a five or ten year ROI on an £80,000 solution when we could work out a two or three year plan for a cheaper solution that we were sure could meet the requirements. So that's what we went for.

Over the past nine months, our decision to go with KEMP has been ratified -- I know it has because we have some queries over the validity of other things we've done but KEMP isn't one of them, thankfully. It's not one that we have to worry about. They're unique boxes to look at, but the only time I ever know they exist is when I walk into that room maybe once a month and they're the two yellow boxes staring at me. But more than that, they just do the job. I don't have to worry about them.

BKW: Have they changed your environment in other ways? Do they perform in ways that were unexpected?

CM: They certainly have stolen a lot of our NLB load. We had a number of different systems that used Microsoft NLB to sort of take that load-balancing load, and some of them quite complicated, to be honest, so that we could try and still keep that resilient aspect of NLB but not need a dedicated load balancer. So over the past six months that the KEMP LoadMasters have been duly implemented, we've seen a slow move from NLB solutions to the load balancer, with more to follow.

The KEMP devices, from an administration perspective, are simple enough for us to manage internally to the extent where we now have two 2600 devices in this office and we have a 2200 device in our test environment. We can give access to any of our infrastructure engineers to that 2200 test site to have a look and learn how they work. We've had engineers who have never touched load balancing -- people who understand the concept of what it does but they've never seen a KEMP device, they've never handled a load balancer before -- and within hours they're submitting releases to our change advisory board, how they're going to implement this in the live environment, what the impact is going to be on the services that they recommend.

There really is a large amount to understand, but it all condenses onto a single page on the web console. We easily can understand it through the Help, and if we needed to, can go back to KEMP. But we really haven't had that many requirements to go back and ask them about anything.

We front-ended our physical digital dictation service, and made that solution resilient with the LoadMasters. We've also done our practice management and financial recording system through there as well -- all through those 2600 boxes.

BKW: Anything else you wanted to mention about how things are going?

CM: We've just gone through an update of the devices to the latest version. We've deployed them, we configured them, and we've updated them, and they've just done what we've asked them to do in the meantime. Not a lot else to say.

For more information about load balancing Exchange 2010, see Ken St. Cyr's article "Exchange Server's Client Access: Load-Balancing Your Servers."

Follow B. K. Winstead on Twitter at @bkwins
Follow Windows IT Pro on Twitter at @windowsitpro

Comments

Plain text