Several years ago, Microsoft used to ask, “Where do you want to go today?” as an advertisement for the Windows OS. Today, we can apply this same question to consider how we want to deploy Microsoft Exchange Server systems for our organizations. Several alternatives are available, including hosted email platforms, virtualized email platforms, and traditional on-premises physical hardware deployments—as well as various combinations of all these options. Because of the large number of available options, Microsoft’s old marketing question has never been more accurate than it is now. To choose the best deployment approach for your organization, you need to consider each option, including how all the options might be a good match (or a poor match) for the needs of your company.
In this article, I explore some of the options for deploying Exchange Server, including the following:
- Installing Exchange Server on physical onsite servers—This option includes new decisions for Exchange Server 2010, such as continuing to use SAN storage, switching back to DAS, and determining the number of copies of data that are necessary.
- Virtualization—The ubiquitous deployment of virtualization for large and small solutions in the past 5 years has increased the benefits of Exchange Server virtualization, but challenges still exist.
- The cloud—Not a new concept, cloud deployment of Exchange has existed since at least Exchange Server 5.5.
- Hybrid environments—This approach combines local and cloud-based Exchange Server deployment.
The Traditional Choice
Exchange Server is considered the predominant business email system worldwide, with about 360 million seats deployed. Traditionally, Exchange is installed locally, on physical servers. Advantages of local Exchange deployment over virtualization and cloud-based solutions include the fact that no additional software is required, bare-metal processor and disk performance are available, and no additional Internet bandwidth is necessary.
No additional software. Exchange requires an OS, regardless of whether that OS is virtualized. However, whenever virtualization comes into play, some type of hypervisor (e.g., Microsoft Hyper-V, VMware ESXi, Citrix XenServer) is necessary to run the OS on the physical hardware. This extra software requires additional training, ongoing support, and maintenance, therefore adding to the overall support burden and cost associated with maintaining an environment.
Cloud-based solutions often require several pieces of additional software, including software solutions that provide for single sign-on (SSO), so that users don’t have to log on to each cloud-based solution separately; federation services, so that logon isn’t required for cloud-based solutions; and Control Panel or other administrative software that provides systems administration. Some of these solutions might require installation on individual end-user workstations; other solutions might require their own servers and sets of specialized knowledge and administrators.
Improved processor performance. All hypervisors consume resources, which reduces the resources available to the OS image. Furthermore, a hypervisor is responsible for scheduling the resources available to the physical machine—and regardless of the number of virtual machines (VMs) present on a physical machine, the hypervisor always provides relatively small slices of processor time to each VM. This can cause the appearance of latency or jitter in a VM’s operation, which is a major reason that virtualization isn’t supported for many real-time or voice/video applications. In addition, the processor requirements of one VM can be so large as to affect the performance of other VMs. When combined, these problems can be reason enough to have dedicated physical servers assigned to certain workloads and applications.
Solutions that are based in the cloud have different performance requirements. The typical SSO application that’s distributed to each end-user workstation has a negligible effect on the performance of an individual workstation. However, the infrastructure required to deploy and maintain that application can consume significant manpower. Federated services might also require additional local servers.
Improved disk performance. With the traditional approach, you don’t have to worry about the overhead associated with disk drive virtualization. Modern hypervisors typically provide access to disk resources in three different formats: virtualized disk (VHD or VMDK formats), pass-through disk (assigning dedicated DAS and accessing that DAS without a virtualization layer), and iSCSI/Fibre Channel (accessing remote block structured disk resources as if they were local). However, one of the key design characteristics associated with good Exchange performance is an appropriately designed disk subsystem—which holds true regardless of the deployment method.
In regards to cloud-based deployment, you depend on the company providing the hosted solution to design a disk subsystem that meets your needs, as well as the needs of all the other customers with whom you share hardware.
No additional Internet bandwidth. When comparing a virtualized solution to the traditional solution, there’s no expectation that the Internet bandwidth requirements differ between the two. Both the traditional and virtualized solutions are generally designed and deployed presuming that most usage occurs locally and doesn’t cross the LAN/WAN-to-Internet link.
But this isn’t the case with cloud-based solutions. Instead, all communication between a company and the hosting provider occurs over the Internet. For some companies, this might mean that additional or redundant Internet bandwidth is necessary.
Modern virtualization solutions depend on some deep concepts in OS kernel design and in the processor hardware itself, to ensure that one OS image can’t adversely affect another (and, in fact, can’t affect it at all). This applies regardless of what the VMs might be executing—whether Exchange Server, Microsoft SQL Server, file and print services, etc. Each VM is protected from the code executing on all other machines.
In terms of cloud versus virtualization, the issues with the cloud are almost identical to the issues with the traditional solution. Therefore, except where differences exist, I don’t mention those problems in this section. Some of the problems that I mentioned in the previous section might make you wonder why anyone would use virtualized resources at all. The answer is simple: high-performing hardware.
Processor performance that exceeds system requirements. Exchange Server was initially released requiring a processor with only a tiny percentage of a modern processor’s power (a 66MHz 80486, for you history buffs). Granted, Exchange’s feature content in those days, and the accompanying processor requirement for those features, was much less than today’s—and we’re fortunate that Moore’s Law has provided us with multi-core processors that supply many times the level of performance required for most servers. This benefit can make it cost effective and resource effective to run many virtualized servers on top of a single physical server platform, even with the associated overhead of controlling those virtualized servers by the hypervisor.
This can mean that when running Exchange on physical hardware, the system might simply sit idle most of the time: consuming power, generating heat, but performing no useful activity. Although having growth potential and headroom is desirable, using resources efficiently via virtualization is even better.
Cheap memory. One thing that high-performing OSs like is memory. The more memory, the better! The initial version of Exchange Server required 32MB of RAM—a huge amount of RAM in 1996, but it supported 100 to 400 users per server. Today, an Exchange server requires a minimum of 4GB of RAM, and a multi-role server typically needs 12GB to 16GB of RAM to support 100 to 400 users.
Even low-end computers come with at least 2GB of RAM, with 3GB of RAM as the standard for consumer-level 64-bit computers. At the enterprise level, 4GB or even 8GB of RAM is increasingly typical for the average workstation or laptop. High-powered laptops can often support 16GB of RAM.
On servers designed to support virtualization, 64GB or 128GB of RAM is common. Even for OS images of application products (such as Exchange Server and SQL Server) that demand a large amount of RAM, it can be very cost effective to combine multiple servers into one and still have more than enough memory to meet the needs of those applications.
Cheap disk storage. The price of storage is very specific to Exchange Server. In Exchange Server 2010 and Exchange Server 2007, the Exchange Server product team applied extensive engineering changes to optimize Exchange’s behavior on slow and inexpensive disk. In fact, an array of slow 3TB 5,400rpm disks is quite capable of providing primary or archive mailbox storage for Exchange 2010. This type of slow disk is typically referred to as Just a Bunch of Disks (JBOD). Slow disk gives a great deal of flexibility to Exchange storage planners who intend to virtualize. It’s entirely possible to use slow disk and achieve performance that can adequately support the load of hundreds, thousands, or even millions of mailboxes with Exchange 2010—given properly configured hardware.
However, using cheap disk has definite downsides. Cheap disk tends to fail more often (i.e., it has a lower mean time between failures—MTBF). A JBOD subsystem is designed differently than what a typical Exchange administrator is used to and usually requires additional monitoring. With the decreased MTBF of the disks, you need to have more spares on hand and be prepared to repair the subsystems more often, which might increase your costs.
You can use any of the following for an Exchange disk subsystem:
- Virtualized disk resources (VMDK or VHD files)
- RAID array(s) of DAS using slow disk
- RAID array(s) of iSCSI or Fibre Channel disk using slow disk
- Pass-through disk using slow disk
Of course, any of the following older solutions that use faster disk are also still available:
- Virtualized disk resources on fast disk
- RAID array(s) of DAS using fast disk
- RAID array(s) of iSCSI or Fibre Channel disk using fast disk
- Pass-through disk using fast disk
- SAN virtualized disk
Using cheap disk brings up a few caveats that you should be aware of. First, using slow disk is possible because of the tradeoff with memory utilization—that is, you shouldn’t use slow disk and only the minimum memory for an application, because performance won’t be ideal and in fact might not even be acceptable. Second, Microsoft supports using non-RAID solutions for mailbox storage (either a primary or archive mailbox), but only when at least three copies of the mailbox data exist across multiple servers (i.e., when using a database availability group—DAG—with a primary copy and a minimum of two secondary copies). If you don’t have three copies of your data, for data safety and security reasons you should continue to use RAID-based solutions. Again, JBOD solutions require monitoring for each disk, as well as the awareness that these disks tend to fail more quickly than their more expensive counterparts. Finally, you can continue to use SAN disk storage or high-performance DAS, which might let you support more mailboxes per server than using JBOD. However, JBOD can provide excellent performance for the maximum recommended active mailboxes per Exchange server (4,000 to 5,000); therefore, using expensive disk isn’t necessary for performance reasons—although companies such as NetApp and EMC offer solutions with very desirable features.
The cost of power. Each physical server consumes a certain amount of power. Typically, that amount of power is somewhat constant on a per-server basis. Of course, that’s not strictly true—modern servers often can shut down underutilized processor cores, but when you compare, for example, three physical servers to a single physical server running three virtualized images, the server running the virtualized image will typically consume much less power. The power needs of a single server might not seem significant, but the heat generated by even a single server is noticeable. As more servers are added, the power factor becomes more significant quite quickly, as does the amount of heat generated. Lower power requirements mean lower air-conditioning requirements—and therefore even lower power requirements.
Underutilized network resources. Today’s typical network is much faster than networks only a few years ago. In the United States, 100Mbps to the desktop is common, and the switch fabrics in many server rooms are 1Gbps or even 10Gbps. Network virtualization is a mechanism for sharing network bandwidth among multiple VMs.
Other than backup applications, it’s uncommon for today’s networks to be fully utilized. Of course, this is a generalization—some companies do have heavily utilized networks, and it’s quite possible to overcommit or saturate any network if either the physical or virtual networks are poorly designed.
In cases in which 1Gbps or 10Gbps isn’t enough for a virtual server, because of hosting multiple VMs that require high network usage, hypervisors support assigning multiple virtual network cards (as well as multiple virtual network switches) to virtualized images. Network bandwidth that’s available to the VMs is limited only by the number of card slots available to install physical NICs.
Hardware independence of VMs. Although I discussed some of the challenges associated with virtualizing hardware, I haven’t yet covered the good part: hardware independence. When you build a VM, the NICs, CPU, IDE disk drives, etc., are all virtualized to a common set of hardware. That hardware is exactly the same regardless of whether you’re running your hypervisor on the absolutely latest hardware or a server that you built three years ago. It’s very easy to move a VM from one computer running Microsoft Hyper-V to another computer running Hyper-V or from one server running VMware ESX to another server running ESX. (Moving images between hypervisors is more challenging, but that’s not what I’m talking about here.)
If you have a catastrophic failure on a server, restoring the VMs to operation is a simple matter of installing the hypervisor on another physical server and then restoring the server images from backup (or from another cluster member). This is far simpler than rebuilding a server from a system state backup or any type of bare-metal backup. Simple restores from backup make great sense for Exchange roles that maintain little state information (e.g., a Hub Transport) or no state information (e.g., a Client Access server). However, a simple restore from backup isn’t such a good idea for a Mailbox server that contains crucial data.
Infrastructure reduction. Combining many of the strategies that I discussed in this section obviously leads to having fewer physical servers. And fewer physical servers tends to reduce the number of physical switches, the number of physical racks, the amount of power, the amount of air-conditioning and air-handling systems, etc. These changes can make an operation more efficient and can help offset the costs that might be associated with switching to virtualization.
I admit that I’m a little tired of hearing about the cloud. The name is new, but the concept isn’t—the cloud’s basic functionality has been around for more than a decade. Of course, the cloud of today is more developed than the cloud of a decade ago. Ten years ago, you could do hosted Exchange. You could do hosted websites. You could do lots of things—individually. But few integration capabilities existed, and feature content was low.
Today, Microsoft’s primary cloud offering is Business Productivity Online Standard Suite (BPOS), which includes hosted Exchange, hosted Microsoft Office SharePoint Server (Windows SharePoint Services—WSS, not Microsoft Office SharePoint Server—MOSS), hosted Live Meeting, and hosted Microsoft Office Communications Server (OCS). All these products have limited functionality when compared with their on-premises counterparts. However, at a cost of $10 per month per user for the suite (US retail pricing), hosted applications can be very attractive to small businesses.
Microsoft is currently preparing its Office 365 offering, which might be available by the time you read this article. Office 365 will provide version upgrades to the BPOS product line, as well as significant additional feature content. Office 365 will add BlackBerry Enterprise Server (for free), Microsoft Office Web Apps, Exchange Server 2010 (upgraded from Exchange Server 2007), and Microsoft Lync 2010 (upgraded from OCS 2007). These features will provide IM and presence, the Lync Meeting replacement for Live Meeting, and PC-to-PC calling, as well as support audio, video, and desktop sharing. A full SharePoint 2010 installation is also included, rather than the limited WSS, which will let SharePoint-based organizations publish professional corporate public websites.
All of Office 365’s solutions come with a much richer experience than BPOS in terms of command and control. Specifically from an Exchange perspective, Exchange 2010 adds configuration capabilities for both users and administrators within the Exchange Control Panel (ECP). The Office 365 Control Panel goes even further: Many per-server and per-organization settings that were only previously available from PowerShell or Exchange Management Console (EMC) can now be executed from within the Office 365 Control Panel. With all of its added functionality and capabilities, Office 365 might become a real contender for replacing some onsite services with cloud-based services.
In most environments, implementing the solutions included as part of BPOS or Office 365 on premises will require a minimum of one server per application, an Active Directory (AD) infrastructure, a significant Internet connection, and software costs for the Windows servers, application servers, and CALs. A matching ROI could take a significant number of users or an extended period of time to achieve. However, for organizations that already have most of the requisite infrastructure, converting capital expenditure (CapEx) to a monthly operational expenditure (OpEx) might not be a reasonable choice.
A significant advantage to cloud-based mailbox servers is that you aren’t subject to the whims of your local Internet provider as to whether email gets received for your company—at least as far as the destination email server—or whether email gets sent by your company (i.e., from the source email server). But if your local Internet provider is down, you can’t access that email server, which makes that advantage somewhat moot. For geographically dispersed companies, having email in the cloud might remove access concerns for the “other” locations (i.e., the locations where the email servers weren’t located previously). In that case, the remote locations are no longer dependent on the central location being available from the Internet. However, for many global companies, large or sufficient bandwidth already exists between various facilities, so this configuration doesn’t really represent an advantage.
Another potential advantage to cloud-based mailbox servers is that the local organization is no longer responsible for backup or recovery—those operations become the responsibility of the hosting company. However, this setup introduces significant complexity into the decision making process. First, an organization must consider whether the hosting company’s backup and recovery options meet the organization’s legal and corporate policy needs. Organizations are subject to a variety of data content and retention requirements based on the type of business and the country in which they’re located. These requirements include but aren’t limited to retention policies, data searchability and discovery, cleaning data after a secure data spill, containment of information such as credit card and other Personally Identifiable Information (PII), and the physical location of data. Organizations must look closely at a hosting company’s policies and capabilities.
The availability of the hosting company’s services must also be carefully measured and monitored. Regardless of the promises from hosting companies, downtime does occur—it’s inevitable. Your agreement should include specific availability requirements and consequences if those requirements aren’t met. This type of agreement is typically known as a service level agreement (SLA) and should also include mechanisms for the escalation of issues, notification of problems, specific definitions of who owns the data located at the hosting company, and how it can be recovered in the case of any termination of services.
Of course, no job is complete until the paperwork is done. To ensure that your company is receiving the appropriate value for the money spent, the hosting company should provide detailed billing and reporting. Large organizations that contract for dedicated hosting services can also require regular audits of the hosting provider.
None of these hosting requirements are free. Small organizations might ignore these criteria, but medium and large organizations can’t afford to—the price is worth the guarantee of availability.
With Office 365, it’s possible to have some capabilities in the cloud (off premises) and some capabilities on premises. In fact, using Active Directory Federation Services (ADFS) or Forefront Identity Manager (FIM) 2010, you can effectively extend your onsite AD into the cloud. For example, you can opt to have 80 percent of your Exchange mailboxes in the cloud and 20 percent of them on a local Exchange server. It’s unlikely that such an advanced scenario will be the typical solution for a small-to-midsized business (SMB). But a hybrid on-premises and off-premises solution might be a desirable option for larger organizations, especially those looking to dedicated Office 365 solutions.
Of course, onsite and offsite aren’t the only types of hybrid solutions available. It’s also possible and quite common to virtualize some Exchange servers and not others. For example, to ensure that maximum performance is eked out of available disk subsystems, and to ensure that there’s no jitter in your voicemail, you might put your Mailbox servers and Unified Messaging (UM) servers onto physical hardware but put your Hub Transport and Client Access servers onto virtualized servers.
The ultimate hybrid solution might combine all the options. For example, you could have local physical servers, local virtualized servers, and some services in the cloud.
The Best Solution
Which solution is best for your company? The answer depends on many things. To find the best solution for your environment, due diligence requires you to examine each consideration and assess its applicability to your organization.
The cloud can potentially reduce local infrastructure requirements, but it can also raise many questions or issues around data storage, data recovery, security, legal requirements, availability, reporting, and SLAs. It’s impossible to make a concrete recommendation without knowledge of a particular company’s requirements in these areas. Using cloud-based solutions makes a company extremely dependent upon Internet access and availability—which is a crucial component of buy-in to any cloud solution.
Although virtualization can also potentially reduce local infrastructure requirements (but for different reasons than cloud solutions), virtualization adds a piece of software (i.e., the hypervisor) that must be learned, supported, and maintained on every physical server. Virtualization also requires a change in design paradigms and therefore isn’t necessarily a solution for all server needs.
The traditional choice is the easiest one—the one we’re all familiar with—but this choice can lead to gross inefficiencies, such as having multiple physical servers when only one would suffice. Traditionally, the decision between onsite and offsite has been primarily about control: You can control more of your installation’s features with onsite solutions (although that gap is slowly shrinking).
Moving Toward the Future
Most companies already have experience with cloud-based services, which might include anti-spam and antivirus solutions, patch management solutions, etc. Even on an individual basis, many of us use the cloud. After all, what are cable television and land-line telephones except different kinds of clouds?
Some companies will always keep their solutions local, to meet the requirements of their business. Still, many small companies have jumped eagerly onto the cloud bandwagon for their Exchange deployments, and many medium and large organizations have already virtualized some or all of their Exchange infrastructure. Moving other services into the cloud seems to be a growing trend, even though progress toward the cloud is still slow and careful. We’re still very early on the adoption curve, but more is certainly yet to come.