Anyone dedicated to trivia will note that the code name for Microsoft Exchange Server 2007 was Exchange 12, but the next major release of its mail server has been code named Exchange 14. Microsoft skipped 13 for the same reason that many hotels don’t have a thirteenth floor—superstition! Exchange 14 is expected to ship in late 2009 and have a final name of Microsoft Exchange Server 2010. Exchange 2010 follows up the architectural changes made in Exchange 2007 with some big updates of its own to give the product better performance and make it more resilient and easier to manage. The most important changes fall broadly into the categories of an Information Store refresh, a new approach to high availability, management and administration updates, and messaging compliance improvements.
Enhancements to the Store
Exchange has always been a challenging application for storage because the I/O profile of a busy mailbox server consists of many random small I/O operations rather than the predictable I/O patterns you see in other database-centric applications. This situation can be explained by the huge variety of messages that an Exchange server handles—from the simple, one-line message sent to a single recipient to the multimegabyte message (including attachments) sent to nested distribution lists. Obviously these transactions create radically different I/O demands.
Microsoft greatly reduced disk I/O with Exchange 2007, largely by trading the extra memory made available by using the 64-bit platform to cache as much Store data as possible. This process resulted in a significant I/O reduction per active mailbox—except in the case of large mailboxes. The problem with large mailboxes is that users tend to keep thousands of items scattered around hundreds of folders. The more items and folders in a mailbox, the more work the Store has to do to organize and maintain the indexes that underpin the mailbox. Windows Desktop Search with its Microsoft Office Outlook integration lets users become even less organized: If they forget where something is in their large folder structure, it’s easy to perform a search to find the desired item.
So, although Exchange 2007 made real improvements by optimizing Store caching, human behavior meant that further work was necessary for Exchange to effectively support very large mailboxes. As it happens, Microsoft had previously assessed whether they could move the underlying Store database engine from Extensible Storage Engine (ESE) to Microsoft SQL Server. The engineering investment to make this change proved too great, which is why Exchange still uses ESE. However, the investigation reviewed some fundamental aspects of the Store database, including its schema and tables. As a result, some changes to aid performance were included in Exchange 2007, notably the increase in page size from 4KB to 8KB and smoother I/O transactions. Further performance improvements in Exchange 2010 include:
- Increased page size from 8KB to 32KB—With this change, more data can be stored in a single page, avoiding the need to scatter across the database the pages required for a single item, including any attachments.
- Header data for all mailbox items is stored in a single database table—This change makes the database more efficient because it can process a single table for a mailbox during a client session instead of accessing different tables for different mailbox folders. A side effect of this schema change is that Exchange no longer uses Single Instance Storage (SIS) to keep just one copy of message content per database. Most servers support multiple databases, so the efficiency gained from SIS is less and less as time goes on.
- The Store compresses attachments—Microsoft calculates that the CPU time spent compressing and decompressing attachments is less than the work required to manage the storage of very large uncompressed data within the database. This change also reduces the overall size of Exchange databases, which speeds up operations such as backups.
- The Store updates views (indexes) only when they're accessed—An Outlook client can create many different views for a folder on the fly (e.g., items ordered by subject), and the Store maintains these views within the database. The Store ages unused views out after 40 days, but it needs to maintain views until then. Updating views only when needed eliminates a lot of background processing.
Microsoft’s initial performance results indicate that the new Store generates substantially fewer I/O operations than its Exchange 2007 equivalent. Reducing I/O lets servers support more mailboxes as well as allowing additional flexibility in storage options. Traditionally, large mailbox servers have used high-end storage configurations such as SANs to deliver excellent I/O performance with maximum reliability. If Exchange 2010 delivers a smaller I/O footprint and better resilience, system designers might be tempted to use lower-cost Serial ATA (SATA) and Just a Bunch of Disks (JBOD) storage. Companies that use SANs can continue to do so, especially when they’ve made that choice because they manage storage centrally rather than on an application by application basis. Microsoft’s drive to support lower cost storage through better I/O performance is a good thing, but changes will still occur in the code before Exchange 2010 ships, so we’ll have to wait a bit to know how to optimize storage for production environments.
High Availability at the Core
Exchange 2007 introduced log shipping to let administrators replicate data to local disks (local continuous replication—LCR), to another node in a cluster (cluster continuous replication—CCR), and to a server in another data center (standby continuous replication—SCR). Microsoft builds off this log shipping technology to make high availability a core characteristic of Exchange 2010. Microsoft is shaking up Exchange’s high availability feature set through four key steps:
- The concept of storage groups is eliminated, so the database becomes the management unit to plan high availability around—this is a sensible step given that log replication works only for a storage group containing a single database.
- Single copy clusters are eliminated and not supported in Exchange 2010. Microsoft is moving toward the idea that maintaining multiple copies of data on multiple servers delivers better high availability than attempting to update a single copy of data. Microsoft has also removed LCR from Exchange 2010 because log replication on the same server delivers limited value.
- Exchange 2010 introduces Database Availability Groups (DAGs), which are groupings of up to 16 servers in which some or all of the databases are marked for replication to one or more other servers. Microsoft uses some components of Windows clustering (e.g., heartbeats, the file share witness) to connect servers within the DAG, which can span physical locations. The big feature is that you can replicate databases to multiple servers within the DAG through log shipping, so locations within a DAG must share sufficient network resources to be able to copy logs quickly enough so that queues of unplayed logs don't build up; think of this requirement as being similar to that of SCR today. Replication targets are chosen at the database level rather than the server level, so you can replicate different databases from a server to different servers within the DAG. For example, a server in New York that has two databases could replicate one database to a server in Los Angeles and the other to a server in Seattle. The live database is referred to as the master; if a problem occurs with the master database, a component called the Active Manager switches to one of its replicas and makes it the live master. Microsoft includes management for DAGs in Exchange 2010's version of Exchange Management Console (EMC) and adds Exchange Management Shell (EMS) commands, so you can control DAGs through the GUI or the command line.
- A new component in Exchange 2010 called the RPC Client Access Layer upgrades the Client Access server role so that all client connections flow through a predictable point in the network. With the potential for live copies of databases switching between servers, clients can become confused when they attempt to connect to a mailbox. Exchange 2007 introduced the Client Access role, which manages connections from all clients except MAPI (i.e., Outlook). In Exchange 2010, the Client Access role determines which server currently hosts the live copy of a mailbox by reference to the DAG information, which is held in Active Directory (AD), and is therefore able to redirect clients when a database has been switched.
There are challenges with any high availability solution. Some obvious problems that deserve consideration are how third party backup software will deal with DAGs and what role offline backups play after you deploy Exchange 2010. The introduction of DAGs indicates that Microsoft is heading toward multiple database replicas as the primary solution for data availability: Because multiple replicas are available, you should be able to revert to a replica database if problems occur with your live copy. Therefore, the importance of backups, especially tape backups, is lessened. Of course, administrators will have to deal with challenges such as audit requirements that might insist on offline, secured backups; providing sufficient storage and network bandwidth to handle multiple replicas and log shipping; and the inevitable updates to operational procedures necessary for backups, restores, and the loss of a disk or server.
Client Access server sizing could be another challenge. In Exchange 2007, the vast majority of Client Access workload is generated by Internet client access, including Outlook Anywhere. In an Exchange 2010 environment, the introduction of the RPC Client Access Layer means that the Client Access server has a heavier workload, so you'll find that some current Client Access configurations are undersized for the new workload.
Improved Management and Administration
Microsoft has made many improvements to Exchange’s manageability, and certainly the combination of EMC and EMS in Exchange 2007 lets most administrators get their work done fast and efficiently. Both components are upgraded in Exchange 2010 to accommodate the new features and to support Windows PowerShell 2.0, which is based on Microsoft .NET Framework 3.5. PowerShell 2.0 supports remote management, so you can connect to a remote Exchange server and execute commands on it as easily as you can on a local server. In addition to new commands for features such as DAGs, some older commands are upgraded; for example, the Move-Mailbox command now supports an -online switch so that you can move mailboxes even when users are connected.
The introduction of role-based access control (RBAC) and a lightweight web console to perform a restricted set of operations are two important management changes in Exchange 2010. RBAC associates the necessary permissions with a role to let someone holding that role do their job effectively. We've seen the concepts of roles and associated permissions before (think of Exchange Recipient Administrator), but Exchange 2010 gives you a way to define custom roles for your organization, define the tasks that the roles perform, and associate the permissions to allow those who hold a role to do the job. For example, you could create a Help desk role with the necessary permissions to create new mailboxes and reset passwords and such common tasks, then assign that role to the users who take care of such tasks. If you grant users the role, they automatically inherit the permissions. If the role is taken away, they lose the permissions.
The big difference here is that the permissions are associated with tasks rather than AD objects such as servers and mailboxes. Thus, if you decide a role should be able to manage mailboxes, behind the scenes the role inherits the permissions required to fulfill the task. This aspect, together with the ability to set a scope of objects for a role to work with—for example, only mailboxes that belong to certain servers or only mailboxes in Germany—creates a logical and flexible approach to distributed management that should be popular with medium to large organizations. Smaller organizations will see less value in RBAC because they often have only one or two people in IT so offloading work isn't an option.
Exchange has always included a management console, and the console includes the ability to execute tasks that are often performed by Help desk personnel, such as setting up new mailboxes or editing mailbox properties, as well as tasks that you might not want available from the Help desk, such as creating new transport rules. Exchange 2010 adds the Exchange Control Panel (ECP), a web-based interface that lets administrators assign the ability to perform specific management tasks, using RBAC, to individuals. Smaller installations probably won’t see much value in ECP, but it should be a popular feature in enterprise-class deployments.
Messaging Compliance Improvements
Microsoft created a base for messaging compliance in Exchange 2007 with messaging records management (MRM) and transport and journal rules. Unfortunately, some aspects of MRM were incomplete and difficult to deploy, such as the requirement to publish message classification definitions via XML files to each Outlook client. However, transport rules were a welcome advance, eliminating the need to write code to perform special message processing, and journal rules let Exchange efficiently capture messages. These rules depend on the architectural change Microsoft made in the transport system to force every message to flow through a Hub Transport server, even if sent to a local recipient. The Hub Transport server therefore functions as a single place where messages can be examined and processed.
Microsoft builds on MRM with some new features and by tweaking some implementation details. For example, a new records management role is defined in ECP that lets assigned individuals perform email discovery searches. Auditing will track such searches to prevent user abuse. Archiving is more granular, so you can decide to archive only messages that meet certain conditions rather than everything sent by mailboxes in a specific database or by a specific user, as is the case today. For example, you can archive messages only if the sender and recipient are in different departments or if they are located in Austria.
Exchange 2007 also introduced managed folders, each of which can have a different retention time. As it turns out, users just didn’t get their heads around managed folders, so Microsoft is pursuing a different approach by focusing on tags as the basis for message retention. Administrators can define a set of tags, such as “Important,” “Long-term archive,” or “Do not delete.” Each tag has its own retention policy (such as “Never delete these messages”). When users apply tags to messages, Exchange applies the appropriate retention policy when its management agents scan mailboxes. It’s too early to know whether tags will be any more successful than managed folders as the basis for message retention.
Exchange 2010 also includes new MRM policies so that administrators can provide users with the ability to archive messages without having to move them to a PST. PSTs are horrible to deal with from an administrator’s perspective—hard to backup and restore, difficult to search thoroughly for e-discovery—so this change is a welcome one.
The Future for Exchange Clients
It's long been standard practice for Microsoft to release a new version of Outlook alongside a new version of Exchange. Exchange 2010 is part of the Office 14 wave, so Microsoft will upgrade Outlook, Outlook Web Access, and Pocket Outlook (on Windows Mobile 7.0 clients) to add new features, improve usability, and accommodate the architectural changes in Exchange 2010, including some performance improvements within Outlook to deal with the demands of very large (>2GB) mailboxes. After all, there’s no point in Exchange being able to support very large mailboxes if its premier client finds it difficult to process those mailboxes, which is often the situation today.
The biggest thing you’ll notice in the client UI is a focus on conversation views where you'll be able to process complete sets of messages that make up a conversation more efficiently than you can today. MailTips, small balloon-like messages, will appear to warn users whenever an action might not make sense. For example, you’re about to use Reply to All on a message that includes 3,000 recipients. Other tips will tell users when recipients can’t receive messages because their mailbox is full or if they're out of the office and won’t be able to respond. OWA will also support MailTips and conversation views.
The Exchange 2010 Environment
Microsoft plans to release only a 64-bit version of Exchange 2010 for production, but they might again provide a 32-bit test version. Of course, now that Microsoft has Hyper-V in its armory, you can expect that Exchange 2010 will be a good candidate for virtualized deployments, albeit with the normal caveats that roles such as Client Access and Hub Transport are more suitable for virtualization than high-end Mailbox servers. Unified Messaging servers remain a poor choice for virtualization because of the demands of audio processing for voicemail. Given that experience with virtualization grows all the time, it’s wise to check with Microsoft for the latest news on support for your favorite application.
Exchange 2010 isn't supported for Windows Server 2003, so you'll have to deploy it on Windows Server 2008. As usual, Exchange 2010will have other prerequisites, such as the latest version of the .NET Framework, PowerShell 2.0, and some schema updates for AD. There's no current dependency that Exchange 2010 must access AD on Server 2008, but you'll need to ensure that your forest is at least at Windows 2003 functional mode and that there's at least one Global Catalog server running Windows 2003 SP2 in each domain that supports an Exchange 2010 server. Exchange 2010 doesn't support read-only domain controllers.
Within an Exchange organization, you can mix Exchange 2010 servers with servers running Exchange 2007 SP1 or later and Exchange 2003 SP2 or later, but there's no support for earlier versions of Exchange. Just like Exchange 2007, you won’t be able to upgrade an existing version of Exchange to the new release and will have to deploy new servers running Exchange 2010, then use the Move Mailbox feature to move users to the new servers. Details of deployment recommendations are still being worked out, but I expect that best practice will be to deploy servers running the Hub Transport (and Edge Transport) and Client Access roles first, followed by Mailbox servers.
Tons of New Developments
There are many other changes in Exchange 2010. Public folders persist, but some APIs (e.g., CDOEX, WebDAV, ExOLEDB) are replaced by Exchange Web Services. Unified messaging gains features such as a message waiting indicator and a personal auto attendant that can configure rules for how to answer incoming calls. You can expect Microsoft to connect Exchange better with Office Communications Server and its Windows Rights Management Services, bringing different strands of its information worker strategy closer together.
Microsoft still has tons of work to do before Exchange 2010 becomes a shrink-wrapped product, but all indications from the beta versions are that the new release will deliver some interesting and valuable functionality. Like any release, things can change before Microsoft ships the final software, including the elimination of features that don’t meet goals for functionality or quality. However, given that Exchange 2010 doesn't represent the same kind of generational change represented by the move from Exchange 2003 to Exchange 2007, I expect that the bulk of the functionality that exists in today’s builds will appear in the final release. The changes in the new version collectively represent nearly three years’ hard work by a large development group, so you can expect to be busy learning all about Exchange 2010 in the coming months.