Implementing an email discovery, compliance, archiving, and retention (DCAR) solution is similar to remodeling a house. You need a “blueprint” (plan) that provides detailed instructions for the “remodeling” (design, components, and implementation steps of the DCAR solution). You also need to be able to translate the plan into reality—as a contractor would on a remodeling job—within the budget and schedule you've allotted for the project. Like a remodeling, implementing a DCAR solution might disrupt your current messaging scheme somewhat, but in the end it will make your job as an Exchange administrator easier and add functionality to your messaging system. Fortunately, Exchange provides some built-in features—message journaling, backup and restore APIs, and message and transport security—that provide a framework for building a DCAR solution. We'll examine those features as well as some related technologies in Exchange, such as event sinks, protocol logs, and message tracking, and explore how each fits into a DCAR strategy. And in the Web exclusive sidebar “Third-Party Products in an Exchange DCAR Solution,” we'll look at some important DCAR functions that you'll need to obtain through third-party products and what to consider when choosing such products.
You'll find a growing degree of confusion about the difference between archiving and journaling. From a high-level point of view, you can successfully argue that little distinguishes the two methods; both extract messaging data from the messaging system. However, you do need to know the difference, not only to implement your solution correctly but also to evaluate whether third-party components will meet your needs.
Archiving is the process of removing content from the messaging system for long-term storage in some other system, usually some type of database. Email messages are taken from user mailboxes according to criteria, such as age. Archiving technology usually provides some kind of Web or mailbox-based extension mechanism that lets users continue to access archived content as necessary. Features generally address the discovery, archiving, and retention components of DCAR. Archiving technology is useful for mailbox management, reducing storage requirements by consolidating and compressing multiple copies of the same data and ensuring the preservation of corporate knowledge.
Journaling is the process of creating or capturing copies of email messages as they enter and traverse the messaging system, ensuring that those copies are collected in central locations. Together, the journal copies comprise a searchable form of documentation that administrators and auditors can use to see the email messages that users are sending and receiving. Journaling generally addresses the compliance component of DCAR and is one of the three common mechanisms for moving messaging data into compliance and policy solutions. Journaling usually doesn't provide any direct benefits to end users, but it can be a vital part of a complete DCAR solution.
Although Exchange has no built-in archiving functionality, it has included basic journaling capabilities since Exchange Server 5.5 Service Pack 1 (SP1). Over the years, Microsoft has increased journaling functionality in the various Exchange releases, service packs, and occasional hotfixes. Today, Exchange includes three types of journaling:
- simple journaling (also known as message-only journaling)
- blind carbon copy (Bcc) journaling
- envelope journaling
All three types work on the same basic principle: Almost every email message that enters the Exchange organization is examined to see whether it's bound for a recipient configured for journaling. If it is, the first Exchange Server categorizer through which it passes creates another copy of the email message (in a process called bifurcation) that's delivered to a specified journal mailbox or public folder. The only messages exempted from journaling are system messages such as Active Directory (AD) replication messages, public folder replication messages, and journal messages.
Note: Although you can specify a public folder as the journal destination, Microsoft recommends that you specify a mailbox. Journal messages delivered to public folders can't be stamped with the full range of data with which email messages delivered to a mailbox can be stamped. Although you have some control over which recipients care configured for journaling, you should be aware that your ability to perform this configuration isn't very granular in any current version of Exchange. For Exchange Server 2003 and Exchange 2000 Server, journaling is enabled on a per–message-store basis. All mailboxes in enabled message-store databases are journaled, and all journal messages that mailboxes generate in that database are sent to the same journal mailbox (although you can configure separate journal mailboxes for each message-store database).
In Exchange 5.5, you can enable journaling for an entire organization or on a per-site or per-server basis. Be aware that Exchange will capture and copy only email messages that are transmitted. If someone edits a message in-mailbox, the change won't be captured. I know of at least one lawsuit that involved lawyers being blind-sided because they weren't aware that email messages in their organization had been changed to cover up evidence of wrongdoing. Opposing counsel produced records of the original, unaltered messages. I'll review the three types of journaling relative to the goal of implementing a DCAR solution.
Simple journaling has existed in Exchange since Exchange 5.5 SP1. When simple journaling is enabled, the first Exchange categorizer to handle a given email message parses the P2 header—the header information contained within the actual message that determines whether the relevant mailboxes are in databases with journaling enabled. For email messages sent within the organization that use Messaging API (MAPI), remote procedure call over HTTP Secure (RPC over HTTPS), Microsoft Outlook Web Access (OWA), or another form of HTTP access, this server is the sender's mailbox server. Otherwise, the bridgehead server receives the message through SMTP or the Exchange Message Transfer Agent (MTA) service. Journal copies of the email message are then sent to all relevant journal mailboxes. You control simple journaling through the Mailbox Store Properties dialog box, as Figure 1 shows.
Let's look an example. Imagine an Exchange organization with four mailbox servers, EXCH01 through EXCH04. Each mailbox server has two mailbox stores, one for regular users and one for journaling. Each regular mailbox store is configured to deliver to the journal mailbox in the journal mailbox store on the same server. In addition, an SMTP bridgehead server handles all incoming and outgoing SMTP traffic.
An external email message comes into the organization addressed to four recipients: Adam, Barbara, Charlie, and Denise. By chance, these four recipients are homed on separate mailbox servers. In addition to forwarding the email message to the actual recipient mailboxes, the bridgehead forwards it to the four journal mailboxes, requiring extra bandwidth, disk I/O, and CPU in the process. That kind of traffic multiplier can cause a significant performance hit in organizations with geographically dispersed servers linked by low-bandwidth WAN connections or organizations whose servers are already running close to their peak performance.
You might wonder why extra bandwidth is required, given that the SMTP stack in Exchange 2003 and Exchange 2000 is supposed to send only one copy of a message between servers even when there are multiple recipients on the destination server. Because the journal copy of the message has extra properties stamped on it during the bifurcation process, the journal copy technically counts as aseparate message. Be aware of this behavior when you design your solution.
Simple journaling has some other limitations, mainly because it uses the P2 header information. Simple journaling can't
- capture Bcc recipients. This limitation reduces or eliminates journaling's usefulness; you can't accurately track email message recipients.
- capture the results of any address rewriting you might have configured in your organization.
- uniformly expand distribution list (DL) membership. This limitation could leave you with a journaled email message that contains the DL name instead of the list of members. How do you go about proving the list's membership at the time the email message passed through the system? What if the list constantly has members added and removed? And consider how this limitation can affect a large organization with a complicated AD replication topology.
The Exchange 5.5 version of simple journaling has an additional flaw: The journal copy captures display names rather than actual email addresses. In essence, you can't prove that an email message actually went to a particular recipient; you can guarantee only that the message was sent to a recipient who had that specific display name configured at that specific time.
Bcc journaling is really simple journaling on steroids. No additional UI is exposed to turn on Bcc journaling functionality. Instead, in the HKEY_LOCAL_MACHINE\System\Current ControlSet\Services\MSExchange Transport\Parameters registry subkey, you need to specify JournalBCC (REG_DWORD), with a value of 0x01 to enable and 0x00 to disable.
Restart the SMTP and Exchange Information Store services to pick up the change. When the Exchange store detects this registry value on startup, it enables capture of the Bcc recipients on journaled messages and records this information in the journal copy of the captured messages.
Caution: Make sure that all your Exchange servers have the necessary upgrades for the level of journaling you use. If you use Bcc journaling on one server, make sure you enable it on every server that has message journaling enabled.
Microsoft included support for Bcc journaling in the original release of Exchange 2003, but the feature still suffered from the DL-expansion problem. Exchange 2003 SP1 added support for DL expansion. If you have Exchange 2000 servers in your organization, you can add both capabilities by ensuring that you upgrade your servers to Exchange 2000 SP3 and add the hotfix described in the Microsoft article “Bcc information is lost for journaled messages in Exchange 2000” (http://support.microsoft.com/?kbid=810999). No upgrades or hotfixes can give you Bcc journaling functionality in Exchange 5.5.
Microsoft introduced envelope journaling (aka advanced journaling) in Exchange 2003 SP1. It's designed to address the limitations of simple journaling and Bcc journaling by using the P1 header data. Like Bcc journaling, envelope journaling was retrofitted to Exchange 2000 servers; they must run Exchange 2000 SP3 plus the post–SP3 Update Rollup. (For more information about the rollup package, see the Microsoft article “An update rollup is available to enable the Envelope Journaling feature in Exchange 2000 Server” at http://support.microsoft.com/?kbid=834634.) Again, no updates for Exchange 5.5 provide this functionality.
In many respects, envelope journaling works much like simple journaling; it's still invoked by the first categorizer to handle a message and still configured on a per-store basis. The big difference is that whereas simple journaling looks at the P2 headers, envelope journaling looks at the P1 headers. The latter approach provides several benefits over simple journaling or Bcc journaling. Envelope journaling
- captures both display names and actual SMTP addresses
- natively captures Bcc recipients
- allows full DL membership expansion (including hidden DL members)
- captures all mail-enabled recipient objects: public folder, contacts, alternate recipients, ad hoc recipients, and query-based DLs
- captures delivery reports, nondelivery reports (NDRs), read receipts, and out-of-office messages
Instead of sending a bifurcated copy of the original email message to the journal mailbox, envelope journaling creates a separate journal report message. It then makes an exact copy of the original message as an attachment to the journal report message. This approach preserves the original message exactly as it was sent; it contains original headers, DLs are intact, and Bcc recipient information isn't displayed in the original message while still being available in the journal report message.
It sounds great, but the downside is that envelope journaling frequently results in multiple copies of the journal report message (and attached original message) being delivered to the journal mailbox. This behavior is an expected consequence of using the P1 headers. The server that performs the original categorizer might not have all the information it needs to completely identify all message recipients. It generates an incomplete journal report message with the best information it has, then forwards copies to other servers for them to process. The steps of the process follow:
- When the originating store can't perform DL expansion, it sends the email message to an expansion server for DL expansion.
- The DL-expansion server generates a new journal report message that now contains the full DL membership recipient information.
- Each of the recipient stores generates an additional journal report message to show that the original message was, in fact, delivered.
The result? You now have multiple copies of the journal report (and the original, attached email message) in the journal mailbox. All these copies contain different subsets of the final message recipients. They're all keyed from the same message ID, which means that whenever you audit message delivery, you must examine all related journal-report messages.
If your organization is dispersed over multiple sites linked by low-bandwidth WAN connections, envelope journaling can have a significant effect. The following steps show you how to set up envelope journaling in a way that mitigates potential negative effects:
- Ensure that all your servers run at least Exchange 2000 SP3 plus the post–SP3 Update Rollup or Exchange 2003 SP1. You'll experience inconsistent behavior if you don't upgrade all your servers.
- Download the Microsoft Exchange Server Email Journaling Advanced Configuration tool (exejcfg.exe) at http://www.microsoft.com/downloads/details.aspx?familyid=e7f73f107933-40f3-b07e-ebf38df3400d&displaylang=en.
- If you currently use Bcc journaling, remove the registry subkey or set its value to 0x00 (disabled) on all the servers on which you have journaling enabled. Restart the SMTP and Exchange Information Store services to activate the change.
- From the command prompt, use the exejcfg.exe tool to enable envelope journaling across the entire organization. This tool makes a simple change to AD that tells all the Exchange servers to change the heuristics they use to perform journaling.
At this point, you don't need to restart services. As your AD replication takes place, your Exchange servers will pick up the changed configuration value and switch over to the new behavior.
Backup and Restore APIs
If you're like many of your fellow Exchange administrators, your backup and restore plans for your Exchange organization are often a major driver of current operational procedures. It might not seem that your backup and restore process is directly related to DCAR, but if you stop and think for a moment, you'll realize that it is. Here are a handful of ways in which DCAR and backups are related:
- Some regulations mandate the ability to restore backups for a period of years. A viable disaster recovery plan is essential for demonstrating your organization's intent to comply.
- Unauthorized use of backup tapes can be a vector for disclosure of protected information. Poor control over backup materials can be another route to compliance nightmares.
- Backups protect the data currently in your mailboxes before it's migrated to your archiving system. Most archiving systems aren't designed to be able to inject content back into the messaging system.
- Backups are the only way to recover improperly deleted email messages when your retention policies don't work the way you expect them to, especially if you don't detect the malfunction right away.
If you use (or intend to use) some form of message journaling, you'll need to modify your backup and restore plans. Journaling mailboxes accumulate a large amount of traffic quickly, which can affect your backup and restore windows. If you follow recommended practices and establish your Exchange journaling mailboxes in separate message stores, you'll need to back up those stores.
Whatever form of backup and restore you use, you must ensure that it actually supports Exchange. Taking a backup of the database files directly from the file system (including several snapshot solutions that storage vendors offer) is a bad idea for several reasons. Not only is it extremely difficult for a solution that uses this approach to ensure that your databases are backed up in a consistent state, such a solution does nothing to address the growth of your Exchange database transaction logs. A proper Exchange-aware backup uses established APIs to perform the necessary log truncation after a successful backup of the store, which shortens restoration time (fewer transactions must replay after the files have been restored) and helps conserve disk space.
If you need a snapshot-based backup strategy and you run Exchange 2003 on Windows Server 2003, you might be able to take advantage of Microsoft Volume Shadow Copy Service (VSS)–based backups. By using VSS, the Windows OS provides a tested and supported snapshot capability, letting compatible backup systems take their backup against a consistent shadow copy of the database.
Sometimes, designing a complicated backup solution for Exchange isn't the best answer. If Windows Backup (aka NTBackup) is good enough for Microsoft's production servers, it's probably good enough for yours. You can use NTBackup to create an on-disk backup of your Exchange databases, then use your enterprise backup software to transfer those files to other media. A disk-to-disk strategy removes your reliance on slower, less reliable technologies, and, in turn, helps your backup and restore process run more quickly.
Don't gamble with your backups. Actively test your backup and restore plans by using duplicates of your production hardware and software so that you know you can restore your data when it counts. Ensure that your backup methodology is Exchange aware and that Microsoft supports it. And don't forget that Exchange 2003 features—namely the Recovery Storage Group (RSG)—can make your restoration processes run much more smoothly.
Message and Transport Security
Message security encompasses two main areas: message encryption (using cryptography to protect the actual message from inspection by unauthorized parties) and transport encryption (using cryptography to protect discrete connections between components of the messaging system).
Message encryption. Message security has clear implications for your DCAR solution. In particular, you need to consider the following questions:
- If you use Secure MIME (S/MIME), which Exchange supports, does your archiving solution support it?
- Does your archiving solution archive older certificates, so that you can still view email messages encrypted with them?
- How do you protect, back up, and restore whatever public key infrastructure (PKI) you use with S/MIME? (And although pretty good privacy—PGP—isn't optimal for DCAR, if you use it, ask yourself how you'll protect, back up, and restore your users' keyrings encrypted with PGP.)
- Can your policy-compliance software handle encrypted email messages?
- Are you required to protect message integrity through every hop of your network?
- Can attackers (whether internal or external) eavesdrop on unencrypted transport links?
Exchange 2003 and Exchange 2000 come with strong support for S/MIME; the Exchange 2003 version of OWA extends this support to OWA users. However, the practical considerations of deploying and managing the requisite PKI, dealing with the content-inspection challenges, and archiving keys tend to make the use of S/MIME unattractive for most organizations unless they're required to use it (e.g., government Exchange deployments).
Transport encryption. Transport encryption, on the other hand, is easy with Exchange and Windows and tends to mesh well with any third-party components of your DCAR solution. Exchange 2000 and later natively support Secure Sockets Layer (SSL) and Transport Layer Security (TLS) for a variety of protocols; Windows 2000 and later provide built-in IPsec functionality. Don't rely on MAPI encryption to protect connections between Outlook and Exchange; either deploy IPsec policies or upgrade to Microsoft Office Outlook 2003 and Exchange 2003 so that you can use RPC over HTTPS.
In my experience, Microsoft Internet Security and Acceleration (ISA) Server 2004 is one of the best investments you can make to help provide a higher level of message security between the Internet and your Exchange organization. Placing an ISA server in your demilitarized zone (DMZ) means never having to expose your Exchange servers directly to incoming Internet traffic and greatly simplifies your firewall configuration. Plus, ISA permits SSL bridging, so that you can perform protocol-aware proxying and filtering of SMTP and HTTP connections while still providing transport encryption for every connection.
A variety of other Exchange technologies and features aren't directly related to DCAR but still provide useful hooks into your Exchange organization or make deployment and troubleshooting easier to perform:
- Event sinks—Exchange event sinks provide a powerful mechanism for extending Exchange functionality. Many DCAR components use this feature to plug into your Exchange servers and intercept email messages before they're passed off to internal Exchange components. Common uses include alternative journaling implementations, content inspection, and disclaimer injection.
- Protocol logs—Although protocol logs are disabled by default, you can easily turn on Exchange's powerful protocol-level logging on a per–virtual-server basis. These logs provide an accurate picture of all the communications that transpire through that virtual server, letting you easily track down problems or perform spot audits.
- Message tracking—Exchange's message-tracking feature is disabled by default. When enabled on all your Exchange servers, message tracking lets you quickly trace the passage of email messages through your organization. Enabling message tracking takes a small amount of overhead, but the ability to easily find out where an email message went astray more than makes up for the overhead, especially if you need to troubleshoot your DCAR implementation.
- Message hygiene—Exchange 2003, in particular, includes some impressive antispam features that can help you reduce the level of junk that makes it into your organization. The reduction in spam in turn reduces the load on your retention, archiving, and compliance components. Exchange also provides a comprehensive antivirus API that lets you stop worms, viruses, and Trojan horses.
Completing the Solution
As you've seen, you can use Exchange's built-in journaling, along with Exchange 2003's support for VSS and message and transport encryption plus related features such as message tracking, as the foundation of your Exchange recovery and compliance solution. However, Exchange doesn't provide certain other essential DCAR functions, such as archiving and PST management. To complete your Exchange DCAR solution, you'll want to look into third-party products that can provide these capabilities.
EXCHANGE COMPLIANCE RESOURCES
E-discovery and compliance:
Email Compliance Requirements: Getting Started, and Preventing the IT Search Party: Be Prepared for E-Discovery—on-demand Web seminars, http://www.windowsitpro.com/events
Exchange backup and recovery:
“Best Practices for Recovery Storage Groups and Exchange Server 2003,”
“How can I back up my Microsoft Exchange Server storage groups and databases?”
“Exchange Server 2003 data backup and Volume Shadow Copy Service,”
Microsoft's in-house Exchange 2003 backup strategy: “Backup Process Used with Clustered
Exchange Server 2003 Servers at Microsoft,”
“Exchange 2003 Advanced Journaling,” InstantDoc ID 45644
“What message journaling options does Microsoft Exchange Server 2003 support?”
“Troubleshooting message journaling in Exchange Server 2003 and Exchange 2000 Server,” http://support.microsoft.com/?kbid=843105
Exchange's built-in antispam features:
“Secure Email with S/MIME,” InstantDoc ID 49878
This article is adapted from Email Discovery and Compliance, Chapter 5: Implementation, Part 2—Hardware and Software (Windows IT Pro eBooks, 2006).