In "Enabling Message Journaling on Exchange Server" (December 1999), we talked about configuring Exchange Server for message journaling. Message journaling is a method for saving a copy of messages for legal or business reasons. Although message journaling is becoming a hot topic among companies that want to bring their email systems into compliance with government or business requirements, a big drawback is the scarcity of message-archiving tools. You need to be able to easily transfer all the messages from the journal recipient to another medium to prevent the message volume from overwhelming the Information Store (IS).
CommVault Systems (http://www.commvault.com) and SRA International (http://www.assentor.com) offer products that work with message journaling for long-term message archiving. However, Microsoft designed the Microsoft Exchange Server Archiving Agent (EAA) specifically for low-budget, high-volume message archiving. EAA lets you easily archive messages that are in the journaling recipients. You don't have to use EAA with message journaling. You can use EAA as a freestanding utility to simply dump all the messages in a mailbox or public folder.
Although EAA is an effective message-archiving tool, its documentation is sparse and offers little insight into its full functionality. To help fill this void, let's look at how to plan and prepare for archiving journaled messages and how to install, configure, and use EAA.
Planning the EAA Schedule of Events
In our previous article, we discussed the pros and cons of using a mailbox vs. a public folder for the journal recipient. In summary, most people with more than one Exchange server choose the public folder option because this option lets them control the utility's impact on the network. To demonstrate EAA's capabilities, we assume you have multiple public folders across multiple Exchange servers in a distributed WAN environment.
Planning is key to archiving successfully. You need to know the size of the data in public folder replication and the time needed for archiving and replicating deletes. Before you begin archiving, you need to ensure that the public IS has replicated all public folder messages, even large messages. To control message flow, set the Message replication size limit on the Advanced tab of the public IS property page to a value you've determined appropriate. The message replication size limit refers to the size a message or group of messages must be before replication occurs. The 300KB default is usually appropriate, but in some situations with low bandwidth constraints, you might want to make the message replication size to allow for smaller batches of messages to trigger the replication process and therefore keep large bulk transfers from occurring.
Table 1 shows an example public folder replication schedule. We staggered most of the replication times to avoid overwhelming the NCR-HQ server, which functions as the repository for all the archives. Based on our benchmarks, these time intervals will accommodate the estimated amount of data that we need to replicate and archive each day (see Table 1, column 6). To find out how much time public folder replication is taking, go to the Server Replication Status tab on the Properties sheet of the server's public IS. This tab shows how long the last transmission from the local server to the selected server took, which tells you whether your estimates were correct. You can also determine how much total disk space a public folder replica is consuming by viewing the Total K column of the Public Folder Resources tab on the server's public IS Properties sheet.
You can determine your storage requirements by multiplying the amount of message data per day by the amount of time you want to save the messages for retrieval. For good performance, you need a dedicated set of hard disks used solely for archived message storage and appropriate media (e.g., CD-ROM, DLT) for long-term storage. The permanent storage medium must be appropriate for the length of time you need to keep the archived messages. For example, a CD-ROM's shelf life is about 20 years, whereas a DLT's shelf life is only about 5 years.
Although you can configure the EAA to delete or not delete the original messages from the source public folder or mailbox after you've archived them, the best practice is to delete the messages. Because replication of deletions from the NCR-HQ server back to the regional servers requires minimal overhead on the WAN, we've scheduled these replications for all regional servers at the same time (see Table 1, column 4).
Timing is crucial, so we recommend that you configure a server monitor to automatically synchronize the servers' clocks. See Mark Ott, "Using Exchange Server Link and Server Monitors" (May 1999) for information about server monitors. Synchronization ensures proper replication of the public folder data and correct timestamps on the archived mail messages.
Preparing for EAA
To prepare for EAA, you need to configure your Exchange system. Your first step is to configure public folder replication for archiving. First, use the Outlook client to create the public folder in each home server (see Table 1, columns 1 and 2). Next, to schedule public folder replication to NCR-HQ, select each remote server (i.e., Cincinnati, Cleveland, and Columbus) in the Microsoft Exchange Administrator program, then go to Folder, Public Folders, and select the public folder that is the message journaling recipient on that server. Select File, Properties, and select the Replicas tab, which you see in Screen 1. Select the central destination server (e.g., NCR-HQ), and click Add to create a replica of the folder. Repeat this process for each remote server. If you configure public folders to limit administrative access to the home site, you can create all the necessary public folder replicas from the NCR-HQ server.
To configure both the replication schedule to NCR-HQ and the replication of deletes (see Table 1, columns 3 and 4), go to the Replication Schedule tab, and select the time you want. For example, select the 1 Hour Detail View and Selected times, and configure the Cinti-Journaling public folder to replicate between 11:00 p.m. and midnight and 6:00 a.m. and 7:00 a.m., respectively, as Screen 2 shows. We repeated this process with the other servers according to the corresponding times in Table 1.
Installing and Configuring EAA
The system requirements for EAA are Windows NT 4.0 Service Pack 3 (SP3) or later, Exchange Server 5.5 SP1 or later, Microsoft Transaction Server (MTS) 1.0 or later, Microsoft Outlook 2000 or Outlook 98, and Collaboration Data Objects (CDO) 1.21. MTS is the communication glue between the NT service and the Visual Basic (VB) COM component that does the archiving. EAA also comes with an interactive version that is useful for testing without scheduling jobs to run. Although you can install EAA on any server, Microsoft recommends that you install it on the server with the journaled messages to avoid traversing the network. If you run EAA on a machine that isn't running Exchange, you need to edit the NT service dependency Registry keys that the EAA readme.txt documents.
You can download EAA from Microsoft's TechNet. To find the utility on TechNet, search for the keyword EAA. The search also locates related white papers and program updates. Microsoft released two updates: build 81 and build 82. Build 81 adds features such as the ability to specify the length of the archived messages for the HTML and external message (MSG) formats. This feature lets you limit message filename length to 64.3 characters to be compatible with CD-ROM format. Build 82 updates the EAA Interactive program by fixing the problems with the CDO FolderPicker routine. If you can't find these updates on TechNet, you can download them from the Exchange Administrator Web site (http://www.exchangeadmin.com).
Before you run the EAA setup program, you must create a service account for EAA to run under; for security, don't use the existing Exchange service account. Next, log on with the new service account, and configure a MAPI profile for this account. To configure the profile, right click the Outlook icon on the desktop, choose to manually configure information services, and click Next. Name the profile Microsoft Exchange Server Archiving Agent. You must spell out the name exactly, and the name is case-sensitive. Click Next. Add only the Microsoft Exchange Server by clicking Add, selecting Microsoft Exchange Server, and entering the server and mailbox for the EAA service account. Don't add any personal store (PST) folders to the profile; using a PST folder as your destination format can create problems because Microsoft didn't design the EAA to archive from a PST folder to a PST folder.
If the EAAService isn't initially visible in the Control Panel, Services applet after the setup, you might need to register the service manually to make it display properly. To register the EAAService, run the command
in the directory where this file resides. The program installs by default under \Program Files\Microsoft Exchange Server Archiving Agent, as Screen 3 shows. Next, make sure you change the service startup to Automatic so that future reboots of the system will load the service on startup. Configure the Microsoft Exchange Server Archiving Agent service to log on under the service account you created.
The Microsoft Exchange Server Archiving Agent folder contains the EAA Interactive.exe file, which lets you test the application by archiving only one recipient container at a time. The file prompts you to log on to the Outlook profile (which you created earlier). After you log on, a window appears with the hierarchy of the service account mailbox and the public folders on the Exchange server, as Screen 4 shows. By navigating the hierarchy, you can copy all the messages from a public folder or mailbox or copy the messages from the Exchange server with an option to delete. You can copy these messages to an alternative location by specifying the path on the local hard disk or a network share in the OutputDirectory window. You can define the output type in PST, MSG, or HTML format. When the Finished box displays the number of messages EAA has copied, you've finished archiving.
Which output type is best to use? PST format places all the archived messages inside a folder in one PST file in the root of the target directory you configure. The format of the PST filename is
where servername is the name of the Exchange server you're running the agent from, source folder is the name of the folder you're archiving from, and GMT-timestamp is the date and time the agent started archiving in Greenwich Mean Time (GMT). The PST format uses compressible encryption to save space, if you require encryption for data security. The biggest drawback to the PST format is that it offers no good way to search through the messages without opening the PST file in the Outlook client. For this reason, use the PST format only when the volume of messages is small and you don't need easy search and retrieval.
The second format is the native MSG file format, in which the EAA stores each message in a separate file in the Outlook native MSG format. The MSG filename is in the format
where messageID is the message's MAPI ID (which has a 128-character default). All the messages that you archived in one job are in the servername_folder name_GMT-timestamp system folder. You can index these MSG files only with the Microsoft Site Server indexing service (i.e., Index Server) because the Microsoft Internet Information Server (IIS) indexing service (i.e., Search Server) doesn't automatically parse MSG files. The MSG file is the best choice when message fidelity and property preservation are crucial and you're willing to purchase and use Site Server search for indexing and retrieval.
HTML format is probably the best choice because you can index items in that format for easy search and retrieval and view them with only a Web browser. With HTML format, if you want to open a message 10 years from now, you would need only a program that understands HTML, whereas the other two formats require a program that can read the Outlook PST or MSG format. When you select HTML format, EAA saves all the messages in a folder named servername_folder name_GMT-timestamp. EAA saves each message in the format
where messageID is the message's MAPI ID. EAA saves message attachments with the same filename as the message they're attached to but retains their native file extension to facilitate opening them in the program that created them. EAA saves the original filename of the attachment in the HTML file, with the original filename as a hyperlink. Also, EAA saves all the MAPI properties of the message in an HTML comment in the HTML file so you can easily use them for search and retrieval. Because the property names are in their native MAPI ID hexadecimal name, you need to use the MAPI ID-to-name translation table included in the CDO libraries to retrieve their original names for display. When you've established an archive location and started archiving in HTML format, you can easily set up search and retrieval by configuring the IIS index service to catalog that location, then modifying the sample search page to include those catalogs.
When I archived messages from the public folder HQ-Journaling in either MSG or HTML files, EAA created a new folder by default under the root partition ComputerName_Mailbox-Name_Inbox_Date. If you use the PST file format, the PST file goes to the directory you specified. In all three formats, EAA creates a log file that has a one-line entry for each mail message with information such as message ID, sender, receiver, and size. Because this file is comma-delimited, you can easily open it in Microsoft Excel or Microsoft Access for summary analysis. You can also use the log file to perform a quick analysis of who is sending the most mail, the average number of recipients, or other statistics to meet your reporting needs.
The EAAConfig.exe file is better than the interactive version because it lets you use automated scheduled archiving. As Screen 5 shows, this file lets you schedule the date, time, and frequency of any folder you want to archive without intervention. EAAConfig.exe has the same output parameters as EAA Interactive.exe, and EAAConfig.exe lets you turn off logging and archive multiple folders simultaneously. The Agent.htm file in the Microsoft Exchange Server Archiving Agent provides more information about using EAA Interactive.exe and EAAConfig.exe.
Clicking the ellipsis (...) next to the Current Folder to Archive window invokes the service account's Outlook profile. The profile displays a window of all the public folders in the organization and the service account's mailbox. Although you can define only one archiving job at a time, you can repeat the process as many times as you want to add more jobs to the EAA list for archiving multiple journal recipients.
Selecting View Task Info lets you see all the scheduled tasks you've configured, as Screen 6 shows. Use the drop-down menu under Archival Task Id's to change the view to see a specific job number and its corresponding details. However, you can't change any of these task settings after you configure them, so if you make a configuration mistake, you must remove that scheduled task and start from scratch.
By default, the Outlook profile displays only one mailbox: the service account's mailbox. If you need to archive additional individual mailboxes (e.g., for rule compliance or an internal investigation), you can configure additional mailboxes in the profile. To configure additional mailboxes, in Outlook drill down to Tools, Services, Microsoft Exchange Server, Properties, Advanced, and add the mailboxes you want to archive. If you're using the Exchange service account, you have permission to open anyone's mailbox. Otherwise, you need to make the EAA service account a mailbox owner by adding this account to the Permissions tab of the respective mailboxes.
A Good Enough Utility
Although the EAA application is certainly not full featured, it provides bare-bones functionality for quickly archiving messages. EAA functioned perfectly in our tests, and it has archived millions of messages in production environments. The program doesn't include compression capabilities. If you need to compress files, you can use NTFS compression to conserve disk space. Despite the agent's limitations, it might be just the tool you need to archive the messages captured with the Exchange journaling feature.