The Information Store (IS) and Directory form the core of Microsoft Exchange Server. Systems administrators expend a great deal of effort to protect these databases. Administrators use RAID 5 or RAID 0+1 disk configurations to protect data; perform frequent backups; and carefully plan the stores' location on their servers' available disk volumes. I applaud systems administrators who take these precautions—the IS and Directory need as much protection as possible. But administrators mustn't limit data protection for Exchange Server to the databases. The databases' transaction logs (the IS and Directory each have a set) need protection too.
The IS and Directory use a write-ahead transactional model to protect data. Before Exchange Server commits data to one of the stores, it writes the data to a log file. Exchange Server logs every IS and Directory transaction from every source. After a transaction occurs, Exchange Server records it in the current log and writes the modified database pages to a memory cache. When system load permits, Exchange Server commits the pages in the cache to the appropriate store.
Exchange Server manipulates database pages in the cache to adjust the order of transactions and optimize the process of committing transactions to the database. Page caching also improves performance: When a transaction uses an already-cached page, Exchange Server can update the page in memory rather than generating additional disk operations. You might not think transactions often reference cached pages, but when an inbox receives multiple messages over a short time, the first transaction caches inbox data and subsequent transactions access inbox data in memory.
Exchange Server maintains data integrity by writing every transaction to a log file; Exchange Server never commits a transaction to a database unless the transaction is complete. Before Exchange Server can commit a transaction, it must perform a series of operations. For example, when a new message arrives, the IS must locate the rows for recipient inboxes in the folders table and locate the table that contains the target inbox data, and update both tables. If the IS can't complete these steps (e.g., because a software or hardware failure occurs), it regards the transaction as incomplete and won't commit the transaction to the database. Because transaction logs are so important to Exchange Server's data integrity, you need to use RAID and backups to protect not only the IS and Directory, but also the IS and Directory transaction logs.
Understanding Transaction Logs
Exchange Server transaction logs are always 5MB. When the current transaction log file—edb.log—fills, Exchange Server renames the log edbxxxxx.log (the xxxxx portion of the name is a hexadecimal number starting with 00001) and creates a new edb.log file to accept transactions. (Microsoft calls each Exchange Server transaction log a generation.)
On servers with heavy traffic, Exchange Server can generate hundreds of transaction logs per day. You must select a location for these logs carefully. The disk you allocate to hold these files must contain enough space to store the transaction logs until Exchange Server removes them. Exchange Server deletes transaction logs whenever it completes a full backup. If the IS or Directory attempts to create a new transaction log when the transaction log disk has less than 10MB of available space, the IS or Directory shuts down. Exchange Server reserves an additional 10MB of space in two log files—res1.log and res2.log—to protect outstanding data if the logs exhaust the allocated disk space. Exchange Server uses the first 5MB of this disk space to write outstanding transactions to the res1.log file. If Exchange Server needs to write more than 5MB of transactions (which is possible on very large systems), it uses the res2.log file. After Exchange Server uses res1.log and res2.log, the files become typical log files with edb.log names. Exchange Server can't write more than 10MB of data to the reserve files; thus, you need to monitor free space carefully on transaction log disks.
Screen 1 shows a set of transaction logs. The edb00668.log file is the last log Exchange Server created before it created edb.log. Screen 1 shows that Exchange Server logged these two files at approximately the same time (11:54 a.m.), which indicates that the server was under a reasonably heavy load.
The header information inside a log file contains a hard-coded path to the log's database, a time stamp that shows when Exchange Server created the log, a unique database signature that prevents Exchange Server from replaying transactions to the wrong database, and the transaction log's data. You can use the ESEUTIL utility in Exchange Server 5.5 with Service Pack 1 (SP1) to dump a log file's header and view the header information. For example, you can type
C:\WINNT\SYSTEM32\ESEUTIL /ML <log_edb01969>
To make sure you have the updated version of ESEUTIL in the EXCHSRVR\BIN directory, select eseutil.exe and click File, Properties. If you have the updated version of ESEUTIL, your version number will be 5.5.2232.0. (The original version distributed with Exchange Server 5.5 is 5.5.1960.3.) If you need to replace the file in the EXCHSRVR\BIN directory, you can copy the correct version from the Exchange Server 5.5 SP1 CD-ROM.
When you view the transaction log dump, the checkpoint, date, time stamp, and database signature are easy to find. Other important information in the transaction log dump includes the path and filename of the database the log belongs to. Exchange Server uses the rest of the information to replay log transactions when necessary. For example, Figure 1 shows the log dump for one of my IS transaction logs. This file contains information about a public store and a private store.
Protecting Transaction Logs
Exchange Server writes transactions to the log in sequential order, appending new transactions to the end of the log. These writes generate I/O activity, so the disk you store the transaction log on must support a heavy I/O write load. The disks that contain the IS experience read and write activity when users access items in their mailboxes. The Directory generates a small amount of I/O activity except on servers that host multiple directory-replication connectors.
If your transaction logs have a lot of traffic, you can place the transaction logs on a dedicated hard disk to segregate the I/O activity they generate. Dedicated hard disks don't typically become swamped with transaction logs' I/O requests. If a server generates a large load, the I/O activity to the IS will probably be of more concern than the transaction log disk load. Using a dedicated hard disk also solves the transaction logs' disk-space problem. Most dedicated hard disks are 4GB or larger; free disk space is almost always available on hard disks of this size that you dedicate to logging. If you accumulate 4GB of logs (i.e., 800 individual log files), either your server is under a tremendous load or you haven't backed up your IS and Directory recently.
You might not be able to dedicate a hard disk to log files. The arguments I made for using a dedicated hard disk when you decide where to store transaction logs on your system are that you need enough space for log file growth, and you need to monitor I/O activity. Don't place transaction logs on the same hard disk as other files that generate heavy I/O loads (e.g., Windows NT page files). In addition, don't place the transaction logs and the IS or Directory on the same hard disk. If the hard disk that contains the stores fails, you won't be able to access the transaction logs, and you'll lose data.
Mirroring. RAID 5 or RAID 0+1 provides the best protection for the IS and Directory. The transaction logs also need protection, but RAID 5 generates overhead that slows write activity to the logs. RAID 0+1 improves the transaction logs' I/O performance, but this level of protection doubles the amount of disk space the logs require. You can run the transaction logs on an unprotected disk, but any problem that occurs on that disk can render the logs unreadable. If your log disk has a problem and you have to restore a database, you can't re play log transactions into the restored database, and you lose data. I recommend protecting the transaction logs through mirroring. When you consider the amount of money you'd spend to recover data if one of your hard disks failed, and when you factor in the costs of productivity loss and compromised data, you see the importance of protecting your transaction logs.
Circular logging. Many Exchange Server machines serve communities with few users. Companies that migrate from Microsoft Mail or Lotus cc:Mail to Exchange Server, or that deploy Exchange Server as part of Small Business Server (SBS), usually prefer systems that use minimal disk space and require minimal administration. Microsoft designed Exchange Server for the enterprise, but the software also works well with servers that have few users. Circular logging is Microsoft's solution for companies that don't want the Exchange Server stores' transaction logs to use too much disk space. Circular logging saves disk space by reusing transaction log files. Instead of grad-ually accumulating a set of logs that contain the transactions that have occurred since the last full online backup, circular logging marks a log file for reuse after Exchange Server com-mits transactions to the database. Typically, circular logging uses no more than five or six files or 25MB to 30MB of disk space.
Circular logging sounds like the perfect solution to Exchange Server logs' disk-space consumption. The benefits are obvious: a reduced disk-space requirement and no need to monitor transaction log disk space. However, Exchange Server keeps logs so you can recover transactions when you have to restore a database during a backup. If Exchange Server reuses the logs, you can't recover the old transactions. If you use circular logging, you might lose data when you restore an Exchange Server database.
I can argue a coherent case for using circular logging on small systems. If you run Exchange Server on SBS, you probably don't support communities with many users, so your transaction load is light. Also, small servers don't typically require the amount of administrative support that larger servers do; thus, you might not have time to worry about transaction logs filling up disks. However, you need to disable circular logging on servers that create more than five transaction log files per day. To disable circular logging, select a server in Exchange Administrator and click File, Properties. Screen 2, page 123, shows the check boxes that control circular logging for the IS and Directory. If your Exchange Server machine has circular logging enabled, clear the check boxes and restart the IS and Directory to activate the change (i.e., disable circular logging).
Soft and Hard Database Recoveries
The Extensible Storage Engine (ESE), Exchange Server 5.5's database engine, performs a soft recovery each time the IS or Directory starts. Exchange Server maintains a checkpoint file (edb.chk) for each store. When the IS or Directory starts, Exchange Server reads the database's checkpoint file and determines whether it must commit any transactions in the store's log files to the database. Exchange Server replays these transactions, and the store completes its startup.
Automatically checking for transactions that the database is missing ensures that when the stores start, they are identical to the log files. Screen 3, page 123, shows the application log's record of Exchange Server replaying the contents of the edb007b1.log file to the IS. The time that replaying a transaction log requires varies from server to server and depends on CPU speed, disk I/O subsystem capacity, and the server's current workload. On small Pentium systems, Exchange Server processes log files in 15 seconds or less.
You use the IS's or Directory's transaction logs for a hard recovery after Exchange Server restores a database from a backup tape. Consider this scenario: You experience a disk failure or the IS reports Jet page errors. You decide to restore the database from a backup tape. After you restore the database, you discover that the database is missing transactions; those transactions are available in the logs. If you restore the logs Exchange Server created after the backup finished and make them available before the IS restarts, you will force Exchange Server to replay the transactions. Exchange Server will access each transaction log, verify whether the log contains data that Exchange Server needs to replay, and update the database. Exchange Server must have all the transaction logs for this process to complete successfully. An individual transaction might span two or more logs. For example, a message with a 12MB attachment extends across three logs. If one of the three logs is missing, part of the transaction won't be available. If Exchange Server encounters a missing log during replay operations, Exchange Server stops applying transactions, and you lose data.
Transaction logs are important. You must protect these logs in much the same way you protect the IS and Directory. Only foolhardy administrators ignore the finer details of transaction log management—the contents of these logs might save them one day. A hardware failure that reduces a database to a collection of useless bytes is horrible, but if the transaction logs that contain your data no longer exist, that hardware failure becomes much more serious.
Corrections to this Article:
- "Exchange Server Transaction Logs" incorrectly stated that the Eseutil tool is in the \exchsrv\bin directory. For Exchange Server 5.5 and later versions, you find this utility in the \windows\system32 directory.