In November 2011, I wrote an article about the need to cherish transaction logs and explained why some of the changes that Microsoft has made to the Exchange database engine and schema in Exchange 2010 might have lured administrators into a false sense of security when they configured disk storage to hold transaction logs. One response is that you can avoid running out of disk space by using circular logging. This is indeed true and there are a number of changes in this area in Exchange 2010 that require a new understanding of how circular logging is performed.
Circular logging is not a new concept for Exchange as it’s a feature of the product ever since Microsoft released Exchange 4.0 in March 1996. Back then (yes, it was in the dark ages), disk space was very expensive relative to today and most servers came equipped with disks measured in hundreds of megabytes. But that was OK because the average size of a corporate mailbox was around 30MB and consumers hadn’t really figured out how to use email.
The basic idea behind circular logging is that the Information Store process uses a constrained set of transaction logs to capture transactions, reusing logs in the set as the transactions contained in the logs are committed into the database. You can quickly see the attraction of circular logging in situations where disk space is tight because you always know that Exchange will never use more than five or six logs in the set and you don’t run the danger of exhausting available space on a disk because of an accumulation of logs. Early versions of Exchange used 5MB transaction logs, so a maximum of 30MB or so would be used for the logs. Current versions of Exchange use 1MB transaction logs.
Circular logging is a property of a mailbox or public folder database. It can be manipulated by editing the properties of a selected database using the Exchange Management Console (EMC) or with the Set-MailboxDatabase or Set-PublicFolderDatabase cmdlets. For example:
Set-MailboxDatabase –Identity DB1 –CircularLoggingEnabled:$True
A transaction is something like a new email arriving or an item being deleted from a mailbox. It’s important to realize that a single transaction log might contain all of the records necessary for several individual transactions. Equally, a transaction might well span several transaction logs. For example, if you create a message and add a 5MB PowerPoint attachment, the Information Store will take at least five transaction logs (on an Exchange 2007 or Exchange 2010 server) to capture all of the data records that constitute the creation of the message and the addition of the attachment. Sending the message will represent another transaction. Exchange never commits a transaction into a database unless all of the records that collectively make up the transaction are available and complete so there’s no danger that log replay will result in a corrupt database.
Like a lot of things in life that seem good on the surface, a downside exists for circular logging. The reason why transaction logs exist is to make sure that transactional data is captured twice. Data from memory is written into transaction logs as buffers fill and is subsequently committed into the database. If the transaction log is subsequently reused and overwritten with new data, the original data now exists in only one place and the transaction log can never be used to replay transactions if the database fails. In the days before Exchange included high availability features, especially when the product supported just one mailbox database, the consequences of a disk failure could be catastrophic. It was therefore deemed to be best practice to never enable circular logging for a production database unless you had a death wish or desire to explore early unemployment.
Technology has moved on and circular logging isn’t quite the Bête noire of Exchange that it once was largely due to the introduction of continuous replication and multiple database copies in the form of CCR/SCR in Exchange 2007 and the Database Availability Group (DAG) in Exchange 2010. Circular logging is deemed acceptable when you assure that mailbox databases are sufficiently protected through replication. Best practice for Exchange 2010 is that circular logging can be enabled for any database where there are at least three copies (one active, two passive), the logic being that three copies are enough to survive the vast majority of normal outages that are encountered in production such as a disk failure or even a complete server going offline. In addition, once mailbox databases are deployed in a DAG, they are protected by the magic of single page patching, a feature introduced in Exchange 2010 that enables the Information Store to “broadcast” a request for the data necessary to patch a corrupt page in an active or passive database copy.
Circular logging has evolved to take account of new circumstances and Exchange 2010 now employs two distinct modes of circular logging for mailbox databases. The first mode is traditional circular logging, much the same as has been used since Exchange 4.0 and the only type that can be used with public folder databases. The second is referred to as “continuous replication circular logging” or CRCL, the method used by the Information Store when databases are included in a DAG. Why the difference? It’s simply because a DAG is a far more complex environment than a standalone server and the Information Store has to accommodate configurations such as lagged database copies where transaction logs are retained for a set period before they are replayed to update a database copy. In addition, traditional circular logging is a function of the Jet database engine whereas CRCL is managed by the Exchange Replication Service (MsExchangeRepl.exe)
Log truncation, the process whereby unwanted transaction logs are deleted or marked for reuse, requires more management too because Exchange has to ensure that it retains transaction logs until they are not needed by any database copy. Within a DAG, servers update each other using the Exchange Replication Service with the current status of transaction logs and so are able to determine when it is safe to delete logs because the logs are no longer required for replication. For example, the server holding the active copy of a database won't truncate logs if the server(s) that host passive copies are down for some reason to ensure that every log that might possibly be required to bring a database up to date is available.
Even though enabling circular logging seems to be a matter of setting a simple property, the switch between traditional circular logging and CRCL requires manual intervention by an administrator. However, this is only required when the first database copy is created in a DAG. If you want to create a copy of a database that has traditional circular logging enabled, you must first disable circular logging, dismount the database, mount it again, and then create the database copy. This process allows Exchange to ensure that all available data is available to create the new copy and that log truncation does not occur while a database copy is being seeded (a small database will seed quickly, a larger database might take several hours and all logs must be maintained until the seeding process is complete). Later on, after the new database copy has been fully seeded, you can switch the database back to circular logging by updating database properties. At this point CRCL is used rather than traditional circular logging. The interesting thing is that if you now create another database copy in the DAG, you don’t have to mess around with circular logging settings and dismount the database because CRCL is in use and the Information Store is managing the database accordingly.
The evolution of circular logging is a good example of how a feature that has been around for years has been subtly updated to accommodate the needs of high availability. It’s nice to see that some technology ages better than other things, including myself!
Follow Tony's ramblings on Twitter @12Knocksinna