Backup and recovery implementations have always been driven by five key criteria: time, speed, size, certainty, and cost. How long does it take to complete a backup operation, and how long must data be retained? How fast can lost data be restored? How much data must be backed up? How certain can you be that backup operations have succeeded and data can actually be restored? And, finally, how much will it cost? Over the past 2 years, the emergence of continuous data protection (CDP) has considerably raised the stakes and expectations for backup and recovery infrastructures.
Decline of Traditional Backup
For a long time--before the Web began to affect business operations--large companies would perform backups nightly. Smaller companies would often back up their data once a week or even less.
As Web-based operations and a new breed of enterprise-level applications emerged, the traditional backup scenario began to break down. As companies stayed "open for business" more hours of the day and generated more data, the backup window kept shrinking even as more information had to be managed. Some companies simply couldn't back up their data in the allotted time. Moreover, once-a-day backup was seen as inadequate. Too much data was generated each day to risk losing it if a failure occurred between backup operations.
The industry's first response was to offer snapshot technology, which began to command widespread attention with the release of the EMC TimeFinder product family. Instead of relegating backup to a single daily operation, major storage vendors began to offer the ability to take snapshots of data periodically throughout the day. If data was lost or corrupted, the system could be rolled back to a point at which its integrity was still intact.
CDP: The Next Step in Backup and Restore
CDP is the next step in that evolutionary process. "When CDP was first talked about a year ago, the idea was just to take a lot more snapshots," said Rick Walsworth, vice president of marketing at Kashya, which has released CDP appliance tuned for the Oracle and SQL Server markets.
That approach has its limits. Storing multiple data snapshots requires a lot of--often expensive--storage capacity. Moreover, you might not have a valid snapshot of data before corruption occurs, and even if you do, when you restore a file to its precorrupt state, you lose all changes made after the snapshot was taken. Finally, taking very rapid snapshots of data might cause performance problems.
CDP moves beyond snapshots. But, as with many new technologies, hype and fact about CDP are meshed together. A number of smaller companies have barreled into the CDP sector. Some of those companies, capitalizing on the market's growing enthusiasm for CDP, are positioning their technology as delivering CDP even if that isn't exactly the case.
CDP comes in three basic flavors. The most rudimentary approach provides block-level replication and logs all changes to the disk image as they occur. This process has been described as mirroring that can be rolled back. Since it doesn't store directory or file information, this CDP type is most useful for full-restore applications.
The second flavor, and most common approach to CDP, takes place at the file level. This CDP solution logs changes to files as they occur. Individual files can then be restored. File-level CDP, however, doesn't collect information about database logs or operations, and it might restore transactional systems to an inappropriate point in time.
The third and most involved approach to CDP is called application-aware CDP. This type of CDP technology understands how the application works. It applies timestamps to specific I/O and application events, creating, in effect, intelligent benchmarks. If something untoward happens to data, it can be rolled back to a point in time when the data was last consistent according to the application.
Vendors use any of several different approaches in their application-aware CDP products. For example, Kashya offers an appliance that sits outside the data and monitors write activity. Other approaches monitor both read and write activity. For instance, FalconStor Software offers a software-only CDP solution.
Despite the confusion surrounding CDP, the technology has gotten a lot of people's attention. According to a recent survey by the Enterprise Strategy Group, 63 percent of its respondents indicated that they were familiar with the concept.
At this point, CDP seems to be most appropriate for email and transactional applications. Nonetheless, the idea of continuous protection is very attractive, and its use is likely to broaden over time as IT pros gain a better understanding of CDP features and how the technology can benefit them.