What Is Disaster Recovery and Business Continuity?
The terms disaster recovery and business continuity are often used interchangeably, but they refer to two different sets of processes. Both are about getting a business back on its feet after a disruptive incident.
Disaster recovery (DR) is the process of restoring essential computer-based systems and data after a disruption to normal business operations. Disruptive events can come in many forms, including natural disasters (e.g., a hurricane) and security incidents (e.g., a ransomware attack). A disaster recovery plan details how an organization will respond to disruptions so that users can resume access to critical systems.
Business continuity (BC), meanwhile, effectively uses disaster recovery as a foundation but goes beyond just restoring IT systems. Instead, business continuity aims to reestablish operations based on essential functions, such as personnel, venues and communications systems. BC activities might include contacting emergency personnel, setting up temporary offices, providing transportation for employees or arranging for them to work from home, and rebuilding links with supply chain partners and customers.
How Does Disaster Recovery and Business Continuity Work?
Both disaster recovery and business continuity require extensive planning and analysis of an organization’s business processes. Planning efforts have two main goals:
- Ensure that all required critical applications and processes can be recovered
- Avoid overestimating and thereby overprovisioning recovery resources
These goals can be achieved by performing a risk assessment and business impact analysis. The risk assessment should determine the events that would most likely cause a disruption in company operations. Events may be caused natural disasters, such as hurricanes, tornadoes or flooding; human error (e.g., accidental data deletions or damage to systems); malicious attacks to physical installations; or cyberattacks that target systems or data.
Once an organization identifies its critical hardware/software assets and data repositories, its DR plan must address the required resources for restoring those systems, via a recovery site. The size and scope of the recovery site will depend on the number of systems and the amount of data an organization needs for baseline operations.
The risk assessment is also important for developing the business continuity part of the recovery plan. The risk assessment will identify the types of risks an organization faces, which can determine response strategies. For example, if flooding has been identified as a likely risk for an organization, the BC plan would consider how to move personnel to recovery sites if roads become impassable.
Another key outcome of the analysis and planning effort is the decision of just how much of the company’s total infrastructure and business operations should be recovered and when. If the IT footprint is modest and the business processes are not very complex, it may be possible to restore all operations during the initial recovery efforts. However, for larger IT environments or multifaceted businesses, it may be necessary to prioritize recovery activities. For example, an organization would restore the most critical elements first, then move on to lower-priority systems and processes. With this tiered approach, DR/BC activities could require several days or even weeks before a full recovery is realized.
Establishing the Recovery Site
After developing a DR/BC plan, the next steps involve putting all the pieces in place at the secondary or recovery site. The secondary site may be a facility owned by the company, a co-location site or provided by a cloud service. (See Disaster Recovery and Business Continuity Implementation Options below for more details.)
The equipment at the recovery site must be sufficient to handle the processing requirements and data capacities as defined in the DR/BC plan. The list of gear will likely include the following:
- Servers, including virtual servers
- Storage systems (that may have to be identical to storage arrays in the primary data center if certain types of data replication will be used)
- Data protection software that will run at the primary site and send data to the recovery site (may include continuous data protection applications, replication or any other form of data backup)
- Application software to reboot the business after a disruption (note that it may be necessary to secure additional licenses of critical applications)
- Operating systems and virtualization software
- Communications suitable for rapidly getting data to the recovery site and for accessing applications once they’ve been recovered
DR/BC Documentation and Testing
Detailed documentation is essential for DR/BC. The documentation should contain enough detail so that personnel not involved in the planning process can carry out the recovery plans. The documentation should include lists of personnel who are expected to participate in the recovery, along with a well-defined notification system to ensure that all involved know what’s going on and efforts are well coordinated.
Documentation should also specify who in the organization can declare a disaster, what needs to be done to make the recovery site operational and how recovered systems will be certified for use.
Business continuity relies on DR for systems recovery. However, since it addresses all aspects of doing business, BC planning and documentation may be more complex. BC documentation should outline how personnel will be protected if necessary, how they will be notified (e.g., using emergency notification systems or call lists), and how they will be relocated to alternative facilities or accommodated so that they can work remotely.
DR/BC plans should be tested regularly -- at least twice a year, but more often if possible. Testing can include tabletop exercises or “live” tests. Testing should indicate any procedures that require correction or improvement, and organizations should update documentation accordingly.
What Are the Benefits of Disaster Recovery and Business Continuity?
In the aftermath of a disruption, the benefits of a rapid restoration of business functions are easy to quantify (e.g., minimized loss of transactions and revenues). Quickly getting up and running again will also demonstrate a company’s diligence and foresight, which can build confidence among customers and the public.
In addition, a fast and successful recovery will safeguard critical company data and potentially sensitive customer data. It will minimize the amount of time that the data may be vulnerable to malicious attacks.
What Are the Drawbacks of Disaster Recovery and Business Continuity?
The main perceived drawback to DR/BC is cost. A DR/BC plan and the required additional infrastructure is effectively an insurance policy against business failure. It may never be needed, but it still incurs ongoing costs for software, hardware, alternative venues, personnel and so forth.
DR/BC testing can also be time-consuming and expensive to perform. It may involve many company employees, which can temporarily interfere with normal business activities.
Disaster Recovery and Business Continuity Implementation Options
Generally, there are four scenarios for DR/BC recovery facilities planning.
1) Company-Owned Recovery Site
- In this scenario, the organization owns secondary sites and equipment. The secondary site may be an active data center, in which case the primary and secondary data centers will back up each other.
- Data from the primary site must be replicated (or copied in some other manner) frequently to secondary site.
- The secondary site must be far enough away from the primary data center location to avoid the same climate/geographic threats.
- It may be prohibitively expensive to maintain a company owned recovery site.
2) Co-location With Secondary Site Space Rented From a Service Provider
- A co-location recovery site may be managed by the company or contracted to the colo or DR provider.
- The company may own the equipment at the colo space or lease it from the colo or DR provider. (If the latter, equipment may not be in place until a disruption occurs, so the provider is responsible for timely response and setup.)
3) Disaster Recovery as a Service (DRaaS)
- The DRaaS provider is the secondary site that stores data and application backups.
- DRaaS clients may choose to use the DRaaS provider to recover just data, or they can resume operations by spinning up virtual servers available from the DRaaS provider.
- Because DRaaS makes extensive use of virtualization, it is a flexible approach that can be used by all sizes of companies.
- DRaaS is relatively inexpensive compared to alternatives, with the potential for significant cost savings.
4) Hybrid Approach That Combines In-House, Colo and Cloud Resources
A hybrid approach that uses more than one type of secondary site and recovery process may be appropriate for companies with multiple sites. It can also factor in different kinds of disruptions and respond to each accordingly.
Hybrid may provide a good way to employ a tiered approach for the restoration and recovery process.
Conclusion: DR/BC Demands Careful Planning and Testing
In a global marketplace with round-the-clock operations for many companies, weathering a disaster may be a critical factor in helping an organization remain competitive.
The time and costs related to establishing a DR/BC plan, along with its periodic testing, must be weighed against potential losses (including less tangible losses such as damage to the company’s reputation). The potential losses that would result from a disaster should inform the level of protection that DR/BC plans will strive for.