Like everyone else who saw the hurricane-spawned devastation in Louisiana, Mississippi, and Alabama, I was stunned by the sheer scale of the destruction. Of course, my heart went out to the victims, but as I watched the TV coverage, other thoughts also kept running through my head. As a resident of the hurricane-vulnerable part of the country, I was thinking what everyone on the Atlantic—from southern Virginia down to the Gulf coast—was probably thinking as they watched Katrina’s march up the Mississippi: “How long before something like that comes to my town?”
Which is why disaster recovery is on my mind.
I speak to large groups around the world, and when I get the chance, I like to ask the question, “How many of you have a disaster-recovery plan?” Typically, 20 to 40 percent of the crowd raises their hands. Next, I ask, “How many of you have ever gone in on a Saturday with a few new machines and a handful of CD-ROM discs and backup tapes to try it out?” Invariably, most of the hands go down, leaving perhaps 5 percent of the crowd with their hands up.
That always scares me. I once read that 99 percent of the wealth of the nation is stored in bits. Wipe out those bits, and you wipe out the wealth. Wipe out the wealth, and you wipe out the economy. But we needn’t talk about a national disaster. Think about what would happen to a business—large or small—that lost all its data. Lose the customer list, and the business instantly becomes a startup, all over again. Lose the bookkeeping information, and it has no idea who it’s paid or who needs to pay it. Facing such a disaster, most concerns would just fold their tents and steal off into the night. The “night” of unemployment, that is.
How about the individual? Imagine losing all your data. How would you do your taxes without your checkbook information? How would you recover your lost photos that exist only as JPGs on a now-dead hard disk?
Many people don’t bother setting up a disaster-recovery plan because it seems difficult. And sure, it’s possible to spend a ton of cash on a disaster-recovery scheme. But if you don’t have a disaster-recovery plan yet, here are a few thoughts.
First, of course, you need to back up your data. Everyone does that, at least (I hope) on their servers, so this advice isn’t groundbreaking. But keep some of those backups somewhere else. You don’t need to rent space in a cave in Kentucky to store your offsite backup media. But think about taking a tape home once a week—provided your firm trusts you to do that sort of thing. If not, put a safe in a branch office and periodically ship backups to that location. Another suggestion is to grab an old laptop, put Windows Server 2003 or Windows 2000 Server on it, hook it up to DSL and a VPN connection, and make it a domain controller (DC). That way, if the office building falls into a hole in the earth, rebuilding AD will be much easier. Believe me, rebuilding an AD infrastructure with at least one remaining DC is much easier than doing it when all the DCs are destroyed. (Again, you’d have to have the trust of your company, and you’d have to worry about physical security if you did this.)
Second, test the backups now and then. You’d be surprised how difficult some backup systems are to use. Worse, they might be buggy. The backup software gets a lot of backup exercise, but not so much restoration exercise, so you might not find out it’s buggy with that aspect until it’s too late. I’m a huge fan of external USB or FireWire drives as repositories of extra, “just in case” backups. You can quickly walk them from machine to machine, collecting essential data, and when you’re done you’ve got data files that aren’t compressed in some bizarre format. They’re in regular old NTFS data structures.
Third, don’t think of a disaster as something that happens only to other people. Don’t think of it as a low-probability scenario. Sure, airliners crashing into buildings and category-4 hurricanes are horrible, unusual events. But roof leaks pouring water into server rooms aren’t terribly uncommon. Neither are squirrels crawling into tight spots and electrocuting themselves while taking out a power source, or clogging essential ventilation pipes. Well-intentioned administrators who think they’re formatting the hard disk on one server but accidentally formatting the hard disk on a different, critical server (“Dang, I always forget to check the lights on the KVM switch”) are, sadly, not rare.
Fourth, keep a “system change” notebook. When you installed Microsoft Exchange Server 2 years back, what questions did the Setup wizard ask, and how did you answer them? Better yet, get good at scripting installations, and build scripts to rebuild your systems. Try them out on virtual machines in VMWare. The snapshot feature is a great way to quickly test installation scripts. (You ask, “What about Virtual PC?” I’d love to suggest it, but the license forbids people from installing server OSs on Virtual PC virtual machines. It’s a shame.)
If your job responsibilities include the words “server” or “user,” you’ve got plenty on your plate. But go back and take a look at the pictures from the Mississippi coastal towns that took the Katrina storm surge right in the teeth. Imagine yourself in the shoes of the systems administrators from those towns, and think of how many of them are muttering, “I wish I’d set up and tested a good disaster-recovery plan.” And take some time to put together some anti-mutter insurance.
To my readers on the Gulf coast, I pray that you all came out in one piece, or have the good fortune to be able to put the pieces back together. Good luck to you all.