Sometimes a really good thing can have bad consequences.
In August 2012 an errant Configuration Manager Task Sequence took down an entire Australian banking system. At the time, it was the most monumentally negative press ConfigMgr had ever received. In this situation, CommBank had outsourced their ConfigMgr operations to HP which eventually led CEO, Meg Whitman to make an onsite visit to apologize.
ConfigMgr is a complex system, but at its core it only does what an administrator tells it to do. So, it behooves any organization to take methodical steps to deploy any piece of software using the systems management beast. And, it takes a lot of training, which many organizations are remiss to provide.
During TechEd 2014, Emory University's IT department prepared and deployed Windows 7 upgrades to the campuses computers. If you've worked with ConfigMgr at all, you know that there are checks-and-balances that can be employed to ensure that only specifically targeted systems will receive an OS upgrade. In Emory University's case, the check-and-balance method failed and instead of delivering the upgrade to applicable computers, delivered Windows 7 to ALL computers including laptops, desktops, and even servers.
I'll stop for a second to let you take that in.
Yes, even servers. By the time it was realized what exactly had happened, the Windows 7 sequence had repartitioned, reformatted, and installed Windows 7. Emory IT powered off the ConfigMgr server, hoping to stop the deployment before it was too late, but – it was too late. Even the ConfigMgr server had been repartitioned and reformatted. The way ConfigMgr works, package and deployment instructions are sent from the primary server to other servers in the infrastructure to offload processing. These servers take over and manage the deployment and eventually report back successes or failures. So, by the time the ConfigMgr server was shut down the deployment instructions were already in the wild. Many have requested that Microsoft invent an all stop button in ConfigMgr for years, but really ConfigMgr is not a real-time system and it's nearly impossible to develop such a feature.
I'm sure Emory IT had that lump-in-the-gut feeling we all get when we've done something we know will have negative repercussions, particularly things that will take days to rectify. Emory IT swiftly fixed the issue, though. In true IT form, the group had a good handle on the situation by May 16. The original problem was identified on May 14.
So, could this happen to you? Absolutely. ConfigMgr is the market leader in endpoint management and is used by the majority of organizations around the world for managing Windows systems.
What can be done to protect your company against an accident like this? Mike Terrill over at myITforum has started a series on how to minimize the risk of an operating system deployment disaster. You can read Part 1 here:
Emory IT kept the campus up to date using a web page, and the original page has been taken offline, but a cached version is still available here: Windows 7 incident