More often than not, the specific event that triggers a data center outage is unpredictable. Companies spend tens of millions of dollars designing redundant infrastructure and automatic failover systems to compensate for the unpredictable. But that goal is unreachable by definition.
A civil contractor accidentally drove a spike through a power main in Manchester, England, Tuesday morning, cutting off power supply to two of the three buildings on the data center campus operated by UKFast, the British service provider wrote on its website. The facilities’ backup power system failed to do what it was designed to do, and the data centers went dark.
While it took about an hour to get generator power to all the servers, it took engineers until Wednesday morning to restore all client services. Some physical equipment failed as a result of the outage and had to be replaced, and there were snags in getting some software-based infrastructure systems up and running.
UKFast provides a variety of data center services hosted at its Manchester campus, including colocation, cloud, dedicated servers, and managed services. The campus has three two-story data center buildings: MaNOC 4, MaNOC 5, and MaNOC 6 and 7 (a single building that houses two data centers). Each building has data halls on both floors. MaNOC 5, and MaNOC 6 and 7 were the buildings that lost power.
The site’s UPS systems worked fine when it lost utility power; the generators started but failed to synchronize, which the company attributed to the damaged power cable in a status update on its website:
The UPS system supported the load for its designed time and the generators started; however, due to the physical damage to the power cable, service to the site was unstable and intermittent. As a result, the generator sets failed to synchronise and take over service.
Engineers had to synchronize the generators manually.
Because electrical grids are often unreliable, many data center operators build redundant utility feeds to their sites, often connected to multiple grids. But this kind of redundant infrastructure is expensive, and some operators, especially smaller ones, sometimes skip the expense, betting their facilities’ uptime on the robustness of their backup power systems.
The French cloud service provider OVH saw three of its data centers in Strasbourg go down last month after losing utility power. The company said one of the main reasons for the outage was lack of dual utility feeds at the site. OVH said that while using the dual-feed architecture was a company standard, that particular site was an older one where the standard was not applied.