Looking to become one of the few IT vendors that put some teeth behind their service level agreements, downtime incident management platform provider PagerDuty this week announced a downtime insurance program under which it will pay out up to $3 million to anyone that experiences outages using its IT incident management service.
PagerDuty CTO Andrew Miklas said most SLAs only compensate organizations based on a small percentage of the money they invested to acquire the product. Even then, that compensation usually takes the form of credits toward continuing to use the product.
Downtime Insurance, in contrast, represents one of the first efforts to tie a cash payment to an SLA, Miklas said.
“When we are talking about incident response, it’s always about peace of mind,” he said. “In accordance with that we wanted to tie the SLA to an actual business value.”
The insurance is available to any organization that signs up for the Enterprise Plan attached to the PagerDuty service. In the event of downtime, PagerDuty and the customer would jointly comb through PagerDuty logs and assess the damage to the business.
PagerDuty is willing to assume responsibility for any downtime relating to an incident, up to a maximum $3 million.
By comparison, most other SLAs are essentially toothless, Miklas said. Not only do most SLAs hide behind a best-effort clause buried somewhere deep in a contract, they don’t actually result in the provider of the IT service assuming any financial risk.
PagerDuty recently published a survey of 100 business and IT professionals, conducted by Forrester Consulting on its behalf, that found that more than half of the respondents said their organization experiences significant disruption of IT services at least once a week. Worse yet, half the time IT is notified of the disruption by internal employees or external customers.
The study infers that one of the reasons this occurs so much is that many IT organizations are trying to make sense of six or more IT management tools, each tool addressing a specific tactical issue. As a result, correlating all that information into something that resembles actionable intelligence is next to impossible.
Having access to an incident-response system essentially creates a framework around which the IT organization develops a discipline to not only minimize any potential downtime, but also keep the rest of the organization informed about what is actually occurring, and who specifically is taking care of the problem, Miklas said.
Naturally, the degree to which organizations have a formal process in place for managing IT incidents varies greatly. But, as the saying goes inside and outside of IT, it never hurts to expect the unexpected.