Does Microsoft Explain Cloud Outages Better Than Google?

On October 3, I wrote about the success that Google was then enjoying at achieving stellar levels of performance against SLA for Gmail, which had delivered 99.984% availability for users in 2010 and appeared to be doing even better in 2011. I guess the pressure of fighting high-profile law suits must be transmitting stress from the executive suite to elsewhere in Mountain View because Gmail has not delivered the same performance since.

In fact, soon after I wrote the original piece, Gmail suffered a 50 minute outage on 31 October. Not so bad, but enough to put a big hole in their hopes of achieving the same SLA performance as in 2010. And then Gmail had another blip on April 17, 2012 when “something” caused a 64 minute outage that Google said affected less than 2% of their estimated 350 million users worldwide. Again not so bad, unless you were one of the 7 million people who were deprived of their email fix.


It seems therefore that Exchange Online has been doing a little better in its competition against Gmail recently. There was the small matter of the APAC outage on March 16, but no one really counts a problem that only affected people at the other end of the world, do they? At least, that’s what it seemed like based on the lack of coverage in the mainstream IT media.

I don’t want to keep on harping about cloud email outages because it really doesn’t matter all that much if either Gmail or Exchange Online are inaccessible for an hour or so every so often. Many of the IT departments who run internal email services would be happy if they managed to deliver the quality of service that the cloud services deliver day in and day out. The fact that any issue is magnified by the sheer number of people who use these services is sufficient to make sure that we all know about the problem and the volume of complaints that circulate through Twitter and other social forums when things do go wrong is enough to make your head dizzy.

What does deserve comment, however, is the lack of communication from service providers when problems happen. Microsoft was hauled over the coals when Office 365 experienced its problems in August and September 2011, something that was perhaps a sign of immaturity in the Office 365 support operation. But within hours of each incident, Microsoft explained what had caused the problem and took the criticism on the chin when it turned out that something like a DNS configuration problem was the root cause of one major outage. The same happened when Azure experienced its little leap year problem and corporate VP Bill Laing was forced to acknowledge that a programming bug was its root cause. I like being able to read about the steps that providers have taken to determine and fix cloud problems because it provides a human side to the story.

However, Google doesn’t seem to see things in quite the same way. Despite having a very informative Gmail blog, it’s scarce on details when the time comes to even acknowledge that problems might occur with the service. And if you search the Internet (using Google, of course) for details about the root causes of Gmail outages in 2011 or 2012 it’s difficult to locate the reasons. I can find one root cause analysis for a Gmail failure in 2009 but nothing more recent.

Maybe this is because Google considers this information to be trade secrets that might be of some use to their competition. Perhaps it’s because their whole infrastructure is home-brewed and completely unlike that used by Office 365 where you can deploy and use on-premises versions of products like Exchange and SharePoint that work in a similar way to their cloud counterparts. In fact, I think it’s probably because there is such a huge amount of information about Microsoft technology available in the public domain – possibly more than Microsoft really likes at times – that makes it easier to understand when problems happen in Office 365 and also forces Microsoft to be upfront and open in their communications when they screw up.

I also like the new wiki launched by the Office 365 team to inform customers about upcoming features that they have introduced. Different wikis are available for Plan P and Plan E subscribers. When you hand over control of an application to a cloud provider it can be hard to track the changes and improvements that the provider now controls. The wiki is just another example of good communication. To be fair to Google, they have an equivalent "what's new" page for Gmail users.

In the competition between Google and Microsoft I therefore consider Microsoft to be ahead in terms of communication and the gap to have narrowed considerably between the two companies when looking at SLA performance. In fact, it’s only the relative lack of track record that Office 365 has had since last June that makes me say this and it will be interesting when we can compare the two records for the complete 2012 year.

Follow my ramblings via Twitter

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.