Can a software company really be your IT department?
Two big outages in two days. First Lync Online goes down on Monday, and then a 9 hour outage for Exchange Online users on Tuesday. Microsoft, of course, has apologized to customers reporting that the outages were caused by a degraded network. Rumor has it that the outage was actually due to a Spam filtering issue, but we'll have to wait to hear from Microsoft on the final report. That's a bit funny to me, if true. Can you imagine a Spam filter identifying ALL messages as junk? Oops.
But, the outages bring up four very important points that need to be understood and addressed.
Cloud Outages Have a Greater Impact
When a company's email is unavailable it is painful, for sure, and it affects both the company experiencing the issue and its customers. But, think about the Cloud. As more companies migrate services and apps to a public hosting provider the impact of an outage does not just affect a single company anymore. When a provider has issues, particularly those as severe as experienced in the last couple days, all those companies putting trust in the service are impacted. An outage in the Cloud makes the issue much, much bigger.
Bottom line: When a company's email goes down it affects revenue. When the Cloud goes down, it affects the economy.
Microsoft's Monitoring Didn’t Work?
Reports flowed over Twitter yesterday that, though users were experiencing severe issues, the public-facing monitoring dashboards Microsoft uses showed all GREEN, meaning that there were no service disruptions reported by the technology. Communication about the issue was sparse and spattered, and we've yet to hear full details on the problem.
The ironic thing here is that Microsoft's notification method for Exchange Online is Yammer, yet Twitter provided a much better communication medium because, well, Yammer wasn't working.
Bottom line: Unacceptable. IT managers would be in HR filling out severance package papers if this happened to the business. This needs to be fixed right away.
Perception and Responsibility
When email (or any other service) goes down in a single company, IT works to minimize the impact by supplying workarounds while the issue is identified and rectified. When email (or any other service) goes down in the Cloud, there's not a direct point of contact that management can pin down and hammer. Business leaders still look to IT as a source for communication and all IT can do is shrug and point to the Cloud provider. And, if the Cloud provider's communications are muddled and its alerting technology is not meeting guidelines, it makes both the Cloud Provider and IT look bad. I'm sure business leaders were already trying to figure out who to fire this week.
Ironically, in the past IT Pros have positioned Microsoft products as a means for job security because specialized expertise is required to run and manage them. IT Pros can still say the same thing, but for different reasons – if you get my drift. If it didn't seem as if Microsoft was trying to take our jobs at every turn, we might feel sorry for the company.
Bottom line: Microsoft took a big hit yesterday. Yes, in the full scope of SLA it was a minor bump, but the industry perception will echo for months. That's a big problem for Microsoft and gives IT Pros enough fuel to delay any Cloud adoption for quite a while.
Is Microsoft Ready to Be IT?
Another ironic thing about the past couple days is that Microsoft has now been faced with what we've all experienced in our careers in IT. Despite orchestration and redundancy, the system failed – and it failed big time. In the bigger scope, a day or two of inaccessibility is not a big deal, but when the Cloud fails, Microsoft fails. But, what pushes the needle even higher on the ironic scale is that Microsoft is experiencing issues with its own products.
As IT Pros, we have years of experience with this sort of thing and have put in checks and balances to minimize disruptions, because we know that Microsoft software can be, uh, quirky. We don't roll out an update until at least the first service pack, or until other companies have uncovered RTM problems. Microsoft has spent considerable time over the past few years producing best-practices whitepapers and feeding the ITIL monster, but everyone in IT knows those are baselines, not laws. Every single environment is different and has its own challenges no matter which Operations Framework is sourced.
Bottom line: Microsoft could learn a lot from IT Pros, and it needs to. This week proves that Microsoft and IT Pros are at least on equal footing when it comes to the potential for outages, but the impact is different. Microsoft is getting a crash course in how IT really works.
So, what say you? Can Microsoft be IT?