Last month brought yet another big “cloudtage,” as Microsoft’s BPOS had a bad day on May 10 and then two days later on May 12, but I think in this case, the cloud did show something of a—apologies, I can’t resist—silver lining, in the form of some better-than-expected service response.
On May 10, BPOS’s email servers received a badly-formed email message. What that basically means is that Internet email servers (“SMTP servers”) of all stripes expect to see any incoming messages formatted in a very particular format, a format that’s been around since 1982’s RFC 822.
But—happens corruption or dropped packets in SMTP messages?
Well, even if you figured out what that last sentence meant, I guarantee that it took you longer to read than it’ll take to read any of the others in this article, which illustrates what happened to BPOS in mid-May. Of course, you figured out the sentence because you’re a human with a brain that has an expert pattern recognition device in your cranium, which puts you leagues ahead of a mere email server, which is only equipped with a set of protective algorithms that try to detect, catch and then discard badly-formed incoming packets before feeding said “malformed” packets to the central email processing programs. No algorithm’s perfect, though, and so once in a great while a packet comes along that isn’t correctly-formatted but that is close enough to correctly formatted that it gets past the email server’s filters, allowing what is essentially gunk to get into the innards of the email server’s message processing engine. Furthermore, once that packet is inside the server’s message processing engine, there’s no telling what will happen. In the case of the BPOS servers, it caused them to chase their tails for a while, making BPOS customers attempting to get to their email to have to wait up to nine hours. (If you think about it, there’s a potentially zero-day-exploit air about this. Imagine some hacker wrote a program that created randomly malformed packets and then threw each one of those malformed packets at an Exchange server in the hacker’s lab, hoping to eventually find one that could slip past Exchange’s filters. Once that hacker had that magic packet, he could fairly quickly hammer, say, the 1,000 busiest Exchange servers on the planet with copies of that packet, making for a truly bad day for Exchange admins. Okay, it’s probably more likely that I’m paranoid and need to take my pills.)
The bottom line here is that either Exchange or some front-end to Exchange had a bug that allowed a “poison packet” to essentially shut down email for BPOS customers for the better part of a day, throwing thousands and thousands of users in disarray. So why bother reporting this? Well, so far, it’s just another irritating and unacceptable thunderhead in the “Intercloudosphere.” But there was one difference.
Users report that Google Docs has been down for hours at a time without a word from Google or much in the way of phone help, and if you read my piece about my Postini experience a year ago, I can readily believe it. A contact who used to work for Amazon tells me that Amazon really, really, really doesn’t want to have to answer the phone to talk to EC2 users with questions, and many EC2 users were truly incensed that Amazon’s answer to their woes ran along the lines of “hey, we never said we were perfect…didn’t we suggest some sort of backup?,” meaning apparently that when I move my stuff to the Amazon cloud, I’d better keep my data center running just in case there’s another outage. (Kinda hard to justify the cost savings in that model.)
So what did Microsoft do when a bad packet left egg on their face?
They apologized, and quickly.
Dave Thompson, one of the guys in charge of BPOS, made a blog post wherein he detailed what happened, took full responsibility, and explained what steps they were taking to ensure that it didn’t happen again. Now, in truth I’m not all that surprised, as I’ve run into Dave a few times in the past ten years and he’s always been candid and informative even though he was fully aware that he was talking to a journalist. Dave’s frank message was a breath of fresh air.
Then again, perhaps Dave was just being smart. As a guy who watches Windows for a living, I’ve noticed a few things recently vis-à-vis the world’s favorite OS. First of all, I find that I spend an awful lot of my non-work computing time on my iPad, and less on my Windows box. Second, it seems that most cloud services are delivered through a browser window, and—as I’ve observed before here—nobody’s going to sign up for a cloud service that only works on IE9 and not on Chrome, Safari and whatever, and that Safari support can’t just work on Windows. Third, nobody’s selling the cloud harder than Microsoft, and if “more cloud” equals “less need for Windows desktop,” then it doesn’t take too many brain cells to see that Microsoft is currently in the middle of what may be the biggest you-bet-your-company wager that we’ve seen in a long time.How’s it going to turn out? I have no idea whatsoever , but here, I saved you the good seat right by the flatscreen. Sit down, there’s plenty of popcorn and I’m pretty sure that this is going to be a good show.