On the Friday after the Monday and Tuesday outages of Office 365, Rajesh Jha, Microsoft VP for Office 365 engineering apologized for the service disruption. Jha's apology and explanation appeared in a support forum post. Read the full post at the Office 365 support forum.
Jha explained what caused both outages, which, he said, affected customers hosted in the North America datacenters.
The June 23rd Lync outage, he said, resulted from "a brief loss of client connectivity in our North America datacenters due to external network failures. Even though connectivity was restored in minutes, the ensuing traffic spike caused several network elements to get overloaded, resulting in some of our customers being unable to access Lync functionality for an extended duration."
The June 24th Exchange Online outage, he said, started with an authentication request problem affecting a small number of customers that triggered "an unexpected issue in the broader mail delivery system due to a previously unknown code flaw leading to mail flow delays for a larger set of customers."
He said that an issue with a publishing process prevented timely notifications to affected customers, "which we realize was frustrating and this has since been addressed."
He noted that customers will receive a Post-Incident Report (PIR) with "a detailed analysis of what happened, how we responded and how we will prevent similar issues in the future."
Read the full post at the Office 365 support forum.