Do Exchange Online backups make sense?

Do Exchange Online backups make sense?

Microsoft is well-known for using native data protection (aka "no backups") for Office 365. Exchange mailbox data is protected by using four database copies spread across two datacenters and the full array of features like Single Item Recovery, single page patching, and automatic reseed is used to ensure that mailboxes stay online all the time. And generally speaking, they do, as attested to Microsoft's strong SLA record for Office 365.

But it's hard to break the habit of a lifetime and it can be a challenge to convince people who enjoy the security of good backups to leave them behind when workload moves to the cloud. It can also be an issue for those charged with data security and compliance who value a backup, especially when it's taken by an independent third party. Which is where Spanning, a subsidiary of EMC Corporation, comes in with their Spanning Backup for Office 365 product.

Spanning is a cloud-to-cloud backup service. The Office 365 offering is currently in beta with the plan of record being for a full launch at the end of June 2015. Spanning has accumulated considerable experience with backups for Google Apps and Saleforce.com and it will be interesting to see how they cope with the demands of Exchange Online and the rest of Office 365.

I caught up with the product team in early February to talk about the value of a cloud backup. Essentially, apart from the compliance angle, it's all about being able to quickly restore a mailbox back to a specific point in time, including the ability to recover items from one mailbox and move them into another. Restores can be activated by both end users and administrators.

To enable recovery, Spanning uses Exchange Web Services (EWS) to read items from mailboxes and copy them across the net to their datacenter. A mixture of full and incremental backups is used to ensure that recovery can be done at different times. Items copied across the Internet are indexed to allow easy search and recovery via a browser interface. All of which sounds pretty good, if you decide that backups are really your thing and that native data protection is insufficient for your organization.

However, I have my doubts. Using EWS is obviously a great way to access information inside Exchange Online mailboxes. After all, EWS is the interface used by clients like Outlook for Mac and it is the preferred method for third-party developers to create code that works with mailbox items (you could also use MAPI but few do, despite the efforts of the MFCMAPI team to dispel the mystery surrounding this API). There's also the salient fact that EWS includes an ExportItems method explicitly designed to support the export of messages, contacts, tasks, and appointments from Exchange. That method is supposed to be a lightweight way of getting data out of Exchange, even at the rate required to extract complete mailboxes for backup purposes.

But the nagging doubt that I have is that I don't think that EWS was designed to be used for backup. It kind of reminds me of "brick backup" products or the early days of Exchange anti-virus when agents used MAPI to log-onto user mailboxes to watch for inbound viruses. None of these implementations were particularly good at scaling up and they all consumed resources as if there was no tomorrow.

Microsoft isn't generally appreciative of any process that hogs resources within Office 365 and has deployed a wide range of checks to impose throttling on processes that attempt to seize more than their fair share of resources. This is reasonable because you don't want runaway code affecting multiple tenants. With throttling in mind, it might be the case that Spanning will run into some challenges when it introduces a backup agent that connects to every mailbox to capture contents and record changes. A rough back-of-the-envelope estimate is that Spanning will double the load on Exchange Online mailbox servers, not to mention the increased network load to transport all the information back to Spanning's datacenters. Customers can ask Microsoft to loosen the throttling restraints for tenants to accommodate the demands of the backup agents and that might be a solution. After all, Microsoft has lots of servers running Exchange Online.

But Spanning is not to blame here. According to Spanning, they haven't been told by Microsoft not to use EWS in the way that they are. In fact, Spanning uses the EWS for the purpose Microsoft designed the API to serve – to access mailbox data. They're just using EWS at a scale and for a purpose that Microsoft might not have anticipated. All of which proves that Microsoft needs to provide a better way to access very large quantities of Office 365 data, something that also became apparent in another area of Office 365 when I looked at Office 365 reporting products recently.

Spanning faces other challenges as well – how to deal with IRM-protected content, or S/MIME encrypted messages for instance (security is always a joy), but they are in beta at present and enough time remains between now and June to address these issues. And of course, you don’t need backups for only mailbox data – SharePoint document libraries, lists, and OneDrive libraries also deserve some tender loving care.

If you're looking for a backup solution for Office 365, you can consider the Spanning beta program. Other companies in the space include CloudFinder, CodeTwo, AvePoint, and CloudAlly, none of which I have used.

I'm interested in hearing why people use backups with Office 365 as well as your experience of the performance and utility of these products. Are they really necessary or is native data protection good enough for Exchange Online?

Follow Tony @12Knocksinna

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish