Last November, I wrote about some configurations advanced by Hitachi and IBM that appeared to meet the storage needs of 120,000 Exchange 2013 mailboxes. In a nutshell, I didn’t think too much of the configurations because they featured Database Availability Groups (DAGs) that deployed only two copies of each mailbox database. That just doesn’t make sense in any way if you want to run such a large Exchange deployment.
Storage vendors are perfectly able to dream up whatever configuration they think meets a particular need. In this instance the need was to pass a Jetstress test and meet the requirements of the Microsoft Exchange Reviewed Solution Program (ERSP). This is an artificial test that simply means that a storage vendor is able to set up and process some synthetic transactions for a certain number of mailboxes against a set of Exchange databases on their platform for a set period. It does not come with a guarantee that the configuration will work if deployed. No quality control is applied and no technical stamp of approval is given from anyone who has ever run a large production environment. In most cases, the configurations are as useless as a snowball in a glacier.
But vendors do like to pass tests so that they are listed on Microsoft web pages. Microsoft is careful to say that ESRP is a program “designed to facilitate third-party storage testing and solution publishing for Exchange Server.” Facilitation doesn’t mean approval, endorsement, authorization, or sanction. It’s quite a mealy-mouthed word in some respects because it can lead customers to assume that a presented solution might actually work. You might have gotten the idea by now that I am no fan of ESRP.
Which brings us to the latest storage solution brief from Nimble Storage, published on June 11, 2014 (thanks to Wei Liu for the heads-up). I’ve got to admit that I know very little about this company or its unique selling point, which appears to be based on “its patented Cache Accelerated Sequential Layout (CASL™) architecture. CASL leverages the unique properties of flash and disk to deliver high performance and capacity – all within a dramatically small footprint.” My read from their material is that CASL makes sure that needed data is kept on SSD ("adaptive flash") rather than on traditional hard drives, which seems like a good thing. More information about Nimble Storage is available in this YouTube video.
After asking around, it seems like Nimble Storage does not have a big footprint in the Exchange market. Their only published case studies for Exchange are for a small 200-user bank and a health company. The relative lack of penetration in this market when compared to HP, EMC, IBM, NetApp, etc. might be the reason why Nimble Storage decided to publish their ESRP solution brief.
The proposed configuration is for 100,000 mailboxes, which seems a stretch from their other customers. However, my doubts are again all about the impractical nature of the configuration in production terms. Four physical Windows 2012 servers running Hyper-V support twenty Exchange 2013 mailbox servers arrayed in two DAGs. Immediately you ask why two DAGs? And the resilience delivered by four Hyper-V servers for 100,000 mailboxes? And why only 100 databases (50 active) with only two copies per database? And then we have two thousand 1GB mailboxes assigned per database, pushing up against the best practice 2TB limit for a mailbox database. Of course, users don't fill their mailbox quota from the get-go so these databases will be much smaller in practice (at least at the start).
The YouTube video explaining the solution really didn't address these issues as it's far more focused on storage issues like IOPS, capacity and latency. These are important issues in their own right, but a viable design on the application level would be nice too.
So there are lots of red signs that would cause an experienced Exchange administrator to look at this configuration and discount it as the basis for a production deployment. You could argue that a single DAG spanning 16 nodes could handle the load of 100,000 mailboxes, especially as such a DAG could support up to 1,600 mounted databases, or 400 active databases each with 3 passive copies. But you probably won’t max out the DAG to support 100,000 mailboxes across 16 servers (6,250 mailboxes per server). Teasing out questions like this requires some awareness of real-life operations coupled with deep Exchange expertise.
It would be really nice if one of these ESRP solutions was actually of practical use. In other words, take a well-designed solution (probably one involving multi-role servers) and run it against a vendor’s storage to see how well things actually perform when real-life considerations are in force.
A good start would be made by basing the configuration on the output of the Exchange 2013 server role requirements calculator, tweaked by an expert to take advantage of the platform. At least then we’d know that the data presented was of some worth. The current implementation of the ESRP doesn’t help customers much at all.
Follow Tony @12Knocksinna