Any company that has been around for at least a decade has probably gone through a few iterations of storage. For many, the journey started with direct-attached storage, then moved on to storage area networks and network attached storage to solve the issue of stranded storage and low storage utilization. Next came fiber channel and iSCSI, which provided a common set of services, centralizing storage and making it easier to use virtualization. All these methods of centralized storage provided a storage pool that could be used by multiple users or computers.
Despite these technology advancements, many organizations still found it complicated to manage all the data and ensure the right people had access to the right data. That ushered in hyperconverged infrastructure (HCI), a category of systems that combine servers, storage, and a hypervisor into one platform, which is managed by software. Easy to deploy and manage, HCI developed into an ideal platform for virtual desktop infrastructure.
Yet HCI isn’t a good fit for many other things, such as handling large amounts of data. Scale can be a major issue. For example, the average cluster size of HCI is about eight nodes, but modern Cassandra databases can run around 100 nodes. And because compute and storage are coupled, they must be used or replaced in lockstep.
Companies wanted the ability to manage storage separately from compute. They also wanted greater agility and automation and to compose data centers out of a pool of resources so they could support more diverse workloads. In addition, companies wanted to do more with their data, including implementation of AI frameworks and advanced data analysis, and support workloads that run both in the cloud and on-premises.
For many, the time was right to consider a new model: disaggregated storage.
What Is Disaggregated Storage?
Disaggregated storage decouples memory, compute, and storage so they can be scaled and provisioned separately. That can help companies support a more diverse set of workloads simultaneously, providing consistent performance and low variability. For example, one workload might require excellent IOPS while another might not, and one might require encrypting data with a specific key while others don’t. Disaggregated storage can also support hundreds or even thousands of applications on a single storage platform.
Disaggregated storage’s approach is necessary for modern application development and container-based environments, in which scale is critical, said Scott Sinclair, a practice director and senior analyst at Enterprise Strategy Group. “With those environments, you need to scale very quickly, while at the same time developers want something they can spin up cost-effectively.”
Today’s fast networks have made disaggregated storage possible. “Back in 2003, you had one gigabit of networking and 12 gigabits of hard drive performance you could put on that node,” explained Jeff Denworth, chief marketing officer and co-founder of high-performance storage startup Vast Data. “Networks have since evolved by a factor of 400 in the last 20 years, and hard drives to SSDs have evolved by a factor of about 35. The point is that networks are now much faster than drives, where previously drives [were] much faster than networks.”
Fast networks afford more flexibility. Once companies have stopped tightly coupling storage and compute, they can build the right set of resources that suit their applications, and they can feed those applications with a disaggregated, scalable pool of storage that is much more efficient, Denworth added.
Why Choose Disaggregated Storage?
High performance is a chief benefit of disaggregated storage. “The difference between technologies like fiber channel, NVMe over Fabric, and iSCSI vs. disaggregated storage is huge in terms of performance,” said Jai Menon, chief scientist at chip startup Fungible. “For example, if you want to have a million IOPS from a compute server, it would take eight or 10 x86 cores just to support storage -- and that’s pure overhead. Those cores can’t be running anything else. You have to move to something newer like disaggregated storage to reduce consumption significantly.”
Data protection and security are also important differentiators of disaggregated storage. While security features had been more or less an afterthought in the past, especially for earlier centralized storage products, it’s rapidly grown into a critical need. Today, it’s about how quickly you can back up data, how quickly you can restore, and features like immutable snapshots that can’t be rewritten or changed. One way to achieve that, Sinclair said, is by using virtual air gaps, where the snapshot can’t be remounted, so there is nothing that can access the snapshot to corrupt it.
Menon agreed that disaggregated storage is more security friendly than other storage technologies. Since the technology was built to support multiple workloads, it must support separate security standards for each tenant.
Fungible’s system, for example, encrypts data as it is leaving the compute server, unlike centralized technologies that encrypt data after it travels unencrypted from the compute node to the shared storage box.
Because disaggregated storage separates the control plane and the data plane, all boxes storing the data, along with the data movement, are separate from the control base. “Say I want to create a new volume of a terabyte in size,” Menon explained. “That’s a control-plane action, not actually data movement. The two are separated, so the security around who can create and delete volumes is isolated in the control plane and carefully protected there.”
Disaggregated storage can also allow organizations to tap into NVMe over TCP, which extends NVMe across an entire environment using a TCP/IP fabric to provide high performance. NVMe over TCP is much lower latency than iSCSI and allows organizations to build large-scale Ethernet fabrics, Denworth said. “With this network standard, you don’t need a special network,” he noted.
While disaggregated storage isn’t in widespread use today, it’s moving in that direction. Dedicated disaggregated storage vendors include Fungible, Lightbits Labs, and MayaData. Many large storage vendors like Pure Storage, Vast Data, and Dell Technologies are adding disaggregated storage offerings.
All this doesn’t mean that disaggregated storage will spell the end for HCI or even centralized storage. For workloads that don’t need high performance, HCI can work well. Some users also welcome the push-button nature of HCI or want to buy compute with storage.
In fact, some of these storage models may morph over time. For example, Menon believes that centralized storage systems will evolve into disaggregated storage to make use of modern protocols. It makes sense, because both put the idea of shared storage front and center, he said.
Many industry watchers believe that storage is only the first of many technologies that will become disaggregated. Memory, GPUs, and networks will eventually also get there. With this method, organizations can disaggregate these resources and then compose them on demand.
“Composability is a natural extension of disaggregation,” Menon said. “For example, if you need a compute server with 10 TB of storage and three GPUs, you should have the composable software to be able to compose that on the fly. That’s the future.”
When it comes down to it, though, disaggregated storage is just one of several choices. “At the end of the day, we get too wrapped up in the individual architecture of storage, whether it’s HCI, [storage area network], or disaggregated,” Sinclair said. “But all the application cares about is that it’s getting the storage it needs, with the right degree of latency, reliability and availability.”