Emerging data storage is a category of high-speed storage that includes hyper-converged infrastructure, NVMe/PCIe storage interface, edge storage, among other innovative technologies.
Data storage technology has traditionally acted as a background player in data centers but is today recognized as a strategic enterprise computing resource. The change in storage’s status began in the mid-1990s, when the idea of centralized, networked and shared storage took hold and put storage on a nearly equal footing as application servers. Storage’s role in modern data architectures grew as storage area network (SAN) and network-attached storage (NAS) technologies evolved into sophisticated data repositories. The introduction of solid-state media accelerated storage’s evolution as flash added read/write speeds that affected the entire computing environment.
Solid-state media also made most of the modern incarnations of storage resources viable alternatives to traditional storage implementations.
Storage has changed a lot over the past few years, but these five innovative technologies and implementations represent some of the most novel yet practical rethinking of storage systems.
- Hyper-converged infrastructure (HCI). HCI is a server-centric approach to shared storage where storage resources are distributed rather than physically centralized.
- NVMe/PCIe storage interface. NVMe -- nonvolatile memory express -- is a data transport protocol designed to take advantage of PCIe (peripheral component interconnect express) bus architectures and flash storage.
- Storage-class memory (SCM). SCM sits between a server’s RAM and storage, combining aspects of both of those data repositories.
- Edge storage. Distributed environments, such as internet of things (IoT) architectures, may require some data processing at their endpoints, which means data must be stored there, as well.
- Computational storage. This new class of storage systems puts the data as close as possible to the processing facilities to yield higher performance.
Collectively, these five technologies have helped to elevate storage into a critical asset in most data-processing scenarios.
How Do Emerging Data Storage Options Work?
The five key storage options profiled here share some similarities, such as rethinking the degree of centralization for shared storage systems and the accelerated performance of NAND flash. Some represent evolutionary developments in storage technologies, while others are different enough from conventional implementations that they represent “forklift” data center upgrades.
Hyper-converged Infrastructure (HCI)
Before taking its place in enterprise data centers, HCI figured prominently in the hyper-scale storage implementations of companies such as Google and Amazon. The design premise of HCI is fairly simple: Instead of networking scores of servers to central storage arrays, fill up each server with storage capacity, connect the servers in what is essentially a peer-to-peer network, and let applications use the compute and storage resources they need regardless of where those resources are physically located.
But the real genius of HCI is how it scales. If more storage or compute capacity is needed, a new server can be added to the cluster. The HCI software will recognize the new server and add its resources to the pool. HCI also makes it easier to allocate storage or compute where it is needed.
Some HCI systems are software-only products, allowing companies to use existing hardware or to purchase common off-the-shelf (COTS) servers and storage media.
NVMe/PCIe storage interface
The NVMe specification was developed specifically for solid-state storage because the SAS and SATA protocols used for hard drives couldn’t keep up with the speed of flash media. NVMe takes advantage of the latest bus technology -- PCIe -- and taps into its greater bandwidth and multiple channels to improve data transfers and reduce the bottlenecks caused by slow transports accessing high-speed storage.
Under the hood, NVMe’s specs dwarf those of previous protocols. For example, NVMe can handle 64,000 commands in a single message queue and accommodate over 65,000 I/O queues. That’s a massive increase compared to SAS’s queue depth of 256 commands and SATA’s 32.
Storage-class memory (SCM)
SCM is a hybrid technology that borrows the best of DRAM (extreme speed) and pairs it with lower-cost, higher-capacity NAND flash. By combining DRAM-like memory and standard flash, SCM creates a kind of bridge resource that can supplement DRAM (without being quite as fast) and provide high-speed storage that links directly to memory.
SCM can help improve performance over the traditional way data is passed from storage to DRAM to the CPU’s cache. And unlike DRAM, SCM is non-volatile, which means data will be retained when power is cut off.
A non-volatile dual in-line memory module (NVDIMM) is another form of SCM with a somewhat different implementation. NVDIMM modules include flash media storage, but rather than connecting via a server’s bus, they plug into open DIMM slots that are generally used for additional DRAM. NVDIMMs provide excellent performance and can maintain the state of application data to provide effective recovery from crashes.
The idea of putting storage on the edge was inspired by the expansion of IoT environments. It was determined that the performance of overall IoT operations could be improved if data was processed as close to the source as possible, which means adding storage capacity to endpoint devices.
Endpoint devices can range from remote servers and PCs all the way down to single-function sensors. These devices can use the locally installed storage for edge processing to achieve more timely results.
In many scenarios, the data on the edge storage will likely be transferred to a data center or cloud storage service.
Computational storage is another technical development that takes advantage of fast solid-state storage and keeps compute as close to the data as possible. Computational storage systems include the storage resources typical of a block, file or object storage array, but they also contain the processors, memory and other associated devices needed to run applications against the stored data.
This arrangement eliminates the latency and resulting processing bottlenecks that are unavoidable when moving data back and forth over a network infrastructure. Computational storage is best implemented where rapid processing and analysis of data is critical, such as in artificial intelligence (AI) and machine learning (ML) use cases.
What Are the Benefits and Drawbacks of Emerging Data Storage Options?
While the innovative storage technologies described above have added new dimensions to enterprise storage, each must be evaluated to determine if their benefits outweigh any drawbacks for implementation in specific environments.
Here’s how each option stacks up.
Use Cases of Emerging Data Storage Options
Some modern data storage options have narrow use cases, while others are adept at handling a variety of applications.
Hyper-converged Infrastructure. HCI implementations may range from small installations with only a handful of nodes to HCI environments supporting hundreds or thousands of servers. Based on virtual server and hypervisor technologies, HCI installations are typically used for general purpose computing, including end-user productivity applications.
NVMe/PCIe storage interface. NVMe offers a broad range of use cases, as it is implemented in end-user computing devices such as laptops as well as high-end servers that address numerous and varied applications.
Storage-class memory. SCM is still a specialized storage option, so it’s typically used by larger organizations and their high-end applications, including big data analytics, stock trading systems and in-memory database management systems.
Edge storage. Edge storage use cases are more a matter of network topography than types of applications, but it plays a very specific role in extended networks and IoT implementations. Edge storage may be used for many different types of applications, but the common thread is that the edge device that the storage is connected to needs to access and process the data immediately. In other situations, edge storage might serve as a means for distributing data storage chores to relieve central storage repositories. Those instances typically involve endpoint devices that are equipped with sufficient compute and storage capabilities, such as PCs.
Computational storage. The benefits of computational storage can best be realized when it is applied to tasks that involve large amounts of data and are latency sensitive. Applications may include some form of cryptography; big data processing that involve data warehouses or data lakes; applications that require AI or ML processing; flight data analysis; and highly transactional databases and other high-volume critical applications.
Emerging Data Storage Options: The Bottom Line
There are, of course, other important developments in emerging data storage, but the five described here represent some of the most cutting-edge, high-speed storage innovations. They can fill the gaps that traditional networked or direct-attached environments are unable to address.
Most of the recent changes in storage architectures are related to the advent and proliferation of solid-state storage. As solid-state continues to evolve, you can expect more innovative storage technologies built on flash foundations.