For decades, system architects have struggled with storage bottleneck issues. While the nature of these issues has evolved over time, making sure that the storage subsystem performs well enough to meet the demands of the workload remains a challenge. Even so, computational storage technologies may hold the key to enabling optimal performance for the most resource-intensive workloads.
For a long time, one of the most common solutions for improving the performance of hardware intensive-workloads has been to put the compute and storage resources as close together as possible. NVMe is a great example of this. SSDs are known for their performance, but are limited by legacy storage controllers that predate solid state drives. NVMe accelerates the speed at which a workload can transfer data to or from storage by completely bypassing the legacy (SATA or SAS) controller and communicating instead through the computer’s PCIe bus.
In the case of NVMe, performance is improved because the storage hardware (the NVMe disk) is brought closer to the CPU. Rather than being separated from the CPU by way of a legacy storage controller or perhaps even by a form of remote connectivity, the CPU has direct access to the storage through the PCIe bus. Computational storage flips this model on its head. Rather than bringing storage resources closer to the compute resources, compute resources are brought to the storage.
On the surface, this difference may seem trivial. In fact, these two options sound as though they are two different ways of explaining exactly the same concept. However, there is a big difference between the two approaches.
When bringing storage closer to the CPU, as is the case with NVMe, the computer’s CPU runs a workload and makes calls to the storage as necessary. In the case of computational storage, however, it is not simply a matter of moving the CPU closer to the storage, but rather introducing additional CPU resources into the system.
In this approach, a server’s CPU continues to be used for running a workload. However, a secondary CPU is added to the storage device. That way, some of the storage-related processing that would normally have to be performed by the system’s primary CPU can be offloaded to the storage CPU.
Admittedly, this approach initially seems somewhat counterintuitive. After all, it is very common for servers to have more processing power than they really need, but also to have storage hardware that is only marginally capable of performing the required number of IOPS. In such a system, it is the storage that is the bottleneck, not the CPU. So why add CPU resources to the storage?
Before I answer this question, it is important to understand that computational storage is a somewhat generic concept. It can be implemented in a variety of ways, and it can serve a wide range of purposes.
With that said, here is an example of a use case for computational storage. One of the biggest storage-related challenges that large organizations are coping with right now is the proliferation of IoT devices. These connected devices can produce huge amounts of data. This data is often stored in the cloud, which means that organizations incur storage costs and must also ensure that the organization has sufficient internet bandwidth available to transfer the data.
The thing about this is that not all IoT data is useful. Consider a security camera, for example. It might be useful to see who passes through the camera’s field of view, but there is little use in having video data that shows nothing but an empty room.
In this type of situation, computational storage can be put to work by running an AI application on the camera hardware. Such an application might be trained to tell the difference between an interesting event and the mundane. By using this approach, it becomes possible to differentiate between useful data and data that has no practical use. This greatly reduces the volume of data that has to be streamed to a storage device.
Another example might be the transformation of a large dataset for a big data application. Rather than performing such an operation directly on the server that is hosting the big data application, it may make more sense to offload the data transformation task to the storage hardware. The storage hardware could likely complete the task more efficiently than the application server could and would minimize the volume of data that has to be sent back and forth, thereby improving the application’s overall performance.
At this point, computational storage is still maturing. Even so, it seems all but certain that it will be used far more extensively in the coming years.