Storage—the workhorse technology responsible for keeping data accessible and safe—has gotten much more intelligent over time. There really hasn’t been any choice. If businesses today want to effectively manage massive and continually growing repositories of increasingly unstructured data, run compute-intensive workloads, ensure security against changing threats, satisfy new compliance and privacy regulations, and become more efficient, smarter storage makes a difference.
“We are demanding more from our storage system than just simply reading, writing and storing data. We want our storage systems to help us extract maximum value from our data while minimizing operational cost and complexity,” said Albert Chen, founder of Kalista IO, a company focused on building intelligent compute and storage systems optimized for software-defined environments and next-generation storage devices.
What Chen is talking about is smart storage. Instead of technology that focuses solely on data availability, integrity, data transfer speeds, mean time between failures (MTBF) and other traditional functions, intelligent storage focuses on ways to use that data to improve the business. It uses algorithms to automate data indexing, analytics and retrieval; proactively identifies and resolves issues and bottlenecks by removing human intervention as much as possible; optimizes data placement; responds to changing workloads and dynamic business objectives; and protects the integrity and security of data by identifying anomalies from normal operations.
Today’s smartest storage starts with artificial intelligence and machine learning. These technologies enable the systems to analyze data from connected storage systems and identify both patterns and anomalies. They then use this information to optimize the systems and surrounding environment.
It boils down to having an infrastructure that is self-managing, self-healing and self-optimizing, said Sandeep Singh, vice president of storage marketing at HPE. “That’s where AI-driven intelligence becomes a game-changer in helping deliver a predictive and proactive experience, and helping them make the end-to-end infrastructure invisible so customers can get beyond the IT firefighting that traditionally IT lives in a lot of the time. It’s about getting beyond the disruptions that hold companies back.”
One of the most important functions of intelligent storage today is optimization. For example, if it can pinpoint which virtual machine is consuming more than its fair share of resources, it can determine which node would be a better place for that VM to improve operations. It could also optimize and auto-schedule background tasks based on the history of application workloads and performance needs.
Optimizing performance is a major benefit. When administrators are provisioning workloads on the storage infrastructure, for example, it’s always been a challenge to determine the performance implications of how a new workload will interact with existing workloads. With AI-driven insights, the system can predict, based on performance characteristics of similar workloads, how the new workload will interact with existing workloads. This will allow administrators to optimize for those workloads.
One of the most important capabilities of intelligent storage is its ability to predict issues and make changes that proactively avert problems, improve performance and efficiency, and optimize workloads. Higher-level tools also can predict and proactively address issues that occur outside of the storage ecosystem. That’s important—the vast majority of issues that ultimately arise within storage start off above the storage layer, at the virtual machine, server or even network layer, Singh said. That makes it critical to be able to detect issues in the greater ecosystem well before they reach the storage layer so they can be proactively prevented.
Organizations also are increasingly relying on smarter storage to help mitigate security issues. The AI and machine learning component of intelligent storage gives algorithms the data they need to determine risk and spot behavioral patterns. With this information, the system can suggest workarounds to security administrators, potentially removing issues before they become big cybersecurity problems.
The Many Forms of Intelligent Storage
While all approaches to intelligent storage have the same goal—storage that learns to improve and protect the environment—vendors have many different ways of achieving that goals. Steve McDowell, a senior analyst for storage at Moor Insights & Strategy, divides these varied tools and approaches into two major “buckets”: traditional enterprise storage that has become smarter; and disaggregated, software-defined storage, typically from upstart vendors that have designed new storage architectures that take full advantage of the storage innovations of the past several years, such as NVMe, NVMe-over-fabric, intelligent NICs, storage class memory and transparent cloud tiering.
Traditional enterprise storage, from vendors such as HPE, Pure Storage, Dell, Infinidat and NetApp, is designed to meet the needs of nearly any storage problem. Such solutions are proven, reliable and consistent, and most in this category work well in a hybrid cloud environment. The leaders in this category have added many sources of intelligence, like the predictive analytics capabilities of HPE’s InfoSight and Pure Storage’s Pure1.
Of course, there is plenty of differentiation in this category of intelligent storage solutions. Infinidat’s InfiniBox, for example, uses what it calls a Neural Cache that continuously learns so it can optimize cache loading and improve storage performance. According to a report on the technology from Silverton Consulting, InfiniBox analyzes and updates metadata to reflect new IO patterns and continually refine its approach to caching workloads.
HPE goes about it in a slightly different way, incorporating intelligence through its InfoSight service, which collects telemetry data from millions of sensors on systems implemented around the world. InfoSight continuously analyzes the data using AI, machine learning, deep learning and predictive analytics, and then applies the results of that analysis to individual customer systems. InfoSight has a library of issues that are continuously scanned, and when a match occurs on an issue, it auto-creates a case for proactively resolving the issue, Singh explained.
The second category of intelligent storage solutions consists mainly of a group of relative newcomers that are using newer techniques to wring out new levels of performance and flexibility. Vendors in this category include VAST Data, Kalista IO, Lightbits Labs and Fungible.
In general, these solutions use newer technologies and processes to dynamically reconfigure storage to meet shifting requirements. Chen explained how Kalista IO’s approach to intelligent storage differs from traditional vendors: Unlike traditional storage technologies, Kalista IO has implemented intelligence not only within the control path, but also the data path. That means that all aspects of user activities can benefit from the predictive and analytical powers of AI/ML, from each administration and management operation to every read and write request.
Its storage system keeps track of device- and system-level metrics such as power-on hours, latency, temperature, error rate and usage counters to predict and proactively manage availability and performance. For example, if a device starts to act erratically or exhibit behaviors that correlate to possible future failure, its prediction algorithm will flag the device, alert the user, divert traffic away and proactively evacuate data from the device.
Since Kalista IO's storage system, Phalanx, is file- and object-aware, it can extract a lot of contextual and semantic information from each IO request and associated metadata. It then combines that information with user-, application- and system-level information to fuel the learning and predictive optimization algorithms that drive its data placement and caching decisions. Having these insights into each IO allows the system to intelligently prioritize incoming and internally generated commands. That means, for example, that large throughput oriented workloads from a backup application will not overwhelm small latency-sensitive ones from database queries.
While more specialized intelligent storage solutions such as Kalista IO provide fast, intelligent building blocks, they often tend to live in isolation, McDowell said. They sell their more specialized solutions into applications where performance is more important than “enterprise fit.” Vendors such as VAST, for example, are finding great success in the finance sector and in high-dollar research sectors like pharmaceuticals, where every microsecond of latency, or byte of additional throughput, can deliver a competitive edge that makes it worthwhile.
Others are finding success where their innovations in building flexible software-defined architecture are meeting data management needs that the traditional storage vendors can’t yet meet. Lightbits Labs, for example, focuses on environments that require consistently performant access to data across both local and cloud-based storage.
At the same time, more traditional storage vendors are incorporating newer technologies like NVMe-over-Fibre Channel or tiered memory using storage-class memory where possible. The difference, McDowell said, is that traditional vendors have no choice but to bolt these technologies onto existing architectures. This provides much of the benefits offered by these technologies, but not as much as a storage architecture designed from the ground up to support them would.
Even Smarter Over Time
As intelligent as storage has become, there is more to gain over time, with plenty of innovation occurring across the board. Many believe that as technology advances, intelligence will be pushed further down the stack toward the edge as previously “dumb” devices will be empowered by computational resources like computational storage, and up the stack into the application layers. That means that users can look forward to future storage devices taking a more active role in data processing and offering higher-level services. These changes are likely to ripple up the stack and across the ecosystem.
The eventual goal, Singh believes, is to acquire intelligence throughout the stack—not only with storage. Intelligence only at the storage layer can’t address many issues, such as a firmware issue that might have a downstream impact. AI-driven intelligence with full stack view allows for more proactive detection.
Over time, McDowell expects many of these intelligence-driven storage systems to consolidate, as traditional storage vendors begin to experiment with elements of more innovative solutions.
“The good news for IT buyers is that the type of disruption that the upstarts are delivering demonstrate the value of the emerging storage technologies, and this will ultimately impact everyone’s storage architecture, leading to even greater capabilities,” he said. “IT benefits from all of this, whether they’re buying directly from an upstart or taking advantage of the how the upstarts are forcing the traditional vendors to evolve.”