Proactive analytics is changing the face of storage, and companies need to know what it is, how to get it, and whether existing storage systems are providing it.
It wasn’t all that long ago when five nines of availability was the gold standard for storage systems. For those who may not be familiar with the term, five nines of availability means that the storage is online and functional 99.999% of the time. This equates to a tolerance of about 5.2 minutes of down time each year. More recently, however, many storage vendors have increased their availability standard to six nines of availability (99.9999% uptime). If you consider a year to be exactly 365 days, then six nines of availability equates to less than 32 seconds of down time. Some vendors have even gone so far as to promise 100% availability.
So, what has changed that allows storage vendors to promise such extreme reliability? Storage vendors are using things like logging data and AI-based predictive, proactive analytics as tools for improving storage health. In other words, it isn’t necessarily that the hardware itself has become dramatically more reliable (although the hardware may have improved, too), but rather that the management layer is able to provide prescriptive guidance through proactive analytics that can help keep hardware healthy.
In some ways, prescriptive guidance has existed for some time. Consider, for example, an old-school RAID 5 array. Such an array provides protection against the failure of a single hard disk. Not long after these arrays were first introduced, manufacturers began to realize that a hard disk failure could happen without the administrator even realizing that anything was wrong. In fact, an administrator might not even realize that there was a problem until a second disk failed, causing the entire array to fail.
Manufacturers addressed this problem by introducing alerting mechanisms. These mechanisms varied in scope by manufacturer, but they all used various means of getting the administrators’ attention and letting them know that a disk had failed and needed to be replaced.
Today, of course, storage systems are far more complex. Sure, hard disk failure is still an issue, but there are many other things that can go wrong. Modern storage systems therefore use a combination of intelligent log parsing and predictive analytics to ward off any problems.
Modern storage systems may use logging data in a few different ways. For starters, logging data is sometimes combined with real-time detection techniques as a tool for finding and correcting problems. In some cases, such mechanisms are able to automatically locate and remediate problems. For example, log entries and real time metrics may lead a system to determine that it needs to reboot one of the system’s nodes.
In other cases, intelligent analytics may be able to tell the administrator exactly what kind of problem is occurring, and walk the administrator through the steps necessary to correct the problem. Oftentimes, these types of intelligent analytics can use log entries to detect symptoms of an impending problem, and help the administrator to take corrective action before there are any obvious symptoms. Some systems are even able to use the available data to tell the administrator how quickly he or she needs to take action before a serious problem occurs.
Predictive, Proactive Analytics
In some ways, my last example of a system that uses log data to determine how quickly an administrator needs to take corrective action before a minor condition becomes a serious problem could be thought of as predictive analytics (although I tend to think of it more as data extrapolation). However, storage vendors are building numerous forms of predictive, proactive analytics into their systems.
The most common example is probably capacity forecasting. By monitoring usage patterns, storage systems can estimate when it will become necessary to perform a capacity upgrade.
Another common use for proactive analytics is estimating the lifespan of disks. By tracking use patterns and the manufacturer’s specifications, disk failures can be predicted, thereby giving organizations a chance to replace aging disks before they fail.
Of course, there are also far more advanced predictive analytic capabilities being built into modern storage systems. Systems may be able to recommend upgrades or custom configurations, for example, based on an organization’s own unique use patterns.
Mechanisms such as log analytics and predictive analytics are helping to make storage systems more reliable than they have ever been before. As these mechanisms evolve, self healing will inevitably become even more capable and more prevalent than it is today.