Since Y2K (remember that?), the big storage story has been consolidation, simplification, and unification. Storage administrators have been occupied with creating larger, more flexible storage capacities by using Network Attached Storage (NAS) and Storage Area Network (SAN) devices, while increasing the centralized control of those devices through Storage Resource Management (SRM) software management tools. The idea is for an enterprise's storage infrastructure to function as a single pool that storage administrators can fill efficiently. The payoff is that utilization rates increase as administration costs go down.
But a countervailing impulse is also at work. Analysts and observers are increasingly coming to realize that storage technology shouldn't be seen simply as a part of a generalized infrastructure. Rather, storage solutions should be seen as application-specific. This viewpoint is represented through the development of content-specific storage technology and storage hardware and software intended for specific applications.
The fundamental idea behind application-specific storage software is that all data isn't equal. Transactional data, fixed and reference data, audio and video data, and backup data play different roles in most contexts, and their storage requirements aren't identical. Perhaps the highest-profile acknowledgement of the differentiation and stratification of data types was EMC's introduction of Centera. Centera is a disk-based, write once, read many (WORM) device specifically created to store fixed, active reference data.
Although its competitors have contested EMC's claim that Centera was the first content-addressed storage solution, EMC still markets Centera as the first solution in which the address of a particular object also contains information about the object. Due to the proliferation of audio and video files, EMC officials believe that in time, content-addressed storage could make up as much as 75 percent of all stored data. Regardless, Centera is one of many efforts to create hardware and software storage solutions for very specific purposes.
Glen Otero, president of Callident, a provider of performance computing on Linux clusters, said, "The access patterns to different data are very different and depend largely on the specific application context, which drives the development of an appropriate solution." Callident is teaming with Promicro Systems, a high-performance computing solutions provider, to create a Linux computing cluster solution for the biotechnology industry that employs a Serial ATA mass storage solution for the cluster data. As Otero explained, scientists working in the life sciences have very complicated storage needs. "Their data is very heterogeneous and comes from more than one source. The data comes in chunks of different sizes. They need to combine internal and external sources of data. Data from an Oracle database may be combined with data from GenBank." GenBank, a daily-updated public database of nucleotide sequences from more than 130,000 organisms, is the product of an international collaboration.
Otero also points out that scientists need access to data stored in terabytes to as much as half a petabyte, although not all this data needs to be immediately accessible. Furthermore, he said that scientists don't often need enterprisewide access to experimental data. As Otero put it, "There are no clear cut out-of-the-box hierarchical storage management solutions for life scientists."
The storage industry is realizing the growing need for additional sophistication in storage solutions in areas outside science technology. For example, Randy Thorburn, vice president of sales and marketing at Avail Solutions, has created storage solutions with the retail industry in mind and argues that backup and restore technology has to become more intelligent. Thorburn said, "Until now, backup has consisted of putting all data into one blob and then moving that blob around." Thorburn contends that storage administrators should back up different data types in different ways. He believes that backup storage has to become more like a library in which the locations of different data files are clearly delineated.
The drive toward application-specific and content-specific storage solutions runs parallel with the dominant move toward a simpler, more unified storage infrastructure. Such a driving force adds complexity to the storage equation. But sometimes complexity is a good thing.