Everybody knows that the amount of data that enterprises store is increasing rapidly. The META Group estimates that corporate data will increase thirtyfold over the next 10 years. Several factors are fueling the growth. For example, in regulated industries (e.g., health care, transportation), the federal government has mandated that industry participants preserve significant amounts of data for years.
The widespread adoption of enterprise-level applications such as customer relationship management (CRM), supply chain management (SCM), and enterprise resource planning (ERP) also increases data storage needs. Although implementing these applications is high on the IT department agendas in many large and midsized companies, the applications require significant amounts of data storage. Each enterprise application relies on an underlying database, and, as Mike Grovesnor, an industry analyst with the Giga Information Group, has noted, "Databases are the number-one storage hog."
Storage requirements promise only to grow. According to industry observers, for example, migrating an application from to Oracle 11i (to increase performance and gain functionality) could also increase its data-storage requirements by as much as 40 percent because of the structure of the Oracle database tables.
Administrators have used several strategies to address the escalating data-storage demands that enterprise-level applications pose. The first and most common strategy has been to throw more servers and storage capacity at the problem. But even as the cost of hardware has fallen, the overall cost of managing the storage infrastructure has risen.
Storage virtualization is a second and more technically sophisticated approach to managing application-data storage needs. Virtualization makes it easier to view the storage infrastructure as an integrated whole. If one database needs additional storage capacity, you can assign space anywhere on the Storage Area Network (SAN), eliminating the need to install additional hardware and software in one corner of the enterprise if there is unused capacity in another corner. However, few companies have embraced virtualization so far.
A third approach to data storage is active archiving. Pioneered by Princeton Softech, active archiving identifies data that's seldom used (reference data) and moves it off the production database, archiving it on less expensive storage media (e.g., tape drives), from which you can retrieve or restore the data when needed. The active-archiving process preserves the relational context of the data stored on the secondary-storage system.
According to Lisa Cash, president and CEO of Princeton Softech, removing seldom-used reference data from production databases provides many benefits. You can not only use the storage infrastructure more efficiently, reducing the demand for additional hardware and software, but also improve the performance of the applications accessing the database.
In some ways, the concepts behind active archiving resemble those behind Hierarchical Storage Management. HSM is a policy-based storage-management system. The hierarchy makes use of different types of storage media based on cost and retrieval speeds. As files age, they're automatically moved to less expensive, slower devices. Cash points out, however, that HSM differs from active archiving in that HSM doesn't preserve the relational integrity of the data—it just stores information as flat files.
Although Princeton Softech has staked a claim as the leader in active archiving, it's not the only player in the field. IBM offers Row Archive Manager, which lets you selectively remove and store aged data. Computer Associates (CA) and Compuware also offer general-purpose archiving tools. However, these tools are geared to making the backup and restore process more efficient rather than to letting databases operate more efficiently.
The idea that active archiving can play an important role in making the storage infrastructure and the application infrastructure more efficient is relatively new. To make active archiving work, storage administrators, database administrators, and end users all must be involved in the implementation. With storage needs growing rapidly, you might want to explore whether active archiving is a good storage solution for your organization.