Skip navigation

Leave That Old Data Alone

Data archiving is one of the principal pillars of a company's storage infrastructure. As organizations increasingly adopt information lifecycle management (ILM) strategies, they'll have to develop policies to govern both when to move data to lower-cost, less-available storage technologies and what data to move.

In the data-lifecycle process, one of the least-addressed questions to date has been when does data come to the end of its life? When should companies simply purge data rather than keep it? Such questions are particularly pressing in large companies and companies that work in regulated environments. To meet the need to maintain hot backups of data as well as near-online and archived versions of nearly everything, data storage requirements are soaring to accommodate what, in reality, are many redundant copies of the same information. Compounding the issue, in regulated industries data-retention policies are often dictated by law, not by the well-reasoned decisions of management.

The solution to data-lifecycle issues can look very different in smaller companies and companies that don't operate in highly regulated environments, as reader Benno Belhumeur pointed out to me in response to an earlier column. Belhumeur is the director of IT at Durkee Brown Viveiros and Werenfels Architects (DBVW). Founded in 1994, DBVW is an award-winning architectural firm with approximately 25 employees located in Providence, Rhode Island. The firm's broad range of projects includes designing commercial and institutional buildings, restoring and adaptively reusing historic structures, creating affordable neighborhood housing, and designing private residences. DBVW is a Microsoft-only shop, and the company's two-person IT department is in the process of installing a NAS solution.

Belhumeur questions whether physically moving data at all for archiving purposes would address his company's needs. The scenario in his shop looks like this. Although DBVW still maintains paper files of its construction drawings and regularly backs up its data to tape, increased storage capacities had made offloading data for archiving unnecessary. The data captured in electronic documents, Belhumeur notes, has a useful life of perhaps 6 to 7 years. By the time data has reached the end of its life, storage capacities have increased by such a large factor that culling through what should be saved and what should be discarded isn't a cost-effective use of time. It's more efficient just to save everything, letting data stay where it is. Belhumeur estimates that such data fills as little as 5 percent of his total capacity. In short, he says, companies like his can basically keep everything around and pay a very small cost to do so.

DBVW has several other specific characteristics that have shaped its storage strategy. First, the company is legally responsible only for maintaining the drawings and specifications as delivered for construction, which are known as the "contract documents." Most of these are still delivered on paper, though many firms are beginning to move to PDFs. DBVW also retains all the email correspondence associated with specific contract documents.

DBVW isn't required to retain the electronic CAD files, although the firm prefers to keep them. Nevertheless, it's often very difficult to get old electronic files called "sheets" to work again because of the number of dependencies that are built into them. Sheets are created for specific printers and are generally linked to other data or applications, style sheets, and so on. As Belhumeur puts it, trying to retrieve and use old electronic documents is often like having a broken external link in a Word document multiplied by 100.

The DBVW experience raises three important issues for small businesses in addressing their storage needs. First, despite the nonstop buzz in the storage arena to do this or do that, doing nothing extraordinary can be a good strategy, as long as the business has well-thought-out reasons for doing nothing, as in DBVW's case. Second, when IT establishes data-retention strategies, they must ensure that data can actually be usefully retrieved. As composite document generation becomes more commonplace, retaining data in a useful electronic form can become more challenging.

Finally, for an approach that takes advantage of increased storage capacities to leave data in place "forever," developing and deploying efficient search technology can be critical factor in the success of this strategy. After all, if data is archived in your system, somebody might ask for it. When this happens, you'll want to be able to easily find and retrieve data in a useable form.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.