The Challenge of Managing Petabytes of Storage

The National Center for Atmospheric Research (NCAR), a national research facility with headquarters in Boulder, Colorado, and managed by the University Corporation for Atmospheric Research (UCAR) under primary sponsorship by the National Science Foundation (NSF), conducts wide-ranging research in chemistry, climate and weather, and solar-terrestrial interactions. NCAR also provides UCAR's 66 member universities and other affiliates with instrumentation, aircraft, and computer technology to study the earth's atmosphere.

NCAR is a data-generation superpower. Established in 1986, NCAR's storage facility recently surpassed the 1 petabyte (PB) mark. Within a year or so, NCAR will have generated and stored another petabyte of data. And by 2005, NCAR anticipates it will have around 6PB of archived data. NCAR currently averages about 70,000 robotic mounts a month. By 2005, that number will reach 900,000 a month. NCAR currently has 5 data silos made up of 5 high-performance StorageTek PowderHorn automated tape systems and plans to add 12 more automated tape systems over the next 2 to 3 years.

The root cause of data generation at NCAR lies with its computing infrastructure. The facility has the 10th and 33rd most powerful supercomputers in the world: Each is a large cluster IBM supercomputer with more than 1200 processors and is devoted to weather and climate research. Scientists run simulations on the supercomputers that look at climate change over vast stretches of time. The 10th largest supercomputer has an aggregate output of 2Gbps. NCAR streams the data through a high-speed data fabric--currently High-Performance Parallel Interface (HIPPI), although Al Kellie, the director of the NCAR Scientific Computing Division, is considering a move to Fibre Channel in the future--onto drives, and then to the data silos.

To better understand why NCAR is producing and storing data at such an accelerated rate, how NCAR manages its data storage infrastructure, and the impact the rate of growth has had on NCAR's IT operations, I talked to Kellie. According to Kellie, the speed of access to the data and drive and tape capacities aren't a problem. The primary problem is managing storage, which, according to Kellie, isn't well understood. NCAR has greatly increased its sustained compute capability in the last 4 years. For example, in 1999, NCAR was accumulating a net of 15TB of data per month. Today NCAR accumulates data at a rate of 30TB to 50TB per month. Kellie's team is using data mining techniques to better understand data patterns at the facility. Simulations have produced 80 percent of the 17.2 million files currently in the system. But the increased number of simulations isn't the only storage problem NCAR faces. The other 20 percent of NCAR's archived information represents observation data collections. Users who are involved with these collections generate an enormous number of files. One user, for example, has 1 million files.

According to Kellie, managing files is the most pressing concern facing all of the national laboratories. Kellie said NCAR uses the High Performance Storage System (HPSS) archival data storage system that the Department of Energy's National Energy Research Scientific Computing Center developed. HPSS has a storage capacity of 1.3PB and is managed by IBM servers. But, according to Kellie, maintaining his homegrown data storage system, which is in its fourth release, is less expensive.

Data growth has caused NCAR to involve its users in the development of policies for storing and expunging data. Kellie knows that 64 percent of the data NCAR generates has never been read, but that doesn't mean the data isn't valuable. Determining when a particular data file might become useful is often difficult, and user input about the relative value of data files is an important part of policy decision making.

NCAR is now factoring storage cost into its overall budget. In the past, the cost of storage was insignificant for NCAR; now, however, that cost is as important as the cost of compute resources. At some point, storage costs could exceed compute costs. As Kellie sees it, containing storage costs requires new tools to enable users to manage their data holdings more efficiently. According to Kellie, you can't change your storage policies without providing better tools.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.