World’s Largest Windows NT Cluster Goes Live

The Advanced Cluster Consortium (AC3), which includes Cornell University, Intel, Microsoft, Dell, and Giganet, announced on August 12, 1999, that it had completed the installation of a 256-processor high-performance computer cluster using Windows NT 4.0. AC3’s cluster bests a University of Illinois 192-processor NT cluster, which Windows NT Magazine covered in June 1999. AC3 built the cluster, called the AC3 Velocity, using 64 quad 500 MHz Pentium III Dell PowerEdge servers, Dell PowerVault storage, Giganet’s clan host adapters and switches, and MPI/Pro middleware software. The project cost an estimated $3 million, producing a supercomputer with a performance of 122 gigaflops (billion computations per second). A member of the consortium estimated that this system can replace a supercomputer costing from $15 million to $30 million, depending on the applications you deploy. The AC3 Velocity is notable in many respects, but most significantly in its price to performance ratio, its use of commodity equipment and open architecture, and the speed with which the system went from boxes to working cluster server. The AC3 consortium reached its design goal of bringing the system live within 24 working hours. Engineers from Giganet completed the installation, which you can read about at http://www.tc.cornell.edu/Events/1999/VelocityInstall/index.html, in about 10 hours. Justin Rattner, Intel Fellow and director of Intel’s Server Architecture Lab, said, “Clusters of servers, based on Intel processors and multiple industry standards, like the Virtual Interface Architecture, have emerged as one of the premier strategies to provide both high availability and scalability in the computationally-intensive Internet environment.” Todd Needham, manager of research programs at Microsoft, added, “We’re focused on working with AC3 to build and validate a model for supercomputing constructed from industry standards, high performance networking, and Windows NT that industry and large enterprises can apply to a wide range of problems.” The Cornell Theory Center (CTC) expects to move to Windows 2000 (Win2K) later this year and scale to beyond 256 processors to as many as 512 processors. AC3 presented the cluster project as a demonstration of something that any company can do, provided it has the funds. The consortium built the AC3 Velocity to service the research needs of the CTC. Thomas F. Coleman, CTC director, said, "We're finding that Windows NT-based cluster computing is an attractive environment for computer scientists and computational scientists, and is bringing new researchers from business and the social sciences into the CTC community." CTC will use the AC3 Velocity for technical applications such as biomedical and genomics research, seismic processing, materials modeling, large-scale database and data warehousing applications, and computer science research in areas such as parallel I/O and systems reliability. Cornell pointed to some of the applications already underway on the cluster, including marketing research using very large datasets and computationally-intensive techniques at the Johnson Graduate School of Management, a new generation of software for fault-tolerant, load-balanced "data center" applications for managing massive databases. CTC’s Cluster Computing Solutions Group had already begun experimenting porting applications from UNIX to NT on a smaller cluster before installation of AC3 Velocity. That work provided performance and scaling data, and from it emerged methods for enhancing the clustering technologies for computationally intensive projects. The cluster acts like a single machine image to applications that can talk to its MPI/Pro middleware. The project used other software, including OpenMP from Kuck and Associates and the Portand Group, with compilers for Fortran90, C, and C++. MPI/Pro includes ClusterCoNTroller, a resource management and scheduling tool developed by Lifka at CTC and commercialized by MPI Software Technology. Other AC3 infrastructure members include Etnus, Inc., Fluent, Inc., ILOG, Inc., MPI Software Technology, Inc., the Numerical Algorithms Group, The Portland Group, Inc., Reliable Network Solutions Inc., and SAS Institute, Inc. You can find additional information about AC3 and the cluster at http://www.tc.cornell.edu/AC3/Memberships/.

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish