More power to you
By Lea Anne Bantsari (February 2003)
From understanding the heart to preventing accidental explosions, high-performance computing clusters power scientific research aimed at making the world a safer, healthier, and all-around better place
In universities and research institutions around the globe, scientists crank out terabytes of data annually while researching almost any topic imaginable. From global warming to human behaviour to the incidence of three-legged frogs in Minnesota, scientific research generates mountains of information, and researchers require high-powered equipment to process, store, and access this data. Until the last few years, however, that equipment was extremely expensive to purchase and maintain.
Now, the steep cost of specialized hardware is giving way to lower-cost commodity-based systems. That trend has combined with the growing popularity of open-source software to produce the Linux-based high-performance computing (HPC) cluster—a computing technique that puts fast, affordable supercomputing within reach.
Scientists have long sought a way to harness the full potential of the microprocessor to simplify their work. However, many research projects demand processing power that far exceeds the capacity of a single CPU.
The birth of the "supercomputer"—a single computer with multiple CPUs—enabled the processing power necessary to handle simultaneous tasks at high speeds. The drawback: These supercomputers require extremely specialized hardware and software, as well as a dedicated support staff working to ensure uptime. Despite the tremendous abilities of supercomputer systems, their high cost is usually out of the question for research institutions that often compete for every dollar they acquire.
"Supercomputers have large memory, exceptional floating-point performance, and very good I/O," says Chuck Sears, director of Research Computing Services at Oregon State University's College of Oceanic and Atmospheric Sciences. "They're great for scientific computing, but they're too expensive and too specialized."
In recent years, the clustering technique has emerged as a viable alternative to the supercomputer. A cluster comprises several smaller, commodity-based computers that are networked together to act as a single larger computer. The advantage is a great price/performance ratio: The cost of several smaller machines often is cheaper than the price of a single supercomputer, and the processing capability has the potential to be nearly equal.
An open-and-shut case for open source
On the other hand, proprietary cluster packages can also require specialized hardware and software—sometimes forcing users to pay high dollars and work hard for interoperability. To save the most dollars on an HPC solution, experts say users should consider basing it on the Linux® operating system (OS). In fact, the Aberdeen Group predicts that Linux will dominate about 80 percent of the HPC market within two to three years1 . The hope is that this open-source, UNIX® -like operating system will finally address the three critical concerns that have plagued HPC proponents for years: price, practicality, and performance.
Indeed, Linux has found a niche in the world of high-performance computing because its advantages make it a perfect match for anyone wishing to process a massive amount of data. Unlike proprietary systems, Linux requires no licensing fees and organizations can freely distribute it across all servers in a cluster at no cost. As the trend toward building commodity-off-the-shelf (COTS) systems increases, the use of a flexible OS such as Linux is a natural choice for achieving maximum value. Moreover, availability of support and tools from Linux vendors is also increasing. For a Linux HPC cluster, available support means that administration and upgrades—already simple tasks—are becoming even easier.
"Our Linux-based HPC cluster is so straightforward and has such simplified maintenance requirements that a local high school sophomore has installed the operating system and facilitates systems management," says Dr. Andrew Pollard, associate professor of biomedical engineering at the University of Alabama at Birmingham.
Linux extends the lifespan of the HPC cluster by making its components longer lasting, easier to acquire, and, often, reusable. Because Linux clusters are based on an open-source system, they do not require proprietary hardware or software. Instead, they can leverage legacy or commodity-based systems and upgrade by component as necessary. The overall result is a more practical solution for research organizations that cannot afford to rebuild HPC systems repeatedly.
Linux believers also rate the operating system's performance as top notch. Its flexibility allows it to interoperate with any number of components regardless of manufacturer, enabling organizations to assemble a system that suits their specific performance needs. Linux also runs on nearly any kind of processor. For many organizations, these qualities have revealed a cost-effective combination of processing speed and stability.
Linux clusters in the real world
From universities to laboratories around the globe, researchers use Linux-based HPC clusters to advance scientific discoveries in many areas of study. By putting HPC clusters to work in the name of science, research institutions have turned an old adage on its ear; knowledge may be power, but in the case of high-performance computing, it seems that power can also be turned into knowledge.
Searching for a cure
At the Buffalo Center of Excellence in Bioinformatics at the University at Buffalo (UB), a campus of the State University of New York, researchers run a 2,000-node Linux-based HPC cluster on DellTM PowerEdgeTM servers to perform more than 5 trillion calculations of biological research every second. Researchers combine such high-tech practices as supercomputing and visualization with scientific expertise in genomics, proteomics, and bioimaging.
The goal is to unlock the secrets to diseases such as cancer, Alzheimer's, and AIDS so that scientists in the medical field can apply the knowledge to their pursuit of cures. The Linux-based HPC cluster puts these medical breakthroughs on the fast track: The amount of data to be analysed by the cluster would take approximately 2,000 years to analyse on a single computer with one processor.
Following the events of September 11, 2001, scientists around the world became increasingly interested in how to minimize the danger of fires and explosions. Whether accidental or deliberate, detonations can cause severe structural damage, as well as human injury or death.
Now, with the help of a Linux-based HPC cluster made up of Dell PowerEdge servers, researchers at the University of Utah's Center for the Simulation of Accidental Fires and Explosions (C-SAFE) study the ways in which unsafe chemical storage can result in fire disasters. The goal is to help government and corporate organizations improve handling of hazardous materials so that future accidental fires and explosions might be prevented.
Like computing infrastructures, global energy sources must also be scalable and accommodate increases in demand to be both practical and effective. Unfortunately, many of the world's current supplies of energy resources are diminishing rapidly. That's why Compagnie Generale de Geophysique (CGG), a leading supplier of products and services in the oil and gas industry, is working diligently to locate new, usable oil fields.
With the assistance of several HPC clusters running Linux on Dell PowerEdge servers, CGG tackles complex calculations in both the United Kingdom and Houston, Texas, that lead researchers closer to pinpointing possible new oil sites—and possibly increasing the world's known supply of this critical natural resource.
The final frontier
Under the expansive Australian sky, scientists at Swinburne University in Victoria are boldly going where no man has gone before—with the help of a cost-effective, high-performance cluster. Based on Dell PowerEdge servers running Linux, this stellar HPC cluster processes data from the Parkes Radio Telescope in an effort to map new territories in space. Additionally, the HPC cluster uses data from the Mars Orbital Laser Altimeter to model the surface of Mars for use in educational films.
Swinburne scientists say they are pleased with the cluster's processing capacity and its price/performance ratio-which they attribute to the use of Linux in conjunction with standards-based technologies.
Heart at work
Inside the Department of Biomedical Engineering at the University of Alabama at Birmingham, a Linux-based HPC cluster deals with matters of the heart in the hopes that researchers might better understand the often-fatal complications of cardiac arrhythmia.
The Linux-based cluster, on Dell PowerEdge servers, provides heavy-duty processing power for the University's modeling and mapping projects, which involve complicated computerized dissections of the heart's function and musculature. In addition, the system's reliability and stability help to simplify maintenance and administration—easing the strain on valuable staffing resources and helping the lab to cut costs.
Good things come in packages
Linux is a no-cost giveaway that anyone can download from the Internet. But there's more to Linux than its lack of a price tag. In fact, Linux can play a critical role in the ultimate performance and longevity of a cluster network. To make the most of a Linux HPC deployment, users need a packaged offering that includes support from a reliable hardware vendor to ensure speed, scalability, and ongoing maintenance.
Dell has aligned the strength of its PowerEdge enterprise servers with support from Dell Professional Services and the market-leading commercial Linux distribution from Red Hat. What's more, Dell recently cranked up its HPC program to enable greater scalability, optimised performance and connectivity, as well as remote management capabilities.