ncsa

NCSA director: GPU is future of supercomputing

The director of the National Center for Supercomputing Applications has seen the future of supercomputing and it can be summed up in three letters: GPU.

Thom Dunning, who directs the NCSA and the Institute for Advanced Computing Applications and Technologies at the famed supercomputing facilities on the campus of University of Illinois at Urbana-Champaign, says high-performance computing will begin to move toward graphics processing units or GPUs. Not coincidentally, this is exactly what China has done to achieve the world's fastest speeds with its "Tianhe-1A" supercomputer. That computer combines about 7,000 Nvidia GPUs with 14,000 Intel CPUs: the only hybrid CPU-GPU system in the world of that scale.

"What we're really seeing in the efforts in China as well as the ones we have in the U.S. is that GPUs are what the future will look like," said Dunning in a phone interview Thursday. "What we're seeing is the beginning of something that's going to be happening all over the world."

NCSA already has a small CPU-GPU hybrid system. "It's something we have been working on for a number of years. We have a CPU-GPU cluster for the NCSA academic community. Made up of Intel CPUs and Nvidia GPUs. A 50 teraflop machine," he said. (Note that Oak Ridge National Laboratories is also installing a hybrid system now.)

But it's not going to be a snap to tap into the processing potential of GPUs. "Programming these machines to do [GPU] calculations is still a very substantial effort. There will be some applications that will be rewritten to use GPUs [but] a lot of times it will be only part of an application that will use it so you won't get nearly the power and computing advantage of running it all on the GPU," he said.

The catalyst to move programmers en masse toward GPUs will be when chips appear that combine both high-performance CPU and high-level GPU functions on the same piece of silicon, Dunning said. "If they start to solve some of these other problems like… Read more

IBM: Envisioning the world's fastest supercomputer

IBM will release a radical new chip next year that will go into a University of Illinois supercomputer in a quest to build what may become the world's fastest supercomputer.

That university's supercomputer center is a storied place, home to both famous fictional and real supercomputers. The notorious HAL 9000 sentient supercomputer in "2001: A Space Odyssey" was built in Urbana, Illinois, presumably on the University of Illinois Urbana-Champaign campus.

Though not aspiring to artificial intelligence, the IBM Blue Waters project supercomputer, like the HAL 9000 series, will be able to do massively complex calculations in an instant and, like HAL, be built in Urbana-Champaign. It is being housed in a special building on the Urbana-Champaign campus specifically for the computer that will theoretically be capable of achieving 10 petaflops, about 10 times as fast as the fastest supercomputer today. (A petaflop is 1 quadrillion floating point operations per second, a key indicator of supercomputer performance.)

Part of the National Center for Supercomputing Applications (NCSA) at the University of Illinois, it will be the largest publicly accessible supercomputer in the world when it's turned on sometime in 2011.

Supercomputers are essentially a large collection of microprocessors acting in concert on a complex problem. As processor designs go, the upcoming Blue Waters' IBM Power7 processor--due in the first half of 2010--is a big step for IBM: the processor integrates the features of a chip used in its "Roadrunner" supercomputer, which has often been ranked as the fastest supercomputer in the world. Power7 fuses the flagship Power chip design with key technology from a separate "Cell" processor--the latter was part of IBM's Roadrunner system at the Los Alamos National Laboratory, according to Bradley McCredie, an IBM Fellow in the Systems and Technology Group.

"We took some of that genetic material from the Cell program--ways to do floating point (calculations)--and embedded that right into the Power7 core," McCredie said in an interview with CNET.

But that's not the only thing that makes the Power7 chip special. It integrates eight processing cores in one chip package and each core can execute four tasks--called "threads"--turning an individual chip into a virtual 32-core processor. As a yardstick, Intel's high-end Xeon processors typically have two threads per processing core.

IBM is also using novel memory technology. Widely used "static" RAM memory, used as the on-chip memory in almost all processors today, can add as much as a billion transistors to high-end processors. IBM wanted to avoid these ballooning--and costly--chip counts and elected to use a technology called E-DRAM, keeping the total number of transistors to 1.2 billion. "The equivalent number of transistors if we had done all of the cache in (static RAM) is well in excess of two billion," McCredie said.

And the chip's speed? Between… Read more