16xGPU System
Published by Bogdan Alex, on July 29th, 2008, in the categories: News
We know Roadrunner is the fastest supercomputer on Earth (for the moment at least), but how much does it cost to assemble and maintain such a behemoth? Millions of dollars? Not that much for a government-funded project. I guess the guys who assembled the Roadrunner didn’t take into consideration what NVIDIA CUDA can provide for their endeavors. CUDA forces all unified stream processors found in a GPU to work as separate CPUs. So if we have two NVIDIA 9800GX2 cards, that would amount to 2X256 processors running at around 1,4 GHZ. This setup alone can turn you computer into a miniature supercomputer that can solve complex equations and coordinate impressive simulations. How about 8 of these cards working in parallel?
MIT graduate students Nicolas Pint, David Cox and James DiCarlo have managed to assemble an impressive 16-GPU system composed of eight 9800GX2 video cards donated by NVIDIA. That would translate into 2048 processors that would deliver more than 20 TFLOPS. The CUDA architecture will soon be adopted by ATI, as well, and that means we will get to use 1600 processors on a single Radeon 4870X2 card. Sure, the ATI unified processors are clocked at a lower frequency than those found on NVIDIA’s cards. Still, eight 4870X2 will net you the power of 6400 processors clocked at around 800 MHz.
Ubergizmo informs that the high-throughput method the three students promote can also use other ubiquitous technologies like IBM's Cell Broadband Engine processor (found in Sony's Playstation 3) or Amazon's Elastic Cloud Computing services. What puzzles me is the fact that the team is also involved in the PetaVision project on the Roadrunner, so why didn’t they use the CUDA architecture? That would have cut the costs drastically. I reckon the Roadrunner had been designed before the CUDA architecture actually got released.
If you liked this post, subscribe to our blog by filling your e-mail address below:
Want to add something? Post your comments
9 Comments on “16xGPU System”
web design company said on 08/20/2008:
supercomputers make me happy
Joe said on 12/04/2008:
Still only runs Crysis at 26FPS...
Greg said on 12/04/2008:
funny they got all this and still use a DVI to VGA adapter
Jake said on 12/05/2008:
Hahahaha I thought the exact same thing Greg.
LeMelon said on 12/05/2008:
I want benchmarks! :-(
Pretty cool though!
Blah said on 12/06/2008:
GPUs are only able to achieve that high performance for single precision floating point operations. Most scientific work is done in double precision. The double precision performance of GPUs blows. This makes sense if you think about it. If your pretty graphics only have 8 significant decimal digits of accuracy or whatever (I'm a little tipsy and I don't remember the real number, sorry), who cares? But scientific applications, in which the rounding error often accumulates or propagates as the simulation proceeds, this is a big deal. There are some special applications in computational science that could benefit from CUDA, but it isn't general purpose, certainly. That's the real reason a cluster like roadrunner doesn't use them.
Jérôme Muffat-Méridol said on 12/13/2008:
So, you have these 8 cards sitting in one computer... What happens when they start needing to talk with each other? Masses of data trying to go through the PCI bus... Ouch... I guess this can work if the problem to solve can be formulated as very independent calculations.
Cool machine, though...
blaarhg said on 12/14/2008:
een supercomputer hooked on to a vga monitor.... come on... where are the 16 widescreen monsters?
Personne said on 12/17/2008:
What kind of motherboard is this ?