poniedziałek, 17 sierpnia 2009

Low-end vs High-End

Over the past few days I've been configuring and playing with my new toy - Janus (as all our clusters bear the names of mythological polycephalous creatures). Janus will work as a test bed and development platform for our MPI+CUDA codes. It consists of two nodes equipped with:
As you can see these are mostly low-end solutions (in terms of HPC), created with gamers, not computing in mind. Main advantage is of course price - 2100€ per node, which is fairly cheap even if it comes with worse performance... Now there is a question whether Janus is much worse than high-end solution? Amazingly, the answer is not that much.
While googling today I found site of NCSA's GPU cluster along with results of standard test from CUDA SDK:
../../bin/linux/release/reduction --kernel=5 --n=16384
Reducing array of type int.
Using Device 0: "Tesla C1060"
16384 elements
128 threads (max)
64 blocks
Average time: 0.025320 ms
Bandwidth: 2.588309 GB/s
which we can compare with Janus:
../../bin/linux/release/reduction --kernel=5 --n=16384
Reducing array of type int.
Using Device 0: "GeForce GTX 295"
16384 elements
128 threads (max)
64 blocks
Average time: 0.021630 ms
Bandwidth: 3.029865 GB/s
it's better! (This is a moment when we can give big yay for Nehalem technology :] )
If we compare shear power: Tesla is capable of 936 Gflops using 180W of energy under load, while GTX295 2*894=1788 Gflops using 330W! Furthermore, the difference in price is enormous! Tesla costs 1500€, while GTX295 - 400€.
I slowly begin to wonder why people use Tesla C1060 at all? Maybe cause it's easier to program single GPU card with lots of memory (Tesla has 4GB DDR3), than put a little effort into developing MPI+CUDA codes... Time will show which strategy will prevail.

2 komentarze:

  1. of Janus? Nope but if I remember to bring camera to work I'll take a few. I bought coolers with LEDs and now our server-room looks like disco :)

    OdpowiedzUsuń