

GRAPE6, Now With GNU/Linux Frontend, At 32 TFlops 45
You can also get a baby-grape, see pictures on http://www.astro.umd.edu/~teuben/pics/japan/09/p70 90014.html which runs a good fraction of a TFlop, and will cost somewhere around 10k$.
I have some more pictures on http://www.astro.umd.edu/~teuben/pics/japan/08/ which shows the 1/4 size Grape6 running 32 Gflop. The final full version would cost about 1M$. Compare that to the AsciWhite at 12 Tflop for 100M$. Drawback of course is that the Grape only computes things similar to the gravitational N-body problem (also useful for pharmaceutical industries).
Btw, also spent some time in Akihabara on sunday, I guess we're deprived on the US east coast, the amount of DVD writers you can get here is amazing. Also very popular here seem to be all kinds of embedded units, e.g. the GPS in your car to not get lost in Tokyo!
There was an ABC news story earlier in the year on the GRAPE, but at the time it was running alpha's with their unix. They have now fully switched to linux, and this system has been running since July 5."
general vs. special purpose hardware (Score:1)
The last time I saw cool and useful specialized hardware was the EFF's cracking machine that won the distributed contest.
We talk about, for example, Java being fast enough to compete with compiled languages, but the fact of the matter is that a general system could not achieve anywhere near 32TFlops peak performance on standard PC clusters where you really just need raw computational speed. I think some other people mentioned that SIMD will get you in the GFlops range, but that is 3 orders of magnitude below the Grapes machines.
Before Seymour Cray was killed, one of the last thing he was working on was a project aiming for Petaflops performance. You can see just what a high goal that still is. (A Petaflop is 1000 Teraflops!)
I remember when transputers used to be advertised a lot in Byte and other computer magazines. I wonder if we'll ever see a return of something similar. Grapes seems pretty specialized, but something a little more general like FPGA add on boards might be a good way to get good price/performance on a PC base (i.e. using a PC cluster instead of an expensive supercomputer). The applications would be limited to computationally intensive things. But, for example, 3D rendering for movie animation might be better done on more specialized hardware.
-Kevin
Thats 32 TFLOPS "theoretical peak"! Not actual! (Score:1)
obligatory.... (Score:1)
:)
(Yes, I know that its limited hardware. It's just sorta expected.)
Configurable GRAPE? (Score:1)
A whack of FPGA's should be pretty decent, but you can configure it for more than just as a N-body gravitational problem.
BTW, Akhihabara is over-rated. There's a WHACK of stuff there that we don't get in North. America. But wander for a few hours, and you soon realize that the same store exists on every block, repeated over and over and over...
Besides Akihabara doesn't usually have the best prices. I loafed around all over Tokyo, and it usually has the highest prices. Just pop over on the train to Ikebukero or something, they'll have the same mega chain stores (just not repeated every block) and usually lower prices.
I found digital cameras weren't cheaper or better. The MD players kick ass! 320min playtime per disk now in about the size of 3 3.5" floppies stacked up.
And there there's the colour, digital cell phones about 1/2 the size of ours for about $10-50US. Woohoo!
Re:Thats 32 TFLOPS "theoretical peak"! Not actual! (Score:1)
Re:Q: What do GRAPEs have in common with chess? (Score:1)
http://www.altera.com/products/devices/apex2/ap
One thing you get with a FPGA, parallelism, you can have as many execution units as you have gates to implement and if you need more, add chips, you do have to pay the IO penalty, but it can work out to single cycle operations without any of the pipeline stalls you get in a general purpose processor.
The other nice thing, most if these parts are reprogrammable, so algorithm tweaks are possible.
Grilfriend Sim 1.0 (Score:1)
So by Moore's law... (Score:1)
Actually (Score:1)
Re:How is this possible? (Score:1)
Re:Attn: Beowulf cluster troll (Score:1)
[TMB]
Re:32 Teraflops? Seems a tad high... (Score:1)
If you read the paper at http://astrogrape.org about the prototype GRAPE6 system then you would have notice, according to the paper, when they actually did a simulation of the evolution of a galactic nucleus containing triple massive black holes, they only got about half the theoretical peak performance of the prototype.
GRAPES in Indiana (Score:1)
We're running some tests later this month, shipping data from the Tokyo GRAPE farm across TransPac to the Indianapolis HPSS silo, testing differentiated QoS [another whole thread, involving Napster and GriPhyN]. The idea is to eventually send slices of the data on to the American Museum of Natural History Planetarium in New York, linking three "specialized instruments."
Re:linux grape joke... (Score:1)
Yow! These "*BSD is Dying" posts are getting weirder and weirder...
Re: I am a stupid uninformed loser (Score:1)
I should have waited for more of the grape website to actually load, and install chinese text support. I am dumb and apologize for wasting your time. Below is some useless crap I concocted to try to save my ass from looking like an idiot. I failed, just like I failed out of college and at just about every other aspect of my miserable fucking life. It is funny how that after a while you get used it, and sleep a lot to pass the time--because if you don't, you end up wanting to impale your head on a pointy wrought-iron fence and just be done with it.
I guess the specialized boards are doing far more floating point operations per set of data sent to them then I thought. I didn't see how they could do this without saturating the bus to the number crunching hardware -- or especially saturating the PC's cpu itself, because it would have to send the information to the boards, retrieve the results, and do I/O.
32 Teraflops? Seems a tad high... (Score:1)
-Mike (on a 24 MegaFlop Indigo2)
Imagine... (Score:1)
GFlops not TFlops (Score:1)
Newtonian Force Accelerator? (Score:1)
emulation, anyone? (Score:1)
So? If it's Turing-complete I can read slashdot on it -- or any other app.[1] Just a question of how long...
after all, linux itself was a hack to get unix onto x86...
[1] of course Slashdot will run equally slowly. But imagine your {FPS title} frame rate!
~
what's wrong here? (Score:1)
Judging by the speed of the site... (Score:1)
Re:GFlops not TFlops (Score:1)
TFlops not GFlops. Read the link.
Re:linux grape joke... (Score:1)
Re:Should I say it? (Score:1)
Re:Oh yeah! (Score:1)
Re:Imagine... (Score:2)
Re:sweet (Score:2)
Having a vinyard would be quite the cluster of GRAPEs.
Re:Configurable GRAPE? (Score:2)
Because in silicon it's pretty much twice as fast as an FPGA gets? (EE's rule of thumb, admitingly reffering to microwave app's).
Not only that, but why would they want to do something other than N-body gravitational problems? _You_ might, but there are a lot of such problems to do, and that's what this is designed for.
--
Re:32 Teraflops? Seems a tad high... (Score:2)
Doug
Heard it... (Score:2)
Worldcom [worldcom.com] - Generation Duh!
Re:32 Teraflops? Seems a tad high... (Score:2)
Probably because we're not convinced that the lockout works properly on the GRAPE5s. I know it works well on the GRAPE3, but VE and MS have done some tests on the GRAPE5 where they've tried hammering it with 2 different jobs, and the results haven't been kosher.
Anyone have experience with GRAPE5 and notice this? Anything that could be changed in the API? I suppose we could write a wrapper around the g5_open and g5_close calls that does additional locking, but that seems inelegant.
[TMB]
Re:32 Teraflops? Seems a tad high... (Score:2)
Hi Doug! :-)=
Just a clarification:
The boards themselves don't do the SPH calculations. What they do is return neighbour lists for each particle, which reduces the load necessary to compute the hydro forces.
[TMB]
sweet (Score:2)
Nope... the GRAPES are each one sweet system!
Re:How is this possible? (Score:2)
Re:Attn: Beowulf cluster troll (Score:3)
--
Re:32 Teraflops? Seems a tad high... (Score:3)
If you have additional physics (hydrodynamics, etc), that processing must happen on the workstation which is running the simulation. So the performance is ultimately bottlenecked by the workstation. In practice, Grape practioners typically do not see anything close to the theoretical peak of their boards.
Bob
Re:32 Teraflops? Seems a tad high... (Score:3)
Grape boards are highly specialized chips which do nothing but N^2 direct summation gravity force calculations (actually some of them do SPH (smoothed particle hydrodynamics) as well).
You take a pc/sun/beowulf cluster and link it to a set of grape boards. You then send particles to the boards and get accelerations back.
Doug
Re:Q: What do GRAPEs have in common with chess? (Score:3)
FPGA's are, unfortunately, slow.
They're made with a worse process than custom chips. For inner loops, you want as fast as you can get. You pay for programmability, and if it's always the same task, special-purpose is best.
It's like the difference between hand-assembled code and a compiler. You get it easier with the compiler, but hand-assembling can be better when you know the specifics.
The n-body gravitational problem is going to be around for a while, so it makes sense to customize to it.
what?!?! (Score:3)
What??? A machine like would cost one Microsoft? Either I have been sleeping thru all this time while inflation is running rampant, or M$ is not worth that much anymore.
linux grape joke... (Score:3)
Give up?
A: Nothing, it just made a little Wine.
:( (Score:3)
Re:32 Teraflops? Seems a tad high... (Score:4)
With refference to the calculatiosn they are doing, they are simply doing
G * m_i * SumOverAll(j
They are doing this by custom hardware.
This is not a general purpose computer.
Despite what the blurb said, there are 96 independant units doing the calculation, in each machine, to get the 32 TFlops across the system.
There is a picture of an earlier model, which is about the size of one of my filing cabinets.
Remeber these are scientists, not marketing, making those claims. They expect to be asked to justify them - and they have.
--
Here we go again... (Score:5)
How slashdot slows scientific progress in the world:
1. Oh look, and interesting story on academic research on slashdot.
2. Oh look, a lovely link to those poor academic's website. Surely they have the $40k necessary to make a server that can handle the load from slashdot?
3. Oh look, the reeking Sun Ultra 5 that they were using for web duties has burst into flame, destroying the lab and scaring a small puppy that lives in the lab next door.
To hell with you slashdot for burning puppies.
Q: What do GRAPEs have in common with chess? (Score:5)
A: GRAPEs and chess-playing computers, such as the one that tackled Kasparov (Deep Blue?), both accomplish their opening-up-of-cans-of-mathematical-whoopass via the same approach: functions in the innermost loops are done via calls to special-purpose hardcare cards. The rest is done with software.
So, say I take a GRAPE, and replace its special N-body gravitational daughtercard with one containing a few FPGAs programmed for, say, RC5; now I have a cracking machine. And then reprogram the FPGA to do image manipulation instead; now I have a renderer to make my own Toy Story. And then reprogram the FPGA to do, etc, etc.
Of course, I'm still lacking the software. So actually this post is mostly babbling. :-)