Software to Make Blue Gene Top 200 Teraflops 171
An anonymous reader writes "New Scientist has a story about the most intensive computer program ever created. It runs on IBM's big beast, Blue Gene/L, at Lawrence Livermore National Laboratory in California and carries out 207.3 teraflops (trillion cacluations per second). The program, called Qbox, performs very complex quantum calculations to simulate the behaviour of thousands of atoms in three dimensions. Wow."
Slight clarification (Score:4, Funny)
It simulates interactions between 1000 molybdenum atoms under high pressure using equations that take the quantum behaviour of electrons into account.
Also, when its not being used to dynamically model atomic structures, the IRS uses it to calculate Bill gates's taxes.
Re:Slight clarification (Score:5, Funny)
Re:Slight clarification (Score:2)
But can it run Linux?
</obligatory>
Couldn't run vista.... (Score:2)
But could it run Linux? Yep, it does!
Re:Slight clarification (Score:3, Interesting)
Re:Slight clarification (Score:2, Informative)
Re:Slight clarification (Score:3, Insightful)
Huh? QM was a while ago, but I'm afraid you'll have to give a reference or two. You're saying that Density Functional Theory [wikipedia.org] is impossible? The authors (of DFT) did win the Nobel proze a while ago, so I'm sure I'm missing something. Mind you, any implementation is only an approximation, but that's true of almost any computational method.
Re:Slight clarification (Score:2, Insightful)
Re:Slight clarification (Score:2)
Only if it is open source. Otherwise, it belongs in the Journal of Irreproducible Results. Unless I can reproduce the numerical experiment, the predictions are as meaningful as a call to the psychic friends network.
Yeah, but... (Score:3, Funny)
Re:Yeah, but... (Score:5, Funny)
-Eric
Re:Yeah, but... (Score:4, Funny)
-Eric
Re:Yeah, but... (Score:2, Insightful)
Re:Yeah, but... (Score:2)
It's good to see a sensible human around for once.
Re:Yeah, but... (Score:3)
That's your excuse, but turning your responsibility for electing that criminal into a "disagreement" with people who voted for a competent president, especia
Re:Yeah, but... (Score:3, Insightful)
when you already control everything, you don't HAVE to be loud.
As for obnoxious, well, that's in the eye of the beholder. For example, I consider monitoring my phone calls and locking people up without due process to be pretty damn obnoxious. But that's just me.
-Eric
Re:Yeah, but... (Score:3, Funny)
Too bad for Q-box... (Score:4, Funny)
Too bad for Q-Box that their title will be stripped of them so soon. Vista's almost here.
Wait a minute, Vista? Nevermind...Q-box should have it for a long while.
Re:Too bad for Q-box... (Score:2, Funny)
More importantly... (Score:2, Funny)
More importantly, at what FPS does it play WoW?
Though I wouldn't be surprised if it needs a new graphics card for Crysis [youtube.com]...
Only the most intensive USEFUL program (Score:5, Funny)
Re:Only the most intensive USEFUL program (Score:3, Informative)
(And for those of you who are humor-impaired, I do realize that neither would use any FLOPS because they would both be optimized into L1: jmp L1).
Re:Only the most intensive USEFUL program (Score:3, Funny)
Re:Only the most intensive USEFUL program (Score:2, Funny)
Everyone knows Linux can finish that loop in 5 seconds.
"Infinite" loops (Score:2)
Not *that* infinite loop. The "infinite" loop that Linux and any other OS can finish in 5 seconds (if the CPU speed is right) is:
int n;
for (n = 1; n > 0; n++) ;
This loop will actually finish because n will overflow and become negative after it reaches the largest value that can be represented as an integer in the machine it's running.
Re:"Infinite" loops (Score:2)
int main(){
int n;
for (n = 1; n > 0; n++)
return 0;
}
yarbo@oxygen
real 0m6.761s
user 0m6.748s
sys 0m0.003s
#and just for fun
yarbo@oxygen
yarbo@oxygen
real
Re:Only the most intensive USEFUL program (Score:2)
Re:Only the most intensive USEFUL program (Score:2)
Re:Only the most intensive USEFUL program (Score:3, Insightful)
Now, spawn a thread for each processor running this, and you might have something =-)
Re:Only the most intensive USEFUL program (Score:2)
#define NUMPROCS x
int array[NUMPROCS];
function getval(int indx) {
return array[indx];
}
while(1) {
for(i=0; iNUMPROCS; i++) {
array[i] ^= getval(i);
}
}
should probably be optimised for multiple processors. I'm not sure how fine-grained the optimisation is, but I doubt you have to manually launch threads to get
Re:Only the most intensive USEFUL program (Score:2)
...wow... (Score:5, Interesting)
Regardless, as a computer scientist, I say way to go to these guys, this is damn impressive.
Re:...wow... (Score:2)
it doesn't work like that (Score:4, Informative)
Re:...wow... (Score:3, Insightful)
Re:...wow... (Score:5, Informative)
0.2 TFlops per atom, yes. But there are 1000 atoms, and it's molybdenum which has 42 eletrons... so that's 42,000 particles that all interact with each other. Still... that's not too many. But maybe they're considering interactions between nuclei, too. Who knows...
As for your question about what the equations look like? They're probably very nasty integrals of sines and cosines and what not to various odd (read: strange) powers and stuff. I do fairly computationally intensive simulations on some big IBM machines and just simple equations can amount to quite a bit of calculations. Nothing like what these guys are doing, though.
Finally... what time frame is the simulation over? I'd wager VERY SHORT times, maybe nanoseconds or something like that. Even casual "molecular dynamics" simulations can only probe very short timeframes. Their coarse-grained cousins can maybe do microseconds or milliseconds.
Mike.
Re:...wow... (Score:2)
I'll bite this time. Just once. I have a very good idea what I'm talking about. I said sines and cosines because to a first approximation the wavefunctions of the atoms probably resemble that, so I'd assume the interactions be built off of them in some fashion. I'm sure you took elementary qu
Re:...wow... (Score:3, Informative)
In quantum mechanics the state of the system is defined by a wavefunction on a 3N dimensional space. The state of a system is no longer a point, it's a *function* on a 3N dimensional space. That means that at any
Re:...wow... (Score:2)
Oh, and this is not classical physics, but QM. Thus each electrons wave function has to be represented by a (possibly substantial) set of basis functions. Not sure if anyone's been able to get Density Functional Theory (DFT) to scale that high, but if so, DFT scales as (IIRC) either N^7 or N^9. Ouch! Sure there are tricks, such as pseudopotentials th
Re:...wow... (Score:2)
As I say, modulo a polynomial. The complexity of quantum systems typically grows exponentially because we're looking at the tensor product of the subsystems.
I'd love to find out a bit more about the algorithms used here. And I'd be interested to know what kind of validation there is for the methods. I guess I can start here [wikipedia.org]. (My background is more particle physics than many-body systems.)
Quantum Monte Carlo (Score:3, Informative)
The article is light on details but I suppose the only quantum algorithm that can handle 1000 atoms is Quantum Monte Carlo [wikipedia.org]. The problem is that the algorithm is cubic with the number of particles (and has a huge prefactor). So in essence 1000 atoms is 1000^3=10^9 more time consuming than one. And I'm sure they still use dramatic simplifications, even though they have the most powerful computer. They probably
Re:...wow... (Score:2)
Re:...wow... (Score:2)
Re:...wow... (Score:3, Insightful)
Re:...wow... (Score:3, Insightful)
Because they are apparently simulating them under extreme conditions that are present during nuclear explosions. And nuclear tests are banned.
Re:...wow... (Score:2)
Never understood that stuff. You throw one, we throw 100, you throw 10000 and the earh is destroyed. It goes bang in a big way, and contaminates everything in the direct surrounding. What do you need a super-computer for? Why would you need to test such a thing in the first place?
Now if they would put UD on it you could topple the top ranking in cancer research.
Re:...wow... (Score:2)
Molest me not (Score:5, Funny)
The program, called Qbox, performs very complex quantum calculations to simulate the behaviour of thousands of atoms in three dimensions.
"Molest me not with this pocket calcualtor stuff." [earthstar.co.uk]
How to test a nuke.. without testing one (Score:4, Insightful)
Re:How to test a nuke.. without testing one (Score:2, Funny)
Call your broker, because it's a good time to invest in pencil and paper futures.
Re:How to test a nuke.. without testing one (Score:2)
For the record.. I'm not throwing stones at them. It just struck me as a somewhat amusing way to think about it. How *do* they know they got it right?
Re:How to test a nuke.. without testing one (Score:2, Interesting)
If they are modelling everything without calibration from known experimental results then anything this machine can produce is as trustworthy as internet gossip.
For instance, if you were creating a weather prediction machine (easier to explain), you would feed it with all your historical data and allow the calculations to run from a set date in the past. If the results matched up with actual observed results for the following day/week/periods then you begin to build confidence i
Re:How to test a nuke.. without testing one (Score:2)
Re:How to test a nuke.. without testing one (Score:2)
Re:How to test a nuke.. without testing one (Score:2)
However, the tests are useless in concept. I mean.. I personally would prefer they all were a dud.
Re:How to test a nuke.. without testing one (Score:2)
Tell me, do you do barmitzvahs?
Smart, sure. But is it happy? (Score:5, Funny)
Just wait... (Score:4, Informative)
Re:Just wait... (Score:2)
Sounds like a very interesting project. I guess you have no problem writing and debugging multithreaded code?
Re:Just wait... (Score:5, Interesting)
As far as writing multi-threaded code, I've spent the last 5 months rewriting the NAS CG benchmark to work effeciently on Cyclops64, which will probably play some part of my PhD thesis. (A sidenote: All of NASA's NAS implimentations are written in Fortran (except Integer Sort), which would have necessitated me rewriting NAS-CG in C. Fortunately, I didn't have to start from scratch, because the Japanese had already done the hard part [phase.hpcc.jp]).
Re:Just wait... (Score:2)
Does the Cyclops64 support out of order execution?
Just kind of wondering. My programing is limited to Xscale, Intel, and AMD cpus. The big cool toys fascinates me.
Re:Just wait... (Score:3, Insightful)
Re:Just wait... (Score:2)
I also assume that the interger units are basied on the Power ISA.
Re:Just wait... (Score:2)
As far as instruction re-ordering -- for parallel computation, the big peformance hits occur with waits, synchronizations/barriers, and locks/mutexes. Making these cheap and reducing the number of them is the biggest way to increase performance.
Re:Just wait... (Score:2)
Is Cyclops64 using a shared memory system? Most clusters I have seen used message passing. On those systems your bottle necks tend to be in message passing.
Yes mutexes are a lot of fun. I tend to use mutexes in my code just long enough to make a copy of the data structure for the thread to use. Yes it is cheating and relatively inefficient but it is also pretty safe and keep blocking to a minim
Re:Just wait... (Score:2)
Re:Just wait... (Score:2)
The question mark should have the word "Roadrunner" before it.
Also for those who don't want to follow the link. Roadrunner is a supercomputer being developed at Los Alamos National Laboratory with aims to run at a sustained petaflop.
Re:Just wait... (Score:2)
Re:Just wait... (Score:5, Insightful)
Re:Just wait... (Score:2)
We use a large(ish) cluster - admittedly not even remotely in your league, but our main limitation has always been node connection bandwidth/lag.
Re:Just wait... (Score:2)
Re:Just wait... (Score:2)
Re:Just wait... (Score:2)
Re:Just wait... (Score:4, Informative)
NASA approached the problem differently. Their numerical analysis group put out a set of "paper and pencil" benchmarks (based on real world problems that one would encounter, for example, fluid dynamics). The actual implimentation was left up to the individual companies. This is what we know today as the NAS benchmark suite.
I suspect the answer ends up (Score:4, Funny)
Re:I suspect the answer ends up (Score:2)
42.0
only (Score:2)
You are absolutely right! (Score:2)
And the atomic number of molybdenum [webelements.com] is... 42
Fill in Blank Please (Score:2, Funny)
Well done, you may now enter. Gaming room to the right, pron cubicles left, and crazy linux hardware center up ahead.
We hope you enjoy your stay at Geek Heaven.
HPCWire Interview (Score:4, Informative)
There's some additional info about BlueGene and what Livermore thinks of it here. What this interview neglects to mention is the millions of dollars being spent on IBM and internal developers to get this code (and any others) working on BlueGene. I was briefed by the hardware and software teams that built BlueGene and I can tell you, it's no easy task to bring apps to that platform. Kuznezov seems to trivialize it in the interview and I'm gonna have to go back and review the process again. Maybe it has changed since my briefing in early 2004, but somehow I doubt it.
Re:HPCWire Interview (Score:2)
Re:HPCWire Interview (Score:2)
wait, only 3 dimensions??? (Score:2)
Screenshot here (Score:3, Funny)
Oh, wait. Qbox. Nevermind.
m-
Why not put that power to good use (Score:2)
Re:Why not put that power to good use (Score:2)
10 print "42"
Those who don't know what the hell the parent poster and I are talking about obviously has not read their Douglas Adams [douglasadams.com]!
What are these simulations calculating? (Score:2)
Re:What are these simulations calculating? (Score:2)
Re:What are these simulations calculating? (Score:2)
Thousands of atoms (Score:2)
Sounds impressive, but that's only about a 10 atoms on a side.
Re:Thousands of atoms (Score:2)
Exactly. That only goes to show how much CPUs still have to evolve. Every time someone mentions a new more powerful CPU here in /. there are people who ask "why, what's the use?". For many types of physical simulations, the most powerful CPUs in the world are still pathetically slow.
And that's also a reason why carefully optimized code in C or Fortran with the inner loops written in assembler is still needed. Java, or Ruby, or Python, or any other interpreted language
A few more iterations (Score:3, Insightful)
To the UN: We'd like you to look at these satellite images that clearly show a super computer simulating the destruction of the U.S. We have to take out these terrorists and we're willing to go it alone.
Afterward: Well it turns out that they didn't have the computing power at all, the images we had were of a mobile home park.
Wow indeed (Score:3, Interesting)
This has interesting consequences for the study of plastics, DNA, virii and other complex molecules.
Perhaps the program can run in a loop trying every possible atomic combination to produce the best of certain attributes, as in give me the hardest material or give me an easy to manufacture room temp superconductor. It bypasses the whole invention/discovery step.
Re:Linux (Score:4, Funny)
Re:Linux (Score:2)
Re:Linux (Score:2)
> --
> *insert guitar solo here*
"She don't read Slashdot, but software makes her Blue Gene talk!"
- Dr. Hook and the Medicine Show
Specs (Score:5, Informative)
Re:Specs (Score:2)
Re:Linux (Score:2)
Umm.. Yes as do most supercomputers these days.
http://www.top500.org/stats/26/osfam/ [top500.org]
When it comes to supercomputers Linux is the OS of choice these days. Even Mac OS/X has five times the market share of Windows when you talk about the top 500 supercomputers. You see when it comes to doing real work Windows is just a hobbyist play thing. The big boys run Linux.
!Physics + !CS = CRAP (Score:2)
1) Software is implemented by CS majors who have little understanding of the math and physics involved. They probably implement highly computationally intensive (inefficent) algorithms well (i.e. no bubblesort).
2) Software is implemented by Physics majors who although knowing the syntax of C/fortran, don't understand how to write good programs. Their implementations are numerically correct, but highly inefficient [e.g. they use non-