Become a fan of Slashdot on Facebook


Forgot your password?
Science Software Linux

NASA To Get 10,240 Node Itanium 2 Linux Cluster 249

starwindsurfer writes "US space agency Nasa is to get a massive supercomputing boost to help get its shuttle missions back in action after the 2003 shuttle disaster. Project Columbia, a collaboration with two technology giants, will mean Nasa's computing power will be ramped up by 10 times to do complex simulations."
This discussion has been archived. No new comments can be posted.

NASA To Get 10,240 Node Itanium 2 Linux Cluster

Comments Filter:
  • by Skyshadow ( 508 ) * on Monday August 09, 2004 @12:38PM (#9921081) Homepage
    ...but someone ought to tell them that Doom 3 runs pretty well just on moderately-new hardware...
  • Dupe? (Score:5, Informative)

    by Gothmolly ( 148874 ) on Monday August 09, 2004 @12:39PM (#9921093)
    10240x more dupes? []
  • by ( 583077 ) <> on Monday August 09, 2004 @12:41PM (#9921107) Homepage
    Well, I guess they're not using it to serve that webpage.

  • Nice...but a dupe. (Score:5, Informative)

    by Agent Green ( 231202 ) on Monday August 09, 2004 @12:41PM (#9921110) 1427228&tid=163&tid=139&tid=103 []

    Do the editors work for the USPTO as well?
  • by geomon ( 78680 ) on Monday August 09, 2004 @12:43PM (#9921126) Homepage Journal
    About $7.2 Million.

    Talk about a software tax!

    • Moderators must be having a bad day. I've seen several other attempts at humor moderated 'offtopic'.

      I wonder if this is a Monday phenomenon? I wonder what the distribution of 'Funny' moderation is through the week.

      • by Anonymous Coward
        I wonder if this is a Monday phenomenon? I wonder what the distribution of 'Funny' moderation is through the week.

        Sounds like the moderators are having a case of the Mondays?
  • by xmas2003 ( 739875 ) on Monday August 09, 2004 @12:43PM (#9921127) Homepage
    This should help 'em convert feet to meters [] ... ;-)
    • You know, the actual error was some engineers assuming specs
      from other engineers were in pounds while they really were in newtons
      (1 pound == 4.45 newton)
      • Yea, it was actually a "force" measurement (you did say Newtons and (I assume meant) pounds-force) - see attached snippet from one writeup [] ... plus the incorrect deviations from the flight path weren't noticed, which is argueably a distance measurement (there was a fair amount of miscommunication going on too, so lotta blame/mistakes on this one unfortunately) ... but I simplified to feet/meters in my attempt at humor. NASA has (obviously) done a GREAT job with the current Mars Landers, but boo-boo's happen
  • by thebra ( 707939 ) * on Monday August 09, 2004 @12:43PM (#9921135) Homepage Journal
    This is great news for intel. They will double the number of itanics shipped in a single deal!

    Hahaha, my comment is a dupe!
    • This is great news for intel. They will double the number of itanics shipped in a single deal!

      Hahaha, my comment is a dupe!
      I MOD because I care.

      From the department of ironic punishment.

  • by grunt107 ( 739510 ) on Monday August 09, 2004 @12:44PM (#9921143)
    The system will have 500 terabytes of storage, the equivalent of 800,000 CDs.

    In related news, the RIAA has filed a writ of discovery for illegal downloads of 'Major Tom' at NASA.
  • by Anonymous Coward on Monday August 09, 2004 @12:44PM (#9921144)
    But I wonder if moving from a spreadsheet to a supercomputer simulation will make it any more likely that engineers with concerns will whistleblow to non-responsive management. This is a government bureaucracy problem, not a technical problem.
    • by Anonymous Coward
      The two biggest failures at NASA, Challenger and Columbia, absolutely would not have been fixed with more computing power.

      In the case of Challenger, engineers whose opinions should have had the most weight were ignored when they expressed concerns about the seals on the solid fuel rocket boosters. The decision was made by bureaucrats who didn't have the technical savvy required to even form an opinion.

      In the case of Columbia, many engineers at NASA were concerned about possible damage to tiles and request
    • you hit the head on the nail.. Along with the supercomputer they need to put in place a oversight committee of engineers that can take anonomous comments from other engineers and completely override and bypass Nasa administration.

      99% of the time major failures lies in the hands of management, or the failure of management. Yes going to space is hard and dangerous, but they KNEW that something went wrong on launch and management chose to ignore it.

      dont believe me? show me one corperation failure that was
  • Tax payer. (Score:4, Interesting)

    by BrookHarty ( 9119 ) on Monday August 09, 2004 @12:45PM (#9921148) Homepage Journal
    I'm rather mad at this idea, the system costs more than an opteron system, costs more to run (heat/power) and is slower. But it at least runs linux.

    Also, why is the BBC the first news tidbit about NASA's new supercomputer?

    • Re:Tax payer. (Score:5, Insightful)

      by geomon ( 78680 ) on Monday August 09, 2004 @12:47PM (#9921163) Homepage Journal
      Also, why is the BBC the first news tidbit about NASA's new supercomputer?

      Science isn't sexy news in America.

      Not unless they declare they've created a satellite system that will track and kill bin Laden.

      • by legoleg ( 514805 ) on Monday August 09, 2004 @01:03PM (#9921323)
        Just wait till the last week of October... I'm sure he'll conviently pop up around then.
        • I'm thinking the same thing. Amazing how much activity we've had over the last two weeks on the War on Terror compared to the last five months.

          bin Laden is probably in a hole somewhere in Leavenworth Penitentiary right now ready for his arrest just before Halloween.

        • Re:Tax payer. (Score:2, Informative)

          by shiftless ( 410350 )
          Right, cause not only would it would make more sense to wait until 95% of America has it beat into their heads that Bush sucks before bringing Bin Laden out rather than bringing him out as soon as he's captured and using it to Bush's political advantage, but also there's no chance that the soldiers who supposedly captured him already would EVER talk or tell anyone about it.

          Did I miss anything? Oh, yeah:
      • Science isn't sexy news in America.

        To be fair, science isn't exactly sexy news in the UK, either. The BBC covers stuff like this because (a) it's mandated to, and (b) there's no profit motive keeping the unsexy news off the (metaphorical) frontpages. Which is nice[1].

        [1] ...provided there remain alternative broadcasters to keep the Beeb on its toes.

        • Interesting that it is 'mandated' to provide science coverage. Here in the States you have to listen to public radio to get any science news.

          Or I should qualify that by saying that you get science news that isn't related to weight loss, plastic surgery, or abortion. Those topics get frontpage coverage on the commercial outlets.

          If it bleeds, it leads.

      • by dr_dank ( 472072 )
        Science isn't sexy news in America.

        When Paris Hilton has nightvision camera sex with the Hubble Space Telescope, you'll be singing a different tune.
    • The reason SGI uses Itanium is that it performs better in a supercomputer environment than Opteron. I'm not sure which CPU might be faster in a "normal" computer (1...4 CPUs), it may or may not be Opteron, but in a single-OS-image supercomputer Intel performs better.

      Yeah, in my home computer i also got an AMD, i think there's more bang for the buck with AMD in that area.
    • I think you may be wrong when you consider floating point performance, which I suspect is the key driver here. From what I have seen googling around a little, Itanium 2 is better than Opteron on floating point computations.
    • Bear in mind it's not a COTS ( Commodity Off The Shelf ) system like a lot of clusters, it's not even a typical cluster like the kind I have at work ( oodles of dual CPU nodes connected via Ethernet ). SGI Altix machines scale up to 256 CPUs and 4 TB of memory in a single system image, which is one very large SMP machine ( actually, CC-NUMA ). I'm not sure how many CPUs per node ( OS image ) this machine will have, as the article didn't coherently state it, but bear in mind that a cluster of these is more
    • Re:Tax payer. (Score:5, Insightful)

      by flabbergast ( 620919 ) on Monday August 09, 2004 @02:10PM (#9921999)
      Because you can't buy an opteron system with NUMA link (3.2 GB/s between bricks) and you can't simply build a 500 TB data cluster by purchasing some CAT5, 100 250GB drives, 10 Gigabit ethernet cards and call it a day. SGI thrives because it can put together a clustered supercomputer and has the technology to build a 500TB data center. 20 Altix racks, 128 Altix bricks/rack (4 processors/brick X 128 = 512 proc) and has globally shared memory thanks to numalink. This means that even though each brick can run independently, you can also build a 512 proc system with a single Linux system image that has the combined memory of all the bricks (thanks SHUB and NUMAlink!). So, when you can build a 512 processor, global shared memory system out of Opterons, then you go ahead and sell it. This is a clustered supercomputer where each cluster is a supercomputer.
    • You're missing the point. The machines in the cluster, SGI Altixes, do things that -can't- be done with any available Opteron systems; each node is a 512 processor machine. This means you can spawn of 512 threads and give each one its own CPU without having to move off the system. ...and the Itanium2 wipes the floor with pretty much anything else on the market when it comes to floating point calculations.
  • by cbreaker ( 561297 ) on Monday August 09, 2004 @12:50PM (#9921192) Journal
    .. or a very good writer.

    "They can also be modelled over a time period of weeks or months instead of over just a few days."

    Ohh sweet, so then what used to take days now takes months?

    And at one point in the article, it says "20 nodes" and then at another part it says "512 nodes." So like, what is it?

    You know what, I don't even care.
    • And at one point in the article, it says "20 nodes" and then at another part it says "512 nodes." So like, what is it?

      Read the article:
      "It is using an off-the-shelf system and taken that and built a powerful system around 512-processors which are then hooked together to give considerable power."

      512 processors * 20 nodes = 10240
    • Yes it takes weeks or months instead of days, but it does it an order of magnitude more expensively! This same technology has been employed for years at the DMV, and look at the results you get there. This is the Government we're talking about...
    • by Anonymous Coward
      Article not read by a technical person.

      The article is stating that the weather pattern studies would now be able to simulate activity periods of weeks or months rather than just days - NOT that the simulation runs themselves would take months rather than days!
    • "They can also be modelled over a time period of weeks or months instead of over just a few days."

      Ohh sweet, so then what used to take days now takes months?

      Think about a weather system: the more days they can model, the better. That's what this means; in a reasonable amount of time, weeks or months of activity can be simulated.
  • by ChaosMt ( 84630 ) on Monday August 09, 2004 @12:54PM (#9921227) Homepage
    I can understand the BBC making this mistake, but slashdot?! I'm sure this was also noted in the dupe.
    • It is a cluster of supercomputers :-)

      Seriously, the way the Altix is laid out... I believe it is a cluster of 512 processor supercomputers.

      This isn't uncommon. Look at ASCI BLUE, or some of the other large IBM SP2 based systems.

    • At least, the value of this information has increased by a mod point since first instance (when I put it up :)

      Maybe this is why we ever need more computing power. Call it repetitive information gain strain syndrome.

  • Isn't the computer running the space shuttle built around one of those physical disk memory systems?

    So this super computer will be used in part to emulate the computer running on the space shuttle - probably one of the oldest designs still in regular use.

    So little memory the launch, orbit, and descent programs cannot be loaded simultaneously.

    • If it aint broke, don't fix it. The only way to have a device that's very, very reliable is to keep it very, very simple.

      The shuttle's computer, btw, is a scaled back version of the IBM s/370 mainframe processor.
      • Re:Irony emulator (Score:5, Interesting)

        by kenaaker ( 774785 ) on Monday August 09, 2004 @01:49PM (#9921771)
        I worked on the space shuttle simulator (lo, these many years ago), and the shuttle computers are derivatives of the computers that IBM originally used in the B52's. They were called AP-101's, and if I remember correctly were Harvard Architecture systems with a separate instruction and data store memories. I think they had 128K (32 bit?) words for instructions and 64K (16 bit?)words for data.

        The simulator originally ran on IBM System 360 mod 75's (serial numbers 1, 4, and 5). When I was working on it, the simulator was running on a IBM 3033 (370 architecture) machine running MVS, and had a hardware interface that attached 3 AP101's to the system IO channels. The shuttle hardware outside of the AP101's and environment were modelled in the 3033, even including the "slosh dynamics" of the fuel in the external tank. The simulator was written in 370 Assembler with macros for the programming control structures.

        One of the funniest things about running the simulator came out of the major failure tests. The simulator had a distinct "abend" that indicated that the vehicle had a position that was below the surface of the earth.

  • ...I litterally started to salivate at the prospect of a cluster of this magnitude.
  • by Hamlin ( 543598 ) on Monday August 09, 2004 @01:16PM (#9921423)
    if they'd gone with G5 Xserves they could have had 23,888 Dual 2GHz systems with 17.916 Petabytes of storage (assuming they just went stock on the high-end systems).

    Okay and one question about the article. Was he saying 1000 Gb of RAM per system or 1000GB per system?
    • by halfelven ( 207781 ) on Monday August 09, 2004 @01:55PM (#9921834)
      It makes you believe this supercomputer is made out of commodity components.
      That's blatantly false.

      The SGI systems are highly proprietary equipments that provide very large bandwidth between the nodes, extremely low latency and tight integration. They're not regular Beowulf clusters. They really are single systems with hundreds or thousands of CPUs, all of them running the same single instance of the OS (as opposed to typical clusters which run one OS instance per node).
      Because of the tight integration, the software does not have to obey the same constraints as when running on commodity clusters. Especially the requirement for total parallelization does not stand anymore.
      Therefore, problems which cannot be translated into 100% parallel algorithms, and therefore do not run efficiently on commodity clusters, are easily tackled on SGI supercomputers.
      That's why they can charge a high price on their systems - because they can solve problems that are not accessible to "normal" computers.

      That being said, the system at NASA is indeed a cluster, but it's a "small" cluster (a handful of nodes), each node being a supercomputer with hundreds of CPUs. It's a hybrid that provides the best of both worlds.
  • by linuxislandsucks ( 461335 ) on Monday August 09, 2004 @01:41PM (#9921701) Homepage Journal
    where is SUN Microsystems?

    well someone had to ask :)
    • In Sun's head, Dr. Jekyll and Mr. Hyde are still fighting for supremacy. They've got this split personality syndrome and are still scratching their collective head figuring out which way to go. By the time they make up their mind, i'm afraid the sun will be setting. (pun intended)
  • What's the price tag on such a system ?
    Or, what's the price for just one 512 processor box ?
  • Just think how fast that scrollbar will fly across when they install Fedora Core 2! It'll probably take 2 seconds!

  • There are limits (Score:5, Insightful)

    by sakusha ( 441986 ) on Monday August 09, 2004 @02:13PM (#9922032)
    There is a limit to what computer power can do for you. I'd rather see the money being spent on human resources: people who know what they're doing. There's an old saying in the business world, I wish I knew who first said it, "for any technological problem, the limiting factor is never technology, but rather, human resources." In other words, if your technology has problems, throwing more tech at it is unlikely to solve the problems. Only more human intelligence applied to the situation will improve things.
    Having the fastest supercomputer in the world won't help you one bit if nobody thinks to run a simulation of what happens when a chunk of foam blows a hole in a wing. I keep thinking about Frank Borman's statements to the Apollo 13 Commission, he said it wasn't a failure of technology, it was a failure of imagination, nobody ever imagined there could be a problem. Computers have no imagination. They give answers, but nobody's asking the right questions.
    • You are only correct to a point. The human imagination can come up with millions of things that can go wrong. There is no way those same people could calculate each possibility thoroughly. That is where super-computers come in. They are to mathematically test those theories through simulations. A peice of foam falling off the hull is only one in an infinite possibilities. The idea is to find a middle ground. NASA employs alot of people, I don't think $160 million for 10k processors that can do the wo
  • SCO Tax (Score:3, Funny)

    by Nonillion ( 266505 ) on Monday August 09, 2004 @03:32PM (#9922825)
    They better pay their $7,157,760 ($699/CPU) in SCO tax or McBride is going to be stomping around saying "NASA is screwing us!"
  • Imagine a Beowulf cluster off... oh, wait...
  • by Hugh-know-who ( 716929 ) on Monday August 09, 2004 @04:28PM (#9923330)
    The Columbia disaster was not due to a lack of computing power, but rather to a culture of denial. The failure of mid-level and senior management to listen to their people prevented any action being taken until it was too late. In a way, this mirrors the broader American culture of the late 20th and early 21st Centuries, typified by a complete refusal of individuals - particularly but not exclusively individuals in powerful positions - to take responsibility for their own actions, inactions or failures of any kind.

Q: How many IBM CPU's does it take to execute a job? A: Four; three to hold it down, and one to rip its head off.