Forgot your password?
typodupeerror
NASA Space Hardware Technology

Self-Healing Computers For NASA Spacecraft 70

Posted by Soulskill
from the it-worked-for-the-borg dept.
Roland Piquepaille writes "As you can guess, hardwired computer systems are much faster than general-purpose ones because they are designed to do a single task. But when they fail, they need to be totally reconfigured. This can be just a costly problem in a lab on Earth, but it can be vital in space. This is why a University of Arizona (UA) team is working with NASA to design self-healing computer systems for spacecraft. The UA engineers are working on hybrid hardware/software systems using Field Programmable Gate Arrays (FPGAs) to develop these reconfigurable processing systems. As the lead researcher said, 'Our objective is to go beyond predicting a fault to using a self-healing system to fix the predicted fault before it occurs.'"
This discussion has been archived. No new comments can be posted.

Self-Healing Computers For NASA Spacecraft

Comments Filter:
  • by 1u3hr (530656) on Saturday April 26, 2008 @05:29AM (#23206230)
    "Just a moment....Just a moment.
    I've just picked up a fault in the AE-35 Unit.
    Its going to go 100 percent failure within 72 hours."
  • Not new (Score:5, Informative)

    by Anonymous Coward on Saturday April 26, 2008 @05:45AM (#23206262)
    I used to work for JPL, in a group that was researching the feasibility and applications of FPGAs for this exact purpose. That was around 7-8 years ago, which significantly predates this "news," given the pace of technology. IIRC, they called it "evolvable hardware."
    • by Taelron (1046946)
      Thank you, I could have sworn I heard this same exact approach discussed and tested back around 2000... Is this another one of those ideas that was to difficult at the time and shelved only to be dusted back off and tried again with new technology?
    • by jd (1658)
      I first saw mention of circuits that could bypass failed areas in a mid 1980s article by Sir Clive Sinclair, who argued it could be used to produce wafer-scale technology. The errors in the wafer would be unimportant, as they'd all be bypassed. Of course, this isn't what I'd call "self-healing" (where circuit switches go along with some sort of effort to repair the original damage if possible), but actual repair - beyond perhaps some sort of robot-wielded silver pen to re-connect broken tracks on a circuit
    • This capability was first discussed back in the 70's and 80's and was theorized about back in the 60's, considerably predating your "anecdote". This is news, no matter what the 'pace of technology'* is, because they haven't quite managed to make it work yet.

      * Largely a meaningless set of buzzwords. Even in computers not every portion of the field progresses at the same pace.
    • Lamgley [nasa.gov],
      Paper E3 [klabs.org],
      Paper 161 [klabs.org] and even a 110MB video of students
      programming FPGAs at NASA [starbridgesystems.com]
  • "If two units go down and canâ(TM)t fix themselves, the three remaining units split up the tasks. All of this is done autonomously without human aid."

    The idea is simple, and I think therein lies its ability to succeed. Regaurdless of how dificult the programming is, the end result is conceptually very basic, tried and true. System redundancy and a support network. Mighty fine.
  • by jmickle (941634) on Saturday April 26, 2008 @06:22AM (#23206344)
    Well at least you cant get a robot pregnant......
  • by flnca (1022891) on Saturday April 26, 2008 @06:25AM (#23206350) Journal
    What will Starbridge Systems [starbridgesystems.com] think about that? Didn't they develop a dynamically reconfigurable computer that ran Windows NT as a test application on 10,000+ FPGAs back in the 90ies? IIRC, they also had a software framework able to automatically implement software fragments in hardware using FPGA auto-configuration.

    Self-repairing computer systems for spacecraft have been in the discussion for decades, and every now and then we get hear about a new project. This project certainly is a good idea, hopefully it will work.

    BTW, Motorola (now Freescale) developed self-repairing processors for military applications a couple of years ago.
    • I think they had troubles with that system. It kept repairing itself to run Unix.
    • BTW, Motorola (now Cyberdyne) developed self-repairing processors for military applications a couple of years ago.
      There we go, fixed that for you
      • by flnca (1022891)
        Why? Soldier bots like in the Terminator movies aren't that bad an idea. Better than real soldiers dying on the battlefield. And a good deterrent too. But in the wrong hands ... yeah ...

        The terminator movies aren't that far fetched, after all. The right type of AI, robot planes, tanks, and soldiers, and mankind is no more ... ;-)

        Then we can only hope that time travel is invented and someone gets sent back thru time to prevent that from happening. ;-)
        • Why what? I didn't specificallyt say it was a bad idea ;) Robot soldiers dying on the battlefied seems a bit stupid - when both sides have them at least. In the cases where only one side has them it would be a massacre. In cases where both sides have them, what's the point? Why not just nuke all the robots? Why not just fight our wars over a game of Starcraft or something rather then spend billions developing robots to play our elaborate game?
  • ...is being implemented by Jackson Roykirk in the Nomad project. What could possibly go wrong?
  • hmm (Score:4, Funny)

    by thatskinnyguy (1129515) on Saturday April 26, 2008 @06:42AM (#23206380)
    For the sake of all humanity in the impending robot wars, lets stop this right now.
  • NASA has been working on this in one form or another for many years now. How is this NEW news now?
    • I attend UCF, and will be starting my graduate degree in the Fall semester after attending graduation next week. This is old news.

      When I started attending UCF for my EE, this had already been done. I have recently completed (last Thursday) Dr. Wu's class on Genetic Algorithms (Evolutionary Computation). This work was used by (grad) students as a starting point for their research for the class project.

      Let me express how this is old news.

      2003 - http://www.springerlink.com/index/M26H2CEEAGWG4FD5.pdf [springerlink.com]

      1993 - h [ieee.org]
  • I fail to see what is new in their approach. Both of these two fields had been explored before and their approach is essentially based on redundancy, only the available standby gates are in the FPGA. I read their paper, it seems that the biggest part that they are still lacking is for problem determination. Their approach is also prone to failure when their reconfiguration hardware or their processor or their analog components are the faulty ones. Although it could have some potentials, it's reliability has
    • Re: (Score:2, Interesting)

      by arktemplar (1060050)
      I had mentioned this some time back as well, but polymorphic processors like MOLEN(tu delft is doing this one), might be usefull for this sort of stuff. The theory behind it is simple, and extends to modern multicore systems as well basically break up the instruction set into microinstructions (all processors that I know do this part), then have any one of the many computational units available do whatever work is required in order to implement those microinstructions. the translation is done by the core pr
  • Well, I don't know, but somehow I think this article is missing the "whatcouldpossiblygowrong" tag. I know, it has been posted already, but self-healing computers just call up HAL in my mind everytime I read about them...
  • I like this.

    I'm hoping NASA involvement will help produce spinoffs for the domestic user eventually. We're all probably familiar with this happening in the past. Military interest might be nice for research too.

    This could address some of the things that bother me about the most common modern architecture paradigms.

    Such as when you're performing one type of task the hardware for other types can remain un[der]utilised. Like my graphics card is sitting on it's ass when the cpu is running emulation or ray-traci
  • really repairing problems or just auto rebooting like the mars rover until the batteries run out?
  • by RMES (1279770)
    This is good stuff because a solution of this nature will soon also be required in aircraft and perhaps other terrestrial vehicles.
  • by Animats (122034) on Saturday April 26, 2008 @11:46AM (#23207436) Homepage

    It's Roland the Plogger again, pushing his ad-laden blog. The actual research summary is here [arizona.edu]. The real paper won't be out until July.

    This isn't new. JPL has been trying various levels of self-healing for years.

    The original article describes a cluster of five machines, set up so that if one fails, others take over tasks running on the failed machine. That's what the better server management systems do. I went to a talk last week by Amazon's CTO, and he described how their platform does that.

    The project web site makes things clearer. There are two levels of recovery. The upper level works like cluster fallover. The lower level tries to reconfigure the FPGAs to use different cells in the FPGA to work around faults. That's likely to be a delicate process; you'd need substantial on-chip test resources to reliably do gate-level fault isolation on an FPGA that's been hit hard by a cosmic ray. It's not clear how fine-grained this is; this may be more like having multiple units like GPU shaders replicated in an FPGA, with the ability to turn off the failed ones. Sort of like the way Sony ships PS3 machines with eight Cell processors, at least seven of which work.

    The available info isn't enough to tell whether this is a good idea or not. About typical for Roland the Plogger.

  • I for one welcome our self sustaining indestructible mechanical overlords.
  • When you take a set of systems and let them vote on which among them have the "most right" answer, that's a committee.

    Take two sets, and that's a congress.

    Get enough members into these sets and they'll reset each other over and over, accomplishing nothing useful. As a design principle it's brilliant as they'll never figure out that accomplishing nothing was the original goal anyway.

  • I immediately had happy thoughts of adaptable ship computers a la Star Trek the Next Generation. They were always re-routing pathways though the systems on the spur of the moment to reach more resources or get past damage. :) Yes, I am a geek.

"A mind is a terrible thing to have leaking out your ears." -- The League of Sadistic Telepaths

Working...