Self-Healing Computers For NASA Spacecraft 70
Roland Piquepaille writes "As you can guess, hardwired computer systems are much faster than general-purpose ones because they are designed to do a single task. But when they fail, they need to be totally reconfigured. This can be just a costly problem in a lab on Earth, but it can be vital in space. This is why a University of Arizona (UA) team is working with NASA to design self-healing computer systems for spacecraft. The UA engineers are working on hybrid hardware/software systems using Field Programmable Gate Arrays (FPGAs) to develop these reconfigurable processing systems. As the lead researcher said, 'Our objective is to go beyond predicting a fault to using a self-healing system to fix the predicted fault before it occurs.'"
Re: (Score:1)
The 9000 Series has a perfect operational record (Score:5, Funny)
I've just picked up a fault in the AE-35 Unit.
Its going to go 100 percent failure within 72 hours."
Re:The 9000 Series has a perfect operational recor (Score:5, Funny)
Re: (Score:1)
Comment removed (Score:5, Funny)
Re: (Score:2)
Getting hazy, can't divide three by two.
My answers I can not see 'em,
they're stuck in my Pentium.
It would be fleet, my answers sweet,
on a workable FPU.
Re:The 9000 Series has a perfect operational recor (Score:2, Insightful)
Re:The 9000 Series has a perfect operational recor (Score:5, Interesting)
Re: (Score:2)
2001? Superman III, more like... (Score:2)
In case anyone doesn't get it, the above is a reference to the Stanley Kubric film 2001: A Space Odyssey
If we're talking about self-healing computers, surely the one in Superman III [imdb.com] is a better example. Anyone remember that?
:-/
It could heal itself and (most memorably) even turn one of the baddies into a robot to defend itself, but its graphics were on the level of an Atari 2600.
Re:The 9000 Series has a perfect operational recor (Score:4, Funny)
Re:The 9000 Series has a perfect operational recor (Score:5, Funny)
The first thing I thought when reading the story was: "I know, I'll post a comment about the AE-35 unit."
Then I read down, and yours was the top comment. It just reminds me that I don't belong in the company of normal people. The Slashdot social leper colony is my true home. I know my place!
Re: (Score:1)
Re: (Score:2)
Re: (Score:1)
From TFA:
...will hopefully be a worthy [pred|succ]essor to LCARS [lcarscom.net].
Ha! (Score:2)
ZEN: Auto-repair circuits are working at maximum capacity. Damage exceeds rectification capability.
DAYNA: Damage? What damage?
ZEN: That information is not available.
Re: (Score:2)
That must have been one of the cooler episodes.
Hope it makes it into the remake: http://www.theregister.co.uk/2008/04/24/blakes_seven/ [theregister.co.uk]
Re: (Score:1)
Re:The 9000 Series has a perfect operational recor (Score:2)
Not new (Score:5, Informative)
Re: (Score:2)
Re: (Score:2)
Re: (Score:3, Interesting)
I think the goal of this project isn
Re: (Score:2)
Re: (Score:2)
* Largely a meaningless set of buzzwords. Even in computers not every portion of the field progresses at the same pace.
Much prior related NASA research at Langley, JPL,. (Score:2)
Paper E3 [klabs.org],
Paper 161 [klabs.org] and even a 110MB video of students
programming FPGAs at NASA [starbridgesystems.com]
Re: (Score:2)
Beauty in Simplicity (Score:1)
The idea is simple, and I think therein lies its ability to succeed. Regaurdless of how dificult the programming is, the end result is conceptually very basic, tried and true. System redundancy and a support network. Mighty fine.
Re:Beauty in Simplicity (Score:5, Informative)
Re: (Score:2)
The future of pr0n! (Score:4, Funny)
Re: (Score:1, Informative)
Re: (Score:1)
Re: (Score:2, Funny)
Re: (Score:1)
Re: (Score:2)
-- Otaku Joe
Doesn't this already exist? (Score:5, Interesting)
Self-repairing computer systems for spacecraft have been in the discussion for decades, and every now and then we get hear about a new project. This project certainly is a good idea, hopefully it will work.
BTW, Motorola (now Freescale) developed self-repairing processors for military applications a couple of years ago.
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
The terminator movies aren't that far fetched, after all. The right type of AI, robot planes, tanks, and soldiers, and mankind is no more
Then we can only hope that time travel is invented and someone gets sent back thru time to prevent that from happening.
Re: (Score:2)
The first use of this technology... (Score:2, Funny)
hmm (Score:4, Funny)
Re: (Score:1)
This Really Isn't anything New (Score:1)
Re: (Score:2)
When I started attending UCF for my EE, this had already been done. I have recently completed (last Thursday) Dr. Wu's class on Genetic Algorithms (Evolutionary Computation). This work was used by (grad) students as a starting point for their research for the class project.
Let me express how this is old news.
2003 - http://www.springerlink.com/index/M26H2CEEAGWG4FD5.pdf [springerlink.com]
1993 - h [ieee.org]
Reconfigurable Computing, Fault Tolerance (Score:2, Informative)
Re: (Score:2, Interesting)
Well... (Score:1)
How long for Spinoffs? (Score:1)
I'm hoping NASA involvement will help produce spinoffs for the domestic user eventually. We're all probably familiar with this happening in the past. Military interest might be nice for research too.
This could address some of the things that bother me about the most common modern architecture paradigms.
Such as when you're performing one type of task the hardware for other types can remain un[der]utilised. Like my graphics card is sitting on it's ass when the cpu is running emulation or ray-traci
reboot (Score:1)
RMES (Score:1)
Roland the Plogger again (Score:4, Informative)
It's Roland the Plogger again, pushing his ad-laden blog. The actual research summary is here [arizona.edu]. The real paper won't be out until July.
This isn't new. JPL has been trying various levels of self-healing for years.
The original article describes a cluster of five machines, set up so that if one fails, others take over tasks running on the failed machine. That's what the better server management systems do. I went to a talk last week by Amazon's CTO, and he described how their platform does that.
The project web site makes things clearer. There are two levels of recovery. The upper level works like cluster fallover. The lower level tries to reconfigure the FPGAs to use different cells in the FPGA to work around faults. That's likely to be a delicate process; you'd need substantial on-chip test resources to reliably do gate-level fault isolation on an FPGA that's been hit hard by a cosmic ray. It's not clear how fine-grained this is; this may be more like having multiple units like GPU shaders replicated in an FPGA, with the ability to turn off the failed ones. Sort of like the way Sony ships PS3 machines with eight Cell processors, at least seven of which work.
The available info isn't enough to tell whether this is a good idea or not. About typical for Roland the Plogger.
Obligatory (Score:1)
Replacing the executive with a committee (Score:1)
When you take a set of systems and let them vote on which among them have the "most right" answer, that's a committee.
Take two sets, and that's a congress.
Get enough members into these sets and they'll reset each other over and over, accomplishing nothing useful. As a design principle it's brilliant as they'll never figure out that accomplishing nothing was the original goal anyway.
a la Star Trek NG (Score:1)