Forgot your password?
typodupeerror
Bug NASA Space Science

NASA Finds Cause of Voyager 2 Glitch 283

Posted by kdawson
from the blame-cosmic-rays dept.
astroengine writes "Earlier this month, engineers suspended Voyager 2's science measurements because of an unexpected problem in its communications stream. A glitch in the flight data system, which formats information for radioing to Earth, was believed to be the problem. Now NASA has found the cause of the issue: it was a single memory bit that had erroneously flipped from a 0 to a 1. The cause of the error is yet to be understood, but NASA plans to reset Voyager's memory tomorrow, clearing the error."
This discussion has been archived. No new comments can be posted.

NASA Finds Cause of Voyager 2 Glitch

Comments Filter:
  • Really? (Score:3, Insightful)

    by atomicthumbs (824207) <atomicthumbs@g[ ]l.com ['mai' in gap]> on Wednesday May 19, 2010 @12:44AM (#32261370) Homepage

    The cause of the error is yet to be understood

    Let me guess: cosmic ray. Is it really that hard? What else causes a single bit-flip error in space?

  • Re:Really? (Score:3, Insightful)

    by srothroc (733160) on Wednesday May 19, 2010 @12:49AM (#32261406) Homepage
    Age? Voyager is hardly brand new.
  • Re:Really? (Score:5, Insightful)

    by pclminion (145572) on Wednesday May 19, 2010 @12:50AM (#32261412)

    Let me guess: cosmic ray. Is it really that hard? What else causes a single bit-flip error in space?

    When you have a probe billions of miles from Earth, with no hope of ever physically retrieving it, and something weird happens, I don't think the first thing you do is start making assumptions.

  • Re:So.... reboot? (Score:2, Insightful)

    by the roAm (827323) on Wednesday May 19, 2010 @12:51AM (#32261424)

    Because if it had been something else, rebooting could have done more harm than good.

  • Re:So.... reboot? (Score:5, Insightful)

    by Brett Buck (811747) on Wednesday May 19, 2010 @12:59AM (#32261466)

    Why don't they just always try that first?

            Because sometimes it doesn't come back on again.

          Brett

  • Re:Really? (Score:4, Insightful)

    by Peach Rings (1782482) on Wednesday May 19, 2010 @01:01AM (#32261486) Homepage

    It's pretty amazing that they even were able to track the problem down to a particular bit. No general purpose operating system has anything even remotely having dreams of approaching that level of reliability and stability. It's nice to see the strengths of bare-metal hacking demonstrated in this bleary age of big-button-pushing Java and .NET.

  • by blind biker (1066130) on Wednesday May 19, 2010 @01:02AM (#32261488) Journal

    This is why you DO WANT nuclear energy in space! OK, Voyager 1 and 2 have RTGs, but even those are considered politically incorrect these days, especially such massive ones as in the Voyagers.

    More nuclear power in spacecraft, I say. To provide propulsion (ion drive, or even better, explosive drive) and energy when far from the Sun. Fuck PC.

  • Hero (Score:5, Insightful)

    by LoudMusic (199347) on Wednesday May 19, 2010 @01:04AM (#32261500)

    NASA is my hero. They do cool shit all the time. Even when their stuff breaks, it's cool. Then they fix it and it's even more cool.

  • Re:Really? (Score:5, Insightful)

    by BitZtream (692029) on Wednesday May 19, 2010 @01:11AM (#32261536)

    Its also extremely important to note that not a single item you own is made to the specifications that Voyagers were made, even though made over 30 years ago.

    Its also rather important to note that as unstable as most OSes are, they are several million times more complex than the code Voyager 1 and 2 run.

    Finally, joke about Windows all you want ... if you do a default installation of Windows and you don't install any additional drivers or software, it is extremely stable and will just sit there for ages happy to do nothing but tick away.

    Its also entirely feasable to find 1 stuck or flipped bit even using Java and .NET, you just have to actually understand the inner workings of this code which is not something pretty much any developer working in these environments has time to do these days.

    Both things may be computers that run code and use electricity to do so, but thats about where the shared bits end. These guys have been using the same code for 30+ years ... they kinda know how it works and all its quirks at this point.

    With all that said ... you're still right, its freaky impressive.

  • Just incredible! (Score:3, Insightful)

    by mcrbids (148650) on Wednesday May 19, 2010 @01:35AM (#32261636) Journal

    Voyager is anything but brand new. Voyager is probably older than most Slashdotters, having been launched in 1977. Think about it: 1977 - when advanced microchips were not as powerful as the chip driving the shatty calculator you buy today at the dollar store. 1977 was a different time, when information technology usually didn't even involve transistors, yet, and vacuum tube testers (for your TV) were still found at the local drug store.

    And yet, some 33 years later, Voyager 2 is still chugging on, after visiting ALL of the outer planets, still going waaayayyyyyyy past its original design limits, still providing meaningful information on its way out roughly towards the star Sirius. It's now twice as far away from the Sun as Pluto is.

    Like the Mars rovers, this is truly good engineering at work.

  • Re:Really? (Score:1, Insightful)

    by Anonymous Coward on Wednesday May 19, 2010 @03:01AM (#32262048)

    I've diagnosed a single bit-error in a system with 1GB RAM based on the corruption it created in files. That bit error even was intermittent, with only a selection of surrounding bit patterns triggering it and then not all the time. It is not magic, folks. You look at the data which deviates from the expectations and look for patterns. Then you use your knowledge of how the system works to establish theories about the possible causes and finally you run tests to see if the kind of deviation occurs that the suspected cause would trigger.

  • by Hurricane78 (562437) <deleted@slashd[ ]org ['ot.' in gap]> on Wednesday May 19, 2010 @03:39AM (#32262264)

    tubes had that nice desirable sweet distortion...

    There, fixed that for ya...

  • Re:Hero (Score:1, Insightful)

    by Anonymous Coward on Wednesday May 19, 2010 @03:48AM (#32262306)

    NASA is my hero. They do cool shit all the time. Even when their stuff breaks, it's cool.

    While the fireworks involved were indeed spectacular, I hear the crews of Challenger and Columbia would like to have a word with you for that bold statement...

  • Bah Kids today (Score:3, Insightful)

    by jellomizer (103300) on Wednesday May 19, 2010 @08:43AM (#32264260)

    You probably haven't had much experience with these older computer systems. They did what they need to do and that is it. The hardware was wired to do what it needs to do. Every bit had a purpose If that bit failed you knew that something was wrong. Making it fairly easy to find the bit that was bad.

    1K can be represented in a 32x32 square. these systems had only a few k of memory to view. And millions of dollars for funding Finding a missing bit is actually very easy. Especially if you go threw the design specs and see what bit does what.

    General Purpose Computing, was a tradeoff that I think for the most part has better benefitted us. If every computer needed to be made bit level specialized to do one/few thing(s) and do them well, we will have a lot of very secure and extremely reliable computers... However only a few large organizations would be able to afford them as they will need a full custom design of their processes. And in terms of power they will be a lot less then they are today.

    The General Purpose computers while are very complex and can cause a lot of problems.

  • by Chris Burke (6130) on Wednesday May 19, 2010 @10:34AM (#32265734) Homepage

    I can't help but think that they purposely set the limits low so that when the machines operate better than anticipated, NASA (or anyone else for that matter) can take a higher degree of credit than if they were more realistic with the expectations.

    That's one way to look at it.

    Another way to look at it is that it is impossible in most cases to precisely predict how long a specific instance of a part will last before failure, and you can at best describe it probabilistically. So first, you're going to design it to last as long as possible. Then, you're going to take your estimated Mean Time Before Failure and back off by a couple standard deviations so that there's a high probability that the part will last at least that long, rather than a 50% chance of it lasting longer than the mean (assuming normal distribution for part failure).

    To put it simply: Designing something so that you can be fairly certain it will last as long as you need it necessarily means designing it so that if things go well it can last much longer. That's not sandbagging, it's called margin and it's needed to usefully meet the requirements. The requirement is "A device that lasts for at least X years". Not "A device that on average lasts X years".

    This doesn't apply that much to the Mars rovers though. They were engineered as robustly as possible within the weight limits to be sure they could survive at all in a largely unknown environment. The 90 day mission had nothing to do with the design of any particular component except for the solar panels, and that only because they didn't know the Martian wind would blow the dust off for them.

Felson's Law: To steal ideas from one person is plagiarism; to steal from many is research.

Working...