Forgot your password?
typodupeerror
Bug NASA Space Science

NASA Finds Cause of Voyager 2 Glitch 283

Posted by kdawson
from the blame-cosmic-rays dept.
astroengine writes "Earlier this month, engineers suspended Voyager 2's science measurements because of an unexpected problem in its communications stream. A glitch in the flight data system, which formats information for radioing to Earth, was believed to be the problem. Now NASA has found the cause of the issue: it was a single memory bit that had erroneously flipped from a 0 to a 1. The cause of the error is yet to be understood, but NASA plans to reset Voyager's memory tomorrow, clearing the error."
This discussion has been archived. No new comments can be posted.

NASA Finds Cause of Voyager 2 Glitch

Comments Filter:
  • by atomicthumbs (824207) <atomicthumbs@gmail . c om> on Wednesday May 19, 2010 @01:48AM (#32261402) Homepage
    the voyager probes use a radioisotope thermoelectric generator so that wouldn't work anyway
  • Re:Really? (Score:5, Informative)

    by 0123456 (636235) on Wednesday May 19, 2010 @02:27AM (#32261602)

    It's pretty amazing that they even were able to track the problem down to a particular bit.

    To be fair, Voyager doesn't have many bits in its memory :). Tracking down a bad bit is much easier when you have 4k of RAM than when you have 4GB of RAM.

  • Just don't brick it! (Score:3, Informative)

    by WGFCrafty (1062506) on Wednesday May 19, 2010 @02:53AM (#32261714)
    The Voyagers are my favorite probes!

    I wonder how many bits they'll have to send to change the one wrong one, and how long that will take.

    Leave it to the stoner astrophysicists Carl Sagan to oversee one of the more amazing feats of space trave!!

    Radioisotope thermoelectric generator [wikipedia.org]s are awesome!
    Anyone know how much fuel is remaining? They've been heating up for knowledge for a long period of time.



    Personally, I want about 6 of the units in Voyager 2, screw solar!
  • Re:Unbelievable (Score:3, Informative)

    by ducomputergeek (595742) on Wednesday May 19, 2010 @03:00AM (#32261750)

    http://en.wikipedia.org/wiki/Error_detection_and_correction#Error-correcting_code [wikipedia.org]

    Apparently V1 and V2 got the beta version of ECC.

  • Re:Just incredible! (Score:5, Informative)

    by fdrebin (846000) on Wednesday May 19, 2010 @03:10AM (#32261792)

    1977 was a different time, when information technology usually didn't even involve transistors, yet, and vacuum tube testers (for your TV) were still found at the local drug store.

    Tube testers were pretty darned hard to find almost anywhere in 1977 (you could find them in old-used-electronics stores). I do recall testing tubes in drugstores in the early 70's.

    Solid state, and even (*gasp*) integrated circuits were in widespread use. Why, by gosh by golly, we even had *8080*'s then.

    I was a senior in college in physics+EE; I and a handful of my fellow students managed to coerce one of the EE profs to take a few hours and teach us about tubes (they had been removed from the curriculum). For the most part the interest was for us audio-nerds... tubes had that nice desirable sweet sound... (but I digress)

    /F

  • by eclectro (227083) on Wednesday May 19, 2010 @04:03AM (#32262058)

    Politically incorrectness is not what is stopping RTGs from being launched, but lack of supply of plutonium 238 [discovery.com]. It's difficult to protest launches with radioactive elements because they all have been successful. And if one were to crash, the RTGs are sealed so there would not be any leakage. Unfortunately environmentalists want to protest anything radioactive, even though such criticisms may no longer be valid.

  • Re:Really? (Score:3, Informative)

    by God of Lemmings (455435) on Wednesday May 19, 2010 @04:15AM (#32262148)
    I would imagine that it was relatively easy. Voyager has not only a small amount of memory (about 541kb) about 10% of the command system's memory is dedicated to fault protection. Read here: Jet Propulsion Laboratory [nasa.gov]
  • by dltaylor (7510) on Wednesday May 19, 2010 @04:27AM (#32262204)

    So who misused the emacs macro?

    For those of you who don't get the (obligatory) xkcd reference:

    http://xkcd.com/378/ [xkcd.com]

  • Re:Really? (Score:3, Informative)

    by Tapewolf (1639955) on Wednesday May 19, 2010 @04:46AM (#32262296)

    In any case, I don't know what memory technology voyager uses. The (slightly) more modern space shuttles used magnetic core memory for essential systems. These are not affected by cosmic rays. If it isn't magnetic core, then it is likely to be static RAM. This too is not easily modified by a cosmic ray.

    I got curious and looked it up: http://voyager.jpl.nasa.gov/faq.html [nasa.gov]

    ...apparently it uses Plated Wire memory [wikipedia.org] which I had not heard of before, but seems to be a relative of core store.

  • New-fangled memory (Score:5, Informative)

    by dfsmith (960400) on Wednesday May 19, 2010 @04:48AM (#32262302) Homepage Journal
    One of the upgrades the Voyagers had over the Viking computers was CMOS memory (instead of plated wires). Read all about it at http://history.nasa.gov/computers/contents.html [nasa.gov] Apparently, there was some debate at the time over whether these new-fangled memories would be reliable.
  • Re:What, no ECC? (Score:3, Informative)

    by ledow (319597) on Wednesday May 19, 2010 @05:01AM (#32262346) Homepage

    The spacecraft is in an incredibly hostile environment. Who's to say that there *wasn't* ECC and it's just that it's Hamming code wasn't enough to compensate for the error - it would make sense: as the hardware ages, the device leaves the solar system, the errors start getting closer and closer to the limits of error correction until one day - bam, even with error correction it slips through the net and ends up as a bad bit in memory.

    Technically, this is possible (but incredibly rare) on even the greatest error correction in the world. Error correction is a statistical function, that says that the *chances* of an error occuring are 2^8, or 2^16 or whatever.

    And, from my coding theory class, Voyager's signal was originally something ludicrous like a (24,12,8) code even when it was nearby. (This presentation, especially the final slide, appears to confirm that: http://www-math.cudenver.edu/~wcherowi/courses/m6409/mariner9talk.pdf [cudenver.edu]).

    ECC is a probability function - the probability of a bit error going undetected is significantly reduced compared to, say, just sending the data and hoping for the best. But reduced does not mean eliminated. Not all errors can be detected and only a small portion of those can be corrected. But that still leaves room for an error that goes uncorrected, undetected and ends up in RAM without anyone noticing until they do a full bit-by-bit check - the same as your 25+ years newer technology harddrive, Ethernet connection, computer bus, etc. There's no such thing as guaranteed data delivery - but we make the chances of an error slipping through so infinitesimally small that it doesn't affect normal, everyday operation. For instance, a corrupt download with an SHA-1 checksum would be seen as valid approximately one in every 2^160 transactions. Small, but not impossible by a long stretch considering how many downloads occur each day.

    Voyager didn't have the luxury of Megabytes of RAM to hold extraneous checksum data, Megahertz of CPU to check everything that came in at line speed, or a broadcast technology that could keep a Gbit data rate going all the time. They made compromises and, later, changed the ECC algorithms as more and more errors could theoretically creep in. We just had a run of bad luck that meant a single bit was out, that's all. And that's even assuming it's not a hardware failure anyway. I think Voyager did pretty damn well, running for decades after it's supposed operational time. And a one-bit error on a random chance is pretty damn minor - let's just hope it wasn't inside anything too critical, like the communications routines.

  • Re:Just incredible! (Score:3, Informative)

    by vlm (69642) on Wednesday May 19, 2010 @07:25AM (#32263034)

    I wonder what a brand new ancient rad-hard cpu costs.

    They're all kind of "ancient", by some definition. The BAE RAD6000 is at least 14 years old and they go for about 1/4 mil. Most recent launch was this February.

    http://en.wikipedia.org/wiki/IBM_RAD6000 [wikipedia.org]

    Some might consider the RAD750 to be "ancient" being about 9 years old. They retail about $200K. The TSSM is going to launch in a decade with one, at which point that CPU will be 19 years old.

    http://en.wikipedia.org/wiki/RAD750 [wikipedia.org]

    The cost and licensing of the fault tolerant GPL LEON series is very confusing, so the cost is somewhere between GPL/free and "if you have to ask, you can't afford it".

    http://en.wikipedia.org/wiki/LEON [wikipedia.org]

    To some extent you can just go to the wikipedia rad hardened CPU page and pick and choose.

    http://en.wikipedia.org/wiki/Category:Radiation-hardened_microprocessors [wikipedia.org]

  • Re:Just incredible! (Score:5, Informative)

    by vlm (69642) on Wednesday May 19, 2010 @07:35AM (#32263096)

    1977 - when advanced microchips were not as powerful as the chip driving the shatty calculator you buy today at the dollar store.

    Classic, ever repeated confusion of what "power" is. Unless you mean volts times amps, power is what you can do with it. An old mainframe can run a department of a small multinational corporation, maybe a large university, or perhaps a division of state government. We know this, because they did in fact do so, very profitably. You claim a dollar store calculator is more powerful. That means a dollar store calculator should be able to run, say, an entire multinational corporation, maybe multiple universities, or an entire state government. Oh wait, a dollar store calculator can, at best, slowly calculate someone's income tax, possibly correctly. I guess the old mainframe is more powerful after all.

    When I worked at a mainframe shop in the late 90s I heard alot of similar tiresome comments... "Ha ha, mainframes, bet you didn't know my laptop can run NOPs faster than your mainframe can run floating point FFTs ha ha ha mainframes". At which point you simply tell them to put up or shut up, hand them a bus and tag cable, and have their infinitely "more powerful" laptop process 5% of the NYSE volume like our mainframes did, while supporting about 100K trader desks, a couple TB of tape robot storage, etc.

  • by PeterBrett (780946) on Wednesday May 19, 2010 @08:01AM (#32263246) Homepage

    Something is so very, very wrong with your reasoning. If NASA couldn't fix the problem we wouldn't just have a bit of space junk spewing out garbage transmissions, we'd have a bit NUCLEAR space junk spewing out garbage transmissions.

    Oh no! What a terrible thing! There's nothing like that in space at the moment, how could we possibly manage?

    The Van Allen belts contain high enough concentrations of radiation that they make Chernobyl's fallout look like spilt milk. The sun regularly pumps out solar flares that would kill unshielded humans in seconds. Compared to that, I find it very very difficult to be at all concerned by a tiny spacecraft literally billions of kilometres away.

    That is a very bad idea for two reasons (assuming you're referring to project Orion and not completely off your tree). 1. Nuclear bombs are very heavy and very destructive, not only do you have the cost of getting them up there but you also have the very real possibility of them being detonated at slightly the wrong angle or slightly the wrong distance vaporising the craft (we are talking about NUCLEAR fucking bombs people) or any of the myriad of other unpredicted problems you will encounter in deep space. 2. Once out in space, you do not need continual propulsion, deploying an explosive drive means sending up two propulsion systems rather then just putting more fuel into the first.

    Oh dear, where do I start? Firstly, no, nuclear explosives (they're only bombs if you're dropping them on someone) are not necessarily "very heavy". They can be easily built small and light enough to fit into an artillery shell; if a serious Orion development programme was resumed, you'd be looking at 5-10 kg per charge, possibly less. In the Orion model, the pusher plate and damping structure are by far the most massive components. Secondly, nuclear explosions behave very differently in a vacuum than in air; most of the destructive power of a nuclear detonation on Earth is due to the way that the massive energy release affects the atmosphere. Thirdly, it's bloody hard to get a nuclear explosive to detonate. They can only detonate successfully if a very long and complex chain of events occur in precisely the right way. I think you overestimate the risk massively. Honestly, mining with conventional explosives is far more risky than propulsion using nuclear explosives will ever be. Finally, one of the biggest advantages of the proposed Orion propulsion system is that the mass efficiency is very high, meaning that it's possible to continue thrusting for a long period of time, so the whole point is that you want to use it "out in space."

    I recommend reading 'Project Orion' by George Dyson [amazon.co.uk] if you want to know more about the practicalities of the Orion propulsion system.

    Two massive hurdles prevent the use of nuclear reactors in space, weight and the ability to operate them safely from remote. First, nuclear reactors are very very heavy with all that radiation shielding.

    Which you don't need in space; you design the reactor so the majority of the radiation produced is directed away from the spacecraft. Look up NASA's SP-100 design.

    Secondly we can not guarantee that remote systems will operate, it's hard enough to keep a well maintained reactor on the ground operating without constant human intervention (which is why they have constant human intervention) let alone one that will be completely unmaintained and far far from any human help.

    No, modern reactors run on almost completely automated systems, even down to choosing which rods should optimally be replaced next. Human intervention is only required when modifying output to match grid loads (and even then, that's largely automated too). Even if something goes wrong, modern reactor safety systems have so much redundancy and fail-safe assumptions

  • Re:Just incredible! (Score:4, Informative)

    by Anonymous Coward on Wednesday May 19, 2010 @09:44AM (#32264270)

    Exactly. The IBM 360 had a truly incredible I/O capacity, powered by multiple parallel processing elements called "channels." You programmed them with "channel command words" or CCWs. They were independent of the main CPU. When a channel needed memory, it got locked down (pfixed) and allocated to the channel, so the channel could piss into memory at high speed. Really large, thick cables connected the CPU with peripheral devices. These cables had lots of wires in them. Because lots of bits were flowing IN PARALLEL. Look up the transfer rate of a 2701 drum drive, still maintained and used for paging devices as late as the 1980's by companies who could not find anything faster.

    When DEC tried to claim that they could replace 360's with VAX's, guess what happened? They didn't have massively parallel I/O processors. They didn't have a massive transfer capability. They generated an interrupt on every character typed by every user, for God's sake. They were not I/O engines. They failed, utterly. Not that VAX wasn't a good machine, but no way could it replace a 360.

    How did a small 360 support hundreds of users? Why, through an innovation called "CICS." What happened was, the mainframe would fill a 3270 CRT terminal screen with a "form." You would fill in the form, locally, using the "smart" 3270's field-editing and checking capability, with no interaction with the mainframe. When you were finished filling in your form, you'd hit TRANSMIT. At which point, the variable data on your form would be glued together by the 3270 in one record and sent up for processing by the mainframe (along with everyone else's form data). A few seconds later, you'd get another form in response. Lather, rinse, repeat.

    Oh wait. That's exactly how most business Web applications work. Except the screens are prettier.

Computers will not be perfected until they can compute how much more than the estimate the job will cost.

Working...