Voyager 1 Resumes Sending Updates To Earth (nasa.gov) 54
quonset writes: Just over two weeks ago, NASA figured out why its Voyager 1 spacecraft stopped sending useful data. They suspected corrupted memory in its flight data system (FDS) was the culprit. Today, for the first time since November, Voyager 1 is sending useful data about its health and the status of its onboard systems back to NASA. How did NASA accomplish this feat of long distance repair? They broke up the code into smaller pieces and redistributed them throughout the memory.
From NASA: "... So they devised a plan to divide the affected code into sections and store those sections in different places in the FDS. To make this plan work, they also needed to adjust those code sections to ensure, for example, that they all still function as a whole. Any references to the location of that code in other parts of the FDS memory needed to be updated as well. The team started by singling out the code responsible for packaging the spacecraft's engineering data. They sent it to its new location in the FDS memory on April 18. A radio signal takes about 22 1/2 hours to reach Voyager 1, which is over 15 billion miles (24 billion kilometers) from Earth, and another 22 1/2 hours for a signal to come back to Earth. When the mission flight team heard back from the spacecraft on April 20, they saw that the modification worked: For the first time in five months, they have been able to check the health and status of the spacecraft. During the coming weeks, the team will relocate and adjust the other affected portions of the FDS software. These include the portions that will start returning science data.
From NASA: "... So they devised a plan to divide the affected code into sections and store those sections in different places in the FDS. To make this plan work, they also needed to adjust those code sections to ensure, for example, that they all still function as a whole. Any references to the location of that code in other parts of the FDS memory needed to be updated as well. The team started by singling out the code responsible for packaging the spacecraft's engineering data. They sent it to its new location in the FDS memory on April 18. A radio signal takes about 22 1/2 hours to reach Voyager 1, which is over 15 billion miles (24 billion kilometers) from Earth, and another 22 1/2 hours for a signal to come back to Earth. When the mission flight team heard back from the spacecraft on April 20, they saw that the modification worked: For the first time in five months, they have been able to check the health and status of the spacecraft. During the coming weeks, the team will relocate and adjust the other affected portions of the FDS software. These include the portions that will start returning science data.
Pretty cool that they can control it when signals (Score:3)
Re: Pretty cool that they can control it when sign (Score:3)
Re: Pretty cool that they can control it when sign (Score:4, Funny)
640 bytes ought to be enough for anyone.
Re:Pretty cool that they can control it when signa (Score:4, Insightful)
Re: (Score:1)
Re: (Score:3)
If you read the fine article you'll get answers to all your questions. It's an interesting story.
Yes some of the onboard memory is faulty.
Re: (Score:2)
What I find more amazing (Score:5, Interesting)
They still have people at hand who can make sense of that code. Even the junior programmers of the original crew are by now retired or dead.
Re:What I find more amazing (Score:5, Interesting)
It's likely well documented and well written.
Even so, it's the job of many software engineers to dive into an existing, gnarly codebase and start fixing things. If you select the best of them who really like space stuff and are nerdy about old hardware, it willbe possible to find people who can get up to speed.
There's whole communities of people now including youngsters doing retrocomputing for fun. And NASA will have had continuity, sure all the original devs are gone, but they have always brought on new people.
Re:What I find more amazing (Score:5, Insightful)
As I've been one of those software engineers that got thrown into the slings and arrows of old, badly written, worse documented and worst supported code, I admire the coding (and working) standards of the NASA engineers of the 70s.
Re: (Score:3)
The slings and arrows of outrageous coding...
Re:What I find more amazing (Score:5, Interesting)
It's likely well documented and well written.
It's all in assembly, but the program is small.
There are three RCA-1802 CPUs onboard.
They are 8-bit processors. The instruction set is straightforward. It's not difficult to program.
The RCA-1802 is notable for being the first CMOS CPU and was likely picked for that reason since CMOS uses much less power than NMOS, which was the prevailing technology at the time.
There are rad-hard RCA-1802s, but I don't know if Voyager used those. Rad-hard semiconductors are fabricated with depleted boron, which has a much smaller neutron cross-section.
Re:What I find more amazing (Score:5, Informative)
There are three RCA-1802 CPUs onboard.
This is the Voyager 1 craft, neither voyager had any integrated CPUs.
Galileo was the first probe to use an integrated CPU.
Viking and voyager used similar but distinctly different architecture designs, but yes the primary difference was viking used TTL while Voyager used CMOS.
This however is referring to the logic gate ICs.
What we on earth would call the 7400 IC series, although TI used a different numbering for their rad hardened line
There are indeed three separate processing systems/computers for different functions.
Finally, just to be clear, those processors were and are "CPUs" by every definition. They just are not integrated into a single chip. The chips are essentially logic gates, counters, shifters, etc. combined on PCBs and wired together in the same manor.
Re: (Score:2)
Processor ISAs of this vintage are typically quite easy to program in assembly language.
There's not usually much reason to program embedded software in higher-level languages, C is okay as a high-level language but primarily because the compilers do NOT perform overly complex optimizations and you can more-or-less predict what assembly it will emit. Remember that a lot of machine interfacing requires extremely precise control and management of very limited resources. A lot of embedded programming I've done,
Re: (Score:2)
I'd rather do this kind of work in assembler. Especially shuffling code around a memory with "holes" in it. A general purpose compiler is of limited use for most control applications. A ladder logic compiler/interpreter might be worth it for this old space probe. But a good linker, that's worth its weight in gold when your memory map is swiss cheese.
Re: (Score:3, Interesting)
It's likely well documented and well written.
Unlike today where most programmers, when told they need to document their code, "We don't do that here."
Re: (Score:3)
And their motto is "Quality: We've heard of it."
Re:What I find more amazing (Score:4, Interesting)
Re: What I find more amazing (Score:3)
Youngster? I prefer ute
And there are still people who want to do it (Score:3)
When it seems like every other dev these days is into "Appzz" or some web stack crap its nice to know some people doing real hardcore coding on a system thats made a real difference.
Re: (Score:2)
Re: (Score:2)
The classics never die. They just get kinda moldy.
Re:And there are still people who want to do it (Score:5, Interesting)
I can see the appeal of apping appz.
It's nice to know that 3 years from now, your code will be considered hopelessly outdated and obsolete and you'll never have to maintain it...
Re: (Score:2)
You think being able to update circa-1977 assembly language is amazing? I sometimes have to decipher and edit other people's perl!
Re: (Score:2)
Friends don't let friends read, let alone edit, someone else's perl code.
Re: (Score:2)
They still have people at hand who can make sense of that code.
I can't remember if it was in the book, the movie, or both, but there is a great scene in Andy Weir's "The Martian" where NASA tries to find people who know how the Mars Pathfinder comms work after Mark Watney retrieves it and powers it back up to phone home 38 years after it first touched down on the Red Planet.
See Doc -- It's Quieter in the Twilight (Score:2)
Everyone on staff is of an age that they will be retired when the power budget runs out and the heater on the r
awesome (Score:4, Insightful)
Re: (Score:2)
Actually I have similar problems maintaining our underfunded undocumented legacy apps, some pre-PC, but things like budget tracking isn't nearly as glamorous as interstellar exploration.
Software engineering (Score:2)
These are the kind of times when that phrase actually means something rather than being a feelgood title for some script kiddies who can barely write a hello world program without help.
Re: (Score:1)
Yes, but at least my probe is web-scale and will thus get me a web-scale job! Fixing it at Uranus will be some other sucker's job.
Compensation (Score:4, Funny)
Are we going to dock its pay? I mean, I've been expecting regular TPS reports and it hasn't delivered. It needs to be put on a performance review plan ASAP.
Re: (Score:2)
In memory patching (Score:5, Interesting)
To add detail to the article:
When you're modifying the code in memory (as opposed to say a bootable disk or flash drive) there are a few things you have to watch out for. This is most true when you have no physical access to the device, as in the case of Voyager 1.
So if you've identified that some part of the memory is unusable, you need to ensure it won't be attempted to be used.
Phase I: Setup
Step 1: Allocate a chunk of memory for a temporary region that will do nothing more than simply exist.
Step 2: Set the instructions in that region to return execution to a known-working part of the code with some non-operational instructions in the middle (NOOPs)
Phase II: Re-vector
Step 3: Change all instructions that send execution to the bad memory to now go to the good region
Step 4: Change some of the NOOPs to log the information so you can tell you're now executing new-region stuff, not old-region stuff
At this point you have logs showing that the bad memory is not being used. You know the new region (not large enough) is being used. So now you have to get the code that existed in the damaged memory put in some places (not enough room for one place) and then jump to it:
Phase III: Recreate the code
Step 5: Allocate new regions of memory. Fill them with NOOPs and a branch (or jump AKA JMP) to the next chunk
Step 6: Add a new region of memory with a new routine to go read the stuff you set up in Step 5 to ensure that if something WERE to execute, it would sequentially go through all those new regions allocated in Step 5 and return just fine.
Step 7: Run Step 6, and if it doesn't pass, fix it.
Step 8: Replace the NOOPs in the new regions with the instructions from the damaged original areas of memory.
Effectively at this point you've replaced the original code but instead of one region of memory you're using multiple regions with branch instructions. Note that I'm using generic comments here like "branch instruction" where if we were doing machine code in the 1970s it would be a JMP or a JSR or BEQ or whatever. That's not important to the concept. The important thing is that IF your steps are successful, at each point your system is fully recoverable. IF a step fails it is still recoverable. You only go to the NEXT step upon success of the preceding step.
Phase IV: Activate the new code
Step 9: Vector the original jump instructions from Step 3 to now point to the new code from Step 5.
Step 10: Lose two days of sleep waiting for success.
You can shortcut a lot of these if you have physical reset access. Not an option here. You can shortcut a lot of these steps if you had an A/B memory (also now used on Android devices and immutable operating systems.). Not an option here either. That means in anything you do you should leave the device in a state where it i still usable enough to fix what you broke. That's why you need 10 steps.
But hey what if you had A/B.
1. Copy A to B.
2. Reallocate regions of memory so B can operate in areas where the physical memory is undamaged. Add branches (jumps) to make a bunch of little regions act as one big region. A jump is one of the least-intensive CPU operations because it loads the program counter with a specific address (to go execute code at) instead of merely incrementing. In pseudocode LOAD PC=new-address is computationally simpler than LOAD PC=PC+current-instruction-size. (Some people use "length" instead of size. Whatever. 1970s octets were all the same size and length... this was no TOPS-10/20 system with 36-bit weirdness.)
3. Boot up on B. If fail, reboot on A. Requires a bootloader equivalent (BIOS on DOS, Fastboot on Android, UEFI on newer systems, etc.) Not an option here.
Well, and after all that's done, what do you do to clean up?
Step 11: Have Voyager 1 send you a new data dump of all of memory so you have a new clear copy.
Step 12: Put in some pseudo-reset options so if this happens again you have SOME of the capabilities of uploading code wit
Re: (Score:2)
And don't worry, they probably used vi, it's optimized for low baud rates...
Re: (Score:1)
Remember also that they will do extensive testing on the ground before it is sent up the the spacecraft. NASA maintains identical ground hardware that software validation is done against to assure themselves that they're actually doing what they think they are doing. Software emulators may also be used but in hardware-level stuff there really isn't any substitute for the real thing.
Their methods are not fast, but speed is much less important than producing bug-free code with something like this.
Re: (Score:2)
Part of the reason they are acting so carefully is that they do not have a ground-based analog of the voyager craft to test with, according to what I read a while back.
Re: (Score:2)
familiar (Score:4, Funny)
> A radio signal takes about 22 1/2 hours to reach Earth from Voyager 1
Engineers having Comcast should be used to that.
Re: familiar (Score:2)
But then you're not counting the cable guy.
Heads up Microsoft Windows 10/11 team (Score:3)
When these NASA folks are done with Voyager 1 maybe Microsoft should consider hiring them to help them figure out how to get a system to take a patch. They might be the only ones who are able to do it. Imagine the power they would have being able to touch the system instead of being 15+ billion miles away.
Re: (Score:2)
One hash per year, but hey a hash is a hash.
Really kickass (Score:2)
The idea that they could rebuild the torched code in working, available ram and then change the addresses in code to point to it is... just top notch. Hats way, way off to the designers.
\o/ (Score:1)
Is there a vscode plugin for remote debugging of Voyager1?
Legends never die (Score:1)
That's it.
Sterilize! (Score:2)
I'm pleased to see that Voyager has recovered from its damage, and can now begin fulfilling its revised prime directive: seeking out and sterilizing imperfect biological infestations.