Pluto Probe Back To Normal, Cause of Snafu Found 80
Tablizer writes: NASA has provided an update to the problem with the New Horizons probe that will fly by Pluto next week. "The investigation into the anomaly that caused New Horizons to enter "safe mode" on July 4 has concluded that no hardware or software fault occurred on the spacecraft. The underlying cause of the incident was a hard-to-detect timing flaw in the spacecraft command sequence that occurred during an operation to prepare for the close flyby. No similar operations are planned for the remainder of the Pluto encounter.
No hardware or software fault? (Score:5, Insightful)
The underlying cause of the incident was a hard-to-detect timing flaw in the spacecraft command sequence that occurred during an operation to prepare for the close flyby.
So a "flaw" in the command sequence isn't a software fault? Sure sounds like one to me. Glad to hear the craft is functioning again though.
Re:No hardware or software fault? (Score:5, Informative)
I'm pretty sure that "fault" has a specific meaning in NASA parlance. There was obviously a software bug, but it probably didn't "fault".
Re: (Score:3)
Can't blame NASA though, when the commands are transmitted over 3 billion miles, the signal would degrade so much it is possible some critical command or an command argument was not correctly received.
Re: (Score:2)
I'm not sure that's a sound argument. You should be checksumming such that you're confident that what you're doing is what you were asked to do, and working in transactions, such that if you've not received a whole command group, you're not running any of it. I'd think it was only in desperate circumstances you'd issues a command that says do this, or in fact do anything plausible if you don't fully receive this, because you're about to fly into something hard...
Re: (Score:2)
Re: (Score:2)
Can't blame NASA though, when the commands are transmitted over 3 billion miles, the signal would degrade so much it is possible some critical command or an command argument was not correctly received.
Nonsense - that's one of the easiest problems to solve in all of computer science: you just tack on a hashcode, checksum, parity bit, etc., and the receiver verifies that it got the right message. If it doesn't verify, the receiver doesn't follow it, and when the sender doesn't get an acknowledgment, it re
Re: (Score:1)
Yep. Not particularly strenuous CRC formulae can detect errors that may happen in a data stream running the entire age of the universe.
Re: (Score:2)
Yep. Not particularly strenuous CRC formulae can detect errors that may happen in a data stream running the entire age of the universe.
Yep. Collision free and works with Ada and decision voting. Trivial when you thunk about it. (goddamned rocket scientists act like they know stuff)
Re: (Score:2)
when the sender doesn't get an acknowledgment, it retransmits the message
When your round-trip communication time is on the order of 10 hours you might want to modify that strategy.
(not that it is hard to do so, just transmit the message multiple times with a sequence number so the client can detect the repeats)
Re: (Score:3)
Come on NASA, was it a "fault", "snafu", "glitch", or "bug". Come clean now!
Personally, I suspect it was a snag.
Re: (Score:2)
Re: (Score:1)
Belgian Flatcoated Retriever Club?
Sticking with the idea that "Pluto" is a dog, eh?
Re: (Score:3)
Of course not. Pluto is a dwarf dog.
Re: (Score:1)
It's hard to determine the breed. Pluto does look a bit like a Rhodesian Ridgeback without the ridges. Otherwise a rather large Hungarian Vizsla except for the eyes.
Hmmm.....
Re:No hardware or software fault? (Score:4, Insightful)
I'm guessing it was an unanticipated race condition. Everything works correctly, everything passes all tests, but for some extremely rare constellation of input values software module "B" is able to complete its calculations and report its results before "A" can-- which has a probability of occurrence so low that it rounds to zero-- and that screws the pooch. If the probability of this happening again approaches zero, it would be fair for NASA to say there was no error in the programming, but instead an unexpected glitch in operations that is unlikely to ever recur.
You can never test for every possible corner condition. More than that, in probably every real world situation, the longer the time since the last hard reboot, the more likely it is that the software will encounter some corner conditions. That Pluto bird has been running for quite a while.
Re: (Score:2)
Or maybe they just needed time to scrub all the pictures of the invading Vogon fleet. Don't want people to panic...
Re: (Score:3)
This is not a manned mission, and not even the nuttiest nutter thinks that man is going to Pluto. You are trolling the wrong article.
Re: No hardware or software fault? (Score:5, Funny)
Did someone who likes space shoot your mother or something? You always pop up on these threads.
As a small child he wanted to be an astronaut. Then he heard it meant reading and, um, stuff. Bitter now, and still a child.
Re: (Score:2)
Did someone who likes space shoot your mother or something? You always pop up on these threads.
As a small child he wanted to be an astronaut. Then he heard it meant reading and, um, stuff. Bitter now, and still a child.
Which, on reflection, seems mean. I should have pointed out that much of the blame for his thwarted ambitions lie with his kindergarten teacher - who told him everyone is good at something, and you should choose a career doing what you're best at.
Being an astronaut seemed like the only job where you didn't have to walk to a toilet, or wipe.
Re: (Score:2)
I came to grips with the idea of never being an astronaut when I was about 7. I'd read that there was a height limit of 6 feet for astronauts, which even in the early 70s might have been out of date information. but since my father was taller than that, I figured I would be also.
Re: (Score:2)
I came to grips with the idea of never being an astronaut when I was about 7. I'd read that there was a height limit of 6 feet for astronauts, which even in the early 70s might have been out of date information. but since my father was taller than that, I figured I would be also.
Ouch! I'm only 5'10" and had never considered that anyone might want to be shorter. But then I never desired to be an astronaut - that I recall.
Re: (Score:3)
Yeah, we don't like that "science" stuff around here! We'd much rather be completely ignorant of the universe around us than have you "space nutters" actually discovering things!
Signed,
The Flat Earth Society
Re:No hardware or software fault? (Score:4, Insightful)
There's a gap between "flawless" and "faulty" whos length, as it so happens, is remarkably similar to the distance that New Horizons has travelled so far.
Re: (Score:2)
The article doesn't elaborate, so I'm guessing this refers to a command sequence sent from the ground. If these are generated by software, it still could have been a software fault, but not on the spacecraft.
Re: (Score:1)
I'm guessing this refers to a command sequence sent from the ground.
init 6 ?
Re: (Score:2)
No, this was Safe Mode: Click Start, Shut Down, select Restart, then hold F8 down during reboot.
Re: (Score:1)
init 1 then...
Re: (Score:2)
> The operator fell asleep waiting for the response [...] and missed the F8
Happens to me all the time.
Re: (Score:2)
I'm guessing this refers to a command sequence sent from the ground.
init 6 ?
init 1, apparently.
Re: (Score:2)
I believe they meant that the software (or hardware) on the spacecraft behaved as expected, but the error was rather due to an handling mistake, sending the commands with the wrong timing. If you asked me, such an handling mistake should be catched by the on-board software and handled properly (which means telling the operator right away to RTFM). I would thus qualify this as a software issue, regardless of what they say.
The official statement is simply putting the "you're holding it wrong" response to a wh
Re: (Score:2)
I believe they meant that the software (or hardware) on the spacecraft behaved as expected, but the error was rather due to an handling mistake, sending the commands with the wrong timing. If you asked me, such an handling mistake should be catched by the on-board software and handled properly (which means telling the operator right away to RTFM). I would thus qualify this as a software issue, regardless of what they say.
The official statement is simply putting the "you're holding it wrong" response to a whole new level.
Well, ok, one could argue that any obscure corner case should be handled appropriately. But at some point, you have to launch the thing.
Re: (Score:2)
If you asked me, such an handling mistake should be catched by the on-board software and handled properly (which means telling the operator right away to RTFM).
Well, that's what happened. Commands were sent, probe responded with a WTF!? and halted, people double-checked things - Oh, there's the problem, probe was reset back to normal.
Unfortunately, the round-trip time to the probe is nearly 9 hours, and nobody wants to be that guy that broke it good and proper, so they double check everything before replyin
Re: (Score:2)
So a "flaw" in the command sequence isn't a software fault?
I don't see why it must be.
Imagine you wrote a shell script to first create a temp folder, then recursively delete the source data folder, followed by copying the source folder to the new temp folder.
Oops, your data is gone!
Is that a fault with the delete command doing exactly as you instructed it to?
Or is that a fault in your sequence commands in the script?
Re: (Score:2)
Imagine you wrote a shell script...
That's software.
Re: (Score:1)
That's software.
That's software doing exactly as instructed, and as expected.
The question is: Is the software working perfectly to be considered a software fault?
A developer or operator fault most certainly. But there was no part of the software doing anything it wasn't told. No part that had any expectation of working differently than it did.
Here we call that operator error.
"I right clicked this file and selected delete. When it asked if I was sure I clicked Yes. Now I'm shocked, appalled, and confused why that file g
Re: (Score:2)
Congratulations, you can read! Now go practice reading the rest of the post, it describes how the "software" is not faulty yet gives an unwanted outcome due to command timing.
Re: (Score:2)
I believe the use of the word "fault" here means that there is nothing broken on the spacecraft, hardware or software. It behaved as it was supposed to, it was just fed a bad command sequence. i.e. any software fault was in the auditing software on the ground. Even then it may not be a "fault" (i.e. breakage) but just some conditions that aren't accounted for in the audit.
Sounds like an unexpected timing race condition (Score:2)
Re: No hardware or software fault? (Score:2)
"no hardware or software fault occurred *on the spacecraft*"
There may have been a hardware or software fault on the ground, that resulted in an invalid command sequence. The desired behaviour in this case may be to enter a safe mode, so that you have a known means to recover (rather than bricking).
GIGO (Score:2)
Maybe the software was working the way it should but not the way the humans intended it to? Like the killer robots/AI of sci-fi.
Kilometers? (Score:1)
No, the plans were drawn in miles!
The question is... (Score:1)
If Pluto is Mickey's Dog, then how can Goofy be Mickey's best friend?
Truly NASA is the only one who can answer this important conundrum.
Re: (Score:2)
Pshaw! I'm sure StartsWithABang has an informed opinion on the subject.
No exactly a SNAFU (Score:4, Informative)
While NASA has had some spectacular bugs in the past, they aren't common enough to start throwing around SNAFU.
Situation Normal: All Fucked Up
Re: (Score:1)
If you're going to use highly technical terms, please follow the relevant RFCs: http://www.rfc-editor.org/rfc/... [rfc-editor.org]
DRINK when someone uses the word "anomaly"... (Score:4, Funny)
I always take a shot when someone uses the word "anomaly" in a space story. The legacy of STTNG continues.
I would have done dry run of entire sequence (Score:2)
Re:I would have done dry run of entire sequence (Score:5, Insightful)
Re: (Score:3)
The main thing is to make sure that you can recover from unexpected failures. It looks like NASA did well getting that right here.
Re:I would have done dry run of entire sequence (Score:4, Interesting)
The fun part is when you do build and launch multiple (whatevers) and they all go down with the same "rare" fault.
Have Spacesuit, Will Travel (Score:2)
I'm just sayin'. Those creepies wouldn't want to be observed before they hit Tombaugh Station.
1 sec time change? (Score:5, Funny)
Re: (Score:1)
They are not used to a time zone change to Pluto Local Time. The Plutonians* were not willing to help.
* "Plutocrats"? "Plutoids"? Reminds me of a joke about Hillary allegedly selling nuke mines to Putie. She's the "Plutonium Plutocrat".
Re: (Score:2)
Translation.... (Score:2)
Someone sat on the keyboard.
100% failure in 72 hours (Score:5, Funny)
It can only be attributable to human error. They checked out the AE-35 Unit and it had no problems at all.
I've still got the greatest enthusiasm and confidence in the mission.
I smell tomfoolery (Score:2)
I've been a sys admin for a very long time and this sounds very familiar to many mad-libs style answers I've provided to uninitiated management immediately following an irreparable mistake.
Obviously somebody running the simulation... (Score:1)
And if they find the Gamilons? (Score:2)
I'll start digging.
(and showing myself out)