ESA: European Mars Lander Crash Caused By 1-Second Glitch (space.com) 110
An anonymous reader quotes a report from Space.com: The European Space Agency (ESA) on Nov. 23 said its Schiaparelli lander's crash landing on Mars on Oct. 19 followed an unexplained saturation of its inertial measurement unit (IMU), which delivered bad data to the lander's computer and forced a premature release of its parachute. Polluted by the IMU data, the lander's computer apparently thought it had either already landed or was just about to land. The parachute system was released, the braking thrusters were fired only briefly and the on-ground systems were activated. Instead of being on the ground, Schiaparelli was still 2.3 miles (3.7 kilometers) above the Mars surface. It crashed, but not before delivering what ESA officials say is a wealth of data on entry into the Mars atmosphere, the functioning and release of the heat shield and the deployment of the parachute -- all of which went according to plan. In its Nov. 23 statement, ESA said the saturation reading from Schiaparelli's inertial measurement unit lasted only a second but was enough to play havoc with the navigation system. ESA said the sequence of events "has been clearly reproduced in computer simulations of the control system's response to the erroneous information." ESA's director of human spaceflight and robotic exploration, David Parker, said in a statement that ExoMars teams are still sifting through the voluminous data harvest from the Schiaparelli mission, and that an external, independent board of inquiry, now being created, would release a final report in early 2017.
This never happened to me before... (Score:5, Funny)
Man, if I had a nickel for every time some kind of sensory saturation forced a premature release...
Re:This never happened to me before... (Score:5, Funny)
Then you'd still be broke. This is Slashdot; you're not fooling anyone.
Re: (Score:2)
Man, if I had a nickel for every time some kind of sensory saturation forced a premature release...
Then you'd still be broke. This is Slashdot; you're not fooling anyone.
Visit the lair of any Slashdot poster, buried deep in the basement of his parent's house, and you will find that the height and majesty of the tissue mountain on the nightstand next to his bed thoroughly discredits your hypothesis. If you need further confirmation, shine a black light at his laptop and prepare yourself to be blinded by the glow.
Re: (Score:2)
Visit the lair of any Slashdot poster, buried deep in the basement of his parent's house, and you will find that the height and majesty of the tissue mountain on the nightstand next to his bed thoroughly discredits your hypothesis. If you need further confirmation, shine a black light at his laptop and prepare yourself to be blinded by the glow.
Just because your house is like that doesn't mean everyone's is.
Re: (Score:1)
I believe he's referring to the manual override.
Re: (Score:2)
What...you never put lipstick on your hand for those special evenings?
Fairness forces me to note that you, too, are a Slashdot denizen.
Re:This never happened to me before... (Score:4, Insightful)
He never said there was another person involved. He's just complaining about never managing to make it to the end of the pornhub clip.
Re: (Score:3)
And then, what? Should it take an "instantenous" decision about what to do next?
Sanity checking code (Score:3)
So say they're doing some kind of weighted average of an altitude computation from the inertial navigation unit and an altitude computation from the doppler radar altimeter.
They should have some code in there saying: If these two values that we're averaging are wildly off from each other, let's not take the average. Instead, let's go into some exception handling code which uses some kind of heuristic (and a little time perhaps) to determine which of the two instruments should become the solely trusted sourc
Re: (Score:1)
Its basic engineering. You need to consider and have plans for as many worst case scenarios as possible, while at the same time maintaining a positive good position vis spurious fault events. So agree with you totally..
Re: (Score:2)
Re: (Score:2)
...detected the thump of the parachute strings going-taught, determined that meant it was on the ground, and cut the parachute...
Just think about that again for a moment.
It's "taut" by the way, but apart from that I see a bright new career in space exploration in your future.
Cheater? (Score:2, Funny)
They're blaiming lag?
Kalman filter (Score:5, Insightful)
https://en.wikipedia.org/wiki/... [wikipedia.org]
How in hell did they test their Kalman filter to allow such bad data to reach the decision logic? (I assume they used one.)
Filter or not (Score:5, Informative)
When the altitude stops changing for a whole second the filter is going to have to be a long one! And that ain't desirable for responsive control.
The real question is how could the sensory processor have overloaded in the first place? My money is on simple [b]code bloat[/b]. Ie: They used a bunch of generic libraries that use further libraries that use further libraries that use further libraries that use further libraries that use further libraries ...
Re: (Score:1)
Bloat itself typically just makes things consistently slow.
To get stalling you either need buffers upon buffers put there "to make things faster" or you need a runaway process that hangs.
Not not realize you have them you need abstractions that hides them away.
Abstraction is bloat (Score:2)
Hidden malloc()'s is a good example of the bloat problem I'm referring to.
Re: Abstraction is bloat (Score:2)
It was only a functional example (Score:2)
This is not a bugginess issue.
Point is the layers create bloat. Any hidden dynamic memory allocations that occur, by whatever system call, is just one more part of the bloat.
Re: It was only a functional example (Score:2)
Re: (Score:1)
I seriously doubt they cobbled the software together with a bunch of generic libraries; but if they did they got what they deserved.
Re:Filter or not (Score:5, Insightful)
A more elegant solution is to use both the sensitive accelerometer and an accelerometer with a greater max threshold. That way you keep the higher max limit without giving up low-gain sensitivity. But spacecraft tend to be both weight- and budget-constrained...
More troubling to me was that there wasn't some basic sanity checking going on. Like a calculation that says "3 seconds ago I was at 4 km high. Now I think I'm on the ground. Does it make sense that I could've traveled that far in that little time? No? Then the instruments saying I'm on the ground are probably wonky, and I should give other instruments a higher priority in calculating my altitude for a bit." Same way I write my code (and spreadsheets) to calculate important numbers two, three, or sometimes even four different ways to make sure they all agree before proceeding to act on it.
Re: (Score:3)
More troubling to me was that there wasn't some basic sanity checking going on. (...) Same way I write my code (and spreadsheets) to calculate important numbers two, three, or sometimes even four different ways to make sure they all agree before proceeding to act on it.
Well it's not exactly like the lander can abort, it's do or die. So you got inconsistent or unlikely data, but what's good and what's bad? It is a glitch, is it defective, did a misfire flip us around or put us in a spin or block the sensor? Can we salvage it or is the mission fucked no matter what? That's really the million dollar question, is there a contingency plan that could work and if so what should trigger it.
I'm guessing that with combinatorics you'll have potentially very many possible failure mod
Re: (Score:2)
Same way I write my code (and spreadsheets) to calculate important numbers two, three, or sometimes even four different ways to make sure they all agree before proceeding to act on it.
You sir, are not a typical coder. I would go so far as to say your error checking and thought processes are superior to (random very high number) 90% of the programmers I have seen. Mind you, I still think that is a bare minimum for calling yourself a programmer but each instance of proper coherency checking requires notice so that others can learn from it.
Re: (Score:2)
This is not how Kalman filter works. Even if it gets totally wrong data for one second it outputs "correct" values based on previous data.
So, in our case the altitude output would have changed for this one second and the output values would have been quite close to the real altitude.
Re: (Score:2)
Re: (Score:2)
You want sanity checking (based on physical possibility/impossibility) on individual input data streams to the Kalman filter prior to allowing them to get into the filter's weighted averaging. If a given single measurement stream (the position measurement by integrated acceleration) is indicating impossible changes in position over various near-past time ranges, exclude the whole measurement-type from the averaging immediately.
Re: (Score:2)
To be fair, either doesn't occur to programmers, or it doesn't occur to managers to allow sufficient development and test time to consider stuff like that.
I got my part of the lander done on time on budget. Gold star for me.
Re: (Score:1)
When the altitude stops changing for a whole second the filter is going to have to be a long one! And that ain't desirable for responsive control.
The real question is how could the sensory processor have overloaded in the first place? ...
... When I heard about the crash landing I literally said to a friend of mine, "I bet the subroutine that cuts the parachute loose so it doesn't land on top of the payload detected the thump of the parachute strings going-taught, determined that meant it was on the ground, and cut the parachute." ...
Mechanical devices can have really long "bounce" times, when it includes a parachute and riser lines it can easily be over a second.
Not only was their mechanical testing lacking, their simulation software should have also picked this up. And they had a similar failure when the landing gear opened, in a previous lander.
It sounds like they had a lot of scientists, but no engineers!
Re:Kalman filter (Score:4, Insightful)
I find it more weird that *one* sensor misbehaving lead to the entire mission failing.
I have more robustness in my thrust measuring rig made of wood beams and zipties :|
Re: (Score:2)
This. So much this. I don't know about the EU, but when NASA builds spacecraft, it tends to put in multiple redundancies where it can and add a little logic to determine if and when a sensor fails given other data. If you're going to send up a multi-million dollar craft for a project that will last months, have a backup plan for each and every thing that could possibly go wrong so long as it doesn't significantly add to the expense.
We know that rocket scientists can fire an object into orbit and hit a
Re: (Score:3)
"Check for integer overflow" is a checkbox in Simulink.
How was this not caught on the Hardware in the Loop test benches?
Jesus people, is this amateur hour.
Re: (Score:1)
Re: (Score:3)
"How in hell did they test their Kalman filter to allow such bad data to reach the decision logic? (I assume they used one.)"
1) A Kalman Filter probably is not really appropriate here because the parachute has just been deployed and you wouldn't have state statistics available to filter the input data. Doesn't mean they didn't use one with ad hoc statistics. That's not as uncommon as perhaps it should be.
2) Presumably the IMU is expected to tell you the probe has run into the planet (i.e. landed) and it's
Re: (Score:2)
Wrong landing sequence. This spacecraft was intended to parachute down to some hundreds of metres, then fire up retro-rockets and jettison the parachutes, then descend to a few metres on the retro-rockets, then drop to the ground. So, the signal from the IMU would vary between free-fall and various substantial
They didn't learn the lesson (Score:5, Informative)
Re:They didn't learn the lesson (Score:5, Insightful)
To be fair, few people did. Multiple cases of overflows and bad data problems have occurred and still continue to occur not just in space programs around the world but in other industries too.
Teach it Phenomenology! (Score:3)
"Obligatory" Dark Star reference.
Ariane 5 (Score:2, Interesting)
Brings to mind the failure of the first Arianne 5 [wikipedia.org] launcher because control software spat an Ada stack trace over a line which was supposed to only contain kinematic data.
Re: (Score:3, Informative)
> ...control software spat an Ada stack trace over a line...
Eh, no. The failure of the INS's control software caused the INS to send diagnostic data (rather than sensor data) to the control systems, which then did what they _thought_ they were being commanded to do.
None of the code in the system was modified in flight.
What the? (Score:5, Informative)
So they didn't correlate the IMU data with ranging radar or even barometric altitude information so as to avoid this?
I know weight and volume are at a premium on such craft but a barometric sensor (even one capable of operating in Mars's rarefied atmosphere, is the size of a thumbnail and weighs just a fraction of a gram.
Sigh!
Re: (Score:2)
You can even correlate it with your own kinematic model. The scenario which the vehicle followed is impossible. It can't land one second after dropping the parachute, and so timing alone should have made it reject the invalid data.
Re:What the? (Score:4, Insightful)
even barometric altitude information
I'm interested to know how you calibrate your barometric altitude information, and even more so what vacuum followed by a sudden atmospheric entry will do to such a sensor.
If I'm going to take a guess I'd so no, an instrument capable in operating that range of pressures, temperatures, vibration, etc is not the size of a thumbnail weighing a gram.
Radar (Score:2)
Yes, you have to wonder why on a mission of this expense and complexity the height about the ground is essentially done by mathematical dead reckoning. Would adding a ranging radar really have added so much to the weight and/or required package size that it was infeasible to include it? Obviously they must have considered it and I'd be interested to know why in the end it was not seen as a viable part of the solution.
Re: (Score:2)
The article says that the radar was working. But the data from the radar seems to be have been ignored at this point.
Re: (Score:2)
Yes, you have to wonder why on a mission of this expense and complexity the height about the ground is essentially done by mathematical dead reckoning.
Because it works really well. The other replier, MichaelSmith indicated it had radar as well.
Re: (Score:2)
What? No landing sensor? (Score:1)
Am I missing something, or is this a stupid design?
Re: (Score:2)
Re: (Score:3)
Even one that works at the velocity encountered during atmospheric entries?
Sounds like you're suggesting putting a Pitot tube on a space probe ...
Re: (Score:2)
Will a barometric sensor work properly while descending through gases emitted from thrusters that are trying to slow the vehicle?
Re: (Score:2)
How do you know the barometric pressure profile before you enter the atmosphere? Mars has a trickily variable atmosphere.
There was a large dust storm developing at the time, which is a (potentially) global event. How much does that affect barometric pressure? (On Mars, not necessarily on Earth.)
Oops (Score:5, Funny)
Should've used metric seconds.
IMU (Score:2)
What kind of IMUs are normally used in these craft? The same kind used in aircraft and weapons?
Re: (Score:2)
Meanwhile.. (Score:2)
... $1000 quadcopters back here on Earth ship with multiple IMUs for redundancy, since the bloody things are about as trustworthy as your average politician.
Having made that glib remark, I'm sure it either did have redundancy, or if it didn't that was for a good reason (e.g. risk of failure deemed too low to warrant the weight penalty in adding redundancy). I would also like to think that they're using somewhat more reliable IMUs than those found in quads.
Single point of failure? (Score:1)
State machine/discrete event control? (Score:2)
Re: (Score:2)
Re: (Score:2)
Calculations were prescient (Score:3)
"[T]he erroneous information generated an estimated altitude that was negative," ESA said.
Which resulted in an actual altitude that was negative.
The Martians' gravity weapon test worked! (Score:2)
A brief burst was enough.
Open Source? (Score:3)
I wonder whether making the source code of these probes available to the public, for vetting would help spot bugs like these? I am also curious whether releasing the code would be problematic for any reason?
Re: (Score:2)
I wonder whether making the source code of these probes available to the public, for vetting would help spot bugs like these? I am also curious whether releasing the code would be problematic for any reason?
Dunno. But I suppose it might be that the code is written by a contractor and they hope to make money out of the code in other contexts.
Re: (Score:1)
Not gonna be open source (Score:1)
Software to land a probe on mars is quite similar, if not identical, to software to put a (nuclear) warhead on a target. That's an important strategic capability for "first world" nations - otherwise you're in the category of Saddam firing Scuds, which are basically V2s with newer parts, and quite literally cannot hit the broad side of a barn (albeit from 100 km away).
So, the hard parts of solving the problem (after you've done the basic college physics part) are likely to not be open source. Things like
Re: (Score:2)
Probably not, as the public wouldn't spend the months needed to study the hardware and interface specifications needed to understand what's going on in the software. Seriously, this is a tightly integrated system not a standalone program - without understanding the system, you can't tell a bug from working as intended.
All you flight software noobies.. (Score:1)
Lots of just plain old ignorant comments here. I say this in a nonperjorative sense - if you've not worked on flight software, there's no way you could know.
1) Space is unforgiving, hardware designs change very, very slowly. Project schedules move fast and have limited budgets. Just because you can buy a MEMS based IMU for your quadcopter does not mean that you can get one for a spacecraft that will work reliably from -40 to +80C, withstand the vibe tests, the pyroshock, etc. Oh, yeah, and it (and the surr
Re: (Score:2)
It's really really hard, granted.
In my experience in the systems engineering industry, there was rarely any re-use of design or code from one project to the next similar project. Silo-ism and misaligned incentives.
Imagine if the reliability of this kind of EDL system and its software could be improved by evolution where different space agencies and subcontractors shared and re-used their ideas for improving solutions to the complex problem.
Imagine all the landers... living for today.
Re: (Score:1)
... ...
4) Fault handling is tricky - you can easily go down a rat maze of low probability events generating code (and hardware) to handle obscure corner cases, thereby increasing your test costs and time, and potentially introducing other faults. For a lot of plausible error scenarios, it's likely you're going to fail for other reasons, so there's no point in trying to do things like estimate state from other sensors.
That's true, but it can also encourage a habit of lazyness in the designs. And, an exceleration spike when the parachute opens or the landing gear locks, is not something that has low probability. It sounds like a lot of "not my job".
no backup (Score:2)
Re: (Score:2)
Could be wrong, but I believe there are follow up missions and the Schiaparelli probe was intended as a great-if-everything-works-but-if-it-doesn't-we'll-probably-learn-a-lot proof of concept mission.
Re: (Score:2)
I thought all the space pizza was on Io? http://www.space.com/18272-jup... [space.com]
Re: (Score:2)
It's a result of the developed world's attitude these days, the attitude of being highly skeptical at best for any big collective action, especially ones with government involvement.
Stuff like this gives them just cause to be skeptical. Funny how the people who are right are the ones getting blamed here.