Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Mars Software Earth NASA Space Science Technology

ESA: European Mars Lander Crash Caused By 1-Second Glitch (space.com) 110

An anonymous reader quotes a report from Space.com: The European Space Agency (ESA) on Nov. 23 said its Schiaparelli lander's crash landing on Mars on Oct. 19 followed an unexplained saturation of its inertial measurement unit (IMU), which delivered bad data to the lander's computer and forced a premature release of its parachute. Polluted by the IMU data, the lander's computer apparently thought it had either already landed or was just about to land. The parachute system was released, the braking thrusters were fired only briefly and the on-ground systems were activated. Instead of being on the ground, Schiaparelli was still 2.3 miles (3.7 kilometers) above the Mars surface. It crashed, but not before delivering what ESA officials say is a wealth of data on entry into the Mars atmosphere, the functioning and release of the heat shield and the deployment of the parachute -- all of which went according to plan. In its Nov. 23 statement, ESA said the saturation reading from Schiaparelli's inertial measurement unit lasted only a second but was enough to play havoc with the navigation system. ESA said the sequence of events "has been clearly reproduced in computer simulations of the control system's response to the erroneous information." ESA's director of human spaceflight and robotic exploration, David Parker, said in a statement that ExoMars teams are still sifting through the voluminous data harvest from the Schiaparelli mission, and that an external, independent board of inquiry, now being created, would release a final report in early 2017.
This discussion has been archived. No new comments can be posted.

ESA: European Mars Lander Crash Caused By 1-Second Glitch

Comments Filter:
  • by hyades1 ( 1149581 ) <hyades1@hotmail.com> on Thursday November 24, 2016 @03:05AM (#53352927)

    Man, if I had a nickel for every time some kind of sensory saturation forced a premature release...

    • by rsmith-mac ( 639075 ) on Thursday November 24, 2016 @03:19AM (#53352971)

      Man, if I had a nickel for every time some kind of sensory saturation forced a premature release...

      Then you'd still be broke. This is Slashdot; you're not fooling anyone.

      • Man, if I had a nickel for every time some kind of sensory saturation forced a premature release...

        Then you'd still be broke. This is Slashdot; you're not fooling anyone.

        Visit the lair of any Slashdot poster, buried deep in the basement of his parent's house, and you will find that the height and majesty of the tissue mountain on the nightstand next to his bed thoroughly discredits your hypothesis. If you need further confirmation, shine a black light at his laptop and prepare yourself to be blinded by the glow.

        • Visit the lair of any Slashdot poster, buried deep in the basement of his parent's house, and you will find that the height and majesty of the tissue mountain on the nightstand next to his bed thoroughly discredits your hypothesis. If you need further confirmation, shine a black light at his laptop and prepare yourself to be blinded by the glow.

          Just because your house is like that doesn't mean everyone's is.

      • by Anonymous Coward

        I believe he's referring to the manual override.

      • What...you never put lipstick on your hand for those special evenings?

        Fairness forces me to note that you, too, are a Slashdot denizen.

      • by silentcoder ( 1241496 ) on Thursday November 24, 2016 @07:17AM (#53353655)

        He never said there was another person involved. He's just complaining about never managing to make it to the end of the pornhub clip.

  • Cheater? (Score:2, Funny)

    by Anonymous Coward

    They're blaiming lag?

  • Kalman filter (Score:5, Insightful)

    by little1973 ( 467075 ) on Thursday November 24, 2016 @03:29AM (#53353007)

    https://en.wikipedia.org/wiki/... [wikipedia.org]

    How in hell did they test their Kalman filter to allow such bad data to reach the decision logic? (I assume they used one.)

    • Filter or not (Score:5, Informative)

      by evanh ( 627108 ) on Thursday November 24, 2016 @03:47AM (#53353055)

      When the altitude stops changing for a whole second the filter is going to have to be a long one! And that ain't desirable for responsive control.

      The real question is how could the sensory processor have overloaded in the first place? My money is on simple [b]code bloat[/b]. Ie: They used a bunch of generic libraries that use further libraries that use further libraries that use further libraries that use further libraries that use further libraries ...

      • by Anonymous Coward

        Bloat itself typically just makes things consistently slow.
        To get stalling you either need buffers upon buffers put there "to make things faster" or you need a runaway process that hangs.
        Not not realize you have them you need abstractions that hides them away.

      • by Anonymous Coward

        I seriously doubt they cobbled the software together with a bunch of generic libraries; but if they did they got what they deserved.

      • Re:Filter or not (Score:5, Insightful)

        by Solandri ( 704621 ) on Thursday November 24, 2016 @12:17PM (#53354777)
        Dynamic range. Sensors which can measure from 0 - 100 g's are not as sensitive in the 0-1 g range you may be more concerned about. So you instead opt for accelerometers which max out at 10 g's and try to deal with the periods of max acceleration in software.

        A more elegant solution is to use both the sensitive accelerometer and an accelerometer with a greater max threshold. That way you keep the higher max limit without giving up low-gain sensitivity. But spacecraft tend to be both weight- and budget-constrained...

        More troubling to me was that there wasn't some basic sanity checking going on. Like a calculation that says "3 seconds ago I was at 4 km high. Now I think I'm on the ground. Does it make sense that I could've traveled that far in that little time? No? Then the instruments saying I'm on the ground are probably wonky, and I should give other instruments a higher priority in calculating my altitude for a bit." Same way I write my code (and spreadsheets) to calculate important numbers two, three, or sometimes even four different ways to make sure they all agree before proceeding to act on it.
        • by Kjella ( 173770 )

          More troubling to me was that there wasn't some basic sanity checking going on. (...) Same way I write my code (and spreadsheets) to calculate important numbers two, three, or sometimes even four different ways to make sure they all agree before proceeding to act on it.

          Well it's not exactly like the lander can abort, it's do or die. So you got inconsistent or unlikely data, but what's good and what's bad? It is a glitch, is it defective, did a misfire flip us around or put us in a spin or block the sensor? Can we salvage it or is the mission fucked no matter what? That's really the million dollar question, is there a contingency plan that could work and if so what should trigger it.

          I'm guessing that with combinatorics you'll have potentially very many possible failure mod

        • Same way I write my code (and spreadsheets) to calculate important numbers two, three, or sometimes even four different ways to make sure they all agree before proceeding to act on it.

          You sir, are not a typical coder. I would go so far as to say your error checking and thought processes are superior to (random very high number) 90% of the programmers I have seen. Mind you, I still think that is a bare minimum for calling yourself a programmer but each instance of proper coherency checking requires notice so that others can learn from it.

      • This is not how Kalman filter works. Even if it gets totally wrong data for one second it outputs "correct" values based on previous data.

        So, in our case the altitude output would have changed for this one second and the output values would have been quite close to the real altitude.

        • This is not as simple as that. The inertial sensor is not outputting bogus or noisy data that can be easily discarded from previous data. It is saturating because the actual acceleration or rotation of the spacecraft is higher than any value the sensor can measure. Any integration algorithm used to compute the position of the spacecraft, including a Kalman filter or not, is going to have trouble in those conditions. Of course, there are methods to estimate what could be the correct measurement value during
          • You want sanity checking (based on physical possibility/impossibility) on individual input data streams to the Kalman filter prior to allowing them to get into the filter's weighted averaging. If a given single measurement stream (the position measurement by integrated acceleration) is indicating impossible changes in position over various near-past time ranges, exclude the whole measurement-type from the averaging immediately.

      • When the altitude stops changing for a whole second the filter is going to have to be a long one! And that ain't desirable for responsive control.

        The real question is how could the sensory processor have overloaded in the first place? ...

        ... When I heard about the crash landing I literally said to a friend of mine, "I bet the subroutine that cuts the parachute loose so it doesn't land on top of the payload detected the thump of the parachute strings going-taught, determined that meant it was on the ground, and cut the parachute." ...

        Mechanical devices can have really long "bounce" times, when it includes a parachute and riser lines it can easily be over a second.

        Not only was their mechanical testing lacking, their simulation software should have also picked this up. And they had a similar failure when the landing gear opened, in a previous lander.

        It sounds like they had a lot of scientists, but no engineers!

    • Re:Kalman filter (Score:4, Insightful)

      by gTsiros ( 205624 ) on Thursday November 24, 2016 @04:14AM (#53353131)

      I find it more weird that *one* sensor misbehaving lead to the entire mission failing.

      I have more robustness in my thrust measuring rig made of wood beams and zipties :|

      • by Ramze ( 640788 )

        This. So much this. I don't know about the EU, but when NASA builds spacecraft, it tends to put in multiple redundancies where it can and add a little logic to determine if and when a sensor fails given other data. If you're going to send up a multi-million dollar craft for a project that will last months, have a backup plan for each and every thing that could possibly go wrong so long as it doesn't significantly add to the expense.

        We know that rocket scientists can fire an object into orbit and hit a

    • "Check for integer overflow" is a checkbox in Simulink.

      How was this not caught on the Hardware in the Loop test benches?

      Jesus people, is this amateur hour.

    • "How in hell did they test their Kalman filter to allow such bad data to reach the decision logic? (I assume they used one.)"

      1) A Kalman Filter probably is not really appropriate here because the parachute has just been deployed and you wouldn't have state statistics available to filter the input data. Doesn't mean they didn't use one with ad hoc statistics. That's not as uncommon as perhaps it should be.

      2) Presumably the IMU is expected to tell you the probe has run into the planet (i.e. landed) and it's

      • Presumably the IMU is expected to tell you the probe has run into the planet (i.e. landed) and it's time to get rid of the 'chute before it lands on your probe and also time to shut down the thrusters

        Wrong landing sequence. This spacecraft was intended to parachute down to some hundreds of metres, then fire up retro-rockets and jettison the parachutes, then descend to a few metres on the retro-rockets, then drop to the ground. So, the signal from the IMU would vary between free-fall and various substantial

  • by LordHighExecutioner ( 4245243 ) on Thursday November 24, 2016 @03:34AM (#53353017)
    Overflows and bad data problems happened to ESA before [wikipedia.org].
  • by Bongo ( 13261 ) on Thursday November 24, 2016 @03:43AM (#53353041)

    "Obligatory" Dark Star reference.

  • Ariane 5 (Score:2, Interesting)

    Brings to mind the failure of the first Arianne 5 [wikipedia.org] launcher because control software spat an Ada stack trace over a line which was supposed to only contain kinematic data.

    • Re: (Score:3, Informative)

      by Anonymous Coward

      > ...control software spat an Ada stack trace over a line...

      Eh, no. The failure of the INS's control software caused the INS to send diagnostic data (rather than sensor data) to the control systems, which then did what they _thought_ they were being commanded to do.

      None of the code in the system was modified in flight.

  • What the? (Score:5, Informative)

    by NewtonsLaw ( 409638 ) on Thursday November 24, 2016 @04:03AM (#53353105)

    So they didn't correlate the IMU data with ranging radar or even barometric altitude information so as to avoid this?

    I know weight and volume are at a premium on such craft but a barometric sensor (even one capable of operating in Mars's rarefied atmosphere, is the size of a thumbnail and weighs just a fraction of a gram.

    Sigh!

    • You can even correlate it with your own kinematic model. The scenario which the vehicle followed is impossible. It can't land one second after dropping the parachute, and so timing alone should have made it reject the invalid data.

    • Re:What the? (Score:4, Insightful)

      by thegarbz ( 1787294 ) on Thursday November 24, 2016 @05:26AM (#53353321)

      even barometric altitude information

      I'm interested to know how you calibrate your barometric altitude information, and even more so what vacuum followed by a sudden atmospheric entry will do to such a sensor.

      If I'm going to take a guess I'd so no, an instrument capable in operating that range of pressures, temperatures, vibration, etc is not the size of a thumbnail weighing a gram.

    • by Viol8 ( 599362 )

      Yes, you have to wonder why on a mission of this expense and complexity the height about the ground is essentially done by mathematical dead reckoning. Would adding a ranging radar really have added so much to the weight and/or required package size that it was infeasible to include it? Obviously they must have considered it and I'd be interested to know why in the end it was not seen as a viable part of the solution.

      • The article says that the radar was working. But the data from the radar seems to be have been ignored at this point.

      • by khallow ( 566160 )

        Yes, you have to wonder why on a mission of this expense and complexity the height about the ground is essentially done by mathematical dead reckoning.

        Because it works really well. The other replier, MichaelSmith indicated it had radar as well.

      • by ddtmm ( 549094 )
        The radar unit plugs in to the lander's headphone jack. Unfortunately, the headphone jack was removed on the new landers.
    • If the landing struts are subject to a compressive force, you've probably landed. If not, you haven't. Why wouldn't the computer make use of this?

      Am I missing something, or is this a stupid design?
      • by Xolotl ( 675282 )
        It was supposed to have released the parachute and made the last part of the descent using retro-rockets for final braking, and only then would you get compression of the struts. In this case it released the parachute at 3+km rather than a few tens of metres ... oops.
    • by Ihlosi ( 895663 )
      I know weight and volume are at a premium on such craft but a barometric sensor (even one capable of operating in Mars's rarefied atmosphere, is the size of a thumbnail and weighs just a fraction of a gram.

      Even one that works at the velocity encountered during atmospheric entries?

      Sounds like you're suggesting putting a Pitot tube on a space probe ...

    • Will a barometric sensor work properly while descending through gases emitted from thrusters that are trying to slow the vehicle?

    • So they didn't correlate the IMU data with ranging radar or even barometric altitude information so as to avoid this?

      How do you know the barometric pressure profile before you enter the atmosphere? Mars has a trickily variable atmosphere.

      There was a large dust storm developing at the time, which is a (potentially) global event. How much does that affect barometric pressure? (On Mars, not necessarily on Earth.)

  • Oops (Score:5, Funny)

    by wonkey_monkey ( 2592601 ) on Thursday November 24, 2016 @04:38AM (#53353191) Homepage

    Should've used metric seconds.

  • by dohzer ( 867770 )

    What kind of IMUs are normally used in these craft? The same kind used in aircraft and weapons?

  • ... $1000 quadcopters back here on Earth ship with multiple IMUs for redundancy, since the bloody things are about as trustworthy as your average politician.

    Having made that glib remark, I'm sure it either did have redundancy, or if it didn't that was for a good reason (e.g. risk of failure deemed too low to warrant the weight penalty in adding redundancy). I would also like to think that they're using somewhat more reliable IMUs than those found in quads.

  • Why wasn't the IMU sensor doubled by other ways of detection? There was no fallback in case it malfunctioned.
  • No basic sanity checks? As in "This phase must last at least X seconds", or "No switching to landing behavior if altitude measurement from 1 second ago still said '2 miles above surface'"?
  • by tomhath ( 637240 ) on Thursday November 24, 2016 @08:04AM (#53353759)
    FTFA:

    "[T]he erroneous information generated an estimated altitude that was negative," ESA said.

    Which resulted in an actual altitude that was negative.

  • A brief burst was enough.

  • by Midnight Thunder ( 17205 ) on Thursday November 24, 2016 @09:19AM (#53353981) Homepage Journal

    I wonder whether making the source code of these probes available to the public, for vetting would help spot bugs like these? I am also curious whether releasing the code would be problematic for any reason?

    • by pr100 ( 653298 )

      I wonder whether making the source code of these probes available to the public, for vetting would help spot bugs like these? I am also curious whether releasing the code would be problematic for any reason?

      Dunno. But I suppose it might be that the code is written by a contractor and they hope to make money out of the code in other contexts.

    • Very unlikely. They didn't do it in the past. However, this story will make another good anecdote for a software engineering lecture.
    • by Anonymous Coward

      Software to land a probe on mars is quite similar, if not identical, to software to put a (nuclear) warhead on a target. That's an important strategic capability for "first world" nations - otherwise you're in the category of Saddam firing Scuds, which are basically V2s with newer parts, and quite literally cannot hit the broad side of a barn (albeit from 100 km away).

      So, the hard parts of solving the problem (after you've done the basic college physics part) are likely to not be open source. Things like

    • I wonder whether making the source code of these probes available to the public, for vetting would help spot bugs like these?

      Probably not, as the public wouldn't spend the months needed to study the hardware and interface specifications needed to understand what's going on in the software. Seriously, this is a tightly integrated system not a standalone program - without understanding the system, you can't tell a bug from working as intended.

  • by Anonymous Coward

    Lots of just plain old ignorant comments here. I say this in a nonperjorative sense - if you've not worked on flight software, there's no way you could know.

    1) Space is unforgiving, hardware designs change very, very slowly. Project schedules move fast and have limited budgets. Just because you can buy a MEMS based IMU for your quadcopter does not mean that you can get one for a spacecraft that will work reliably from -40 to +80C, withstand the vibe tests, the pyroshock, etc. Oh, yeah, and it (and the surr

    • It's really really hard, granted.

      In my experience in the systems engineering industry, there was rarely any re-use of design or code from one project to the next similar project. Silo-ism and misaligned incentives.

      Imagine if the reliability of this kind of EDL system and its software could be improved by evolution where different space agencies and subcontractors shared and re-used their ideas for improving solutions to the complex problem.

      Imagine all the landers... living for today.

    • ...
      4) Fault handling is tricky - you can easily go down a rat maze of low probability events generating code (and hardware) to handle obscure corner cases, thereby increasing your test costs and time, and potentially introducing other faults. For a lot of plausible error scenarios, it's likely you're going to fail for other reasons, so there's no point in trying to do things like estimate state from other sensors. ...

      That's true, but it can also encourage a habit of lazyness in the designs. And, an exceleration spike when the parachute opens or the landing gear locks, is not something that has low probability. It sounds like a lot of "not my job".

  • The Schiaparelli EDM lander is an example of the typical one-off missions that humanity does these days. It's worth noting that they could have had built and launched two or more of these vehicles for much less than the first and already be correcting the erroneous code on a second spacecraft. Then they wouldn't have to wait years for a replacement mission and have a much better chance of mission success.
    • Could be wrong, but I believe there are follow up missions and the Schiaparelli probe was intended as a great-if-everything-works-but-if-it-doesn't-we'll-probably-learn-a-lot proof of concept mission.

Understanding is always the understanding of a smaller problem in relation to a bigger problem. -- P.D. Ouspensky

Working...