Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Japan Space Science

Design, Hardware, Software Errors Doomed Japanese Hitomi Spacecraft (scientificamerican.com) 101

Reader Required Snark writes: The Japanese space agency JAXA said its recently launched X-Ray observation satellite Hitomi has been destroyed. After a successful launch on February 17, contact with the satellite was lost on March 28. Off the 10-year expected life span, only three days of observations were collected. Preliminary inquiry points to multiple failures in design, hardware and software. After the launch it was discovered that the star tracker stabilization didn't work in a low magnetic flux area over the South Atlantic. When the backup gyroscopic spin stabilization took control, the spin increased instead of stopping. An internal magnetic limit feature in the gyroscope failed, causing the spin get worse. Finally, a thruster based control started, but because of a software failure the spin increased further. The solar panels broke off, leaving the satellite without a long-term power supply. It seems that untested software had been uploaded for thrust control just before the breakup. This is a major loss for astronomical research. Two previous attempts by Japan to launch a high-resolution X-ray calorimeter had also failed, and the next planned sensor of this type is not scheduled until 2028 by the ESA. Just building a replacement unit would take 3 to 5 years and cost $50 million, without the cost of a satellite or launch.
This discussion has been archived. No new comments can be posted.

Design, Hardware, Software Errors Doomed Japanese Hitomi Spacecraft

Comments Filter:
  • by Dan East ( 318230 ) on Saturday April 30, 2016 @11:35AM (#52018669) Journal

    Design, Hardware, Software Error

    Oh, is that all?

    • Re: (Score:3, Interesting)

      by phrostie ( 121428 )

      it sounds like everyone is starting from scratch every time a project like this is built.
      regardless of success or fail, wouldn't it be best for everyone to release the engineering and software so that the next one is an improvement over what went before.
      it also might reduce the reduce the life cycle of the next project.

      just my .01999 USD

      • Re: (Score:2, Informative)

        by Anonymous Coward

        On the first launch of the Ariane 5 rocket, it used parts of the control software of the Ariane 4, a very reliable rocket with a success rate of more than 97%. The launch ended with the destruction of the rocket 37 seconds into the flight [wikipedia.org] due to an arithmetic overflow. It had not been taken into account that the bigger rocket would cause bigger values in the control software.

        • On the first launch of the Ariane 5 rocket, it used parts of the control software of the Ariane 4, a very reliable rocket with a success rate of more than 97%. The launch ended with the destruction of the rocket 37 seconds into the flight [wikipedia.org] due to an arithmetic overflow. It had not been taken into account that the bigger rocket would cause bigger values in the control software.

          It was great, they used software coded in ADA that detected the overflow and raised an exception, disabling the faulty part, the work was then taken over by the backup system which, being identical, did exactly the same thing. Whoops.

          • by laing ( 303349 )
            The main issue with Ariane 5 was the in-band error code being sent from the accelerometer when the force during liftoff exceeded its specification. The in-band error code was interpreted by the flight control system as a valid data value. Things went very wrong from then on.
      • by Anonymous Coward

        There have been some attempts to create standardized satellites, a premade box with batteries, gyroscopes, thrusters, solar panels, etc with a cavity inside for mission modules (communications, photography, etc). And they make sense for general uses where you're not trying to do anything too advanced. However with highly advanced/sensitive instruments like in space telescopes its not really practical since each instrument has a litany of things that can throw off its measurements. The James Webb telescop

    • by Tablizer ( 95088 )

      Bad art

    • I would have thought that most of this would be plug and play by now. Not so much that every component is the same, but that how the components interface would be the same. Much like when I hook a new more sensitive mouse, or a better printer to my computer, I don't need to reconfigure my browser. There can't be that many unique systems on any given platform. Gyros, rockets, sensors, etc. Maybe today's gyro package is way better than yesterday's but I would think that you just make the interface capable of
      • I would also have thought that there would be simulators for most of this crap.

        A KSP add-on?

      • by Anonymous Coward

        Flight software developer here (I *am* a rocket engineer, of sorts)

        Spacecraft stuff is not made in sufficient volumes to have standardized interfaces beyond the basic electrical interface. Sure, there's a 24V DC power, perhaps a few discretes, some discrete telemetry (voltage, current, temperature), and some kind of data interface (MIL-STD-1553, RS-422 serial, or SpaceWire are most likely).
        So you will be writing some custom software to deal with this almost one-of-a-kind interface.

        Typically, you are inheri

        • I would suspect that this tradeoff would apply to every single project, but not all the projects overall. The above screwup cost over $200 million. Thus the savings from preventing a single screwup out of even 20 projects would more than cover the extra costs.

          This is similar to an argument I often have about unit testing. Many programmers are still opposed to the idea. I actually believe that a large project without unit testing can't actually be competed. The extra effort of unit testing actually allows
        • by tlhIngan ( 30335 )

          Flight software typically doesn't have lots of extra capability: you have to test it over the entire range, so it tends to be "do we have a specific requirement for that? Yes: build it and test it; No, it's nice to have: Don't build it" So your idea of "incorporate lots of flexibility against potential future devices" would be a non-starter: what requirement would you design against for that "potential future device"? How would you justify that particular requirement, as opposed to another? Say your existin

    • Space is hard. For the japs moreso. Some nations should just stop bothering with their own space programs. There is nothing they can glean that they couldn't be doing a collaborative effort with countries that do space well.
  • 3 days of data? (Score:4, Insightful)

    by JustAnotherOldGuy ( 4145623 ) on Saturday April 30, 2016 @11:50AM (#52018719) Journal

    Only got 3 days of data? Damn, that's gotta hurt.

    Also, the "Design, Hardware, Software Error" bit is funny in a way...I mean, what else was left to screw up? This was like the Trifecta of Fuckups.

    • by Anonymous Coward

      They could have given it bad instructions (i.e. user error).

    • Flying over the South Atlantic Anomaly. It's not like it wasn't known it is there and causes issues that should be tested for before launch.

    • by CODiNE ( 27417 )

      A Whathefecta!

  • ... that you find they were wired backwards.
  • by fahrbot-bot ( 874524 ) on Saturday April 30, 2016 @12:14PM (#52018789)

    It seems that untested software had been uploaded for thrust control just before the breakup.

    See what happens when you don't disable the GWX settings.

  • by Anonymous Coward on Saturday April 30, 2016 @12:27PM (#52018845)

    From the TFA

    Dan McCammon, an astronomer at the University of Wisconsin–Madison, helped to design and build Hitomi’s premiere scientific instrument, an X-ray calorimeter that measures the energy of X-ray photons with exquisite precision. He has been working on the technology for more than three decades, flying versions of it on the ASTRO-E mission, which failed on launch in 2000, and the Suzaku spacecraft, in which a helium leak rendered the instrument useless weeks after its 2005 launch.

  • by Anonymous Coward

    Re-appoint your entire senior software team, especially the lead. Examine the engineering background of the rest.

    Hardware fails, that's completely inevitable. Software of the kind we're talking about is meant to limit the impact of independent hardware failures, which it can do because its own failure modes can be given however many fractional 9's of perfect reliability you desire, limited only by available resources.

    From the reports, it seems clear that the probe's software was not designed to do that, a

    • You don't want to knee-jerk it. Who approved the upload of untested software and why. There could be a valid reason - say a fatal bug discovered in the existing code and no way to change the launch schedule. It could be budget pressure - simply not enough money to test. It could be plain incompetence.

  • If the satellite is being designed and built by a government organisation, in the name of the advancement of human knowledge, should we be encouraging the software to be open source? Have there been examples of such initiatives?

    • by tomhath ( 637240 )
      Some is available [kottke.org]. But keep in mind that "civilian" space programs are usually thinly disguised military projects, so much of what's really happening is not made public.
      • Some is available [kottke.org]. But keep in mind that "civilian" space programs are usually thinly disguised military projects, so much of what's really happening is not made public.

        Thanks for the link. What you say makes sense, though I though I would ask anyhow, since there is likely a shift between what is considered knowledge limited to military use?

  • by vmaxxxed ( 734128 ) on Saturday April 30, 2016 @12:53PM (#52018967)

    Those are called political and budget pressure by managers who have no clue on engineering ---

    Software uploaded with out testing ? There is no way they could have gotten this far with out testing. I am sure there is no engineer in Japan that does not test thoroughly. Actually Japanese code is famous for being of the best quality -

    This was caused by politics, bureaucracy and plain bad management.

    • by gweihir ( 88907 )

      Indeed. And very likely by a culture of "not contradicting the boss". An engineer that is unwilling to "contradict the boss" is a bad engineer, no matter what other skills he has. Of course, many bosses simply get rid of the "naysayers" and foster a culture of "can do". The results are invariably what we see in this story, although many managers manage to conceal that they were responsible for quite a while and sometimes forever. If the damage is huge, it is very rarely the engineers that have screwed up.

      • by Viol8 ( 599362 )

        >An engineer that is unwilling to "contradict the boss" is a bad engineer,
        >no matter what other skills he has. Of course, many bosses simply get
        >rid of the "naysayers"

        And there's your problem right there. What would you rather be - a "bad" engineer who can still pay the mortgage/rent, or a righteous engineer who's now looking for work and could be on the street in a few months if doesn't get a new job?

        • by gweihir ( 88907 )

          I most certainly do not want to be the engineer responsible for a spectacular failure. Of course, the software field has far too many "engineers" and many of them bad in other ways, which makes the problem worse. But while I work on a level where I cannot only speak up, it is required that I speak up, I can understand the person that decides to keep quiet.

        • What would you rather be

          Without even a second of hesitation, the latter. I live in Canada, so there's no at-will or right-to-work or whatever the hell it's called (don't know the difference or care), so good luck firing me for trying to do the right thing. Especially since we have professional organizations backing us up, the company has a hell of a lot more to lose than I do.

          Even if that wasn't the case, my answer doesn't change. If I just wanted to make money slaving under someone else's will with no creative say in my work, dam

        • ... And there's your problem right there. What would you rather be - a "bad" engineer who can still pay the mortgage/rent, or a righteous engineer who's now looking for work and could be on the street in a few months if doesn't get a new job?

          It is better to be fired, or quit. Then you will not be one of the ones black-balled by all of the other personel departments, after the disaster.

          But it is a value judgement that must be made by all of us, based on the potential damage that might happen and the odds.

      • Indeed. And very likely by a culture of "not contradicting the boss". An engineer that is unwilling to "contradict the boss" is a bad engineer, no matter what other skills he has.

        You're supposed to raise the issue after work, over drinks. Yes, I would also prefer to have time for my own personal life, than have to go drinking after work in order to continue working, and do the stuff you should have been able to do at work but couldn't because of societal inertia and corporate culture.

        If I weren't so concerned with what is happening here in the USA regarding labor, I'd be really and truly fascinated by it in Japan. They have a culture of make-work now, and a massive suicide rate. We

    • Doc: "No wonder this circuit failed; it says 'Made in Japan" --Back to the Future

      • Doc: "No wonder this circuit failed; it says 'Made in Japan" --Back to the Future

        Yeah. I noticed that too. A good laugh. 'Member when "made in China" only meant McMickey toys? I was thinking at the time that they'd follow the same arc as Japan.

  • by fahrbot-bot ( 874524 ) on Saturday April 30, 2016 @01:25PM (#52019093)

    It seems that untested software had been uploaded for thrust control just before the breakup.

    Note to self: Don't ask your girlfriend questions you don't want the answers to - again.

  • by gweihir ( 88907 ) on Saturday April 30, 2016 @01:46PM (#52019199)

    This is just one of the more spectacular examples. I have heard of managers of large software teams that "do not believe in testing", I have seen Internet-reachable critical software that got a security evaluation only after deployment, because it was finished only a few days before deployment, and quite a few more things of similar utter incompetence. My guess is that the people responsible for these completely ridiculous screwups are "managers" that think they know how it all works (while being clueless), and that have eliminated all resistance to their views by firing anybody actually competent.

    This is a dangerous and completely unacceptable regression. Humanity needs to be good at engineering if it is to have a future.

    • by Wiener ( 36657 )

      Humanity needs to be good at engineering if it is to have a future.

      So...get rid of management? Because the two are mutually exclusive.

      • ... So...get rid of management? Because the two are mutually exclusive.

        No, just the "pointy-haired managers", who are not actually managers at all!

        Real managers are necessary and helpful.

  • by drwho ( 4190 ) on Saturday April 30, 2016 @02:53PM (#52019461) Homepage Journal

    I'd like to see a more thorough investigation of this set of incidents. That means no one involved gets to skip out by Seppuku. One of the problems with having a number of backup systems is that people tend to think "well, if it breaks, there's a backup system" - not realizing that each time a backup system is added, complexity is added, and that overall reliability goes down, instead of up. I don't know if over-reliance of backup systems, and failure to manage complexity, was the cause here, but it's the only thing other than "bad luck" or "sabotage" that can explain this disaster from a country which has many talented engineers.

    • Not exactly your point, but in the same vein... Your comment on backup systems reminds me of a common misconception when it comes to designing seals with O-ring gaskets.

      I've heard many times: "Well, it almost seals, so if we just put a backup gasket in there it will be fine." Any O-ring design guideline will tell you that adding a backup only allows you to loosen your machining tolerances a bit; e.g. if the groove had to be X +/-0.005" deep, now it can be X +/-0.010" instead. X still has to be the same, the

  • IBM 9000 (Score:5, Funny)

    by drwho ( 4190 ) on Saturday April 30, 2016 @03:00PM (#52019493) Homepage Journal

    "Well, I don’t think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error."

  • "Ask yourself why an antenna won't deploy on a deep space probe."

    "Or ask how they could launch a $6Billion telescope without testing its mirror."

    'The Arrival'

    https://www.youtube.com/watch?... [youtube.com]

  • So... what *modern* development methodology and platform did they use?

  • I know nothing about the specifics of this mission, but I do know something about spacecraft.

    The summary says the star tracker didn't work in "an area of low magnetic flux" (the South Atlantic Anomaly [wikipedia.org]). The true issue is that the SAA is a high radiation area and the radiation caused an SEU [wikipedia.org] in the star tracker. The Scientific American article was a bit mixed up about dumping the momentum stored in the reaction wheels. The text is a bit jumbled, but I believe the article was referring to magnetic torque

Neutrinos have bad breadth.

Working...