Design, Hardware, Software Errors Doomed Japanese Hitomi Spacecraft (scientificamerican.com) 101
Reader Required Snark writes: The Japanese space agency JAXA said its recently launched X-Ray observation satellite Hitomi has been destroyed. After a successful launch on February 17, contact with the satellite was lost on March 28. Off the 10-year expected life span, only three days of observations were collected. Preliminary inquiry points to multiple failures in design, hardware and software. After the launch it was discovered that the star tracker stabilization didn't work in a low magnetic flux area over the South Atlantic. When the backup gyroscopic spin stabilization took control, the spin increased instead of stopping. An internal magnetic limit feature in the gyroscope failed, causing the spin get worse. Finally, a thruster based control started, but because of a software failure the spin increased further. The solar panels broke off, leaving the satellite without a long-term power supply. It seems that untested software had been uploaded for thrust control just before the breakup. This is a major loss for astronomical research. Two previous attempts by Japan to launch a high-resolution X-ray calorimeter had also failed, and the next planned sensor of this type is not scheduled until 2028 by the ESA. Just building a replacement unit would take 3 to 5 years and cost $50 million, without the cost of a satellite or launch.
Is that all? (Score:5, Funny)
Design, Hardware, Software Error
Oh, is that all?
Reminds me of a recent jdrama (Score:2)
Interesting I just watched a true-story-based jdrama about the development of rockets in Japan called "Shitamachi Rocket" which blew my mind.
Re: (Score:1)
You don't know much about junkies if you think they want anyone to suck their dicks. They're not into getting high with natural brain chemistry.
Junkies suck dicks, nobody sucks junkie dicks.
Re: (Score:3, Interesting)
it sounds like everyone is starting from scratch every time a project like this is built.
regardless of success or fail, wouldn't it be best for everyone to release the engineering and software so that the next one is an improvement over what went before.
it also might reduce the reduce the life cycle of the next project.
just my .01999 USD
Re: (Score:2, Informative)
On the first launch of the Ariane 5 rocket, it used parts of the control software of the Ariane 4, a very reliable rocket with a success rate of more than 97%. The launch ended with the destruction of the rocket 37 seconds into the flight [wikipedia.org] due to an arithmetic overflow. It had not been taken into account that the bigger rocket would cause bigger values in the control software.
Re: (Score:2)
On the first launch of the Ariane 5 rocket, it used parts of the control software of the Ariane 4, a very reliable rocket with a success rate of more than 97%. The launch ended with the destruction of the rocket 37 seconds into the flight [wikipedia.org] due to an arithmetic overflow. It had not been taken into account that the bigger rocket would cause bigger values in the control software.
It was great, they used software coded in ADA that detected the overflow and raised an exception, disabling the faulty part, the work was then taken over by the backup system which, being identical, did exactly the same thing. Whoops.
Re: (Score:1)
Re: (Score:1)
There have been some attempts to create standardized satellites, a premade box with batteries, gyroscopes, thrusters, solar panels, etc with a cavity inside for mission modules (communications, photography, etc). And they make sense for general uses where you're not trying to do anything too advanced. However with highly advanced/sensitive instruments like in space telescopes its not really practical since each instrument has a litany of things that can throw off its measurements. The James Webb telescop
Re: (Score:2)
Re: (Score:1)
Bad art
Re: (Score:2)
Re: (Score:2)
I would also have thought that there would be simulators for most of this crap.
A KSP add-on?
tradeoffs in flight software (Score:1)
Flight software developer here (I *am* a rocket engineer, of sorts)
Spacecraft stuff is not made in sufficient volumes to have standardized interfaces beyond the basic electrical interface. Sure, there's a 24V DC power, perhaps a few discretes, some discrete telemetry (voltage, current, temperature), and some kind of data interface (MIL-STD-1553, RS-422 serial, or SpaceWire are most likely).
So you will be writing some custom software to deal with this almost one-of-a-kind interface.
Typically, you are inheri
Re: (Score:2)
This is similar to an argument I often have about unit testing. Many programmers are still opposed to the idea. I actually believe that a large project without unit testing can't actually be competed. The extra effort of unit testing actually allows
Re: (Score:2)
Re: (Score:1)
3 days of data? (Score:4, Insightful)
Only got 3 days of data? Damn, that's gotta hurt.
Also, the "Design, Hardware, Software Error" bit is funny in a way...I mean, what else was left to screw up? This was like the Trifecta of Fuckups.
Re: (Score:1)
They could have given it bad instructions (i.e. user error).
Re: (Score:1)
Flying over the South Atlantic Anomaly. It's not like it wasn't known it is there and causes issues that should be tested for before launch.
Re: (Score:2)
A Whathefecta!
This would be an excellent time for them to post (Score:2)
on Reddit's TIFU: https://www.reddit.com/r/tifu/ [reddit.com]
Re: (Score:2)
Lol, the T in TIFU is really more of a guideline, in that it's OK to completely ignore it. Same goes for the I.
TIFU is really more like TITOASWIOSEFU: Today I Thought Of A Story Where I or Someone Else Fucked Up. Not quite as catchy.
It's only when the backups kick in... (Score:2)
Re: (Score:2)
If only. All they'd need to do in that case is "reverse the polarity"
Software uploaded before breakup. (Score:4, Funny)
It seems that untested software had been uploaded for thrust control just before the breakup.
See what happens when you don't disable the GWX settings.
I feel sorry for that guy... (Score:5, Informative)
From the TFA
Dan McCammon, an astronomer at the University of Wisconsin–Madison, helped to design and build Hitomi’s premiere scientific instrument, an X-ray calorimeter that measures the energy of X-ray photons with exquisite precision. He has been working on the technology for more than three decades, flying versions of it on the ASTRO-E mission, which failed on launch in 2000, and the Suzaku spacecraft, in which a helium leak rendered the instrument useless weeks after its 2005 launch.
Re: (Score:3)
Space is dead. It's a radiation-blasted vacuum. Nobody is going to live there. Ever. Get over it, Space Nutters. We should kill all astrophysicists and burn all scifi books. Like in Europe.
Europe got bored of that and the sport is now found elsewhere in the world. I for one welcome space nutters, since they give us something else to talk about :) I would burn the trolls, but not considering myself a violent person will accepting making a sport of them.
Re: (Score:1)
"cos Arianespace is totaly not a thing, ESA was closed down years ago and Darmstadt is only known for its football team.
Suggestion to JAXA (Score:1)
Re-appoint your entire senior software team, especially the lead. Examine the engineering background of the rest.
Hardware fails, that's completely inevitable. Software of the kind we're talking about is meant to limit the impact of independent hardware failures, which it can do because its own failure modes can be given however many fractional 9's of perfect reliability you desire, limited only by available resources.
From the reports, it seems clear that the probe's software was not designed to do that, a
Re: (Score:3)
You don't want to knee-jerk it. Who approved the upload of untested software and why. There could be a valid reason - say a fatal bug discovered in the existing code and no way to change the launch schedule. It could be budget pressure - simply not enough money to test. It could be plain incompetence.
Re: (Score:2)
Fuck it. Just kill off the entire human race and start over.
Re: (Score:2)
Fuck it. Just kill off the entire human race and start over.
Tried that once. Didn't work.
Open source satellite software? (Score:2)
If the satellite is being designed and built by a government organisation, in the name of the advancement of human knowledge, should we be encouraging the software to be open source? Have there been examples of such initiatives?
Re:Open source satellite software? (Score:4, Interesting)
It was probably running Linux, first mistake.
Nah; it was probably running ITRON [wikipedia.org]. It may well have included a POSIX library, but that wouldn't qualify it as a version of linux, even if some linux code is included there.
I haven't actually bothered to dig up the info, but that's what anyone acquainted with how such things are done in Japan would guess for a situation with serious RT requirements. Maybe it'd be interesting to investigate, to get an idea whether the OS and system libraries might have had anything to do with the failures.
Re: (Score:2)
Re: (Score:2)
Some is available [kottke.org]. But keep in mind that "civilian" space programs are usually thinly disguised military projects, so much of what's really happening is not made public.
Thanks for the link. What you say makes sense, though I though I would ask anyhow, since there is likely a shift between what is considered knowledge limited to military use?
Re: (Score:2)
Just looked and the CA is the US government and it is valid until 2018. This is likely valid, just not a certificate authority most browsers have by default?
Those are not software and hardware errors -- (Score:5, Interesting)
Those are called political and budget pressure by managers who have no clue on engineering ---
Software uploaded with out testing ? There is no way they could have gotten this far with out testing. I am sure there is no engineer in Japan that does not test thoroughly. Actually Japanese code is famous for being of the best quality -
This was caused by politics, bureaucracy and plain bad management.
Re: (Score:3)
Indeed. And very likely by a culture of "not contradicting the boss". An engineer that is unwilling to "contradict the boss" is a bad engineer, no matter what other skills he has. Of course, many bosses simply get rid of the "naysayers" and foster a culture of "can do". The results are invariably what we see in this story, although many managers manage to conceal that they were responsible for quite a while and sometimes forever. If the damage is huge, it is very rarely the engineers that have screwed up.
Re: (Score:2)
>An engineer that is unwilling to "contradict the boss" is a bad engineer,
>no matter what other skills he has. Of course, many bosses simply get
>rid of the "naysayers"
And there's your problem right there. What would you rather be - a "bad" engineer who can still pay the mortgage/rent, or a righteous engineer who's now looking for work and could be on the street in a few months if doesn't get a new job?
Re: (Score:3)
I most certainly do not want to be the engineer responsible for a spectacular failure. Of course, the software field has far too many "engineers" and many of them bad in other ways, which makes the problem worse. But while I work on a level where I cannot only speak up, it is required that I speak up, I can understand the person that decides to keep quiet.
Re: (Score:2)
What would you rather be
Without even a second of hesitation, the latter. I live in Canada, so there's no at-will or right-to-work or whatever the hell it's called (don't know the difference or care), so good luck firing me for trying to do the right thing. Especially since we have professional organizations backing us up, the company has a hell of a lot more to lose than I do.
Even if that wasn't the case, my answer doesn't change. If I just wanted to make money slaving under someone else's will with no creative say in my work, dam
Re: (Score:1)
... And there's your problem right there. What would you rather be - a "bad" engineer who can still pay the mortgage/rent, or a righteous engineer who's now looking for work and could be on the street in a few months if doesn't get a new job?
It is better to be fired, or quit. Then you will not be one of the ones black-balled by all of the other personel departments, after the disaster.
But it is a value judgement that must be made by all of us, based on the potential damage that might happen and the odds.
Re: (Score:2)
Interesting. The two you mention are killers as well. Makes sense to me.
Re: (Score:2)
Indeed. And very likely by a culture of "not contradicting the boss". An engineer that is unwilling to "contradict the boss" is a bad engineer, no matter what other skills he has.
You're supposed to raise the issue after work, over drinks. Yes, I would also prefer to have time for my own personal life, than have to go drinking after work in order to continue working, and do the stuff you should have been able to do at work but couldn't because of societal inertia and corporate culture.
If I weren't so concerned with what is happening here in the USA regarding labor, I'd be really and truly fascinated by it in Japan. They have a culture of make-work now, and a massive suicide rate. We
Re: (Score:2)
Doc: "No wonder this circuit failed; it says 'Made in Japan" --Back to the Future
Re: (Score:2)
Doc: "No wonder this circuit failed; it says 'Made in Japan" --Back to the Future
Yeah. I noticed that too. A good laugh. 'Member when "made in China" only meant McMickey toys? I was thinking at the time that they'd follow the same arc as Japan.
Dating an Engineer (Score:5, Funny)
It seems that untested software had been uploaded for thrust control just before the breakup.
Note to self: Don't ask your girlfriend questions you don't want the answers to - again.
Cretinization of engineering (Score:3)
This is just one of the more spectacular examples. I have heard of managers of large software teams that "do not believe in testing", I have seen Internet-reachable critical software that got a security evaluation only after deployment, because it was finished only a few days before deployment, and quite a few more things of similar utter incompetence. My guess is that the people responsible for these completely ridiculous screwups are "managers" that think they know how it all works (while being clueless), and that have eliminated all resistance to their views by firing anybody actually competent.
This is a dangerous and completely unacceptable regression. Humanity needs to be good at engineering if it is to have a future.
Re: (Score:1)
Humanity needs to be good at engineering if it is to have a future.
So...get rid of management? Because the two are mutually exclusive.
Re: (Score:1)
... So...get rid of management? Because the two are mutually exclusive.
No, just the "pointy-haired managers", who are not actually managers at all!
Real managers are necessary and helpful.
Firing the wrong thruster is "an edge case"?? (Score:2)
I'd have thought for a spacecraft control system it would be one of the first pieces of code you'd test! Its equivalent to putting a car into Drive and finding yourself going backwards!
Root cause analysis? (Score:4, Insightful)
I'd like to see a more thorough investigation of this set of incidents. That means no one involved gets to skip out by Seppuku. One of the problems with having a number of backup systems is that people tend to think "well, if it breaks, there's a backup system" - not realizing that each time a backup system is added, complexity is added, and that overall reliability goes down, instead of up. I don't know if over-reliance of backup systems, and failure to manage complexity, was the cause here, but it's the only thing other than "bad luck" or "sabotage" that can explain this disaster from a country which has many talented engineers.
Re: (Score:2)
Not saying you're wrong, necessarily, but to the GP's point, there is added complexity to the system. While you say that the second key doesn't affect your key working, it does if you need the car and your spouse happens to be using it. Also, there's a second set of keys to keep track of, which could be lost or stolen and subsequently used to steal the car. So redundancy has added a new failure mode and increased the likelihood of an existing one.
Re: (Score:2)
Yeah. I'd say that I've seen outages caused by backups perhaps every 2-4 years. From resource limitations to outright offline for the users.
But most software is not "critical". Only moderate efforts, if even that much, are made for redundancy, so the failures from the backups (as opposed to failures *of* the backups) tend to be taken in stride.
Anyways, I imagine it takes at least an order of magnitude greater effort for critical software.
Re: (Score:2)
Not exactly your point, but in the same vein... Your comment on backup systems reminds me of a common misconception when it comes to designing seals with O-ring gaskets.
I've heard many times: "Well, it almost seals, so if we just put a backup gasket in there it will be fine." Any O-ring design guideline will tell you that adding a backup only allows you to loosen your machining tolerances a bit; e.g. if the groove had to be X +/-0.005" deep, now it can be X +/-0.010" instead. X still has to be the same, the
IBM 9000 (Score:5, Funny)
"Well, I don’t think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error."
obligatory "The Arrival." It's alien sabotage (Score:2)
"Ask yourself why an antenna won't deploy on a deep space probe."
"Or ask how they could launch a $6Billion telescope without testing its mirror."
'The Arrival'
https://www.youtube.com/watch?... [youtube.com]
What to steer clear of? (Score:2)
So... what *modern* development methodology and platform did they use?
Summary Error, Article Error (Score:1)
The summary says the star tracker didn't work in "an area of low magnetic flux" (the South Atlantic Anomaly [wikipedia.org]). The true issue is that the SAA is a high radiation area and the radiation caused an SEU [wikipedia.org] in the star tracker. The Scientific American article was a bit mixed up about dumping the momentum stored in the reaction wheels. The text is a bit jumbled, but I believe the article was referring to magnetic torque