Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Space Technology

Murphy's Law Rules NASA 274

3x37 writes "James Oberg, former long-time NASA operations employee, now journalist, wrote an MSNBC article about the reality of Murphy's Law at NASA. Interesting that the incident that sparked Murphy's Law over 50 years ago had a nearly identical cause as the Genesis probe failure. The conclusion: Human error is an inevitable input to any complex endeavor. Either you manage and design around it or fail. NASA management still often chooses the latter."
This discussion has been archived. No new comments can be posted.

Murphy's Law Rules NASA

Comments Filter:
  • Re:Mark my words (Score:1, Insightful)

    by zerdood ( 824300 ) <null@dev.com> on Friday October 22, 2004 @10:19AM (#10597725)
    Why does everyone have this Crightonesque fear? As long as competent humans program the machines, they will be made unable to harm humans.
  • Re:Mark my words (Score:5, Insightful)

    by wiggys ( 621350 ) on Friday October 22, 2004 @10:20AM (#10597733)
    Except, of course, that we programmed the machines in the first place.

    When a computer program crashes it's usually down to the human(s) who programmed it, and in the rare occasions it's a hardware glitch and it was humans who designed the hardware, so we're still to blame either directly or indirectly.

    I suppose it's like the argument about whether bullets kill or the human who pulled the gun's trigger.
  • I'm still trying to figure out why the Apollo formula of contractors with Nasa oversight doesn't seem to work anymore.

    Then I remember Apollo 1, that killed 3 astronauts, and Apollo 13, that nearly killed 3 more.

    To invoke Heinlien, Space is a harsh mistress.

    To invoke Sun Tsu, success in defense is not based on the likelyhood of your enemy attacking. It is based on your position being completely unassailable.

  • Good Point (Score:5, Insightful)

    by RAMMS+EIN ( 578166 ) on Friday October 22, 2004 @10:23AM (#10597766) Homepage Journal
    ``Human error is an inevitable input to any complex endeavor. Either you manage and design around it or fail.''

    This is a very good point, and I wish more people would realize it.

    For software development, the application is: Just because you can write 200 lines of correct code does not mean you can write 2 * 200 lines of correct code. Always have someone verify your code (not yourself, because you read over your errors without noticing them).
  • by Puls4r ( 724907 ) on Friday October 22, 2004 @10:25AM (#10597795)
    >>Either you manage and design around it or fail. >>NASA management still often chooses the latter.

    This is hindsite at its best, and is the classic comment by beareaucrats who have no concept of what cutting edge design is about. F1 race cars, Racing Sailboats, Nuclear Reactors - NO design is failsafe, and NO design is foolproof. Especially a one off design that isn't mass produced. Even mass produced designs have errors, like in the Auto Industry. It is a simple fact of life that engineers and managers balance Cost and Safety constantly.

    What you SHOULD be comparing this against is other space agencies that launch a similar number of missions and sattelites - i.e. other real world examples.

    Expecting perfection is not realistic.
  • by Anonymous Coward on Friday October 22, 2004 @10:26AM (#10597811)
    Human error is an inevitable input to any complex endeavor. Either you manage and design around it or fail. NASA management still often chooses the latter.

    There's a contradiction in that above statement .. but I cant think what it is exactly. Along lines of human manages and designs the error handling dont they?

    That said nothing wrong with building in redundancy and failsafes

    In space probes redundancy comes at the cost of number of unique mission goals and financial cost.

    Sometimes you just have to eat the failure, thats what insurance is for. We in the public shouldn't always expect NASA to have 100% failure free (non human) missions and then extract harsh punishment on them which invariably gets passed down to engineers and not management decision makers.

    Wuith the current attitude NASA of old would have been shutdown in the first couple of years for wasting tax payer money. Luckily there was competition with Soviets.
  • by GR1NCH ( 671035 ) on Friday October 22, 2004 @10:29AM (#10597842)
    I think this goes along with the saying: 'If you make something idiotproof, someone will build a better idiot'. Sure maybe they could have designed the accelerometers so that they couldn't be installed backwards. But then again what else might have failed. I guess in the end it all comes down to econnomics. What does the cost-benifit analysis say? Is it better to keep checking and double-checking, or to just send it out like it is. Now, I can understand cost-benifit is a little bit difficult when you are talking about a space probe. But chances are you could keep redesigning and rechecking a probe for 50 years and then something you never thought of will come up and all your plans will go to hell.

    Maybe we should just make lots of cheap crappy probes, and hope and expect most to fail instead of one really good expensive one with the hope that it will suceed.
  • by woodsrunner ( 746751 ) on Friday October 22, 2004 @10:30AM (#10597850) Journal
    If you compare the advances to Science and Knowledge due to mistakes rather than deliberate acts, it might come out that everything is a mistake.

    Recently I took a class on AI (insemination, not intelligence) and apparently the two biggest breakthroughs by Dr. Polge, in preserving semen were due to mistakes. First, his lab mislabeled glycerol as fructose and they were able to find a good medium for suspension. Secondly, he blew off finishing freezing semen to go get a few pints and didn't make it back to the lab until the next day thus discovering that it was actually better to not freeze the stuff right away.

    Mistakes are some of the best parts of science and life in general. It's best to try to make more mistakes (i.e. take risks) than it is to try and always be right. (unless you are obsessive compulsive).
  • Re:Mark my words (Score:4, Insightful)

    by NonSequor ( 230139 ) on Friday October 22, 2004 @10:31AM (#10597856) Journal
    If you're expecting this to result from the development of human level AI I wouldn't bet on it. In order to solve problems not predicted by its creators it will have to make some leaps of intuition the way humans do when they solve problems. The ability to propose original solutions also introduces the possibility of error. An AI will also have to rely on inductive reasoning in some situations and there is no reason to believe that a computer can avoid making any false inductions. I suspect that human level AIs will be able to do a lot of things better than us, but they will have at least some of the same flaws we do.
  • by Wizzy Wig ( 618399 ) on Friday October 22, 2004 @10:32AM (#10597863)
    ...having people double check a project from the ground up will almost always find the problems...


    Then you double check the checkers, and so on... that's the point of the article... humans will err... Like Demming said... "you can't inspect quality into a process."

  • Human Factor (Score:4, Insightful)

    by xnot ( 824277 ) on Friday October 22, 2004 @10:35AM (#10597887)
    I think the biggest difficulty surrounding large organizations is the lack of communication tools linking the right engineers together. It seems unfathomable that some of these mistakes were able to propegate throughout the entire engineering process and nobody caught them.

    Unless you consider the fact that often in large organizations, the left hand typically has no clue what the right hand is doing. I work at Lockheed Martin, and typically I'm involved in situations where one group makes an improvement that then none of the other groups know about, changes/decisions are poorly documented (if at all) so nobody knows where the process is going, people making poor decisions due to lack of proper procedures from management about what to do, teams not being co-located, poor information about which people have the necessary knowledge to solve a particular problem, or any number of things that confuses the engineering process, to the detriment of the product. Most of these situations are caused by a lack of communication throughout the organization as a whole.

    This is a serious problem, and it needs to be acknowledged by the people in a position to make a difference.
  • by Control Group ( 105494 ) * on Friday October 22, 2004 @10:37AM (#10597913) Homepage
    No, it is true. It's the "almost always" in your statement that's the key. It's simple statistics, really. Assume that a well-trained, expert engineer has a 5% chance of making a material error. This implies that 5% of the things s/he designs have flaws.

    Now suppose this output is double-checked by another engineer, who also has a 5% chance of error. 95% of the first engineer's errors will be caught, but that still leaves a .25% chance of an error getting through both engineers.

    No matter what the percentages, no matter how many eyes are involved, the only way to guarantee perfection is to have someone with a zero percent chance of error...and the chances of that happening are zero percent. Any other numbers mean that mistakes will occur. Period.

    I remember reading a story somewhere about a commercial jet liner that took off with almost no fuel. There are plenty of people whose job it is to check that every plane has fuel...but each of them has a probability of forgetting. Chain enough "I forgots" together, and you have a plane taking off without gas. At the level of complexity we're dealing with in our attempts to throw darts at objects xE7 kilometers away, it is guaranteed that mistakes will propagate all the way through the process.

  • Nasty Remark (Score:3, Insightful)

    by mathematician ( 14765 ) on Friday October 22, 2004 @10:40AM (#10597940) Homepage
    "Either you manage and design around it or fail. NASA management still often chooses the latter."

    I find this remark very unfair. It is a really nasty snide attitude to it, like "we are perfect - why can't you be."

    Come on guys, NASA is trying to do some really difficult and ground breaking stuff here. Cut them some slack.
  • by _Sprocket_ ( 42527 ) on Friday October 22, 2004 @10:42AM (#10597954)


    This is hindsite at its best, and is the classic comment by beareaucrats who have no concept of what cutting edge design is about. F1 race cars, Racing Sailboats, Nuclear Reactors - NO design is failsafe, and NO design is foolproof.


    But this isn't about design. It's about implementation. In each of the examples, the failure occurred because of incorrect assembly of key components.

    Having said that - there IS an issue of design brought up by the article. That is, the design of a system should not allow for catastrophic configuration. In several examples, failure occurred when sensors (accelerometers) were installed backwards. Those devices should have been designed with some sort of keying system that only allows installation in the intended configuration. Heck - one of the accelerometers' configuration could only be determined after x-raying the device!
  • by onion_breath ( 453270 ) on Friday October 22, 2004 @10:43AM (#10597960) Homepage
    I love how journalists and others like to sit back and criticize these engineers' efforts. They are human, and they will do stupid things. Having been trained as a mechanical engineer (although I mostly do software engineering now), I have some idea of how many calculations have to be made to design even one aspect of a project. I couldn't imagine the complexity of such a system, trying to account for every scenario, making sure agorithms and processes work as planned for ONE mission. No second chances. That we have individuals willing to dedicate the mental efforts to this cause at all is worthy of praise. These people have pride and passion in what they do, and I'm sure they will continue to do their best.

    For anyone wanting to yack about poor performance... put your money where your mouth is. I just get sick of all the constant nagging.
  • by gammygator ( 820041 ) on Friday October 22, 2004 @10:51AM (#10598034)
    Finding problems is a good thing but I've found that nobody likes to be told their baby is ugly.... and if they're far enough up the corporate food chain... good luck getting 'em to listen.
  • by orac2 ( 88688 ) on Friday October 22, 2004 @10:57AM (#10598101)
    This is hindsite at its best, and is the classic comment by beareaucrats who have no concept of what cutting edge design is about.

    You only get to play the hindsight card the first time this kind of screw-up happens. If you actually read the article you'll see that Oberg (who isn't a beauracract but a 22-year veteran of mission control and one of the world'd experts on the Russian space program) is indicting NASA for having a management structure that leads to technical amnesia: the same type of oversight failure keeps happening again and again.

    Oberg is not alone in this. The Columbia Accident Report despairingly noted the similities between Columbia and Challanger: both accidents where caused by poor management but what was worse with Columbia was that NASA had failed to really internalise the lessons of Challanger, or heed the warning flags about management and technical problems put up by countless internal and external reports.

    Sure, space is hard. But it's not helped by an organization that has institutionalised technical amnesia and abandoned many of its internal checks and balances (at least this was the case at the time of the Columbia report, maybe things have changed).

    And if you really want to compare against other agencies, NASA's astronaut bodycount does not compare favorably against the cosmonuat bodycount...

    Sadly, your post is a classic comment by slashdotters who have no concept what effective technical management of risky systems looks like. (Hint: not all cutting edge designs get managed the same way. There's a difference between building racing sailboats and spaceships. This is detailed in the Columbia accident report. Read it and get a clue).

  • You'd think so. (Score:3, Insightful)

    by Bill, Shooter of Bul ( 629286 ) on Friday October 22, 2004 @11:00AM (#10598130) Journal
    Thats a very popular cliche. The fact is with NASA's shrinking budgets, they don't have the resources to design around potential failures. There's old school NASA that desinged the Cassini probe that has redundant systems and is properly designed and tested, and there's new school NASA that makes the cheap Mars probes. Just looking at the Mars probes you'll see why they have moved to this method. If you can make five fault intolerant probes for the same cost as one fault tolerant probe, and odd are that only two of the five work, then its a better idea to build the five crappy probes as you'll probely get twice the science benifit. The problem comes once you start throwing in human lives. Its okay if a manless probe crashes, its only things, but if you apply the same logic to manned missions.. you get the Columbia accident. Its not that NASA intentionally overlooked the problems because they expect people to die, its just that the methodology from the non manned flights have crept into their minds. At least thats my non-expert opinion.

    But I guess it sort of applies to your software analogy as well. There have been a few companies who have discovered that its cheaper to have paying customers find the flaws in their software, rather than do any kind of formalized testing before release.
  • by Halo- ( 175936 ) on Friday October 22, 2004 @11:01AM (#10598139)
    Nasa's current difficulties arise from scattered teams that all only check their parts rather than having fully qualified teams that go over the entire vehical.

    I'm not sure I buy that completely. While it certainly would help to have a single SME go over the entire vehicle, I doubt such a person could exist and complete the checks in a reasonable amount of time. The guy who checks the computer code is probably not going to be an expert in metal fatigue, nor electrical engineering. Even if you could find some sort of uber-genius who had expert knowledge of every system, he or she would have to work serially. If they started at component "1" of 654224166 and went down the line in order, the checks they started with would be out of date by the time they finished.

  • Re:You'd think so. (Score:3, Insightful)

    by RAMMS+EIN ( 578166 ) on Friday October 22, 2004 @11:13AM (#10598266) Homepage Journal
    ``There have been a few companies who have discovered that its cheaper to have paying customers find the flaws in their software, rather than do any kind of formalized testing before release.''

    Not only that, but it's actually beneficial to produce and ship buggy software. Bugs have to be fixed, and who can fix them better than the people who wrote the code? So, it makes sense for programmers to leave flaws in their programs. Companies that ship flawed products can make customers pay for upgrades that also fix bugs, or get good karma by providing bugfixes for free. In the process they get publicity, and the world can see they're not sitting still and their products have not been abandoned.
  • Re:Mark my words (Score:3, Insightful)

    by rapcomp ( 770617 ) <rapcomp@?yahoo.com> on Friday October 22, 2004 @11:25AM (#10598374) Homepage
    Where do you plan to find competent humans?
  • by ishmalius ( 153450 ) on Friday October 22, 2004 @11:35AM (#10598482)
    During design and testing, Murphy is your best friend. Before the baby chick leaves the nest, you want everything that can possibly go wrong, to do so. You can address each of the failures encountered, and then move on to new opportunities for error. This is a mysterious process called "learning," which definitely has its good points.

    NASA does test everything. He didn't mention in the article, but I would be almost certain that the accelerometers were tested, and passed the tests; but that the tests themselves were improper.

  • by gidds ( 56397 ) <slashdot@gidds . m e .uk> on Friday October 22, 2004 @11:36AM (#10598502) Homepage
    Assume that a well-trained, expert engineer has a 5% chance of making a material error. This implies that 5% of the things s/he designs have flaws.

    Now suppose this output is double-checked by another engineer, who also has a 5% chance of error. 95% of the first engineer's errors will be caught...

    That doesn't follow. It's only true if the two errors are completely independent, which is a very big 'if'. In practice, the chances are that some types of error are more likely than others, and that the processes/standards/ways of thinking which are common to both will also affect the types of errors they make. All of which makes it more likely that if one engineer has made a particular mistake, another engineer might make that same mistake as well. So the second engineer will catch less than 95% of the first engineer's errors -- maybe a lot less.

    Of course, things are far better when there's little or no commonality between the two engineers -- different companies, processes, methods, approaches, and cultures will all help to make their work independent, and help to reduce the errors that get missed. Anyone know if NASA does anything like this?

    I believe it's common practice in some mission-critical situations to use three different systems, each built from the ground up by three entirely separate groups of people, with nothing by the specification in common, for exactly this reason.

  • by tentimestwenty ( 693290 ) on Friday October 22, 2004 @11:36AM (#10598511)
    There might always be errors which you can reduce with many checks. The key is to have the checks done by someone who has an eye for potential problems. There is a particular skill set/personality that can forsee unknown problems better than say an engineer who is single minded and focussed. You can get a hundred experts to check the same work but often it's the one guy who says "why is that wheel upsidedown" that reveals a completely unanticipated problem.
  • by Anonymous Coward on Friday October 22, 2004 @11:36AM (#10598513)
    It is a difficult thing to design something to face failures. It requires a mind set vastly different than that of most "builders of things." Those folks tend to think in the positive: my creation does this, and this, and this, and ... This is true whether the thing being built is a program, a car, or a team of people.

    If you want to see this in action, find your favorite developer and ask the following: "What does your program do, and how doe it do that?" Prepare for a long response :-)

    Then ask: "How does it break?" You will most likely get a blank look. You may get a list of things the program doesn't do (missing or removed features), or possibly a list of known bugs, but you will almost never get an answer detailing the failure modes of the program itself. That is, they will not be able to tell you what happens when various assumptions are wildly wrong.

    Answering those sorts of questions requires thinking in the negative (not necessarily negative thinking), which is an entirely different mode of thought. It's also much less pleasant. After all, considering the destruction of the beautiful thing you've built is not a psychologically easy task.

  • by _Sprocket_ ( 42527 ) on Friday October 22, 2004 @11:44AM (#10598636)


    In comparison, NASA developed and flew three X-15 prototypes with similar capabilities for a cost of $300 million in 60's dollars (which incidentally was considered a cheap program).


    You've got good points. But you're being unfair on this one. Even the Rutan notes [thespacereview.com] that the X-15's capabilities far outstrip Spaceship One. That, and X-15 provided some of the basic building blocks in aero and astronomics on which Spaceship One could be built. Furthermore, Spaceship One enjoyed numerous high-performance off-the-shelf materials that didn't exist during the X-15's time. Comparing the two provides some interesting historical perspective. But the two programs are apples and oranges.
  • Easy to answer... (Score:2, Insightful)

    by zogger ( 617870 ) on Friday October 22, 2004 @11:51AM (#10598722) Homepage Journal
    ...it's because humans have this weird deal with society. We wind up with the greediest, lamest most megalomaniacal people for governmental/corporate "leaders". 999 out of a thousand are this way, it just happens, we notice it.

    These people are quite *insane*. they may be brilliant, but still bonkers. They have the most power and money of everyone on the planet. They hire the smartest people they can find, and reseaqrch advanced weaponry. All governments spend a huge amount of time and money and resopurces on this. they hire the smartest scientists and engineers they can find for this task. then the hire the people who psychologically and intellectually are the most prone to use these devices that the scientists and engineewrs create. these people are given more power than "ordinary" citizens, they are tasked with killing people and breaking peoples things, using these advanced machines. This weaponry, consisting of mechanical machines augmented with electrical and chemical advances, are *exactly* designed to "harm humans" and they DO harm humans with this machinery. Happens every day around the globe, by the thousands. Literally thousands of humans a day are killed, and many more horribly mutilated and injured. And the way the system has evolved, it is rigged to always have the megalomaniacs wind up "in charge" and all populations have a certain percentageof "ask no questions" order followers.

    So, stuff happens,evil wicked nasty horrible screaming stuff. This leads to this "fear" which isn't in the least bit an irrational fear for anyone sane to have. It's because it's reality.

    Lately, we can read that they want to automate and robotosize this even further, and to take these machines as far as they can push it with near unlimited budgets and millions of man hours of advanced research. It is not a "tin foil hat" phenomenon for folks to notice that. We also have a veifiable past track record to show that yes indeed, these megalomaniacs tame scientists and engineers and order followers screw up, we get what is called "unintended consequences" and "collateral damage", as if the intended consequences and planned-for damages aren't bad enough. So we as ordinary humans all around the globe who really do not have a beef with joe over there all get to have these "benefits", and we notice that we don't want those sorts of benefits, but there's not a thing we can do about it, because this advancing technology system is rigged in favor of those who like and enjoy and profit-from doing harm.

    You see, we DO have a lot of at least technically "competent people programming the machines", the problem is, they ARE designed to harm. And it's set up to be self perpetuating/advancing and is based from the git-go on forced wealth transference, ie, "theft" and it goes down hill from there into every worse things..
  • by Control Group ( 105494 ) * on Friday October 22, 2004 @11:51AM (#10598724) Homepage
    I was being overly simplistic, admittedly, but I think the "model" (to put on airs) I used illustrated my point adequately: as long as there is a percent chance of an error being made at every step of the process, an error will eventually be made.

    Obviously, the trick is to minimize the odds, but you can't eliminate them.

  • by sphealey ( 2855 ) on Friday October 22, 2004 @12:10PM (#10598941)
    I'm still trying to figure out why the Apollo formula of contractors with Nasa oversight doesn't seem to work anymore.
    Two reasons. First, outsourcing requires more and better project managers and technical managers than insourcing. Many organizations learned this to their sorrow in the 1980s; many more are going to learn it around 2006.

    Second, the stable of competent contractors that existed in the 1940-1960 time frame is gone. North American, Grumman, McDonnell, dozens of others that could be named have been absorbed into 2-3 borg-like entities. The result is less competition, less choice, less innovation, few places for maverick employees to go, and in the end worse results from outsourcing.

    sPh

  • by DerekLyons ( 302214 ) <fairwater@gmaLISPil.com minus language> on Friday October 22, 2004 @12:15PM (#10598996) Homepage
    If you are going be sheer number of launches, body count, payload capacity, or cost effectiveness, the Russians have us beat hands down.
    Well,part of one out of four isn't bad. Let's examine these in detail shall we?
    • Sheer number of launches - This is the only one that the Russians 'beats' the US on, mostly because their hardware is unreliable and short lived, thus requiring frequent replacement. So far as manned flight go however, they've actually flown less. (87 Soyuz flights vs. 113 Shuttle flights alone.)
    • Body Count - The Shuttle (alone) has carried to, and returned from, space nearly twice as many people as the entire Russian space program.
    • Payload Capacity - Not counting vaporware like Energia, or booster not in production like the Saturn V... We find there is actually little difference between currently available payload weights.
    • Cost efectiveness - well, comparing apples to oranges invariably leads to odd results. Soyuz is mostly cost effective because it's has extremely low performance. (And because all of the R&D and infrastructure was paid for by somebody else.) They closest American program you can compare Soyuz to... Is Gemini.
  • HOW IT HAPPENS (Score:4, Insightful)

    by LaCosaNostradamus ( 630659 ) <LaCosaNostradamus.mail@com> on Friday October 22, 2004 @01:28PM (#10599782) Journal
    1. Manager issues a stupid order.
    2. Subordinates obey order out of fear.
    3. Manager gains confidence that stupidity is a valid method.
    4. Stupidity gains an increasing foothold until a catastrophe occurs.
    OR
    1. Manager cuts another corner or cost.
    2. Nothing immediately bad happens as a result.
    3. Manager gains confidence that cutting is a valid method.
    4. The cuttings increase until a catastrophe occurs.
    Managers are among the most moronic of the "educated" Western class ... because, after all, they don't understand the trends I outlined above.
  • Re:Mark my words (Score:3, Insightful)

    by Tackhead ( 54550 ) on Friday October 22, 2004 @01:37PM (#10599866)
    > Why does everyone have this Crightonesque fear? As long as competent humans program the machines, they will be made unable to harm humans.

    Funny. When I read the article, I had exactly the same sentiment, but for the opposite reason:

    "As long as humans build/program the machines, the machines will fail/crash before they can kill too many people" :)

Gravity brings me down.

Working...