Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Space Science

Mars Failures: Bad luck or Bad Programs? 389

HobbySpacer writes "One European mission is on its way to Mars and two US landers will soon launch. They face tough odds for success. Of 34 Mars missions since the start of the space age, 20 have failed. This article looks at why Mars is so hard. It reports, for example, that a former manager on the Mars Pathfinder project believes that "Software is the number one problem". He says that since the mid-70s "software hasnâ(TM)t gone anywhere. There isnâ(TM)t a project that gets their software done."" Or maybe it has to do with being an incredible distance, on an inhumane climate. Either or.
This discussion has been archived. No new comments can be posted.

Mars Failures: Bad luck or Bad Programs?

Comments Filter:
  • You know, 1/10th of something rather than 1/4. Damn engineers can't figure out the conversion between metric and standard!
  • by BlueTooth ( 102363 ) on Monday June 09, 2003 @08:39AM (#6149466) Homepage
    Before complaining at the lack of manned missions to mars any time soon.
  • ...or so the story goes. I'm sure we can make it to Mars with our current technology.

    I think it's hard to get to Mars because it's far away and it it's in SPACE! It doesn't take a rocket scientist to figure that out! Well on second though....
    • by Niles_Stonne ( 105949 ) on Monday June 09, 2003 @08:47AM (#6149541) Homepage
      I think that is part of the difficulty...

      With 512 BYTES of ram you can literally look at the entire contents. You can be aware of every single bit on the system.

      Now, where we have gigabytes of ram, and even more other storage it is simply impossible to sort through every bit. This errors roll in.

      I'm not sure what to do about it, but I see why there is difficulty.
      • by bigpat ( 158134 ) on Monday June 09, 2003 @09:11AM (#6149765)
        "gigabytes of ram"

        no, for instance the Mars Pathfinder spacecraft had "128 Mbyte mass memory" and used a R6000 computer. While the rover had "0.5 Mbyte RAM mass storage" The R6000 is much less powerful than the original pentium.

        http://mars.jpl.nasa.gov/MPF/mpf/fact_sheet.html #S CCHAR

        NASA computer technology has for the past decade or two been a few or more years behind the state of the art in consumer electronics. Largely because they have to put the electronics through more testing and only use chips that will withstand possible radiation with low power consumption. Plus add on the years of development of the spacecraft itself... means that your desktop probably (Anyone want to do the math?) has more computing power than all the deep space explorers ever launched, combined.

        • means that your desktop probably (Anyone want to do the math?) has more computing power than all the deep space explorers ever launched, combined.

          Yes, but can your computer recover from a triple memory failure? Can you rewire your computer remotely to fall back on a redundent system? Frankly I keep the covers off my case to keep my CPU from overheating.

          State of the art is not always measured in Gigahertz.

          • by Lord_Slepnir ( 585350 ) on Monday June 09, 2003 @10:52AM (#6150982) Journal
            Frankly I keep the covers off my case to keep my CPU from overheating.

            A bit of advice: Leave the covers on, but make sure that you have enough case fans to ensure that the CPU has a constant air current over it. I have the fan on the front of my box blow in and the fan on the back (plus the power supply) blow out. If you leave your case closed, the improved air flow will actually lower the temperature of your CPU and motherboard.

        • no, for instance the Mars Pathfinder spacecraft had "128 Mbyte mass memory" and used a R6000 computer.

          The grandparent post's point still stands. 128MB is one huge mass of program and data to debug. I know I wouldn't stake my reputation on a "bug free" multi-megabyte program--only a fool would.

          Remember, the true complexity of a program increases exponentially with the size of the program.

          This is why I will never trust Windows for anything more than a gaming platform (millions of lines of hastily-writte
    • by mcheu ( 646116 ) on Monday June 09, 2003 @08:54AM (#6149610)

      Thing is, space exploration isn't done with *current* technology. The computing technology used in a lot of aerospace applications is 20-30 years old. There are a number of reasons for this, but the ones I've heard of are:

      1. The projects are long-term, and have been in development for a lot of years. Especially when it comes to government projects. They can't just up and switch to the latest tech whenever it comes around, otherwise it will end up like DNF and never see the light of day.

      2. The engineers don't trust the latest and greatest. The technology isn't considered mature enough. All the bugs have been worked out in the older tech, so it's more robust, the engineers are more familiar with it, and more often than not, manufacturers have shunk and simplified the designs significantly since introduction.

      It's more likely that you'd find a 8086 processor in the space shuttle than a Pentium 4 unless someone brings a laptop aboard. It wasn't all that long ago that NASA put adds on websites and geek magazines appealing for old 8086 processors for spare parts. I haven't heard anything since, so either they found a supplier, or they're too busy piecing together the Columbia.


      • It's nearly impossible to find space-rated, radiation-hardened components that are anywhere near 'cutting edge'. The smaller the process, the more likely the component will be damaged by radiation - that pretty much eliminates 'cutting edge' stuff and newly shrunk old stuff.

        It's really a shame that manufacturer's can't easily produce space-rated components cheaply, and it's also a shame that the space-rated component market is not large enough to support that niche as a viable business.

    • by AndroidCat ( 229562 ) on Monday June 09, 2003 @09:06AM (#6149720) Homepage
      Yeah but... The Apollo 11 LEM computer crashed several times [wikipedia.org] during the landing.
    • Sure, but don't forget that those were manned missions. Perhaps that's what we need to think about with Mars...
    • How about this? We're launching fairly small, very complex probes, that aim to do a lot more than the moon missions in some respects...certainly the craft are responsible for accomplishing a lot more 'unsupervised'.

      With the moon missions, there were manned craft, and so every line of code had to be checked and rechecked--and hundreds of guys were on the ground watching everything that happened, twenty-four seven, until the astronauts were safely back on the ground.

      Now, windows for a Mars launch come mu

    • by Waffle Iron ( 339739 ) on Monday June 09, 2003 @11:04AM (#6151111)
      Even with only 20K or so of code, the apollo guidance computer software development nearly slipped the schedule of the entire moon program. This page [nasa.gov] on this very interesting site [nasa.gov] describes the software development.

      I haven't read the whole site in a while, but IIRC, it describes the typical problems with software: underscoping the problem (in the 60s, most people assumed that the computer hardware development would be the majority of the effort), code bloat (the computer required much more memory than originally planned), buggy production code, schedule slips, problems caused by cruft. When the project started, they just waded right in to coding with few tools and little awareness of the need for proper engineering practice.

      This particular case was made more difficult by the program loading procedure: the program ROM was made one bit at a time by hand threading magnetic cores on to tiny wires then embedding it in a solid block of epoxy. The write-compile-debug cycle could be weeks. If bugs were discovered late in the schedule, the astronauts just had to work around them. The software devleopers did have mainframe-based simulators for development, though.

      With the gigabytes of space available for today's software, I'm surprised that any modern space projects get finished at all.

    • by EccentricAnomaly ( 451326 ) on Monday June 09, 2003 @11:09AM (#6151168) Homepage
      Just look at the rate of failure for early moon missions [inconstantmoon.com]

      It's a hard probelm to send a probe to the Moon or Mars. landing and aerocapture at Mars are dicy things.

  • I'm fairly certian it's sabatoge on the part of the Martians.
    • No, it was clearly sabotage on the part of SCO. Their programmer deliberately added code to the Pathfinder source, disguising the fact that it was taken from the original English-system Unix codebase. The balance of the code was taken from the European-based Linux.
    • I don't think the Martians need any help - the bureaucrats at NASA are sabotaging it just fine.

      That said, how did this get modded "insightful"? What, exactly, is the insight? Maybe there should be a "+/-1, TinFoilHat" mod.

      • Re:sabatoge (Score:3, Funny)

        by EpsCylonB ( 307640 )
        What, exactly, is the insight? Maybe there should be a "+/-1, TinFoilHat" mod.

        I used to mock the whole tin foil hat idea, until I put one on. Once their signals stopped entering my brain I started to see things differently. If you have never actually tried a tin foil hat then you shouldn't laugh.
  • Its a shame (Score:2, Insightful)

    by Anonymous Coward

    because software is one of the only things that could and should be theoretically perfect

    maths (especially that based on 1 or 0 is either right or wrong it seems to be only when humans get involved that things go wrong and mistakes happen
    • And it is finite as well, but I don't see anyone with a closed form solution to that either. Even with a very small, searchable code space for possible programs, it is not possible to completely characterize the program's behavior.

      Theoretically, all programs have latent bugs, unless they are too simple to do much.

    • Re:Its a shame (Score:2, Insightful)

      by OldAndSlow ( 528779 )
      Nonsense. Software is not math. Math is "for each program there exists (or could exist) a specification which makes the program correct." Not very useful.
      Software is human beings communicating with each other in ambigous natural languages and then trying to convert what they think they understand into a hyper specific computer language that a program (ie compiler) will translate into machine code.
      The hard part is trying to eliminate all the killer misunderstandings. One of the early Geminis came down se
    • Funny. Of all of the things that went wrong mechanically with the shuttle, from enginees that had to be tweaked beyond what a Rice-Boy would consider safe, to a protective houseing made of glass, to strapping 2 solid fuel boosters just to jet the sucker off the ground, the software on the Space Shuttle worked well, and worked the first time.

      Part of it was the fact they had absolute geniouses working on the problem. Think of it, they designed a system in the late 1970's, tested it on the ground, and had it successfully fly for 20 years without a major "oopsie". Or rather, if a major "Oopsie" happened, they had ways around, over, or through it. They spent YEARS developing the flight software for the Shuttle.

      Software CAN be done right. It just has to be a priority.

    • Re:Its a shame (Score:3, Insightful)

      by pmz ( 462998 )
      Its a shame because software is one of the only things that could and should be theoretically perfect

      And theoretically prohibitively expensive.

      I have yet to meet someone who is geniunely willing to pay for software quality. They simply don't care or understand. Once the software reaches some minimum threshold of "working", the project gets cut off or put on some other tangent.
  • by lingqi ( 577227 ) on Monday June 09, 2003 @08:39AM (#6149477) Journal
    Of 34 Mars missions since the start of the space age, 20 have failed.

    I really hope this explains why there isn't a manned mission. =)

    • by jellomizer ( 103300 ) on Monday June 09, 2003 @09:30AM (#6149992)
      Perhaps it explains why there should be a manned mission. The main problem with exploring the unknown is that there are a lot of unknown variables out their and computer technology is not always adaptable for all unknown variables. This is why there is software failure and lost contact. Manned missions give some extra control of the mission and gives the ability to improvise new solutions for unknown problems. Like Fixing a part that is broken by using an other material that is available. Or realigning so it will maintain contact. The big problem with mars is that it takes 20 minutes to send a signal for it do do something different remotely. A human who is well trained will be able to make these decisions and control the new instructions in far less time (within seconds). If it wasn't so expensive to do a Manned mission to mars. I am sure manned missions would have a much higher success rate.
  • by jkrise ( 535370 ) on Monday June 09, 2003 @08:40AM (#6149478) Journal
    That explains why it's so hard? :-)
  • by rosewood ( 99925 ) <.rosewood. .at. .chat.ru.> on Monday June 09, 2003 @08:41AM (#6149482) Homepage Journal
    I am with NASA on this one (almost always a good idea to stick with NASA). From when I remember of fubar'd mars missions, its been screw ups by the programers.

    Just as in the NFL when a receiver drops an easy pass and someone yells that he gets paid to catch passes like that, programers get PAID not to fuck things up.
    • its been screw ups by the programers

      And I can clearly tell you are not one considering you can not even spell the profession correctly. It's programmers, two 'm's.

      And your comparison between a receiver and a programmer is wrong. A receiver is gifted and talented and can catch the ball in many ways, in the gut, in the air, on there stomach (Antonio Freeman vs Minnesota on MNF) all this and defending off the defense. But at the end of the day it is still catching the ball.

      As a Programmer, I can not ju
    • Programmers (Score:5, Insightful)

      by Cujo ( 19106 ) on Monday June 09, 2003 @09:03AM (#6149687) Homepage Journal

      Yes, programmers have erred. To err is human, to allow errors to propagate into mission failures is a failure of systems engineering, and I think that is where the real blame lies. A lot of the problem is thatspacecraft systems engineers often have a very amateurish grasp of software, if any at all.

      For example, on Mars Climate orbiter, a junior programmer failed to properly understand the requirements. However, systems failed to:

      1. Properly identify the thruster force data as a critical interface.
      2. Failed to demand proper, thorough and timely verification ON BOTH SIDES OF THE INTERFACE.
      3. Failed to make sure the requirements were properly understood by the implementers.
      4. Ignored or missed prima-facie evidence that the interface wasn't working (closely related to 1).
    • by Lord_Slepnir ( 585350 ) on Monday June 09, 2003 @09:05AM (#6149714) Journal
      When I can get paid $4 million a year just to show up to work every day for 4 hours, 6 months a year, get paid another $5 million just to say that I use XXX brand compilor (or reclining chair), get paid by a university to attend there just because they need a new star Perl Debugger (the last one graduated last year, and the backup got carpal tunnel), then I'll stop messing things up like that.
  • Martian aim is getteng more accurate by the hour, isn't it obvious!

    Please stop denying it, the great Anthropoligist and Engineer Erich von DÃniken [daniken.com] has been writing about this for decades. Wake up and smell the Martians.

    hehe
  • by emo boy ( 586277 ) <hoffman_brian@ba h . com> on Monday June 09, 2003 @08:43AM (#6149503) Homepage
    The motivation for achieving Mars is much less than the moon. The reason for this is because there was extreme speculation that the Moon was made of green cheese. Mars is already assumed to have red dust on it. For a society that gorges itself on Big Macs and Cheese Fries this is hardly a worthwhile goal. And as a programmer myself I understand the need to work on projects that will benefit the community as a whole, not on one that will invade a dirt planet.

    • Mars is already assumed to have red dust on it.

      Thus the interest by the Chinese! Their 'Moon' program is nothing but a decoy. They plan to change Mars from 'red' to Red.
  • by Xentax ( 201517 ) on Monday June 09, 2003 @08:44AM (#6149511)
    ...is "garbage in, garbage out" right? One of the mottos anyway.

    If you underestimate the resources you need to do software right, of course you'll have problems -- either getting it done on time, or getting the quality to the level it needs to be (or both).

    That problem is hardly unique to the space programs. And of course, it would be a little tricky trying to upload a software patch to a hunk of solar-powered metal a few million miles away.

    I wonder how much NASA et al. really tap the resources they should be tapping -- I mean, there ARE areas of industry where mission-critical or life-critical software has been developed and deployed for some time now. Maybe it's just a question of getting the right kind of experience in-house...

    Xentax

    • Or, as is often the case, "Data In, Garbage Out."

      And what the users want is "Garbage In, Data Out."

    • by marauder404 ( 553310 ) <marauder404 @ y a h o o.com> on Monday June 09, 2003 @10:23AM (#6150633)
      NASA software engineering is actually quite remarkable -- at least for the shuttle program. I read a paper once about how they actually break many of the paradigms of writing code that so many programmers are accustomed to so that the code is absolutely perfect. Deadlines are met well ahead of schedule and nobody works late. They're not allowed to work late, because the pressure or fatigue could cause an error to occur. The code is personally signed-off by the chief software engineer that it won't hurt anyone. Every line of code is fully documented. The code is virtually written twice by two separate teams. This article actually details some of it great length: They Write the Right Stuff [fastcompany.com]. I don't disagree with you that maybe the way they write software needs to be reviewed, but it seems that they already go a long way to ensure that happens.
  • ... on the last two trips to Mars that failed. Communication and incompetence on Earth were the problem. Exactly how do scientists screw up and get the unit system wrong?
  • by theophilus00 ( 469290 ) on Monday June 09, 2003 @08:46AM (#6149528)
    âoeThe limiting factor in Mars sample return is mass,â he said. âoeDirect return [of samples] from Mars right now exceeds the cost envelope and performance envelope of the available launch vehicles and upper stages.â

    The first samples returned should have mystical properties ascribed to them and then sold on EBay. This should generate enough revenue to substantially increase the size of the "cost envelope"...

    cheers

    (I got engaged last night) =)
  • by bigattichouse ( 527527 ) on Monday June 09, 2003 @08:46AM (#6149532) Homepage
    Make it simple. The original software used (like in the moonshots) was Very simple control loops... no OS, no overhead.. just a simple program doing a VERY simple job over and over. Read stick, fire retros as appropriate.
    Also, solid state, however big and bulky, isn't susceptible to the radiation that many mega-tiny chips are... by writing (and testing) the software in the simplest manner, and building a VERY specific piece of hardware out of solid state components.. and lots of unit testing... you're more likely to get there.
    For the same reason the 486 was the only space-rated intel processor for quite a long time (not sure if thats still true).

    I'd rather go on "slower" simpler hardware that does a very specific job... and you can repair with a soldering iron.

    • Recall that on the first manned moon landing, the software screwed up and the lander would have been lost if the pilot hadn't taken manual control at the last minute!

    • That has some merit to it, but keep in mind, the rover that landed a few years ago had a LOT of off-the-shelf parts in it. I doubt NASA or its contractors are going to build 1970's-era hardware (less tiny chips) for a 2003 mission. Heck, they don't even do it to replace parts on the shuttle. They buy them from ebay and other warehouses of old parts.
    • Well, simple logic like that has caused problems too. The reason one of the recent mars landers toasted was because it mistook the thump from launching the parachute to be making touchdown. With this knowledge, it decided it was safe to deactivate the landing thruster.

      A more intricate, complex system may have provided the lander with the intellect to figure out that it was going to be grey paste on the red earth if it did that (as opposing what happens to humans who fall from the sky).
    • by Anonymous Coward
      as simple as

      10 REM my Martian exploration program
      20 GOTO MARS
    • I'd rather go on "slower" simpler hardware that does a very specific job... and you can repair with a soldering iron.

      The problem is that all of the Mars shots we've launched so far--and all of the failures referred to--have been unmanned probes. So the question remains: how do you plan to get the guy with the soldering iron up there?

    • Also, solid state, however big and bulky, isn't susceptible to the radiation that many mega-tiny chips are...

      Actually, the current microchips are inherently rad-hard (radiation resistance). This wasn't the case in the past. It's something about the size of the features being small and also shallow, so that not much charge is deposited as a charged particle passes through. 0.25 and 0.18 microns are apparently especially good. However, as feature size continues to go down, things will get worse again.

      You

    • by mykepredko ( 40154 ) on Monday June 09, 2003 @11:21AM (#6151303) Homepage
      The technology used in the Apollo Guidance Computers (GCs) were more a function of what their manufacturer (IBM) was comfortable with than what was available at the time. The GC's used IBM "Solid Logic Technology" (SLT) which was primarily a Resistor-Transistor Logic (RTL) technology in which discrete resistors and transistors were bonded to ceramic carriers which were then soldered to PCBs using traditional pin through hole manufacturing techniques. At the time, this was IBM's primary method of manufacturing computers (they did not start using integrated circuits in their computers until the early 1970s). IBM never gave up on SLT until the late 1980s.

      The GCs read only memory consisted of a series of peg-boards into which the code was wire wrapped (by hand). There were 74,000, 16-bit instructions that could be programmed in this way. There was 4k iron-core memory in the computer. There were two GCs used in Apollo. The CSM one was responsible for leaving earth orbit, mid-course correction(s), entering lunar orbit, etc. The LM GC controlled descent and ascent as well as autopilot functions for lunar orbit docking. The computers ran the programs for these manuevers from ROM, but using astronaut input parameters using the "noun-verb" input methodology.

      The software was actually very sophisticated and did not consist of simple control loops - joystick feedback was actually processed to ensure commands kept the spacecraft within limits. The most important parameter was keeping the antennae pointed at the Earth.

      AFAIK, there are no space-qualified Intel built '486s. There are space-qualified computer systems with '486s in them, which may seem like semantics, but these systems typically employed multiple '486s, with bus operations and data continually compared to look for differences indicating upsets. This is a point that always confuses people because at one point IBM/NASA indicated the AP101 Block IIs had the same amount of power as a '486 - this seems to be misinterpreted as the AP101s have '486s built into them.

      Half a lifetime ago, I helped with some hardware failure analysis for the IBM Orbiter Computer Systems Group (It was an intermittently failing memory board on STS-4) and I have to say that they were the most impressive software group that I have ever been associated with. They learned their skills with the Apollo CSM/LM GCs and Apollo Instrumentation Ring - you just don't make mistakes when the instructions are wire wrapped. The software engineers that worked on the shuttle software didn't have a problem with going with the (relatively) complex AP101s (originally designed for the B-1). Going from wire wrapped ROM to battery backed RAM was seen as a good thing, but it did not mean that the software development process changed in any way.

      I'm trying to remember if there were two or three support binders for each module of software in which the requirements were clearly defined, the science and reference information provided, all calculations/constants defined to support the software binder. Coding is always the last thing that is done and only if the support binders are complete and signed off. This process is very expensive, but the software produced is essentially perfect (I believe that there has been one non-safety of flight software error in shuttle history and several hundred thousand lines of code). Complexity isn't the issue.

      I think the issue is, is there a software development methodology/process that fits in with NASA's "smaller, better, cheaper" and produces the same quality as the Shuttle/Apollo?

      myke
  • by Anonymous Coward on Monday June 09, 2003 @08:46AM (#6149533)
    What we need is a bit of competition between nations. Let's face it, without Kennedy wanting to 'beat the Russians' to the moon, there would have been no Apollo programme. Nowadays we throw unmanned stuff around and expect it to perform flawlessly with (comparatively) little monetary backing and none of the incentives of older space programmes.

    However just throwing money at the problem isn't going to solve it, I'd suggest throwing away the rulebook and starting over for unmanned systems, better craft, less of the multimillion dollar single units and more cheaper devices that can carry out multiple landings at once.

    For once, it might be worth imagining a Beowolf cluster of those things - because with many cheaper devices, the mission would most likely have a modicum of success.
  • Methodolgies (Score:2, Insightful)

    by barcodez ( 580516 )
    It's interesting that he blames the problems of software on external pressures such as management hassling of coders but there is no mention of project delivery methodology. I would be interested to know what methods they uses. Are they using continuous intergration techniques, unit testing, agile methodolgies, XP? These things in my experience are crucial to low bug software. Also who are they employing to write their software? Rocket scientists or coders. In my experience domain expertise counts for ver
    • Re:Methodolgies (Score:3, Informative)

      by Jon Peterson ( 1443 )
      Hmmm. I think you'll find the methodologies of the commercial world count for nothing when it comes to space-craft. XP indeed......

      http://www.fastcompany.com/online/06/writestuff. ht ml

      That's what they do, and I'm glad I don't.

      And as for domain expertise not counting for much, that may be true for some domains, but sure as hell is not for mine (medical informatics).

    • Its pretty common knowledge that NASA invest a lot of time and effort in testing. If I remember correctly they have their own language and everything must run without a single glitch on their simulators for hundreds of hours before its is accepted.

      XP is a methodology more suited to commercial environments, particularly web based where the requiremens are often in a state of flux. I would not expect to see NASA telling their coders twice a week that mission requirents have changed and they now need X inst
    • Re:Methodolgies (Score:4, Insightful)

      by drooling-dog ( 189103 ) on Monday June 09, 2003 @09:23AM (#6149889)
      In my experience domain expertise counts for very little when it comes to writting rock solid code.

      Or, at least when it comes to writing rock-solid code that reliably does the wrong thing...

  • Mistakes (Score:5, Interesting)

    by Restil ( 31903 ) on Monday June 09, 2003 @08:48AM (#6149555) Homepage
    Of course, the stupid metric conversion problem only accounted for one of the failures, but it's indicitive of a larger problem. There's obviously a shortcoming in quality control and verification if such an obvious mistake could be overlooked. What less obvious problems are we missing all together? Most of the failures occured during the orbital entry phase, during which time they shut off the transmitter, and therefore don't have up to the second data on the reason for the failure. Sure, they likely wouldn't have much of an opportunity to save the mission, but they would have a good chance at figuring out what the problem actually was so it could be fixed the next time around. Instead, we're left to guess. Cost concerns are always mentioned as the reason, but how much have we "saved" really? An extra million $$ to keep the transmitter on would probably have paid for itself a long time ago.

    -Restil

    • Re:Mistakes (Score:3, Insightful)

      by varjag ( 415848 )
      Most of the failures occured during the orbital entry phase, during which time they shut off the transmitter, and therefore don't have up to the second data on the reason for the failure.

      That's why some folks at NASA develop [nasa.gov] more sophisticated control software that can take of failures. The RAX experiment on DS1 probe successfully demonstrated this approach viable.
      However, at the moment the project suffers major rewrite in C++, notorious for its 'safety', for reasons having very little to do with engineer
  • First, most of the launches go wrong, so they get improved. Second, the spacecraft hardware goes wrong, so that gets redesigned. Third, the software goes wrong, so more work is needed there.

    It looks as if the testing and debugging starts at the begining and works through the mission. I suppose this will eventially work, but it seems to be an expensive way to do it.

  • almost /.dotted (Score:2, Informative)

    by lethalwp ( 583503 )
    1st page

    Why is Mars so hard?
    by Jeff Foust
    Monday, June 2, 2003

    This June will see the beginning of the most ambitious exploration of the Red Planet in a quarter-century. If all goes well, three launch vehiclesâ"one Soyuz and two Deltaâ"will lift off this month, placing four spacecraft on trajectories that will bring them to Mars by this December and January. Those spacecraft include the first European Mars orbiter, Mars Express; Beagle 2, the British lander built with a mix of public and private f
  • by fname ( 199759 ) on Monday June 09, 2003 @08:52AM (#6149589) Journal
    Well, there are a lot of reasons thing go wrong. Landing a spacecraft on a different planet is inherently difficult, and when you read about how MER-1 and MER-2 will land, it's amazing that they can work at all.

    The flip side is that. After Mars Ovserver spectatularly failed in 1993 ("Martians"), NASA started to go with faster, cheaper, better. The idea was, instead of a single $1 billion mission every 5 years with with 90% chance of success, why not 2 $200 million missions every two years, with an 80% chance of success. Everyone loves this idea when it works (Pathfinder), but when a cheap spacecraft fails, the public doesn't care if it cost $10 million or $10 billion, all we know is that NASA is wasting money.

    So, the answer is, NASA has hit some bad luck. But the idea of faster, cheaper, better is ultimately a cost-effective one, so if we can solve these software problems (I mean, can't someone independently design a landing simulator?), and NASA can get 80-90%, we'll be getting a lot more science for the dollar. But NASA-haters will always have some missions to point to as a "waste" of money, and try to cut funding as it's mismanaged; other space junkies will insst that anything under 100% is unacceptble, and costs should double to move from 80% to 100%. I don't which attitude is more damaging.

    NASA has a "good" track record since Observer, unfortunately, the highest profile missions have generally failed. If MER-1, and MER-2 are both succesful, and SIRTF flies this summer, then everyone should get off of NASA unmanned program's back for a while.
  • They keep shooting our probes down. We should really look at is as a success that we got the ones there that we did.

    I mean notice, they never land near the face or the pyramids!

    (apologies to the author Robert Doherty for stealing the idea from his Area 51 series)
  • by Kjella ( 173770 ) on Monday June 09, 2003 @08:58AM (#6149644) Homepage
    Seriously. Space is tough, as the US has experienced with both Challenger and Columbia, and those should only reach orbit. Going even further away in space is tougher. So much can go wrong, and so little can be done to correct it. Certainly a few blunders like the feet-to-meter bug is huge, but they try. I'm not so sure any private corporation that had been asked to do the same would fare any better. They are pushing limits, where you fail and (hopefully) learn from your mistakes.

    Which is why we should continue to try. Giving up, saying "space travel is just too costly and risky" is a big cop-out. If we could send people to a different stellar object (the moon) in 1969 with the equivalent of a pocket calculator but not now, what does that say of our technology? Or sociology? Sure you could take the narrow-minded approach and say "and what does that bring us? The ability to jump from rock to rock in our solar system?" If so, you might as well ask why people decided to go to the poles (just ice) or whatever. You're still missing the point.

    Kjella
  • In my years at NASA Goddard I saw a dysfunctional management operate in ignorance of reality.

    There was much praise of the employee who "went the extra mile", "put in long hours" and "served the customer" (that applied to contractor employees). There was also very little thought paid to the consequences of those practices.

    What's the first thing to go when you're tired? It's not your body -- it's your mind. That's right -- if you're staying at work until you're feeling tired, you're making mistakes that need to be corrected later. The tireder you are, the more mistakes. The tireder you are, the less you can actually do.

    I witnessed people who wore their exhaustion as a badge of honor. And, when they got into management, insist that others emulate their bad example. The result that I saw was people who should have been kept out of management becoming increasingly dominant. This was accentuated by the "faster, better, cheaper" ideology promulgated by former NASA administrator Goldin. This ideology was used to get rid of more experienced (and thus costly) people who were aware of the consequences of trying to squeeze more work out of fewer people.

    It could take a long time for NASA to recover from this culture. The failure of projects in the past few years, the crash of Columbia could be turning points -- or they could be used by incompetents to justify even more dysfunctional behavior.

  • Perhaps one of the reasons that the software isn't getting done on time is that much of the system is written from the ground up. Perhaps it would be better to design a common, open source spacecraft platform. So many of the basic tasks that spacecraft software must perform are essentially identical. The main differences for critical spacecraft systems would be the hardware. If a general purpose OS and spacecraft toolkit were designed, then the main things that would have to written from scratch for dif
  • Yeah, Amnesty International's been ridin' those damn Martians for years about their climate. It's oppressive!
  • I'm not surprised. (Score:5, Interesting)

    by dnnrly ( 120163 ) on Monday June 09, 2003 @09:01AM (#6149671)
    I've seen the code for some MAJOR blue chip companies and I really do wonder how these people stay in business with the rubbish that they put out. For example some of code drops from our clients don't even compile! The reason for all the crap is that it's very easy to cut corners without it being very obvious immediately. Typically, the first thing that gets stopped when things ar getting tight (either time or money) is documentation, quickly followed by testing. Next it's individual features, removed from the requirements 1 by 1.

    Since software engineering is still a 'black art' as far as most traditional engineers and project managers are concerned, there isn't the real intuition/understanding of when things are starting to look bad. Without looking at code AND knowing something about it, you won't stand a chance 'intuiting' whether or not things are going well.

    Writing software is an expensive business in both time and money. It's also a very young business without the same 'discipline of implementation' as other areas. Until the process matures and people realise that doing it on the cheap gives you cheap software, things aren't going to change and Mars probes are going to continue to produce craters.
  • Anyone who's been listening to Coast to Coast AM (first hosted by Art Bell, now hosted by George Noory) may have heard of Richard C. Hoagland, a fairly frequent guest of that show.

    Hoagland thinks many of the Mars missions--including the failed European/Russian Mars 96 mission--were deliberately sabotaged by various space agency officials that want to prevent people from finding out that Mars used to not only have life, but intelligent life on that planet. You should read Hoagland's book The Monument of Mar
  • by foxtrot ( 14140 ) on Monday June 09, 2003 @09:05AM (#6149709)
    Space Exploration isn't easy.

    Look at the Space Shuttle. The space shuttle has never had a catastrophic computer failure-- but every line of code on that truck has survived review by a group of programmers. They've examined it, line by line, multiple times, in order to ensure that it's exactly right, because the cost of failure is 7 astronauts and a multimillion dollar orbiter.

    The new Mars programs, however, are part of the streamlined "do it on the cheap" NASA. NASA put the Mars Rover down using mostly off-the-shelf and open-source software and a small amount of home-brew stuff. No matter how good open source software gets, it still hasn't undergone the level of review that the Space Shuttle code has seen. No matter how popular an off-the-shelf package is, it's not cost-effective for the manufacturer to give it that sort of treatment. NASA can't afford to do that level of code review because that costs them the ability to do some other program.

    NASA is simply trying to do more with less in the unmanned launches, and the cost of that is we need to expect some failures. These failures are unfortunately very visible...

    -JDF
  • by SuperDuG ( 134989 ) <[be] [at] [eclec.tk]> on Monday June 09, 2003 @09:08AM (#6149739) Homepage Journal
    Place sensitive computerized equipment on top of massive explosive materials. Ignite materials causing massive controlled explosion forcing upward and mixed with the pull of gravity causing somewhere in the ballpark of 9 G's of force pulling down every part of the sensitive computerized equipment. Then when all is said in done with the explosion, have another explosion in a vacuum of the coldest and most uninhabitable spot in the entire universe.

    Then after 3 months you are then shot into a planet and stopped by a parachute and then some air bags. The entire time literally thrown into the surface.

    And all this with the safety and security, of the lowest bidder.

    I dunno, you tell ME why these missions have a high failure rate. Could it be there is no humans on board therefore not as much care is taken to insure the safe delievery of these machines? Could it be the fact that they are designed not to go to mars, but to go to mars as cheaply as possible. Could it be that no one really has a whole lot of information so a lot about mars is (pun intended) hit or miss?

  • It is no wonder we cannot get probes to Mars if we have yet to perfect our less sophisticated devices here on Earth. I'm using the seriously over hyped Mac OSX and have an ever increasing list of bugs and flaws in it along with the various applications I run. And I understand that my friends using Windows have similar experiences. (I cannot speak for Linux.) Either way, I have concluded that the reason for the unreliability of most software (OS or app) is because engineers generally (not all!) lack the mind
  • This little story might amuse you...

    One of my friends is (more or less) a rocket-scientist (or likes to think he is :)

    He is currently doing a study on what it would take to launch a bunch of Kiwi's into space/orbit
    (don't ask) using existing, off the shelf technology. It's part of his physics degree.

    (IF he finishes his study I might see if we can get it linked on /. - he was noseying through a book about nuclear missile guidance systems the other day :)

    Anyway, asked him about a mission to Mars the other
  • Quoth Hemos: Or maybe it has to do with being an incredible distance, on an inhumane climate. Either or.

    I have to really disagree with this. NASA is used to dealing with alien climates and terrain and astronomical distances. NASA is also used to dealing with problems. They have some of the best problem solvers out there, and when something goes wrong, then tend to pinpoint why. When NASA says A, B, and C are the causes of failure, I believe them. When NASA cannot figure out why something went wrong, I worry.

    What I'm trying to say is, distance and inhuman conditions shouldn't have that much of an affect on how well a probe works. We built Voyagers I and II, didn't we? They worked even better than expected. And they encountered climates and conditions which make Mars look easy.

    NASA has dealt with so many varying circumstances and climates over the years, and been so blunt about their mistakes, I find it hard to believe that they would blame the failures of an entire class of missions on something "easy." And yes, blaiming failures on software is an easy way out, how many times have you heard someone say "Oh! It must be the software!" when something doesn't go as expected?

    Now, I know this guy doesn't speak for NASA as a whole, but as a NASA trained administrator, and the head of some very large projects, I'm willing to take his opinions at face value. If he says it looks like software has really been a cause of failure, who am I to laugh at his expertise and belittle his explanations? I might not like his explanation, but I buy it.

  • by Idarubicin ( 579475 ) on Monday June 09, 2003 @09:14AM (#6149799) Journal
    Did anybody else notice today's witty quotation at the bottom of the page? Does this answer the question?

    Never test for an error condition you don't know how to handle. -- Steinbach

  • Software is Hard (Score:5, Insightful)

    by Teckla ( 630646 ) on Monday June 09, 2003 @09:19AM (#6149831)
    Most PHB's haven't figured it out yet: SOFTWARE IS HARD. It's amazingly complicated. It's also notoriously hard to come up with realistic estimates.

    PHB's also haven't figured out that developers aren't interchangeable widgets. If you know C, it doesn't mean you'll be immediately productive in Korn shell scripting, and vice-versa.

    PHB's also haven't figured out that experience is key. There are exceptions, but generally speaking, a young hotshot isn't going to be as productive as an experienced professional. Sure, the young hotshot might get v1.0 done first, but it'll be buggy, unreliable, unscalable, hard to maintain, etc.

    The "problem with software" is almost entirely a management issue, imho.

    -Teckla
    • by CyberGarp ( 242942 ) <Shawn.Garbett@org> on Monday June 09, 2003 @09:45AM (#6150172) Homepage

      PHB's also haven't figured out that developers aren't interchangeable widgets. If you know C, it doesn't mean you'll be immediately productive in Korn shell scripting, and vice-versa.

      I think this statement is true, but only because of the failure of education (or lack thereof). A good software analyst, is trained to think about the concepts, not the language. When I was a senior, we had a class where every project was a new language. One of the professor's summed it up, "Any monkey can learn a programming language by reading a book. An analyst will know what he's doing, no matter the language." It's all too sad that most employers hire based on language experience, and not successful software engineering practices.

      The "problem with software" is almost entirely a management issue, imho.

      For many reasons, but proper software engineering is understood but not popular. The results of a Cleanroom Engineering project have been well documented. Why isn't it popular? It doesn't have a fun sounding name and it's tedious to do correctly.

  • by MartyC ( 85307 ) on Monday June 09, 2003 @09:25AM (#6149927) Homepage
    According to this page [ucar.edu] only 3 of 26 missions to Venus have been total failures. When you consider that Venus is a much more hostile environment than Mars then you have to conclude that either Mars is just plain unlucky or mission planners are getting something wrong.
    • Unfortunately, that page is incomplete and misleading, as it only mentions the probes that actually got near Venus. For example, the page lists Mariner 2, but not Mariner 1. Mariner 1 went off course due to a sofware error resulting from a missing hyphen [matcore.com]. Venera 1, though in the list, suffered a communications failure and was a complete failure. Also failing was Sputnik 7, whose 4th stage didn't ignite. Sputnik 23 and 24 never made it from Earth orbit. Sputnik 25's 3rd stage blew up the entire craft.
  • Software (Score:4, Insightful)

    by hackus ( 159037 ) on Monday June 09, 2003 @09:48AM (#6150204) Homepage
    I think the primary problem is that the technology to build and design probes changes too quickly, and affects design.

    I always thought that there should be a way, to build a probes navigation and propulsion systems in a standardized whay so that avionics software wouldn't need to change that much.

    Sort of a standardized platform if you will for doing solar system exploration.

    This platform would consist of a number of parts that would not change, and could be reusable in a number of different configurations for building a probe, depending on what its job was.

    Cameras, photometers, spectrometers, and power sources could all be packaged in the same why depending on the probes job.

    Every probe that nasa launches is always customized and built around cost and included packages.

    I am not so sure that is the best way to go about it as you have to reinvent all the software to manage the probe every time you build one.

    Probes should be cheap, produced in high volume, (thousands) and interchangeable.

    With a standardized approach, failure rates should come down a bit and costs should be reduced.

    -Hack
  • My space failure (Score:3, Interesting)

    by TheSync ( 5291 ) on Monday June 09, 2003 @10:23AM (#6150631) Journal
    Heh, I was a part of a space failure [umd.edu] myself. We were using pretty much off-the-shelf equipment, but it passed NASA spec shake and thermal testing. What probably did it in was radiation...in low earth orbit we figured there wouldn't be much risk of radiation problems.

    If we were to do it again, we probably would have had some kind of radiation-resistant reset system, because building the whole thing in rad-hard would be very expensive (our budget was $1500 plus donated equipment!) But having a few rad-hard devices to reset the box in case of a crash would probably have been affordable.

    About 100 amateur radio operators contacted our payload, and relayed their GPS coordinates to others using amateur packet radio. At the same time, the GPS unit on board the Spartan satellite transmitted its position to listeners on the ground as well. But had it not crashed after about 17 hours, it is possible that several hundred other amateur radio operators would have used it.
  • by Noehre ( 16438 ) on Monday June 09, 2003 @10:24AM (#6150640)
    Venus, like the woman she is, is a real bitch and a half.

    Thick sulfuric acid atmosphere?

    Gigantic storms?

    Temperatures that will melt aluminium?

    Ahh, I need to stop. I'm getting flashbacks of my ex-gf.
  • and the Viking landed. Dad points out that the budget for the Viking was in the neighborhood of 1 billion dollars, and that was when a Mustang Mach 1 cost just over 4 grand. The space program doesn't have the money now to do the missions the right way, which is unfortunate... the developments of NASA when they had tons of money were numerous and wonderful (i.e. Tang!)
  • by stinky wizzleteats ( 552063 ) on Monday June 09, 2003 @10:42AM (#6150849) Homepage Journal

    Before we continue to crucify programmers, we need to remember how hard it is to really get to Mars, from a purely spacefaring perspective.

    From my experiences flying to Mars in Orbiter [orbitersim.com] space flight simulator (FREE!), several problems become apparent:

    Mars is a fantastically difficult target to reach for two main reasons. It has very little gravity, and very little atmosphere.

    If you shoot for something big, like Jupiter, you find that it is hard not to miss it. It's gravity well is so massive that navigational errors en route are relatively insignificant. Mars doesn't help you very much in this regard. An Earth to Mars flight has to be dead on.

    When you get there, you are likely going to want to use the atmosphere to do at least part of the braking maneuver to get into Mars orbit (as most modern probes do). The problem is that Mars has a very thin atmosphere. Think about the sheet of paper analogies with Earth re-entry. Earth's atmosphere goes MUCH farther into space than does Mars'. You have to get dangerously close to the surface (within 50 miles) to effectively aerobrake using Mars' atmosphere. So with Mars, you are more talking about a near-ephemeral gossamer thin 1 cell thick membrane you have to hit the edge of rather than a nice, thick piece of paper.

  • Failure (Score:3, Insightful)

    by DrinkDr.Pepper ( 620053 ) on Monday June 09, 2003 @10:49AM (#6150955)
    We like to prey on these simple glitches only because it is poetic to do so. Saying the MPL failed because a programmer failed to initialize a variable sounds much more interesting and is much easier for a reporter to remember than saying MPL failed because a programmer failed to initialize a variable, which determined how close to the planet the retro-rockets would turn off, and that this was observed in the testing laboratory, but the test data was not annalyzed until after the crash.
  • by mikerich ( 120257 ) on Monday June 09, 2003 @11:03AM (#6151096)
    Most notably with the Soviet Union's dreadful record of getting spacecraft to Mars. A good number of the craft listed as failures actually never got away from Earth.

    Take their early record, before Mars 1 got to Mars, they had had a series of attempts. Two, known to the West as Mars1960 A and B reached Earth orbit then disintegrated.

    Mars1962 A exploded in orbit at the height of the Cuban Missile Crisis - briefly causing a panic with the Americans thinking a missile attack was underway. Fortunately the computers soon told them that doomsday had been averted.

    Next, was a partial success - Mars 1. Which smashed the record for deep-space communications with Earth across a distance of 106 million kilometres. Unfortunately it failed just before reaching Mars.

    Mars1962 B exploded in Earth orbit and didn't appear in the Soviet record.

    November 1964 saw the launch of Zond 2, a highly advanced probe using ion thrusters to perform stabilisation and orientation tasks. It may have also been the first probe to carry a lander. It died a long and lingering death before sweeping past Mars at only 1400 km altitude. (By this time the US had got their first Mars probe to the planet in working order, Mariner 4 took 22 pictures of the planet from 10 000 km. (Its sister ship, Mariner 3 had failed en-route)).

    Neither side went to Mars in the next launch window, but 1969 was a busy year. Three attempts for the Soviet Union, including at least one lander. Mars 1969A exploded in flight as did Mars 1969B. Mars 1969C was removed from the pad after cracks developed in the relatively new Proton rocket design. (Cracking in the Proton was also a major reason for the failure of the Soviet Union to send a manned mission around the Moon during 1969). The US had a twin success with Mariners 5 and 6 flying past Mars.

    On to 1971 and a pair of launches for the US, Mariner 8 ended up in the Atlantic, Mariner 9 went on to become one of the most successful missions ever and the first probe to orbit Mars. For the Soviets - mixed results again. Their first mission reached Earth orbit, but went no further and was named Kosmos 419. But then both Mars 2 and 3 left Earth orbit. They each comprised of a lander and an orbiter. The two craft jettisoned the lander before entering Martian orbit - just as the planet entered an intense dust storm with raging winds and almost total blackout.

    Mars 2's lander was apparently DOA, it remained silent and does not appear to have returned any data. It was however the first craft to hit (not land on) Mars. Mars 3's lander was more successful. It entered the atmosphere, deployed parachutes and landed on rockets. It deployed its antenna and began to transmit the first picture from the Martian surface. Sadly, just 20 seconds later the transmission stopped. The Soviets said that the lander's parachutes had been caught by the storm and pulled it over.

    Mars 2 and Mars 3 orbiters remained on-line and performed experiments on the Martian atmosphere and took photos of the surface. So I would call both missions a partial success and Mars 3 almost a triumph.

    The next window was 1973 and the Soviets planned no less than 4 missions to Mars. Mars 4 and Mars 5 would be orbital missions, studying the planet much like Mariner 9, but also serving as telecoms relays for the Mars 6 and Mars 7 heavy landers.

    Incredibly, bearing in mind the past track record of the Soviets, all four missions reached Mars in working order. Then everything went wrong. Mars 4's main engine failed and the probe did not enter orbit, it relayed images of the planet as it swept past into solar orbit. Mars 5 was next and was the only unqualified success of the year; it was the first craft to return colour images of Mars.

    The two landers then arrived, Mars 7 first, it deployed the lander, but an attitude problem meant that the lander actually missed the planet entirely! Mars 6 was more lucky, the probe entered the Martian atmosphere, took readings all the way down and went dead ab

  • by TFloore ( 27278 ) on Monday June 09, 2003 @11:07AM (#6151150)
    I don't usually comment on typos, mostly because I make so many myself. (Pot, kettle, etc.)

    But in the article:
    âoeFaliures are simply due to human error, which is avoidable,â said Spear.

    That was just too perfect.
  • by confused one ( 671304 ) on Monday June 09, 2003 @11:47AM (#6151536)
    We've all heard of the "faster, better, cheaper" game NASA's been playing lately.

    Here's the problem as I see it: As software and hardware have become more complicated, there's a need to increase testing. Instead, in order to meet NASA's new budgetary requirements, funding in general, and specifically for testing, has gone down. So, it's not possible to completely test all of the hardware AND software, as it should be.

    As an analogy: If we were talking about commercial airliners; these probes would never be certified to fly.

    I'm not putting all the blame on NASA here; although, it is apparent to me that they need to start reporting what it's actually going to cost. Having said that, Congress is equally complicit; they need to come to the realization that it's expensive to do work outside the atmosphere (they apparently don't understand this...)

  • by Phil Karn ( 14620 ) <karn@@@ka9q...net> on Monday June 09, 2003 @12:02PM (#6151738) Homepage
    I think one of the factors contributing to the poor Mars success rate is orbital mechanics. The launch window to Mars opens for only a month or so every two years. This is the longest interval between window openings for launches from Earth to any other planet; windows to the other planets open at roughly yearly intervals or less. Since missing the launch window means waiting another two years, this undoubtedly creates enormous schedule pressures on any team preparing a spacecraft for launch to Mars.

Single tasking: Just Say No.

Working...