Forgot your password?
typodupeerror
Space

Houston, We Have a Software Problem 331

Posted by chrisd
from the nasa.sourceforge.net dept.
An anonymous reader writes "The computer system that launches the Space Shuttle is an old, but important, computer system. It is built from mid 70's technology and features SSI chips like 7400's...which are getting hard to find. It has 64k of memory and no room to repair any software bugs. NASA started the CLCS project in 1996 which uses state of the art computer languages, OO methodologies, and hardware. Everything that you could actually hire people off the street for. However, NASA is in a budget crunch with the Space Station cost overruns. It is looking to trim costs to keep the Space Station going. There are stories about CLCS getting cancelled here and these guys say its already cancelled."
This discussion has been archived. No new comments can be posted.

Houston, We Have a Software Problem

Comments Filter:
  • by Semi-Psychic Nathan (563684) on Sunday September 08, 2002 @07:11PM (#4217671) Journal
    But I thought 64k should be enough for anybody...
  • the future? (Score:2, Insightful)

    by brondsem (553348)
    And what plans do they have to keep this from happening again in a decade?

    Sorry if the article answers this, I can't get to it.
    • From http://www.nasawatch.com/ksc/09.04.02.clcs.html:
      The Checkout and Launch Control System (CLCS) was created differently than LPS. It provides more safety and more operator visibility into launch problems before they occur. It is expandable and capable of being upgraded. It is designed to accomplish not only today's Shuttle launches, but also provides a launch capability for any future vehicle. [emphasis added]
      CLCS would apparently be flexible enough to control anything NASA needs for several decades.
  • by null-und-eins (162254) on Sunday September 08, 2002 @07:12PM (#4217683) Homepage
    Given todays hardware, why you can't just simulate the old system if finding parts for repair becomes a problem. You would just run your old software on the simulated machine.
    • I would think that requires rebuilding the whole thing anyway. So it might not actually improve it.

      Also you have to ensure that the simulator has zero bugs, which means simulating the bugs in the original equiptment which their code depends on.

      Writing a perfect software simulation of hardware is IMO a job as equally hard as just rewriting the original code.

      It's not like they have millions and millions of lines of code, the original rom must to have been less then 64k or so. They just have to rewrite the code in a language that is more maintainable which machine code is not.
    • by rodgerd (402) on Sunday September 08, 2002 @07:28PM (#4217761) Homepage
      Auditing the emulator and the host OS would be a problem - the code they've currently got has a very low rate of bugs, and has been extensively audited. NASA knows everything from the hardware up, exactly what the failure rate is and so forth.

      Now, imagine you take modern commodity hardware (which changes periodically - look at how often Intel silently release new steppings of their CPUs). You're not going to have a guarantee of consistency there. You're going to have to boot an OS off it - and even the simplest RTOSes are still much, much bigger than the whole platform currently. Then you need an emulator. Then you need the system. And the only problem you've solved with all that work is the unavailablility of the old hardware - you still have a old machine language on a tiny platform which can't be easily extended for new functionality.
      • by WasterDave (20047) <davep@ze d k e p .com> on Sunday September 08, 2002 @08:06PM (#4217923)
        This is a very pertinent point that appears to have been lost on the initiators (and now burger flippers) of the replacement-launch-thingy project.

        What they have, right there, is one spectacularly reliable piece of software. I suspect it's significantly more bug free than even the microcode in a modern processor, let alone the companion chips, bios, operating system, and virtual machine for some god awful p-code language (not that I'm naming names here).

        The question that should have been asked is "how can we make a sustainable process for making extremely reliable control computers?". How to go about cutting custom silicon, tiny os's etc. How to save the happy tax payer hundreds of millions of dollars by reselling these services to people making nuclear power stations, heart pace makers etc. instead of going shopping for big sun boxes.

        Oh well, reality strikes again.

        Dave
      • by Jucius Maximus (229128) <zyrbmf5j4x@sn k m a i l.com> on Sunday September 08, 2002 @09:03PM (#4218088) Homepage Journal
        "Now, imagine you take modern commodity hardware (which changes periodically - look at how often Intel silently release new steppings of their CPUs). You're not going to have a guarantee of consistency there. You're going to have to boot an OS off it - and even the simplest RTOSes are still much, much bigger than the whole platform currently. Then you need an emulator. Then you need the system. And the only problem you've solved with all that work is the unavailablility of the old hardware - you still have a old machine language on a tiny platform which can't be easily extended for new functionality."

        Might I suggest using FPGAs [vcc.com] to emulate the hardware old system so the software doesn't have to be thrown out?

        Assuming that circuit layouts are available for these old chips, it would be a piece of cake to emulate them in VHDL (a hardware description language) because they are comparatively simple to today's integrated circuits. Once the chip descriptions are written in VHDL, it would be relatively easy to 'port' the hardware over to a new FPGA if the old one dies or whatever. Then it would not be necessary to truly port or re-code any of the currently working code, and it would be much easier to fix bugs and extend it because you don't have the memory and speed limitations of the old system.

      • Surely with slightly more advanced, but far more available, and stable and simple technologies (programmable gate arrays, etc), they could do a next generation (well, .5 generations ahead) version of this.

        The 7400's and such are just interface hardware; that logic, as well as replacement for the 64k ram, etc., could all be put on a single reliable chip (no I386, no heavy OS, no emulator).

        There's got to be an only-slightly-more-complicated and nearly-as-reliable by not being too ambitious with the latest tech.
      • Or use old, but solidly reliable, thoroughly-tested and completely documented CPUs from Ye Olde Days, like the 6809. Or even the 6502.

        When dealing with shuttles and the like, K.I.S.S. should be the mandate. Ain't nuttin' complicated about a 6809 mobo, yet it can be coupled with an MMU to provide upwards of 1Mb of memory, and there are excellent realtime, multitasking OSes available... with source, IIRC (OS-9, from Microware; and I think there's a QNX for it, too. Plus [ooooh, he stretches his memory...] Flex-09?).
    • There comes a time in every products lifetime when its time to start over, and I believe this is the time. In the 70s, we knew significantly less about good coding practices as today. GOTO was first becoming considered harmful. Procedural programming was on the rise. Object-oriented as a paradigm was beginning to take root.

      With these considerations in mind, clearly simulating an old computer is a very backwards idea. bug-for-bug compatibility [tuxedo.org] is not a positive effect.

      Same as bug-compatible, with the additional implication that much tedious effort went into ensuring that each (known) bug was replicated.
      • by rodgerd (402)
        Given LISP and (IIRC) Smalltalk both existed in the 70s, the world wasn't as primitive as you make out.

        Besides, the use of modern programming buzzwords implemented by college kids sounds like the principal problem with this project...
      • by io333 (574963) on Sunday September 08, 2002 @07:54PM (#4217873)
        There comes a time in every products lifetime when its time to start over,.

        Exactly. And that includes the shuttle. It has never lived up to what it was envisioned to be and it is only going to become more costly and more failure prone in the future as every bit of hardware on that pig is already showing signs of fatigue.

        There are many launch systems that cost far less per pound to throw things into orbit. The reasons we still have those monstrosities flying are political only, not technological or scientific.

        Sure this is flamebate. (Gosh, getting rid of the old karma system is so LIBERATING!) But if we can discuss how some little bits of hardware in the shuttle are past their time, why can't we discuss the big bit?
    • Remember that whatever machine you use to simulate it in needs the special connections and circuitry to interact with other parts of the ship. You can implement the software fine but how do you connect the engine, flappers, and sensors to it?

    • by perfects (598301)
      Given todays hardware, why you can't just simulate the old system

      You can't just buy a system from Dell and put it into the Space Shuttle. You can't use a Pentium, a modern hard drive, Linux, Windows, or Open Source anything.

      As far as the hardware goes, everything mission-critical that goes aboard the Shuttle has to be ruggedize against incredible vibration, tested a thousand different ways to make sure that it can't be affected by exposure to vacuum/heat/cold/radiation/cosmic rays/etc., tested another thousand ways to make sure it doesn't interfere with other critical Shuttle systems... and on and on.

      And a bug in the newly written software could cause not only the death of several astronauts, but potentially the loss of a Shuttle, a launch facility, and the ISS. Would you, under any circumstances, put your life, five other lives, and billions of dollars in the hands of software that you found in an Open Source project?

      On your desk a "Fatal Error" isn't, really. But 60 miles up?
      • by GiMP (10923)
        I believe the discussion is about a computer that is based on the ground.. even if not, I think you fail to realize that NASA has been using Linux in space for a while.. PC104 boards with flash (solid state) memory running Linux.
  • by Ed Avis (5917) <ed@membled.com> on Sunday September 08, 2002 @07:19PM (#4217714) Homepage
    At some point it might be cheaper to give up on computers and just pilot the Shuttle by hand.
  • This is a common problem in big projects. The time it takes to design a system and then actually implement that system is so great, that by the time the sytem is complete, the hardware used to make that system is 'obsolete.' You can't just add more memory and speed, because then you'd have to go through and make sure that everything still works perfectly, and that would take so long as to make the current hardware 'obsolete.' The real problem here is public hype. You don't need 4 GHz and 40GB of memory to program the space shuttle, but if the public finds out that NASA only uses 64k, they will think NASA is behind the times, even though 64K is enough for the system. Of course, the space shuttle is already considered obsolete by some, and new sytems are being created, so don't fret much over this.

    Stephen
    • Re:Common Problem (Score:2, Interesting)

      by cheeto (128748)
      Actually, the old AP101 computers may have had 64k of memory (I don't recall). We upgraded those bad boys a long time ago to AP101S which have a whopping 256k. Who could ask for anything more.

      FYI: That extra bump in memory allowed us to store the entire Entry program in upper memory so that in the event of a Trans-Atlantic abort, we wouldn't have to wait 20 seconds for it to load from the mass memory.
    • The time it takes to design a system and then actually implement that system is so great, that by the time the sytem is complete, the hardware used to make that system is 'obsolete.'

      Which is why serious software engineering is done on platforms like the SPARC, where you can guarantee that later CPUs can run earlier code, or on IBM operating systems where everything is a virtual amchine anyway.
  • 7400s hard to find? (Score:5, Informative)

    by Istealmymusic (573079) on Sunday September 08, 2002 @07:27PM (#4217755) Homepage Journal
    I don't know about everyone else, but when I was a kid I got a Radio Shack 300-in-1 electronic project kit for my birthday which came with a dozen or so 7400 chips. When I plugged one in backwards I just went down to my local Radio Shack [radioshack.com] and picked up a new 74LS00, which they had plenty of in stock all the time.

    Certainly the 7400 series as a whole is still widespread and used in hobbyists kits, I'm not that old. Maybe the original 7400 is becoming obsolete, being replaced with the 74LS (low-power Schottkey) or CMOS chips? If then it shouldn't be too difficult to replace the TTL logic with CMOS logic, given a few adjustment levels in voltage, or they could use the TTL-logic and CMOS-logic in one compatible chips [cjb.net].

    Of course, the 5400 series SSIs (small-scale integrated circuits) are preferred over the 7400s for industrial purposes, and as a plus they are completely backwards compatible. Why isn't NASA using those?

    • You can cause a lot of problems by replacing a part in a working system with the manufacturer's new and improved part. Often the new part has faster outputs, which can change your PCB layouts from working to marginal.

      Given the above, I still don't see why they would not reimplement the whole thing in a slightly newer logic family and requalify it.

    • by mikewas (119762) <wascher&gmail,com> on Sunday September 08, 2002 @07:55PM (#4217878) Homepage

      The 54 series parts were like the 74 series, but in a hermitically sealed case, 100% tested over a wider temperature range, and burned in to remove infant failures. For this application they used space qualified components. The same as 54 series parts, more stringent tests, and now the chips are also evaluated for radiation resistance. Any change in the design or production process and the 54 & space qualified chips must be requalified. What can happen is that a chip is produced to be fuctionally the same, but using smaller geometries, and now is more suseptiple to ESD and radiation.

      CMOS chips, because of their high impedances, are notorious for ESD and rad sensitivity so they won't do.

      With the reduction in military, aerospace, and space spending many manufacturers have dropped the 54 series and space qualified components. They haven't made any attempts to add replacements in their product lines.

      When a part is dropped, the manufacturer usually informs the industry of their intent. You're given a date & price for a final order. the theory is that you can buy a lifetime supply of these parts. Industry isn't likely to but any more than they need to complete existing contracts plus a few spares, there's no guarenty that you'll get any more contracts to build items requiring these parts so these purchases will cut into your profits. Government procurment may buy additional components, but lack funding to really buy large quantities.

      An opportunity is presented, and they will be taken advantage of. A distributer might buy some additional parts -- since the distribributer has several customers buying a particular part from him, his risk of being stuck with an unseable component is small.

      After the final production run, the chip manufactorers will sell the documentation, tooling, and rights to make a chip. There are small manufacturers who buy these, all well as the out of date machinery to produce these parts. They can then make small production runs, sometimes under a hundred components, for a price. In addition, they might buy untested dice or wafers from the last production run. The untested & unpackaged componets are very cheap, so it's more affordable & less risky to buy and store these than the completed components.

      So it is possible to still get the parts needed? -- at a price!

      • When I plugged one in backwards I just went down to my local Radio Shack and picked up a new 74LS00
      Dunno about the Shuttle, but I assume my experience applies. I used to write autopilot & autostabilser software for helicopters. They used 80286 & 68000 CPUs, which have started to become more difficult to find. Not because there are no 286's or 68K's out there, but because there aren't so many 286's and 68K's available that are certified for flight.
      • "there aren't so many 286's and 68K's available that are certified for flight."

        How much of that is bureaucracy at work, and how much is technical?

        • How much of that is bureaucracy at work, and how much is technical?

          Remind me not to fly on any plane that you worked on the avionics for. There are times when overwhelming paranoia is an asset. When your Playstation crashes you can always start a new game. When your main engines get told to hardover on launch because of a failed chip, you die.

        • its a good question. not sure, to be honest. i was told that they were supposed survive more severe conditions - heat, humidity, radiation etc, no idea if that's true though. i seem to remember that their entire history is documented so they can guarantee they've not been mishandled by some intern who hadn't been on the anti-static course. So much of it could be paperwork, but its still a burden that has a cost, and most definately rules out the use of second hand CPUs.
    • I once asked the electronics guru at the university satellite lab where I used to work why were were launching a 386 when we could have stuck a more modern processor (like a StrongARM or something) on the main board. He pointed out that older style chips were preferrable because the gates and interconnects were all bigger. A bigger gate was less likely to get triggered or flipped (if it were in a register or something) by a stray particle of cosmic radiation. Low tech chips were easier to certify for space use because of this.

      I'm sure we could have gotten faster chips rated for space, but we were on a tight budget, so the 386 was it. :)

  • by Boss, Pointy Haired (537010) on Sunday September 08, 2002 @07:35PM (#4217788)
    What?

    "shuttle_launcher_0_1"

    Excellent. That'll save a few dollars. What's the development status?

    "1 - Planning, sir"

    Ah.
  • by NeuroManson (214835) on Sunday September 08, 2002 @07:37PM (#4217800) Homepage
    (1) Print up 50,000 numbered authenticity certificates...

    (2) Break down the old mainframes until you have roughly 50,000 pieces...

    (3) Sell it on eBay (or other auction sites) as space memorabilia, mention that the computer the parts came from were responsible for guiding the Apollo missions to the moon, etc and so on... The machines are SO obsolete now that the only way they could pose a security risk is by sending them back in time...

    (4) Profit!

    (5) Buy a nice little beowulf cluster, hire 20 Linux geeks and feed each of them $50 in dew and pizza in exchange for setting up the system...

    (6) Use remaining funds to pay the Russian space agency to have a little "airlock accident" for that Nsync guy...
    • by Anonymous DWord (466154) on Sunday September 08, 2002 @08:02PM (#4217904) Homepage
      In re: point number 6, I know you'll be sad to hear that 'N Sync guy's flight is no longer on. There was an article in the NY Times last Wednesday that made me laugh.

      ...
      [Lance] Bass, of the pop group 'N Sync, had been training at the Star City cosmonaut complex outside Moscow; he was told today to pack his gear and leave after "failing to fulfill the conditions of his contract," a spokesman for the space agency told Reuters.
      Adding insult to injury, the space agency said Mr. Bass, 23, would be replaced on the October mission by a cargo container.
  • by WolfWithoutAClause (162946) on Sunday September 08, 2002 @07:39PM (#4217812) Homepage
    Let's face it; the Space Shuttle is obsolete- it's 30 year old technology barely warmed over. It's completely failed all of the main design goals; NASA told congress that they were aiming for costs as low as $500/kg and 5 nines reliability- they're currently at about $20,000/kg and only 2 nines reliability. These are not small issues. Missing the target price by forty times is an enormous gap.

    In fact, the Saturn V was able to launch 4x as much for about the same cost. It could probably have launched most of an ISS in a single launch, and tacked on more sections in 2 or 3 more launches.

    • by Speare (84249) on Sunday September 08, 2002 @09:08PM (#4218105) Homepage Journal

      In fact, the Saturn V was able to launch 4x as much for about the same cost. It could probably have launched most of an ISS in a single launch, and tacked on more sections in 2 or 3 more launches.

      Here's where you lose on your argument. The manned vehicle compartments that were available to the Saturn V made extra-vehicular activities very difficult. The Shuttle has a much better design for short-term laboratory work, for crane work, and for extra-vehicular work. You could dump all the parts for an ISS into orbit with a few Saturn V trips, but you couldn't work with those parts to assemble them. In fact, you'd have to drop the parts a fair distance away and then find some way to bring them closer to the building orbit location. You wouldn't want to chance using NASA's equivalent of a Greyhound Bus to maneuver anywhere near the multi-billion dollar parts (and stationed lives) that may already be on-site.

      • The manned vehicle compartments that were available to the Saturn V made extra-vehicular activities very difficult.

        It seems there are much easier ways around that issue than to design something as costly as the shuttle.

        You wouldn't want to chance using NASA's equivalent of a Greyhound Bus to maneuver anywhere near the multi-billion dollar parts (and stationed lives) that may already be on-site.

        To use your analogy, then you have the Greyhound bus tow a little delivery vehicle, rather than trying to turn the Greyhound bus into something that can both carry lots of cargo and maneuver like a forklift.

  • by nettdata (88196) on Sunday September 08, 2002 @07:48PM (#4217848) Homepage
    It's not like this is rocket science!

    Oh, wait....

  • It has 64k of memory and no room to repair any software bugs.

    LOAD "NASASHUTTLE",8,1
    • Hah! I don't think a 1541 is fast enough to handle that!

      My Apple II, on the other hand, you just insert the disk and flip the power, and the NASASHUTTLE program comes up automatically, in 1/10th the time your C= disk drive loads it!

      Of course, your version has better sound, and sprite graphics... but oh well.
  • by wfmcwalter (124904) on Sunday September 08, 2002 @07:53PM (#4217865) Homepage
    However, NASA is in a budget crunch with the Space Station cost overruns

    Just what is the space station actually for?

    • it's an expensive way to get second-rate microgravity
    • it's a rotten, wobbly astronomy platform
    • no-one is allowed to experiment with low-G sex (given the Russians' new found capitalistic streak, it's a wonder we've not seen any low-G porno yet - or maybe I'm just not in the loop on that)
    • despite what the conspiracy-theory boys say, it'd make a crappy spy satellite and a worse orbital weapons platform
    • there's only so many interesting things we can find out about how spiders make webs in freefall
    • it's not even an efficient way for the US government to prop up the Russian government

    The money spent on this (and the space shuttle) could be spent on real science and could get a thousand off-the-shelf spaceprobes to interesting places.

    I suppose getting rid of Lance Bass would have made it worthwhile, but even that's not going to happen anymore (unless /.ers constribute to a paypal account for this purpose...)


    roses are red
    violets are blue
    the Russians have satellite laser weapons
    so why can't we too?

  • by timeOday (582209) on Sunday September 08, 2002 @07:54PM (#4217867)
    The code in the Shuttle's launch system is old? The entire Space Shuttle is old. I'll bet a lot of slashdotters don't even remember the Columbia's maiden voyage.

    I'm not one to replace things that are working fine, but as I understand it, newer designs could be a whole lot cheaper to operate. So I wonder if pouring more into the Space Shuttle program is the best thing to do.

    I'm not saying "let's throw out the space shuttle" but it bothers me that there's apparently nothing in the works with a decent shot at replacing it any time soon. It seems the field of space exploration is becoming antiquated.

  • by broken (1648) on Sunday September 08, 2002 @07:54PM (#4217870)
    Hire John Carmack to do the job. He's into rocketry so he gets to learn more about the whole thing, you get a kickass system, and he may even do it for free.

    The guy's so good he may do a better job than a bloated team of 400 contractors.
    • And about version 1.3, it'll stop crashing...

      John Carmack might be a kick-arse game programmer and a very smart guy, but he is not an expert compiler designer, complexity theorist, or, as is most relevant here, embedded systems programmer for safety-critical systems (though I'm sure he's rapidly learning about it with his rocketry hobby).

  • by Decimal (154606)
    They need a cheap replacement for a 7400? No problem! I have an old 7800 they can have for free. I'll throw in some 2600 games that it can play - StarMaster & Missle Command, that should get them back into orbit in no time, right?
  • by aebrain (184502) <aebrain@webone.com.au> on Sunday September 08, 2002 @08:32PM (#4217995) Homepage Journal

    From an article in the Sydney Morning Herald [smh.com.au].

    Only 58 centimetres square and weighing 50 kilograms, the tiny FedSat satellite is packed with five scientific experiments and all of the instruments required to communicate with Earth during its anticipated three-year life. At the heart of the satellite is a 10MHz ERC-32 processor - a SPARC-based 32-bit RISC processor developed for high-reliability space applications.
    The ERC-32 sacrifices processing power for durability and reliability. It uses three chips to process a modest 10 million instructions per second and two million floating-point operations per second - less than 1 per cent of a Pentium 4's capabilities.
    The pay-off is reliability: the ERC-32 uses concurrent error-detection to correct more than 95 per cent of errors.
    Power-hungry microprocessors such as the Pentium 4, which runs a standard office PC bought off the shelf today, would be an intolerable burden on the solar-powered satellite. The ERC-32 consumes less than 2.25 watts at 5.5 volts.
    Designed to survive extreme radiation bursts from solar flares, the ERC-32 can tolerate radiation doses up to 50,000 rad. This is 100 times the lethal dose for humans.
    ...A team of Australian programmers developed FedSat's onboard software, building on work done in Britain. It is written in Ada-95, a programming language designed for embedded systems and safety-critical software. All it has to work with is 16MB of RAM, 2MB of flash memory for storing the program, a 128K boot prompt and 320MB of DRAM in place of a hard disk that would never survive the launch process. All essential data is stored in three physically different locations.

    The software is built in a similar way - lots of internal checks, tell-me-thrice memory, soft-failure-bit-flip-correcting daemons etc. In this case, lives aren't at stake, but the people doing the programming are used to situations where they are.

    • The software is built in a similar way - lots of internal checks, tell-me-thrice memory, soft-failure-bit-flip-correcting daemons etc. In this case, lives aren't at stake, but the people doing the programming are used to situations where they are.

      Not only that, a single space launch of even a fairly small satellite still costs over a billion dollars. If there's a software glitch, it could render the satellite totally inoperable, and I doubt that these engineers want to tell their source of funding that a glitch they're responsible for just wasted the whole launch...

      Which is also why Microsoft doesn't do aerospace embedded systems. :) Whoops, Satellite Redmond I just had a BSOD...

      • We're getting a free ride along with the ADEOS II [nasda.go.jp] megasat (the Japanese get access to some of the data in return), but we're still talking significant money for development. And you're right re funding: it's no exaggeration to say that the future of Australia's space programme is at stake.

        As regards Microsoft doing space/embedded systems, another quote from the original article [smh.com.au]:

        "The system must be ductile - bending, not breaking - when things go wrong. In space no one can press Control/Alt/ Delete."
        A neat quote, even if I say so myself.

        A. Brain, Rocket Scientist

    • by g4dget (579145)
      Note that they are most likely using GNU software. Here [atmel-wm.com] is a list of the software development environments for these chips, and Here [estec.esa.nl] is the European Space Agency's web page for the tools and emulator.
  • Get a TI-89 and write an assembly program to control the space shuttle. The TI-89 runs off a MC68000 chip, and has (almost) a meg of space. That's about the programming power of the Apollo computers in a pocket-sized object--plenty of power to calculate the orbital trajectory/angle of entry/etc. It even has built-in calculus functions in case the astronauts forget the Fundamental Theorem of Calculus :) .
  • Some interesting information in the article... like the main reasons for cancelling the project are a lack of significant improvements in safety, reliability, or cost savings over the shuttle program's remaining lifetime. I'm no fan of keeping obsolete systems hobbling along beyond their years, but this reasoning doesn't seem outrageous to me. The outrageous thing is that it took 400 contractors to develop something that won't outperform a 30-year-old system that runs in 64k.
  • At the time of the Challenger inquiry, the late physicist Richard Feynman was part of the investigation committee. He found that most of NASA at the time was in full delusional mode about how reliable the Shuttle really was.

    The only exception was the computer systems group, in particular the software side. They had metrics, procedures and rigour.At the time of the enquiry the hardware was already old.

    It's the attitude that counts, not the hardware, not the methodology of the month. OO is not going to solve NASA's problem, it's going to be difficult. Myself I'd just make sure that the hardware would always be available, and not change a thing.
  • by g4dget (579145) on Sunday September 08, 2002 @10:42PM (#4218386)
    Trying to write such a system in C/C++ strikes me as rather stupid. It is extremely hard to write reliable software in C/C++. That may not matter much for desktop applications, but it matters when billions of dollars are in the balance.

    They obviously don't need very high performance, since it runs on 1970s hardware, but they do need high reliability and low development costs.

    That means that they should be using a safe, secure high-level language. Something with a virtual machine might be a good idea so that it will be easy to adapt to new hardware platforms: you verify the virtual machine on the new machine and then have reasonable confidence that your code runs.

    If they want something in widespread use, a home-built Java byte-code interpreter (not a JIT--they are too buggy) might be a reasonable choice--it's well specified and there are lots of people who know how to program it. They should probably avoid JNI like the plague and instead add new bytecodes for I/O and communications and verify them the same way that they do the virtual machine itself.. VLISP [nec.com] might be another good choice--or at least a source of ideas for how to implement a verified Java interpreter--DARPA already has paid for its development.

    And they should hire someone who doesn't recommed COTS with C++, lest we see the next shuttle go up in flames again.

    • They obviously don't need very high performance, since it runs on 1970s hardware, but they do need high reliability and low development costs

      this raizes an interesting question: how much better would a rocket with fast-response feedback mechanisms be ?

      and what are the time-scales involved ?

      how much can you raize efficiency and reliability (automated problem detection and solving) with better computing ?

      would a "real-time" (at the time-scales involved) automated simulation and analysis of the machinery involved (using inputs from the hardware) be beneficial at all ? how ?

    • That means that they should be using a safe, secure high-level language.

      Indeed. So where does this rubbish about Java bytecode come from? You're already going to have to verify the processor and the compiler output. Why introduce a third level (a VM) where things can go wrong?

      And they should hire someone who doesn't recommed COTS with C++, lest we see the next shuttle go up in flames again.

      That was just a gratuitous and offensive swipe. The Challenger shuttle went up due to mechanical problems, not software bugs.

      You seem to be one of those people who doesn't like C++, and therefore lumps it together with C and/or has a dig at it whenever possible. It's up to you whether you like a language or not, but please spare the rest of us the ill-informed language wars, OK?

    • I know the Java JVM is alreasy stack based , but is is far too complex to for the generated code to be verified. Stick with a very simple FORTH based stack with three data stack, long (64) int, Floating point ( 80/128? ). Note, no strings at all, all object/Array access via int syscalls.
    • "They obviously don't need very high performance, since it runs on 1970s hardware, but they do need high reliability and low development costs. "

      Don't confuse "not needing a processor that wastes millions of cycles a second" with "low" performance.

      Then need to work with very percse units of time, and have a exceptionally high success rate, fault tolerant and correcting, minimal suseptability to SUEs. The is high performance, as far as industry is concerned.

  • Just my .05 cents (if that)
    I used to work for GSFC (Goddard Space Flight Center). It was wonderful... many years ago.

    Anywho... they had *shitloads of unbelievable equipment... ages old... *name that piece of hardware*. We could wander from building to building, and look/view/see the equipment.

    Lots were there because they were running projects that took many many years to see results, thus they could not upgrade *in-the-field* because it would stop the project.

    Indeed, part of GSFC when I was there was to backup Houston on launches. When they upgraded they built a totally new floor above the existing backup, and on a *grand* day they transfered power, with one big switch, from one floor to the next - why? because they had to. It had to be well tested and well checked before it could be put in live production, yet the existing systems had to be on-line to backup Houston.

    It was fantastic walking through the various buildings and rooms... I've seen equipment I've no idea what it did. For example, one room had these rather large, circular platforms with clear plastic or glass domes. Inside the domes where flat plates - think silicon... but BIG.. 1 1/2 ft octogon. Stacked with about 2 inches spacing, about 10 of them. I'd say, looking at the room, some very old old old type of RAM.

    That's the wonder of NASA :)


  • "State of the Art" is a good way to run your pocket book into the ground. Jumping on the newest, fanciest programming language doesn't usually make a business successful.

    Here's yet another example: My company's (former) largest competitor invested *millions* into Sun hardware and development in Java. Why? "State of the Art". And guess what! With all of their "state of the art" infrastructure, their system was still slow as molasses.

    What did we do? We spent less than a tenth of what they did to develop with Perl on x86 servers. Our site handles huge traffic loads pretty effectively, and we did it without running ourselves to the bankruptcy court.

    steve

  • The Space Shuttle's flight computers (not the ground checkout system mentioned in the article) have already been upgraded once, and there's another $405 million upgrade in progress, planned for completion in 2006.

    NASA is currently struggling with obtaining a reasonably modern rad-hard CPU. The market is so dinky that nobody wants to bother with it. But they have been able to retrofit flat panel displays, at least.

  • If NASA's budget is hurting so badly, why not swallow a bit of pride and recruit help from fans of the space program who may also happen to be hardware and software engineers?

    Perhaps the crew at, say, ham radio organizations like AMSAT, [amsat.org] or other groups that already combine volunteer engineering effort with an interest in space exploration, would be happy to help out with modernizing the systems. I wonder if anyone's asked them?

    NASA would, of course, keep enough engineering staff around to check the improvements out, but why limit themselves to paid labor if the resource to pay is drying up?

  • Caveat: I am not a rocket scientist, nor an architect of safety-critical systems, just someone who has put in a lot of time over the years on low-level code where reliability and performance (in that order) were essentials.

    It strikes me that this is exactly the sort of project where you don't want to attempt to construct an ambitious, all-singing, all-dancing, state of the art, eighth wonder of the world. This misses the point about what is actually needed. Instead, you go for something as simple and straightforward as you can design which will have the capacity to do the job and continue doing the job for the forseeable future. It needs to be simple so that you can analyse its behaviour and failure modes with a high degree of confidence. You can push the sexy bells and whistles out to helper boxes, but the core systems must 'just work'. And technology that's far enough behind the bleeding edge for its characteristics to be well understood is definitely a Good Thing in these situations.

    Remember the old engineering rule of thumb: "when in doubt, make it stout, out of things you know about".

We warn the reader in advance that the proof presented here depends on a clever but highly unmotivated trick. -- Howard Anton, "Elementary Linear Algebra"

Working...