Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?
Mars Software Science Technology

Stress-Testing Software For Deep Space 87

kenekaplan writes "NASA has used VxWorks for several deep space missions, including Sojourner, Spirit, Opportunity and the Mars Reconnaissance Orbiter. When the space agency's Jet Propulsion Laboratory (JPL) needs to run stress tests or simulations for upgrades and fixes to the OS, Wind River's Mike Deliman gets the call. In a recent interview, Deliman, a senior member of the technical staff at Wind River, which is owned by Intel, gave a peek at the legacy technology under Curiosity's hood and recalled the emergency call he got when an earlier Mars mission hit a software snag after liftoff."
This discussion has been archived. No new comments can be posted.

Stress-Testing Software For Deep Space

Comments Filter:
  • by Anonymous Coward on Wednesday October 10, 2012 @11:02PM (#41615385)

    that's why land-based projects like SKA for example which also take decades to complete are designed taking moore's law into account, leading to a very funny situation in which the project starts, they start building stuff but the computers that will run the thing are still 10 years away... (and I guess everybody just hopes computers will keep up or else...)

    Also you must take into account that the actual instruments are being built fairly early i.e. 5 or more years before launch since there is a LOT of testing calibration more testing etc. Additionally, when the stake is a billion dollar project like these you tend to leave fancy new things and favor old proven and well documented tech. Just in case...
    If not you just mount two instruments if you have space and money a fancy new one and the old usual thing (such is the case for Solar Orbiter for example)

  • by AaronW ( 33736 ) on Wednesday October 10, 2012 @11:20PM (#41615503) Homepage

    With my long experience with VxWorks this doesn't surprise me. VxWorks is not the most robust RTOS. Think of it as a multi-tasking MS-DOS. The version they used has no memory protection between processes and I have found numerous areas of VxWorks to be badly implemented or downright buggy. Up through version 5.3 the malloc() implementation was absolutely horrid and suffered from severe fragmentation and performance problems. On the platform I was working with I replaced the VxWorks implementation with Doug Lea's implementation (which glibc was based off of) and our startup time dropped from an hour to 3 minutes. I was also able to easily add instrumentation so we could quickly find memory leaks or heap corruption in the field, something not possible with Wind River's implementation. After reading about the problems with the filesystem I looked at the Wind River filesystem code. It was rather ugly. They map FAT on top of flash memory (not the best choice) and the corner cases were not well handled (like a full filesystem).

    Similarly, their TCP/IP stack sucked as well. If you can drop to the T-shell through a security exploit you totally own the box (i.e. Huawei's poor security record).

    VxWorks is fine for simple applications, but for very complex applications it sucks. At least the 5.x series do not clean up after a task if it crashes because it does not keep track of what resources are used by a task. A task is basically just a thread of execution. All memory is a shared global pool. At the time it did have one feature that was useful that was lacking in Linux, priority inheritance mutexes. These are a requirement for proper real-time performance and I believe are now included in Linux.

  • pshaw, we use RTEMS (Score:3, Informative)

    by Anonymous Coward on Wednesday October 10, 2012 @11:30PM (#41615559)

    the other big player in space RTOS: RTEMS.
    Free, open source,

    Has all the same problems as VxWorks.. no process memory isolation (because space flight hardware doesn't have the hardware to support it usually)....

    One thing that VxWorks has that RTEMS doesn't, and I wish it did, was dynamic loading and linking of applications. You're basically back in 1960s monolithic image days, not even with overlay loaders.

  • by Anonymous Coward on Wednesday October 10, 2012 @11:42PM (#41615601)

    And that's where you (and most people) are mistaken.

    A RTOS is not an OS that acts "quickly", it's an OS which provide a 100% guarantee that a task will be executed in a definite time-frame, whether this needs to be 1 micro-second or 1 hour ; and which provide guarantees if the task can not be completed in this time-frame. A job neither Windows nor any flavor of Linux can achieve.

  • by Meditato ( 1613545 ) on Thursday October 11, 2012 @01:59AM (#41616209)

    Look, that guy ("Required Snark") might have been an asshole, but you didn't really acquit yourself well either in your original post. I cofounded and work for a real-time telemetry contractor. We use Android, but the Linux kernel isn't built to handle read-time applications reliably. There are too many things to handle in terms of time-safe task-switching, execution, multi-processing, and internal consistency in order for it to be a good RTOS. So keeping that in mind, I had to implement a real time environment in userspace that uses root and some native code in order to collect data, send data, and operate hardware in a safe, timely manner. But this isn't the best solution because I still have to deal with the fact that it's all just a frustrating abstraction sitting on top of a kernel that isn't at all concerned with what I'm actually trying to do, despite my best efforts to single-handedly make the necessary changes.

    Your "newer processors" bit is also completely off the mark. Radiation-hardened processors lag generations behind owing to the need for extensive redesign and testing. Complicating this picture is the fact that even then, they still have varying levels of reliability and power efficiency. You don't want a processor that has a microcode architecture that makes your targeted code difficult to semantically evaluate and verify. You don't want (or need) a recent processor that hasn't had extensive real-world user testing. You want a processor in the goldilocks zone, one that you've worked with before and has a community behind it.

    Keeping that all in mind, they chose a good processor, and already had an OS largely built for it based on previous missions with earlier versions of the same processor.

  • by lordholm ( 649770 ) on Thursday October 11, 2012 @02:56AM (#41616479) Homepage

    Newer missions collect too much data to transmit everything back to earth. They typically need to do local processing of for example images and other data. There is also AI aspects, for the ExoMars rover (made by Europe), the onboard computer will have a virtual scientist embedded. This virtual scientist look at the camera pictures and decide if something is worth an extra look, and may order the rover to carry out opportunistic science. I am not sure as to whether this is the case with Curiosity, by I could easily imagine this is the case. In fact, newer missions have substantial need for computational power. But, there is no software reason to do these computational tasks on the main computer, the task may as well be sent to a soft realtime helper computer, that may as well run Linux or something else. A lost image is typically not the end of the world.

    In many cases the spacecraft and rovers are also not hard realtime, but they are also not soft realtime either (i.e. we compute thruster response for t=0, only to have the thrusters fired at t+0.1 or something in that range, whether they fire within this time does not really matter except during docking, landing and separation), I was trying to push through the notion of firm realtime when I was working in the space sector, but the main problem with this notion is that we do not yet know what effects it has in terms of sw design. Any way...

    The primary reasons for running 10 year old CPUs is that, 1) specs are chosen early in the project, this is important as the CPU specs are guiding the development of the SW requirements and the actual implementation of the SW and 2) as you say, the older CPU will be battle tested before they are sent into deep space.

  • by Animats ( 122034 ) on Thursday October 11, 2012 @03:02AM (#41616503) Homepage

    I get the idea of being hardened to radiation but it was my understanding we have newer processors that fit the bill on this.

    Radiation-hardened processors are hard to get. For one thing, they're export-controlled, so if you make them in the US, you can't sell many. Atmel makes a rad-hard SPARC CPU, and they've sold 3000 of them. Nobody seems to have built a modern x86 design or even an ARM in a rad-hard technology.

    There's a basic conflict between small gate size and radiation hardness. The smaller the transistors, the more likely a stray particle can damage or switch them. So the latest small geometries aren't as suitable. Also, the more radiation-hard processes, like Silicon on Sapphire, aren't used much for high-volume products.

    As a result, rad-hard parts are an expensive niche product. It's not inherently expensive to make them, but the volume is so small that the cost per part is high.

How long does it take a DEC field service engineer to change a lightbulb? It depends on how many bad ones he brought with him.