Patching Software on Another Planet 96

Posted by Soulskill on Saturday July 06, 2013 @11:47AM from the no-do-overs dept.

An anonymous reader writes "Sixteen years ago, the Mars Pathfinder lander touched down on Mars and began collecting about the atmosphere and geology of the Red Planet. Its original mission was planned to last somewhere between a week and a month, but it only took a few days for software problems to crop up. The engineers responsible for the system were forced to diagnose the problem and issue a patch for a device that was millions of miles away. From the article: 'The Pathfinder's applications were scheduled by the VxWorks RTOS. Since VxWorks provides pre-emptive priority scheduling of threads, tasks were executed as threads with priorities determined by their relative urgency. The meteorological data gathering task ran as an infrequent, low priority thread, and used the information bus synchronized with mutual exclusion locks (mutexes). Other higher priority threads took precedence when necessary, including a very high priority bus management task, which also accessed the bus with mutexes. Unfortunately in this case, a long-running communications task, having higher priority than the meteorological task, but lower than the bus management task, prevented it from running. Soon, a watchdog timer noticed that the bus management task had not been executed for some time, concluded that something had gone wrong, and ordered a total system reset.'"

Patching Software on Another Planet

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 96 Comments Log In/Create an Account

Comments Filter:

old? (Score:1)

by Anonymous Coward writes:

I didn't even read the full summary. But hasn't the occurrence of this priority inversion issue been reported about ... many years ago?
- Re: (Score:2)
  
  by Luyseyal ( 3154 ) writes:
  
  Yep. And though I'm not seeing it in Google, I'm pretty sure there was an article about it in Slashdot (I remember because it was the first time I had heard of VxWorks). I did find this CmdrTaco post from 1999 asking Slashdotters to submit their Top 10 Hacks of All Time [slashdot.org].
  -l
- Re: (Score:2)
  
  by Cenan ( 1892902 ) writes:
  
  Reading TFA to the rescue:
  L. Sha, R. Rajkumar, and J. P. Lehoczky. Priority Inheritance Protocols: An Approach to Real-Time Synchronization. In IEEE Transactions on Computers, vol. 39, pp. 1175-1185, Sep. 1990.
- Re: (Score:2)
  
  by Man On Pink Corner ( 1089867 ) writes:
  
  Well, the more important stories were hogging the lock.
Sounds like this was noticed earlier ... (Score:5, Interesting)

by xmas2003 ( 739875 ) * writes: on Saturday July 06, 2013 @11:51AM (#44203389) Homepage

From TFA: "Engineers later confessed that system resets had occurred during pre-flight tests. They put these down to a hardware glitch and returned to focusing on the mission-critical landing software"

Very surprised by this ... even if a hardware glitch, wouldn't you want to track that down before launch? Especially since in the harsh space environment (bit flops even with hardened RAM/CPU), you want your hardware to be as reliable as possible.

- Re:Sounds like this was noticed earlier ... (Score:5, Interesting)
  
  by mlts ( 1038732 ) * writes: on Saturday July 06, 2013 @12:13PM (#44203475)
  
  Devil's advocate here:
  If it were my guess, there are so many priorities of glitches, and with a limited budget, if it isn't something that actively shuts down operations, resources are spent on other things.
  The one good thing in this equation is the watchdog circuits. Without these in place, it can mean the hardware goes down and never comes to life again.
  It is extremely hard to get working operating systems and patch management here on Earth [1]... much less having systems that are made to work where there is no way to walk up to the machine, and re-flash a new OS via the JTAG ports.
  [1]: Patch management had issues for every OS I've used. AIX gets issues via lppchk which means force-installing LPPs, RedHat gets RPM glitches possibly forcing a rebuild of the DB, Windows sometimes will just not install, or permit to be installed an update from WU, and so on. Now, with this in mind, trying to patch a machine millions of miles away is very daunting for even the best of the best.
  
  - Re:Sounds like this was noticed earlier ... (Score:5, Interesting)
    
    by girlintraining ( 1395911 ) writes: on Saturday July 06, 2013 @12:58PM (#44203739)
    
    If it were my guess, there are so many priorities of glitches, and with a limited budget, if it isn't something that actively shuts down operations, resources are spent on other things.
    Devil here: This isn't a budget problem, this is a management problem. Going all the way back to the Challenger disaster, NASA has shown a pattern of disregard for proper engineering practice. Richard Feynman chewed their ass out in Appendix F [nasa.gov] of the Challenger report to congress, and it was so scathing that both Congress and NASA tried to kick him off the board and discard his results... prompting the entire senior engineering staff of all branches of the Shuttle project to sign a petition saying: Either publish this, or face our wrath.
    This isn't a technical problem -- this is management having shitty project management skills. If the budget is insufficient, then the project scope has to be reduced. It's just that simple. This is not the engineers' fault, or is it the fault of the technology... this is management trying to do too much with too little.
    
    - Re: (Score:2)
      
      by 0123456 ( 636235 ) writes:
      
      Bugs get prioritised, and when you have to launch this year or wait three years for the next launch window, non-critical bugs aren't going to delay a launch. A bug that causes the computer to reset and return to operation is not a critical bug for a system that's rolling over the surface of Mars at a few feet per minute.
      I remember reading that the Apollo Guidance Computer developers would randomly press the reset button while testing their software just to ensure that, if it did reset, that wouldn't cause p
    - Re: (Score:2, Insightful)
      
      by DerekLyons ( 302214 ) writes:
      
      This isn't a technical problem -- this is management having shitty project management skills. If the budget is insufficient, then the project scope has to be reduced. It's just that simple. This is not the engineers' fault, or is it the fault of the technology... this is management trying to do too much with too little.
      It must be nice to live in your black and white world, but the rest of us live in the real world where engineering, budget, and schedule tradeoffs are a reality.
      - Re: (Score:2)
        
        by gl4ss ( 559668 ) writes:
        
        This isn't a technical problem -- this is management having shitty project management skills. If the budget is insufficient, then the project scope has to be reduced. It's just that simple. This is not the engineers' fault, or is it the fault of the technology... this is management trying to do too much with too little.
        It must be nice to live in your black and white world, but the rest of us live in the real world where engineering, budget, and schedule tradeoffs are a reality.
        well yeah.. unless you're building something to go to space. I suppose someone knew that the problem wouldn't make patching impossible.
        of course if benefits from the project are totally immeasurable then it doesn't matter that much that the thing might waste the entire budget if it doesn't work.
        
        Re: (Score:2)
        
        by DerekLyons ( 302214 ) writes:
        
        It must be nice to live in your black and white world, but the rest of us live in the real world where engineering, budget, and schedule tradeoffs are a reality.
        well yeah.. unless you're building something to go to space.
        Well, no, Even things that are being built to go into space suffer from the same limitations as anything else.
    - Re: (Score:2)
      
      by arth1 ( 260657 ) writes:
      
      Richard Feynman chewed their ass out in Appendix F of the Challenger report to congress, and it was so scathing that both Congress and NASA tried to kick him off the board and discard his results... prompting the entire senior engineering staff of all branches of the Shuttle project to sign a petition saying: Either publish this, or face our wrath.
      I have a hard time believing that Feynman wrote it, or that it wasn't re-written by someone else before it was published. Read this (emphasis mine):
      "A more reasonable figure for the mature rockets might be 1 in 50. With special care in the selection of parts and in inspection, a figure of below 1 in 100 might be achieved but 1 in 1,000 is probably not attainable with today's technology. (Since there are two rockets on the Shuttle, these rocket failure rates must be doubled to get Shuttle failure rates from
      - Re: (Score:2)
        
        by arth1 ( 260657 ) writes:
        
        I wouldn't be so sure. If x 1, then (1-x)^2 ~= 1 - 2x. It would be Feynman's style to simplify his message, even if it meant the loss of a bit of precision.
        Not to the point of being factually incorrect, especially in a context where the difference between understanding it correctly or not is statistically significant.
        The "since there are two rockets, this must be doubled" text implies that adding is the correct approach. That would mean that if launching shuttles 25 times with an 1:50 risk, there's a cumulative 100% risk of failure. That's obviously not the case, and the error is significant (the risk of failure would be around 63.6%, not 100%)
        I find it much
      - Re: (Score:2)
        
        by arth1 ( 260657 ) writes:
        
        In this case, the difference between a 3.96% failure and a 4% failure rate is going to be pretty insignificant until you start having thousands of trials, while the difference between 2% and 4% could make a difference fewer than a hundred. The doubling won't be off by more than 10% until you get to failure rates above 20%. Wasting too much time fighting over the difference between 3.96% and 4% kind of misses the frequent lack of precision in such numbers and is part of the mindset that caused some of the problems with estimating failure rates in the first place.
        Bzzt, wrong. The difference between multiplying the failure rate for one device with the number of devices and multiplying the inverses very quickly becomes significant. Using the "reasonable" failure rate of one in fifty, we get:
        1 launch (2 rockets): 4% vs 3.96%
        10 launches: 40% vs 33.2%
        25 launches: 100% vs 63.4%
        That is a significant difference, for a moderate number of launches.
        Again, I cannot believe that Feynman would have written something that would sound plausible to those who don't know statistic
  - Re: (Score:2)
    
    by sjames ( 1099 ) writes:
    
    I have to wonder these days if including a BMC that CAN re-flash through JTAG remotely might have become practical. While it is extra weight and we'd need a hardened BMC, it's not like it has to have much performance as long as it runs at all. Given that it could save a mission it could be a good trade-off.
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
- Re: (Score:1)
  
  by Google Fanboys ( 2974975 ) writes:
  
  There's just so much you can do. Eventually you just have to leave it and hope it works.
- Re: (Score:1)
  
  by Cammi ( 1956130 ) writes:
  
  Sounds like a management decision.
- - Re: (Score:2)
    
    by dragonsomnolent ( 978815 ) writes:
    
    Sudden acceleration syndrome was chalked up to operator error. Even revving the engines to full, the brakes supplied more than enough stopping force, every single time. So even from the get go, brakes were over-powering the engine (brakes work so well, in fact, that they had to invent a technology to make them not work so well it's called anti-lock brakes).
    - - Re: (Score:2)
        
        by dragonsomnolent ( 978815 ) writes:
        
        ABS was to stop the brakes from locking up in the first place, regardless of road condition (in other words, brakes work so well they had to invent a way to interrupt them). If you hold your brake pedal down and stomp the gas, your car will stay stationary, if you do it long enough (assuming you have an automatic transmission) You'll blow out your torque converter, your transmission or your engine (depending on the weak link) if you do it in a manual transmission, you'll kill the engine. You missed the poi
        
        Re: (Score:1)
        
        by Anonymous Coward writes:
        
        This is the biggest load of bullshit I have ever read on Slashdot.
        Go ahead on provide a citation on how ABS is supposed to protect engines from braking.
- Re: (Score:2)
  
  by FatdogHaiku ( 978357 ) writes:
  
  From TFA: "Engineers later confessed that system resets had occurred during pre-flight tests. They put these down to a hardware glitch and returned to focusing on the mission-critical landing software"
  
  Very surprised by this ... even if a hardware glitch, wouldn't you want to track that down before launch? Especially since in the harsh space environment (bit flops even with hardened RAM/CPU), you want your hardware to be as reliable as possible.
  Perhaps they were thinking about that sweet sweet mileage charge for a service call?
- Re: (Score:2)
  
  by v1 ( 525388 ) writes:
  
  Inter-planetary travel imposes inflexible deadlines. Planetary alignment dictates when your launch windows are, and they are frequently several years apart. Compare it with the space shuttle for example, where you can get a launch window everyday or two and have lives at risk. Project planning on an inter-planetary launch spreads out over years. If your part of the project starts getting behind, and it's not something you can fix by simply throwing more resources at it, you have to prioritize so you don
- Re: (Score:2)
  
  by MrBandersnatch ( 544818 ) writes:
  
  I seem to recall reading that the return on investment for the US economy from the 1960s space program was something like 100-1. Today government investment in a space program acts as investment for private companies to develop new technologies - and I would be unsurprised to discover that the return is still not above 10-1 from an economic perspective.
  If you really want to attack waste of money spending there are FAR better targets.
- Re: (Score:1)
  
  by Impy the Impiuos Imp ( 442658 ) writes:
  
  I remain convinced there should be a mod option of +1 Troll.
  - Re: (Score:2)
    
    by NoNonAlphaCharsHere ( 2201864 ) writes:
    
    Moderation isn't near enough. That one was worthy of the Nobel Prize for Trolling.
- Re: (Score:2)
  
  by xmundt ( 415364 ) writes:
  
  Our tax dollars are going to these projects because private enterprise is unwilling to take up projects that will produce a guaranteed return for their investors. It is notably unwilling to take on risky projects, or, projects that do not have that clear return. Only an organization that has no profit motive (I.E. The federal government) is willing to invest the large sums in a project that might blow up during the boost phase of a launch. The fact is that the space program is quite profitable -
  - Re: (Score:2)
    
    by meerling ( 1487879 ) writes:
    
    Not to mention that the developments and data that is made available to the public and private industries by NASA and their space exploration and technology developments are responsible for a not insignificant chunk of the GNP. If it were private industry that were doing that, and they won't due to risk and unquantifiable short term return estimates, they would charge through the nose or hoard all that good stuff for themselves. The net result to the economy and human life would be negligible at best, and c
- Ingeniously cryptic point! (Score:1)
  
  by SinisterRainbow ( 2572075 ) writes:
  
  Is this (another) failed sarcastic statement, or did I really just read that. I'm taking away the point that the world needs less true believers, and people need to stop writing sarcasm online.
- - Re: (Score:2)
    
    by kasperd ( 592156 ) writes:
    
    The fastest spacecraft we've ever launched won't reach Alpha Centauri for 40000 years, and that's only because it's not stopping.
    Leaving now isn't the fastest way to get there. You'd get there faster by waiting back on Earth for more efficient propulsion technology to be developed. So when is the right time to leave in order to get there as soon as possible? That is a question which can only be answered in retrospect. One day people can look and say, hey we could have been there already if only we had left
    - Re: (Score:2)
      
      by arth1 ( 260657 ) writes:
      
      What will mankind do, once there are no more habitable star systems left in this galaxy? I guess some crazy attempts at reaching other galaxies.
      One thing is certain - by that time, it would not be mankind, any more than what we are today can be called fishkind.
      But I highly doubt that we'll get there. Evolution does not favour long term strategies unless those picking short term strategies die off.
      - Re: (Score:2)
        
        by kasperd ( 592156 ) writes:
        
        One thing is certain - by that time, it would not be mankind, any more than what we are today can be called fishkind.
        In terms of time passed it could be a shorter period than the time it took to evolve from monkeys into todays humans. Whether we will use the term human about all descendants of humans is a matter of definition. The changes in culture and technology are likely to be greater than the changes in genome. But all of the changes would be subject to evolutionary selection.
        But I highly doubt that we
        
        Re: (Score:2)
        
        by arth1 ( 260657 ) writes:
        
        In terms of time passed it could be a shorter period than the time it took to evolve from monkeys into todays humans.
        Humans did not evolve from monkeys. Monkeys and humans evolved from a common haplorhini ancestor which was neither monkey nor human, around 40 million years ago. Your typical random monkey has undergone as much evolution since then as the typical human has.
        Journeys to other solar system would take an enormous amount of years. So much so that it probably won't happen with live crews. Sending DNA records and reconstructing life at the destination might be the best bet. But even if we used the same bluepr
        
        Re: (Score:2)
        
        by kasperd ( 592156 ) writes:
        
        Monkeys and humans evolved from a common haplorhini ancestor which was neither monkey nor human
        If you could reconstruct pictures of what they looked like, I bet the majority of people would classify it as a monkey if you showed them the picture. And chances are nobody would classify it as a fish.
        Journeys to other solar system would take an enormous amount of years. So much so that it probably won't happen with live crews.
        If you could build propulsion capable of delivering 1G of acceleration continuously for
I've dealt with a related problem (Score:1)

by Anonymous Coward writes:

Fixing code written by someone from a different planet.
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  At my job we refer to them as Indian contractors.
  Usually subcontractors working for IBM.
- Re: (Score:2)
  
  by IWannaBeAnAC ( 653701 ) writes:
  
  The OS included priority inheritance mutexes, but IIRC the developers decided not to use them for reasons of 'efficiency'. Presumably it takes an extra cycle or two to lock a priority-inheriting mutex...
Priority inversion bug (Score:5, Interesting)

by BitZtream ( 692029 ) writes: on Saturday July 06, 2013 @12:13PM (#44203481)

This problem is known as priority inversion. Its a common concern in schedulers when critical functions run in their own threads. Its something that they should have known about and tested against. Or they could have used more traditional IO approaches and let the VxWorks IO system, which already has protection against priority inversion by design, do its job.

- Re: (Score:2)
  
  by k8to ( 9046 ) writes:
  
  The issue was simple.
  In VxWorks, when you create pipes for IPC, you get to choose what kind of semaphore you want, because people want the shortest deadlines possible (at least classically).
  JPL selected simple mutexes, which led the priority inversion.
  Pipes were, generally speaking, far less well exercised in the codebase at the time, and Wind River engineers explicitly advised the use of message queues which would offer the necessary functionality and would not have had the problem encountered.
Red Planet? (Score:1)

by Cammi ( 1956130 ) writes:

Do people still call this the red planet? lol
- Re: (Score:2)
  
  by Smivs ( 1197859 ) writes:
  
  On soviet Mars.......oh, never mind!
- Re: (Score:1)
  
  by maxwell demon ( 590494 ) writes:
  
  Did its colour change?
- Re: (Score:2)
  
  by gl4ss ( 559668 ) writes:
  
  Do people still call this the red planet? lol
  well, yeah. it looks red, sort of. at least from here.
  remake of total recall totally copped out though..
Boring story (Score:3)

by Hentes ( 2461350 ) writes: on Saturday July 06, 2013 @12:34PM (#44203617)

Seriously? This reads like morality tale for beginner programmers. "Remember kids, always check the settings of your mutexes!"
Will we also have articles about NASA engineers mistyping == for = ? Everyone makes mistakes, just because it happened in a rover doesn't make it interesting.

- Re: (Score:2)
  
  by wvmarle ( 1070040 ) writes:
  
  The interesting part is not so much that people make mistakes, it is how they are solved.
  - Re: (Score:2)
    
    by Hentes ( 2461350 ) writes:
    
    Not really, most bugs are easy to fix once found. Especially trivial ones like this.
    - Re: (Score:2)
      
      by wvmarle ( 1070040 ) writes:
      
      Of course, it's the process how to find the bug, and later update the remote device, that's interesting. I know fixing bugs is often just a few keystrokes - after spending hours or days searching for the cause.
      - Re: (Score:2)
        
        by Hentes ( 2461350 ) writes:
        
        But there's nothing in the article about how it was found.
- Re:Boring story (Score:5, Interesting)
  
  by Antique Geekmeister ( 740220 ) writes: on Saturday July 06, 2013 @12:48PM (#44203683)
  
  Actually, it's very interesting. It shows that even with the very extensive testing and layers of planning and managerial processes to prevent such errors, they can still creep in. And it shows that very expensive, one-off projects remain vulnerable to subtle design errors, so the tools to do field updates are _critical_.
  Note that designing for spacecraft can be a real artform: they have extremely limited computational resources, due to the inherent risks of bit errors in increasingly small modern silicon exposed to radiation and temperature changes, and you cannot simply shield the electronics: the shielding adds weight and itself becomes radioactive over time. So you often wind up using quite old but far more stable technologies. That means tools that may be considered quite obsolete by the time your design phase is complete and the device is ready for launch. And by the time it arrives _on Mars_, the techonology is very obsolete indeed.
  My respect for the programmers and designers of interplanetary spacecraft is enormous: systems like Voyager and the Mars Rover, Spirit, that exceed their lifespans by years fill me with pride as an engineer that we could build so well. And the obligatory XKCD on the subject:
  http://www.xkcd.com/695/ [xkcd.com]
  
  - Re: (Score:2)
    
    by Hentes ( 2461350 ) writes:
    
    Actually, it's very interesting. It shows that even with the very extensive testing and layers of planning and managerial processes to prevent such errors, they can still creep in. And it shows that very expensive, one-off projects remain vulnerable to subtle design errors, so the tools to do field updates are _critical_.
    That's true, but has been known for a while.
    Note that designing for spacecraft can be a real artform: they have extremely limited computational resources, due to the inherent risks of bit errors in increasingly small modern silicon exposed to radiation and temperature changes, and you cannot simply shield the electronics: the shielding adds weight and itself becomes radioactive over time. So you often wind up using quite old but far more stable technologies. That means tools that may be considered quite obsolete by the time your design phase is complete and the device is ready for launch. And by the time it arrives _on Mars_, the techonology is very obsolete indeed.
    That is indeed an interesting topic, and I wouldn't have complained if the article talked about that. But it was just a generic description of a common error with almost no details about the actual system. I didn't say that I don't respect the engineers working on the project. Even the best minds make simple errors. It just doesn't make for a good story.
  - Re: (Score:2)
    
    by MatthiasF ( 1853064 ) writes:
    
    All the more reason that public, non-military projects like this should have everything open sourced.
    
    Had the hardware and software platforms both been open sourced and available to the public, they would have had a lot more hands and eyes helping to correct these issues.
    
    The only way we're going to get off this planet is with mutual cooperation, and I think that should start between the public sector and..well..the public.
Mars Code (Score:3, Interesting)

by Anonymous Coward writes: on Saturday July 06, 2013 @12:45PM (#44203663)

At the USENIX "Hot Topics in System Dependability 2012" conference Gerard Holzmann of JPL labs gave a fantastic talk [usenix.org] about how they developed the software for the Curiosity rover. (spoiler: Having to display a Bieber poster in your cubical if break the nightly build, is a great motivator.)

Watchdog timer? (Score:2)

by PPH ( 736903 ) writes:

Are they certain it wasn't just the person on the tech support line who suggested rebooting it?
The problem was well known when the story was new (Score:5, Interesting)

by Cryptosmith ( 692059 ) writes: <rick@cryptosmith.com> on Saturday July 06, 2013 @03:35PM (#44204581) Homepage

This is a rambling bit of history. Move on if that's not your thing. I love reading about problems like the the Pathfinder problems. Trust me - such things often happen on Earth-bound systems, too.
Back in '79, I was working on a multiprocessing router for the ancient ARPANET. At the time the net had over sixty routers distributed across the continent. Actually we called them "imps" - well, "IMPS" but I'll use the modern term "router." We had a lot of the same problems as Pathfinder without ever leaving the atmosphere.
By then all ARPANET routers were remotely maintained. They all ran continuously and we did all software maintenance in Cambridge, MA. By then the basic software was really reliable. They rarely crashed on their own, and we mostly sent updates to tweak performance or to add new protocol features. Once in a while we'd have to use a "magic modem" message to restart a dead machine and to reload things. The software rarely broke so badly that we'd have to have someone on-site load up a paper tape. So remote maintenance was well established by then.
The multiprocessor didn't run "threads" it ran "strips." Each was a non-preemptive task designed to execute quickly enough not to monopolize the processor. If you wrote software for a Mac before OS-X, you know how this works. A multi-step process might involve a sequence of strips executed one after the other.
Debugging the multiprocessor code was a bit of a challenge because we could lock out multi-step processes in several different ways. While we could put our test router on the network for live testing, this didn't guarantee that we'd get the same traffic the software would get at other sites. For example, we had software to connect computer terminals directly to hosts through the router (the original "terminal access controllers"). This software ran at a lower priority than router-to-router packet handling. It was possible for a busy router to give all the bandwidth to the packets and essentially lock out the host traffic. Such problems might not show up until updated software was loaded into a busy site.
Uploading a patch involved assembly language. We'd generally add new code virus style. First you load the new code into some spare RAM. Once the code is loaded, we patch the working program so that it jumps to the patch the next time it executes. The patch jumps back to an appropriate spot in the program once the new code has executed. We sent the patches in a series of data packets with special addressing to talk to a "packet core" program that loaded them.
The bottom line: it's the sort of challenge that kept a lot of us working as programmers for a long time. And they pop up again every time someone starts another system from scratch.

- - Re: The problem was well known when the story was (Score:1)
    
    by Cryptosmith ( 692059 ) writes:
    
    At the time it seemed virtuous to implement state machines. One guy did his phd by building a mechanism that did coroutining - the programmer could write out the whole procedure and stick in the strip breaks after the fact. I suppose someone did something like that for the Mac, tho I stopped writing Mac code before seeing such a thing.
News? (Score:2)

by wonkey_monkey ( 2592601 ) writes:

More like "olds," am I right? Huh? Ahhh.
- Re: (Score:1)
  
  by SwedishCoward ( 1838398 ) writes:
  
  I remember a lecturer telling exactly this story when I took a real-time systems course circa 2001...
  - Re: (Score:2)
    
    by matfud ( 464184 ) writes:
    
    Far far older than that. It is not a new problem but it it a very persistent one. There are many ways to try and avoid the problem. Most do not work in practice. Priority inversion is quite tricky to deal with.
Threads (Score:2)

by Old Wolf ( 56093 ) writes:

This is why you don't use threads for important stuff...
That wasn't Windows (Score:1)

by aglider ( 2435074 ) writes:

or any other Microsoft related OS thanks God.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

old? (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Sounds like this was noticed earlier ... (Score:5, Interesting)

Re:Sounds like this was noticed earlier ... (Score:5, Interesting)

Re:Sounds like this was noticed earlier ... (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Ingeniously cryptic point! (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

I've dealt with a related problem (Score:1)

Re: (Score:1)

Re: (Score:2)

Priority inversion bug (Score:5, Interesting)

Re: (Score:2)

Red Planet? (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Boring story (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Boring story (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Mars Code (Score:3, Interesting)

Watchdog timer? (Score:2)

The problem was well known when the story was new (Score:5, Interesting)

Re: The problem was well known when the story was (Score:1)

News? (Score:2)

Re: (Score:1)

Re: (Score:2)

Threads (Score:2)

That wasn't Windows (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals