Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Mars NASA Technology

10 Years In, Mars Rover Opportunity Suffers From Flash Memory Degradation 105

astroengine writes Mars Exploration Rover Opportunity has been exploring the Martian surface for over a decade — that's an amazing ten years longer than the 3-month primary mission it began in January 2004. But with its great successes, inevitable age-related issues have surfaced and mission engineers are being challenged by an increasingly troubling bout of "amnesia" triggered by the rover's flash memory. "The problems started off fairly benign, but now they've become more serious — much like an illness, the symptoms were mild, but now with the progression of time things have become more serious," Mars Exploration Rover Project Manager John Callas, of NASA's Jet Propulsion Laboratory in Pasadena, Calif., told Discovery News.
This discussion has been archived. No new comments can be posted.

10 Years In, Mars Rover Opportunity Suffers From Flash Memory Degradation

Comments Filter:
  • Memory bristles
    Like Scottish thistles
    Make operation tough
    Plus the interplanetary stuff
    Burma Shave
  • I'm sorry (Score:5, Funny)

    by rossdee ( 243626 ) on Tuesday December 30, 2014 @09:53AM (#48695801)

    But to claim it under warranty, you have to return it to the manufacturer

  • It is time to start building out the martian rover maintenance infrastructure so these guys can be towed in for repairs and upgrades.
    • Unfortunately, Opportunity let its Astro-Afro-Antarctico-Amer-Asian Auto Association membership lapse.

    • Yeah - Spirit and Opportunity went far beyond their 90 Sol limit. And the issues they faced were determining factors in Curiosity being nuclear powered.
      • by matfud ( 464184 )

        I'm not sure where you are going with that. Both rovers are still active even if one is stuck. The solar panels seem to work for them even if there have been issues wrt dust on them. Curiosity is Nuclear powered as it is much much larger and has vastly larger power requirements to even move let alone perform experiments.

  • At least they have identified a fix. But it surely won't be too long before more of the flash memory banks start exhibiting similar behaviour.

    Still, 44x longer lifespan than originally planned == win in anyone's books.

    • by Anonymous Coward on Tuesday December 30, 2014 @10:14AM (#48695957)

      http://mars.nasa.gov/mer/mission/status_opportunityAll.html

      I don't know that one could expect similar behavior from the other banks on a similar schedule. This is fairly old technology in terms of design and software, so I don't think they're doing any sort of automatic wear leveling, for instance. It's probably "manually leveled" if at all. For all we know, bank 7 was used the most and it's worn out. Or, it's taking more total ionizing dose (TID) because of the physical location on the card. Or, it's just a process variation when making the flash chips themselves. They were probably fabricated in 2000, most likely at Micron, since for a 2003 launch, the computer was probably assembled by early 2002, if not earlier.
      Or, the software is not optimized for "space flight use" but, rather, for "consumer camera memory card", which has a different read/write/erase pattern and error tolerance.

      http://spinroot.com/gerard/pdf/25MC.pdf describes an improved file manager under development, but also describes the existing flash architecture.

      • Or, the software is not optimized for "space flight use" but, rather, for "consumer camera memory card", which has a different read/write/erase pattern and error tolerance.

        The flash memory controller was created in-house. Back in 2004, Spirit had well-documented memory issues that were traced to file system logic that didn't properly clear deleted files during a reset. Eventually, storage systems were overrun, which forced NASA to basically reformat the storage system and start afresh after reprogramming the controller firmware.

    • I agree. It's had a fantastic run, but it'll be a real loss when it finally stops working.
  • by Anonymous Coward

    Would it have cost to ship it with a RAID array of flash drives?

    • To say nothing of a beowulf cluster.
    • by hattig ( 47930 )

      Well, it launched in 2004 ... and space tech is usually about a decade behind again...

      Luckily it is only one bank of flash that's bad, so they're going to work around it by disabling that one - probably means a reduction in overall capacity, but maybe it's enough to solve this issue (and/or it was overprovisioned in the first place).

      We're probably talking about kilobytes of flash here, rather than megabytes.

  • If only they had over-engineered it last, this never would have happened!
  • If it was long-known that long-duration, low-intensity heat would revive failed flash [slashdot.org], why did these rovers leave without the ability to do so?

    And why am I not able now to buy flash memory that will heat itself to 800 degrees and heal itself?

    And why isn't flash memory sold in ceramic housings that can stand me baking them in an oven for a few days to fix failed flash manually?

    I'd like to buy hardware that works, or that can be repaired. That's not flash.

    • Maybe because the things powered by solar panels that could barely heat a cup of coffee much less get a flash card to 800 degrees?

    • by Guspaz ( 556486 ) on Tuesday December 30, 2014 @10:40AM (#48696137)

      Explain how the results of research done two years ago could have been built into a probe launched ten years ago using technology from twenty years ago?

      • by emil ( 695 )

        This was known, and should have been exploited:

        Although subjecting the cells to high heat could return memory, the process was problematic; the entire memory chip would need heating for hours at around 250 C.

        The rover is equipped with heaters. There is some possibility that simply placing the flash closer could have extended the life of the memory.

        • There may be some possibility. That would, of course, have *definitely* added to the complexity and time taken to construct the rover. Which was done on the cheap, to meet a limited duration mission goal that it has vastly exceeded...without the extra complexity whose omission you find egregious.
        • by bledri ( 1283728 ) on Tuesday December 30, 2014 @01:48PM (#48697935)

          This was known, and should have been exploited:

          Although subjecting the cells to high heat could return memory, the process was problematic; the entire memory chip would need heating for hours at around 250 C.

          The rover is equipped with heaters. There is some possibility that simply placing the flash closer could have extended the life of the memory.

          The rover's primary planned mission was 3 months and the extended mission plan was two years. It lasted 10 years and your upset they didn't design a way to bake the flash (offline) for four hours at 250C? Self heating flash did not exist, should they heat all the electronics? Invent a mechanism to remove the flash and put it in a little oven? Are you shutting down the rover's computer for this? How much complexity would that have added? How long would it take to develop?

          There is such a thing as "good enough," and engineers that don't know that never ship usable product.

      • by Tablizer ( 95088 )

        If the gov't has the power to insert birth announcements into Hawaiian newspapers decades old, then it can send new research to old NASA.

    • If it was long-known that long-duration, low-intensity heat would revive failed flash, why did these rovers leave without the ability to do so?

      The article you link to is dated 2012 - the MER rovers launched in 2003. You do the math.

      And why am I not able now to buy flash memory that will heat itself to 800 degrees and heal itself?

      Put an 800 degree flame inside the electronic equipment you use the flash memory in - stand back, way back, and borrow a friend's phone, tablet, or PC to report the results

      • The article you link to is dated 2012 - the MER rovers launched in 2003. You do the math.

        No, the article says that you either need low-intensity, long duration heat (which has apparently long been known), or high-intensity, short-duration:

        Although subjecting the cells to high heat could return memory, the process was problematic; the entire memory chip would need heating for hours at around 250 C.

        We are still buying flash that we can't fix because of the packaging. We're still shipping this unfixable flash

        • The article you link to is dated 2012 - the MER rovers launched in 2003. You do the math.

          No, the article says that you either need low-intensity, long duration heat (which has apparently long been known), or high-intensity, short-duration:

          Hello, McFly... did you even bother to read why you're replying to?

          And either way, dissipating the heat is going to be a serious problem - 250C is still high enough to potentially cause considerable damage to the surrounding components.

          We are still buying flas

    • Re: (Score:3, Informative)

      by JoshuaZ ( 1134087 )
      These rovers were designed to last 90 days. The most broad plans extended to about a year if they were lucky. So no plans were made for every thing that could go wrong 5 to 10 years down the road.
      • by godber ( 13887 )

        These rovers were designed to last 90 days. The most broad plans extended to about a year if they were lucky. So no plans were made for every thing that could go wrong 5 to 10 years down the road.

        This is the main answer to many of the questions beginning with "Why didn't they..."

        - Austin

    • by gman003 ( 1693318 ) on Tuesday December 30, 2014 @11:02AM (#48696319)

      The expected mission life of the rover was 90 days. It is currently on day 3885.

      They expected to run out of power several years ago. Thus, they did not design other parts of the system to last as long as it has. Given the designed lifetime, it would have been absurd to add the extra weight of a heating system, if such a thing could even be powered at all.

      For a car analogy, that would be like reinforcing your transmission because after 10,000,000 miles it starts to get a bit off-balance.

    • Because NASA is staffed with morons too stupid to consult with you, obvs.
  • if the issue turned out be mould.
  • Or at least the failing flash isn't the reason the problem is serious. Software bugs involving how the failed flash is handled are the problems, causing infinite loops and automatic reboots.

  • by mschaffer ( 97223 ) on Tuesday December 30, 2014 @11:06AM (#48696363)

    So, does that mean that NASA needs to go back to the plated wire memory and tape systems like the Honeywell systems that ran the Viking and Voyager systems for decades on Mars and in space?

    • No. Opportunity was designed for a 90 day mission. It's on 10 years now. So failing flash memory isn't going to be a problem if NASA's next Mars rover has a mission length of one year. If NASA is planning on a 10 year Mars Rover, though, they'll want to take this flash degradation into account. Somehow, I don't see a planned 10 year mission happening. A one year mission that lasts ten years? Possibly. But not a mission that is planned to last for 10 years.

    • I suspect on a $/MB basis, the remaining functional flash memory cells are much cheaper than plated wire memory. And tape suffers from moving parts and degradation over time.

      I think the real solution to this will be developing methods to identify failing cells and have the controller write around them. Kinda like HDDs mark bad sectors. Heck, you could probably buy dozens of banks of flash memory for the same cost as plated wire memory, and switch to a new bank as soon as the old one developed too many
  • Let me guess: they used OCZ flash memory?

  • Dave, my mind is going. I can feel it. I can feel it. My mind is going. There is no question about it. I can feel it. I can feel it. I can feel it. I'm a... fraid.
  • 12 years ago no smartPhones, tablets, on flash laptops due to expense of flash. Even Ipods had micro-disks. Just a few cameras and mp3 players with very limited memory. Those devices, or at least there chips, were upgraded long ago.
  • Happens to us all, Opportunity.

  • Now I'll just fire up my Steampunk Mars Exploratron and off we go!

  • If good science would be still available after a decade (Opportunity) or many decades (Voyager), at least light components like flash and electronics in general should be designed with good degree of redundancy. Or else if the probe has a limited mission and has accomplished it, there is nothing wrong with abandoning it and focusing money and talent on new missions. Would engineers working on attempts to fix Opportunity be more useful working on newer Curiosity mission? My gut feeling is that making existin

    • The point is that mission planning should have clear focus one way or the other.

      The mission was designed to last 90 days. Through the wonder of excellent engineering and fortuitous circumstances during the mission, it has lasted a decade. There is no reason to abandon the mission now while they're still managing to get good science out of the vehicle and its instruments. When such time comes that the cost is greater than the justification to extend the mission, it shall be retired as so many other missions have in the past.

  • This is an interesting event. Failure of the flash memory can only really be overcome by either replacing it or having a secondary flash that's on standby, syncing up periodically so that it has much less wear on it, so you can extend the mission by switching over to the backup/secondary flash memory. However, this would add precious ounces to the payload, thereby requiring more fuel, etc.

  • But then again, who does?

If all the world's economists were laid end to end, we wouldn't reach a conclusion. -- William Baumol

Working...