Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
NASA Mars

NASA Update Will Deal With Opportunity Flash Memory "Amnesia" 52

BarbaraHudson writes Computerworld has some details on NASA's latest fix to allow the Opportunity Mars Rover to store data while in overnight "sleep mode." Opportunity has been suffering from a glitch that's causing what NASA scientists describe as memory and data loss — or robotic "amnesia" — caused by flash memory deterioration since early December. Currently any information gathered is stored temporarily in RAM and must be sent to Earth so it's not lost when Opportunity powers down.
This discussion has been archived. No new comments can be posted.

NASA Update Will Deal With Opportunity Flash Memory "Amnesia"

Comments Filter:
  • by Russ1642 ( 1087959 ) on Thursday January 08, 2015 @03:34PM (#48768183)

    This is why I'll never use an SSD in my computer. They're just unreliable.

    • Comment removed (Score:4, Informative)

      by account_deleted ( 4530225 ) on Thursday January 08, 2015 @03:38PM (#48768235)
      Comment removed based on user account deletion
    • by gstoddart ( 321705 ) on Thursday January 08, 2015 @03:38PM (#48768239) Homepage

      While that may or may not be true, it's kind of random.

      These things were intended to last 90 days or so. It's been 11 years.

      At this point, if they're still able to apply updates to make fixes to the damned thing it means this has outlived its originally planned lifespan by a massive amount of time.

      I hardly think that's a fair critique of SSD in general. I'd say pretty much every part of that rover has performed well beyond anything it was ever expected to.

    • by Redbehrend ( 3654433 ) on Thursday January 08, 2015 @03:49PM (#48768379)
      I run SSD's on all my computers and never had one fail yet and when one does, it takes less than 10 min to get a new one up and running. (Already tested my restore plan with my backups) In the meantime I enjoy the awesome performance and get back my 10 min and money in x1000 from the time it saves lol.
      Most of the ones that fail are the cheap clearance, no name ones, the good Samsung and Intel ones are developing a great track record.

      But on topic, the rovers are all custom and it's been 10+ years, i believe they already know about this flaw now and they have found ways around it with newer flash memory.
      • I've had an SSD in my desktop for years now. It's fine. I think that poster might have a sense of humour.

    • by Anonymous Coward

      I will just leave this here...

      http://techreport.com/review/26058/the-ssd-endurance-experiment-data-retention-after-600tb

      Remember that flash memory is basically from 1999. I bought a 1TB drive last year. I have in that year put about 5TB on it. They are rated for 100TB.

      I do tons of dev work and virtual machines on it.

      SSDs have really come into their own. I had a laptop from 2009 that was continously running with a 16gig SSD that was running 24/7 for 5 years. It would still be running but I wanted a bett

    • by bluefoxlucid ( 723572 ) on Thursday January 08, 2015 @04:09PM (#48768585) Homepage Journal

      Actually, the amount of NAND in Opportunity is 256MB, which isn't a lot. RAM is only 128MB; NAND is routinely given a full wipe and rewrite. Given it was only meant to last for 90 days, it's probably a low-grade MLC with under 5000 erase cycles reliability.

      By contrast, a desktop SSD of 32-256GB will see a daily write cycle of under a gigabyte per day. The documents, cache, e-mails, and updates per day total very little--hundreds of megabytes, at best--for most workloads, up to and including intensive workloads such as 3D modeling and multi-layer 2D image editing, which produce massive work sets but only make minor changes to them. Large workloads include video editing, which does write multi-gigabyte files out on each large render operation (which may be frequent in some workflows).

      System drive SSDs see small workloads even when used as a swap device: swap offloads a lot of stale memory from RAM, which is either hardly ever used, never written to, or simply stale. The brk() area can have fragmented holes too small to take new allocations, and so may page out entirely unused RAM to disk; much RAM (such as GUI elements, textures, models, and audio assets) contains load-once data that's written to disk when swapped, and then only read later, such that it's left on swap and evicted from RAM without writing out again when memory gets scarce. That means 2GB of swap might be 2GB of writes since boot time, plus maybe a hundred megabytes or less per day of continuous system run.

      Finally, system drives also wear level internally. Writing to the same 1MB area over and over will spread the writes out: the controller will internally account for those blocks being erased and rewritten elsewhere, often by additive writing (writing without erasing) to avoid wearing the drive (e.g. you can have a used and erased map, in which anything erased or not used is free; you can then write to the used map and to the erased map, until you run out of free blocks and need to erase a newly used block, and thus need to mark all used-erased blocks free and mark no blocks as erased, costing one write-erase cycle even though you've written thousands of times). You'll get a full write-erase cycle every full data width write: even if you write to the same spot repeatedly, you only use one write-erase cycle writing 32GB to a 32GB drive, or 256GB to a 256GB drive.

      Accounting for all this, the drives are quite long-lasting. For a 10,000 cycle drive, you'd have to write 320,000GB to a 32GB SSD or 2,560,000GB to a 256GB SSD to wear it out. That's 876 years at 1GB per day for a 32GB drive, or over 8 years if you're writing 100GB per day. For a 256GB drive, it's 7000 year-gigabytes, or 70 years if you're writing 100GB per day. Modern MLC NAND can survive 100,000 write-erase cycles before failure, so these lifetimes may be 10 times higher; high-end SLC drives can survive 10,000,000 write-erase cycles, and can back high-traffic SANs for decades.

      They actually burn out less frequently than hard drives.

      • by Kjella ( 173770 ) on Thursday January 08, 2015 @05:06PM (#48769333) Homepage

        Actually, the amount of NAND in Opportunity is 256MB, which isn't a lot. RAM is only 128MB; NAND is routinely given a full wipe and rewrite. Given it was only meant to last for 90 days, it's probably a low-grade MLC with under 5000 erase cycles reliability.

        I very much doubt that. If you're going to spend that amount of money on sending it to Mars, you don't skimp on off-the-shelf technology that costs a few hundred bucks. You may have to ditch developing that custom system that'd increase the mission budget with a million dollars, but that's different. And if you want to make it as radiation-hardened as possible I doubt they'd go with anything but SLC for maximum signal strength. I doubt it has anything to do with write cycles which they presumably have full control over at all, it's probably from operating in the harsh environment for 11 years. No wonder it's getting a bit flaky.

        • I very much doubt that. If you're going to spend that amount of money on sending it to Mars, you don't skimp on off-the-shelf technology that costs a few hundred bucks.

          Putting a high-grade SLC 10,000,000 write cycle NAND bank into a rover intended to run for 90 days and do 1-2 writes per day would be extreme gold plating.

          Quality is the degree to which a deliverable satisfies requirements. NASA required something that would reliably last for 90 days under their workload; the highest-quality device would be the least-expensive device which satisfies this (along with the constraints of power usage, weight, and so on). MLC with 200 write-erase cycles would barely satisfy

      • Also noone has flash or is using it from that long ago. Current chips are a thosand times larger.
      • http://techreport.com/review/24841/introducing-the-ssd-endurance-experiment/5

        These guys ran continuous high-IO tests on commercial SSDs for over a year - the results are impressive. Most drives could write hundreds of terabytes before significant issues, with some reasonable COTS drives successfully writing/reading petabytes.

        I'd certainly trust SSD longevity over spinning platters, these days. Sure, $/GB means archival storage of large data sets goes to hard drives or tape, but absent constant, bus-li

    • What, because a space craft that was designed to work for 90 days actually lasted 10 years? Given that, shouldn't you expect SSDs to last way way longer than they claim based on this evidence?

    • I said the same thing until I saw firsthand the performance benefits.

      Since then, I've taken a hybrid approach. My laptop ultrabay has a normal magnetic drive, and the primary drive is a 256gb ssd. My system is installed on the SSD, all frequently written directories are mounted on the ultrabay platter drive.

      It's not as fast as it could be running entirely from SSD, but I get a big speed boost with the security of magnetic media for my files.

      I've had one failure on the SSD, which the manufacturer resolved

  • Keeping old relics and forgotten dreams alive is what they do best.

  • by Anonymous Coward

    FTFY
    Computerworld has no details on NASA's latest fix..

  • Wow talk about lag, sending the data all the way back to earth. Ouch. Maybe they should just make a big orbital server to put around Mars so the surface probes can upload to it.
    • by Anonymous Coward

      Do you honestly think the rovers are actually capable of communicating with Earth directly, or are you just joking around? Got news for you if its not a joke...

  • FRAM vs NAND (Score:5, Informative)

    by volvox_voxel ( 2752469 ) on Thursday January 08, 2015 @05:23PM (#48769509)

    I've never been a big fan of flash memory, given that it has a finite number of write cycles before a memory bit fails (varying between 1 and 100million write cycles). The probability may be low that an individual bit may need to flip so many times in it's lifetime, but it's still an issue.. A lot of care must be taken by the firmware engineer to handle this. There are a lot of job postings for firmware engineers that understand flash..

    I'm a huge fan of FRAM. It has a lifecycle limit that is quoted at being 10 trillion write cycles (some mention at it being infinite). The memory density is lower, but is a lot more reliable. It's biggest issue is that the density is lower. For a spacecraft, I'd much rather have a board of these 2Mbit FRAMS then a large flash chip. They use these things in smart meters, etc. In embedded systems, you have to be really careful not to write to the flash too often out of risk of damaging the flash. Most fast SD cards have their own dedicated microcontroller (ARM9, etc) to do what they can to extend the life of the flash..

    A datasheet of an FRAM device: http://www.fujitsu.com/downloa... [fujitsu.com]

    One question I have is how FRAM compares to NAND-flash in a harsh radiation environment, and what are the radiation differences on mars vs the earth. How many vendors offer rad-hard processes for FRAM, and how do they perform?

    Here is one link I could find on FRAM, but the report from 2011 is not clear:

    http://cdn.intechopen.com/pdfs... [intechopen.com]

    • I've never been a big fan of flash memory, given that it has a finite number of write cycles before a memory bit fails (varying between 1 and 100million write cycles).

      I'm a huge fan of FRAM. It has a lifecycle limit that is quoted at being 10 trillion write cycles (some mention at it being infinite). The memory density is lower, but is a lot more reliable. It's biggest issue is that the density is lower.

      No, the biggest issue is likely cost. A quick Google search says a 4 megabit (0.5 MB) FRAM module c

      • Forgot the tl;dr: Basically this is a problem which will disappear on its own. Opportunity has only 256 MB of flash memory (which was quite a lot when it was launched in 2003 - I paid $100 for a 512 MB flash card in 2004). So each individual NAND cell is being used a lot. As you put in larger amounts of flash memory, each cell gets used less frequency, and the problem goes away on its own.
      • Using ww.findchips.com (a great site to check for parts and availability across multiple distrubuters) , in small quantities, the 2Mbit part is ~$5. But still, your argument is valid. For space born applications where reliability is everything, I'd still like to know about it's Rad-hard status.. These parts come in 8 pin packages, and could also likely scale if they wanted to. Who's to say that in the future that we wouldn't see orders of magnitude larger parts.

        I personally am excited to see the memristo

      • No, the biggest issue is likely cost. A quick Google search says a 4 megabit (0.5 MB) FRAM module costs about $50. That's $100,000 per GB. You can load your spacecraft with a couple dozen extra reserve NAND memory banks at 1/1000th the price.

        But they only put 256MB of storage onboard, so it's only 25,000. It costs what, $7k/lb to orbit, best case? So yeah, it's cheaper to add flash, but the break-even point given reliability probably comes sooner than you think.

        Shouldn't MRAM be better at this sort of thing than flash, too? It's cheaper than FRAM, anyway

        • But the whole point of the project was to use cheaper, more off-the-shelf parts and components. Given that it's now ten years past it's 'guaranteed' lifespan, I don't think they did anything wrong.

          Did FRAM even exist ten years ago?

    • by necro81 ( 917438 )

      I'm a huge fan of FRAM

      That may be, but it wasn't exactly an option (in the sense of it being readily available and thoroughly tested for spaceflight) when the rovers were being designed 15 years ago. Is it even an option today?

  • They shouldn't have bought that discount flash from China after all I guess.

"Plastic gun. Ingenious. More coffee, please." -- The Phantom comics

Working...