Forgot your password?
typodupeerror
Science

Storage Dilemma Looms for NASA 75

Posted by CmdrTaco
from the this-is-gonna-get-crazy dept.
John Keeton writes "Guys, This story talks about how NASA is moving its data from tapes as old as seven tracks to newer media, but then they get done, they have to start moving it again to new media, and how they are falling behind, and may have to lose TB's worth of data.. Really interesting.." It says it will take them 4 years to move all the data to tapes that have a 6 year life expectancy. Hmmm.
This discussion has been archived. No new comments can be posted.

Storage Dilemma Looms for NASA

Comments Filter:
  • by Anonymous Coward
    The project I'm working on right now has a requirement that all data be stored for the life of the spacecraft +3 years. The spacecraft in question is expected to last 20 years. We're expecting data dumps of about 10GB each twice an orbit (an orbit is about 90 minutes).

    This is just one project, and we haven't even launched the spacecraft yet.
  • by Anonymous Coward
    My understanding is that a lot of this data
    was just collected by automatic telemetry and
    archived. It has never been analyzed, there
    are no plans to analyze it in the future, it
    can take major research just to figure out
    the file and data structures, etc...
    The proper conceptual model for visualizing the
    structure of this archive is the buildup of
    ocean sediments, and the proper data mining
    technique is based on the analysis of drill
    cores... AHA... here we have reached the
    boundary layer between the 7090/Fortran II
    sediments and some traces of an early 360...
  • by Anonymous Coward
    Our organization has vast amounts of important data, some of which dates back several decades.
    This data must remain accessible in the future as well. Some time ago they got rid of their once-
    multi-million, now-obsolete mainframe. After that
    they wasted a lot of money finding a method to read legacy mainframe tapes on Unix after discarding the mainframe. The costs involved, such as reverse-engineering the interfaces of the tape device, file format etc. and simply the effort of having all of those clumsy tapes read taught them a lesson: As data amounts continue to grow, ANY backup medium will grow obsolete in a few years. In a few years you can't buy a DVD drive anymore because it's grown obsolete, just like C=64 cassette tapes today. Massive transfer operations from obsoleted media repeatedly are simply out of the question. Their solution was to keep ALL of the data ONLINE. Backups are naturally taken regularily, with whatever equipment is in use at a given time. As all of the old data migrates to the new whizbang online storage along with everything else, there is no need to worry about aging archive libraries and preserving the technology to read the obsolete media. Not to mention the newfound ease of access to the more rarely needed old data. Sure, you need to invest
    in large RAID arrays or what have you. Sure, you
    need to invest in a high-volume backup system. But those systems are replaced "often", since they are
    a part of the production system. In a few years, the extra volume that required the extra investment today will seem non-existent.

    Now this may sound wild at first, but think about it: Data grows. Disks grow larger and cheaper, and are replaced. Backup media grow obsolete. The conclusion is a no-brainer, actually.

    --
    Teemu Yli-Elsila
  • by Anonymous Coward
    It just costs a bit of money.

    Get a heirachical storage system. They make it fairly easy to migrate data across different media types. When a new bigger, faster , better storage media comes along, just add it to the heirarchy. Course BIG automated media libraries help a lot, saves on the arms.
  • by Anonymous Coward
    NASA is good a pushing things so this might be good for all of us.


    I think their problem calls for a layered solution, first they should copy their oldest tapes on to modern archival grade tape. Optical solutions just won't cut it yet and if they feel the data is worth keeping (and it probably is) then they need to get it into a more easy to work format.
    I would duplicate the new tapes two, take an old tape and copy it on to 2 new tapes.


    Secondly, they need to go optical. Pressed CDs have an average life of 100 years. CD-Rs are good for a couple decades. If you take good care of them they are expected to last longer. NASA needs to push on blue, green, or violet laser stacked CDs and DVDs. That stuff is in the labs now but if NASA and some government agencies started sounding the alarm for it maybe production could be expedited. I blue laser dvd should hold between 50 and 100GB of data, that is still short of some of the really big tapes out there but I would think it would work. Get STC to build an automatic tape to DVD jukebox machine and the problem would start to go away.


  • mke2fs /dev/null
  • Posted by korto:

    as long as they don't have the goverment on their back (the comunists are coming!!!) to pump their energies up they don't do zilch!

  • As an employiess of StorageTek I like reading this artical. It gives me hope for the future. :)

    #include

  • by bluGill (862)

    I can place 50 gigabytes on a singe tape, uncompressed. I can read or write that entire tape in the same time that a single speed cdrom can read 640 megs. A 40x cdrom would need more media changes (a robot would do the changes of course), and not counting the media changes would take 1.5 times as long!

    Optical storage has promis, don't get me wrong, but when your trying to spool data as fast as NASA does from some applications they aren't suitable.

    #include %lt;stddisclaimer.h> I'm not speaking for anyone here, all numbers have been rounded and esitmated.

  • by bluGill (862)

    I agree use the right tool for the job. I also agree that optical storage can work, but optical drives go obsolete too, my perdiction is that in 5 years manufactures are going to notice that CD-ROM is never used and the DVD players will cost $.50 less because they don't put the ability to read CDROMS in DVD drives anymore. Whopps, goota move that optical data off cdrom not, while you can still find a reader.

    Your still missing the point though: They want to use that data. I just stated the speed they can get from tape. They can't tell me which tapes they will need to read next year, but some of those tapes will be needed for some project. A supercomputer is a device for turning a CPU bound problem into an I/O bound problem. While many supercomptuers run unix and can multitask, the users still want the answer fast, and waiting for data to come off an optical cartrage isn't a good use of time. In todays world human time is more expensive then computer time, so it is worth the cost to make sure human time isn't wasted.

    Don't forget that were talking about several hundred terrabytes of data at NASA, even in the optical stroage system everyone is envisioning (which may eventially be made, but it isn't effective today) it will take up significant space, and unless the media never changes (like CDROM->DVD media didn't change, right) they still need to migrate. I'm not a profit, I'm not about to perdict formats won't change.

  • I know for a fact that most current NASA storage is from StorageTek, and they are famious for robots, so it is likey that the current stuff is robotic. I'm also well aware that 2 years ago it was someone else, and if they aren't careful it could be someone else again as they keep upgrading capacities.

    So it is a safe bet that the new tapes are in a robot. I'm comfortable saying, though unsure, that the old tapes were at least partially manual.

    Accually I know the new systems are robotic, because NASA keeps their data in the same building they handle the deadly chemicals for the Shuttle booster rockets. The data center people really hate to be in a room that shares ventalation with a room where they mix two deadly gasses to make something even more deadly. I don't know why they don't move it.

  • Access time is a legitimate concern, if it becomes a bottle neck. in comptuer scient terms: a 50 gb tape takes O(r+n) time. A bunch of cdroms take r^75+n time. Also note that the constant before n is bigger with a cdrom. Simply a dense tape is faster then optical, and has about teh same lifetime. (CD-R is not good for 100 years, as others have noted it is gaurentied for 20 years, tape can be that good)

    I'm not against an optical storage system. I'd seriously consider investing money in reasearch ofr such systems. But magnetic media still has life, and is still in general a better solution then optical. Yes I expect this to change in the future, but NASA is dealing with today, they will probably migrate to better media again in 5 years. SOP for many buinesses as they try to re-claim the space consumed by the older storage.

  • go watch the first Star Trek movie.

    If you don't want to, the premise of the movie was that some aliens picked up the voyager probe, read the programming that said that it needed to return the information it collected to Earth, and sent it back towards earth with a bunch of new machinery inside a gas cloud sever solar diameters big. It destroyed everything in its path trying to get back to Earth, and although it was sending out some data, no one on Earth remembered how to activate Voyager's transmit sequence.
  • Have a look at this one:

    [norsam.com]
    http://www.norsam.com/hdrom.htm

    They are a DOE spin-off working on archival technologies. The idea is to use particle beams to do the writing instead of lasers: you can focus the beam much more tightly, hence make much smaller dots. They have two technologies -- digital holding 165GB/disk, with 20MB/s storage rate, and analog, holding 90,000 pages scanned at 300dpi. Both use _very_ durable silicon-wafer substrates.


    At that density, a 6-platter changer holds a terabyte, and a dozen 500-platter jukeboxes hold a petabyte. If you want really fast access, stripe across multiple platters -- if you stripe 8-way, you get a transfer speed of 10 terabytes per minute, which does better than NASA's old tapes (someone said 23 months, iirc).


    fwiw

  • Sure, but how about a DC600A tape drive? For that matter, now that you've got the c64 tape, you will read it with what? Sure, if the data is desperatly important, you could do some sort of hack involving a soundcard and a tape player, but that's for data measured in K.

    5.25 IBM formatted still isn't too hard, but how about 8 inch from a PDP-11?

    The point is, no matter how long lived a storage method looks now, in 10 to 20 years, it'll be a big pain.

  • Although painful to think about (given the volume of data, perhaps constantly migrating to the latest and greatest isn't the best answer. For example, I have some old WORM media, and some old punch cards. Guess which one I can still read (if I were really desperate to preserve COBOL code).

    I also have some ancient 78 rpm records that I can still play, and some 10 year old audio CDs that I can't. It seems that there's been a wee little bit of spec drift in CD players so that not all new players work well with some old CDs. I say that because I have an older CD player that has no problems with the same old CDs. Wierd but true.

  • could you imagine trying to locate a specific file, somewhere in a library of 100,000 CD's?

    they MUST find a solution to the capacity problem first. Optical don't cut it.
  • Really, it's all Steve Jobs' fault.

    All those Macintoshes at NASA, now they have to copy their data from floppies to USB-connected ZIP drives. Just because Steve Jobs says floppies are obsolete!

    Thousands of terrabytes? Think how many ZIP disks that is!
    Giga-click-of-death!

  • ...and they're difficult to grep....

    dylan_-


    --


  • Wow!

    dylan_-


    --

  • If they know that tapes have such a short life span (for their purposes, at least), why do they want to transfer their old data from tapes to *shudder* more tapes? Geez... BTW, there are storage devices intended specifically for long-term storage, and they happen to be based on optical storage. WORM and COLD are but a few of many. If DVD weren't such a mess of a "standard", they could use 4-layer double-sided disks to store the data. In short, optical storage is the best candidate for such a task (but many of you already knew that :).
  • The new tapes will have 6 years of life left AFTER the project is over, meaning the life expectancy is 10 years, not 6.

    Not that it makes it any better. Someone tell NASA about DVD and other digital storage methods.

    asinus sum et eo superbio

  • At least save the transmit code for Voyager 6.
    ! ! ! ! ! ! ! ! ! ! !


  • Hi.
    This is exactly the subject of a really fantastic article at Wired's magazine archives. Thought I'd contribute the URL [wired.com].
    Enjoy,
    -p
    --
  • Modern commercial tape technology, specifically DLT, has gotten very fast and reliable. While I realize most of you haven't dealt with anything larger than a peecee and therefore find real technology hard to deal with, it is out there. A Quantum DLT7000 drive, for under $5k, can write 35GB native (70GB compressed; onboard hardware compression) at 5MB/sec media speed. Also, according to Quantum's specs, a DLT cart has a storage life of "More than 30 year with less than 5% loss in demagnetization (at 20C and 40% non-condensing humidity)". DLT is very fast, reliable, reasonably priced (given what it does), and has been around for a while. If they're using DAT (or other helical scan technology) for all this data, they need to get their head(s) checked.
  • Is this the major cause? The amount of data doesn't seem paticularly large. The company that I work for (a large British bank) has 36 STK Powderhorn silos each of which holds 12TB.

    Alternatively are they limited by the speed of the
    old tape drives, rather than CPU throughput?
  • by acb (2797)
    Don't CD-Rs have a life expectancy somewhere in the same ballpark as NASA's tapes? I heard that they start to suffer bitrot as the dye fades/decays.

    Come to think of it, is there any high-capacity digital medium that could be reasonably expected to securely hold its data for centuries (as printed text on paper lasts)?
  • Parcel out the less-critical/unknown tapes out to various interested parties or institutions, with the understanding that a serious effort be made to upgrade the storage.

    If all else fails, draw up a full-color glossy ad and sell 'em off to the SF nuts in the back of STARLOG, complete with velvet-lined display case and commemorative plaque. Oh, and when the display case is opened, you get a cheap, tinny-sounding rendition of the first few bars of "2001".

    Kewl!

  • yes, but for every parchment/papyrus made when the dead sea scrolls were made, there is probably 10000 that have rotted away long ago.
  • how much data does one of these rolls of tape hold?
  • by datazone (5048)
    speed should not be a problem in accessing data from an optical disc. true you may not be able to spin the platter beyond certain speeds. isn't there some technology call "zen" that uses multiple lasers or heads to read multiple areas of the disc and recombines it, giving you a larger flow of data. If NASA or anyone else modifies this technology to its upper limit, you should be able to reach very large data bandwidths, true you will need some cpu power to put the data back together, and maybe a secondary large buffer to store data while it could be reorganized, but that should be nothing for a supercomputer to do.
    It may not be perfect, but it could tide them over for the next 10 years at least, until we start using bio-chemical storage devices.
  • For spoken voice, what's the difference?
  • by rpete (6612)
    > Would like to know what project your talking about (maybe AXAF?)!

    What's AXAF? :-) See the satellite formerly known as AXAF [harvard.edu].

  • by Bilbo (7015)
    Why don't they go to optical disk libraries? (I mean the big 14inch disks.)

    I wonder about some of these time estimates though. Are they talking about the total time to copy all the tapes one at a time? Seems like they could just add more tape drives. The one bottleneck might be the fixed number of readers, since the formats are so old they can't buy new drives to read them...

    (gack! The media loves to latch on to "disaster" stories. :-/)

  • Hi, I work in a Cogntive Neuro-Imaging lab at Princeton University. We use an fMRI scanner to image people while they're doing working memory tasks. Our work generates a substantial amount of data that needs to be archived. Our current need is to archive around 500 GB, but this figure will likely increase to 3 to 5 TB in the next two years. Our current plan includes investigating a Pioneer double sided DVD-R jukebox that should store 9.4 GB per disk and be available around this summer. We considered tape solutions but found them significantly more expensive (when we factoring in the cost of the robotics) for similar capacity. A DVD based solution also offers the promise of non-proprietary universal access. We would be very interested in any advice/suggestions that the Slashdot community has to offer about "reasonably priced" high capcity data archive solutions. Thank you.

    Micah
    malpern@princeton.edu
  • by Lando (9348)
    This is absurd. I deal with terabytes of data every day and the cost for what they want to do is not as expensive or as troublesome as they seem to think.

    A system could be set up to give them real-time access to all their information and provide storage for less than 10 million and that's total cost up to 2005 where they were stating storage costs of 50 million a year.

    Something is obviously wrong here. Am I missing something?

  • 23 TB roughly 2.3 million dollars raid 5 drives? Too expensive?

  • Is there any good optical storage medium they can take advantage of? CD's and MO have an incredible storage life.

    Maybe they would have to use Laser Disc-sized DVD technology? Any other thoughts?

  • Cost is a problem right now.

    In case everyone forgets, we're not spending hords of money on NASA and related departments anymore. In fact, they generally have either static or slightly shrinking budgets. So, naturally, they've gone to strictly commertial stuff whenever possible. No custom build stuff here.

    The biggest problem isn't the throughput of modern tech (I do suspect that DVD-RAM/DVD-RW will be the format of choice), but the rate at which they can read data off the old systems. As other people have pointed out, a huge chunk of the data is on VCR-style types (or 9-track reels) - the readers are old, hard-to-find, and I suspect can't do more than a couple hundred kB per minute.

    OK, math quiz: 100TBytes / 1MB per minute =~ 100,000,000 minutes =~ 190.25 YEARS. Say you have maybe 100 such readers at your site. It still takes almost 23 months of completely continuous reading to read it all off. No wonder they have a problem...

    Oh, and the stab at all the old farts at NASA was unjustified. Most of the people I know at NASA (@ Ames, Goddard, etc.) are real engineers. Many are getting long at the tooth, but I can safely say most of the them are extremely competent, and I'm completely sure this isn't their fault. Probably just the typical upper-level funding problems (ie - complain to the top dogs @ NASA, and, more likely, to the dolts in congress who don't have the vision to properly fund them).

  • I read a similar story several years ago. NASA collects an enormous amount of data from the various probes that are wandering about the solar system. At that time, I'm not sure that CD-ROM was proven for data yet and they were placing everything onto magtape.

    Now that CD-ROM is pretty well established, I can't see why it wouldn't be suitable for copying those old tapes onto. OK, OK, DVD will hold more but even CD-ROM will hold tons more than an old 9-track tape. A simple calculation (feel free to correct me if I messed up here) shows

    (2400 * 12 * 6250) / 8 = up to about 21 MB

    I'm guessing that a 9-track tape takes up about the same amount of shelf space as about 6-7 CD-ROMs. Let's see that's 21 MB vs. 3600-4200 MB. Looks to me like they gain back some floor/shelf space as well as longer life for the data.

    The concern about access time can't be that legitimate. Robotic tape handlers aren't any faster than CD-ROM handlers/jukeboxes.

    I hope NASA acts on this before those old tapes become totally unreadable. Loss of this data, IMHO, would be a catastrophe.

  • by Axe (11122)
    I would love a low job right now...
  • Why do they need all that data? Are they ever going to look at it? What if they hadn't collected as much in the first place? Would they be much worse off than they are now? Can't they just back up the most important stuff?

    How about compressing the data? Not just lzh or something, but things like peaks/troughs and other statistically significant items? Once the raw data has been around for a while, say a year, they can reduce it to what's significant. If later they change their mind, realize they need the original raw data, too bad! They'll just have to revise their algorithms for the future. No big loss.
  • oh yeah, some were on copper sheets.
  • I work at the National Center fro Supercomputing Applications at the University of Illinois at Urbana/Champaign, and we use all the latest mass storage technologies (DLT drives, TLM Robots, Tape Silos, etc), and we've also had do to a migration from older media. And it takes time. First of all, our migrations weren't from such old media, which meant that they held more than NASA's tapes. So, we had less tapes to deal with, and they transferred faster to DLT. NASA has so many (relatively) low-capacity tapes that read slowly, it would take a huge amount of time to do anything. It doesn't matter how fast the medium you're copying to can write at, in this case the bottleneck is reading the old media. Not to mention the fact that tape drives are relatively unreliable. That is, they tend to break every few months when you use them 24hours/day, 7days/week. And we are talking about huge amounts of data... I know at NCSA I once had a user request the deletion of a 100GB file that was tarred and gz'ed. Optical drives would be great, but they don't hold enough compared to tapes.
  • I read an article (posted here?) about how data had a short life span. The comparison it made was between "modern" media (tapes, CDs, etc.) and documents which have lasted basically forever: the constitution (on parchment) and the dead sea scrolls (is that what they're called? I forget) which were carved in stone.

    So why not just carve everything in stone? Yabadabadoo! :)

Make headway at work. Continue to let things deteriorate at home.

Working...