Neglect Causes Massive Loss of 'Irreplaceable' Research Data

Neglect Causes Massive Loss of 'Irreplaceable' Research Data 108

Posted by Soulskill on Friday December 20, 2013 @07:12PM from the store-those-magnets-over-there-by-the-old-hard-drives dept.

Nerval's Lobster writes "Research scientists could learn an important thing or two from computer scientists, according to a new study (abstract) showing that data underpinning even groundbreaking research tends to disappear over time. Researchers also disappear, though more slowly and only in terms of the email addresses and the other public contact methods that other scientists would normally use to contact them. Almost all the data supporting studies published during the past two years is still available, as are at least some of the researchers, according to a study published Dec. 19 in the journal Current Biology. The odds that supporting data is still available for studies published between 2 years and 22 years ago drops 17 percent every year after the first two. The odds of finding a working email address for the first, last or corresponding author of a paper also dropped 7 percent per year, according to the study, which examined the state of data from 516 studies between 2 years and 22 years old. Having data available from an original study is critical for other scientists wanting to confirm, replicate or build on previous research – goals that are core parts of the evolutionary, usually self-correcting dynamic of the scientific method on which nearly all modern research is based. No matter how invested in their own work, scientists appear to be 'poor stewards' of their own work, the study concluded."

Neglect Causes Massive Loss of 'Irreplaceable' Research Data

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 108 Comments Log In/Create an Account

Comments Filter:

This was understood in Engineering projects too (Score:1)

by Anonymous Coward writes:

Just ask somebody to figure out how to build a Battleship, or even the guns off one, heck, you'd have trouble finding people who know the process of firing them.
Or if you prefer, Greek Fire.
- Re:This was understood in Engineering projects too (Score:4, Insightful)
  
  by xmundt ( 415364 ) writes: on Friday December 20, 2013 @08:36PM (#45750657)
  
  Or as a slight step up....there is NO chance that America could build a Saturn V rocket these days. It was a great workhorse, but so complicated that the loss of a few percent of the drawings, and the number of engineers that worked on it that have retired or died means that reproducing it is impossible now.
  In any case, as for the loss of data...that IS a problem. Back in the Olden Days, before someone decided that the computer, with its amazingly fluid and ever-changing methods of storage were the answer to saving data, much of it was printed on paper and tucked away in libraries. Is that still a workable solution? I do not know, but, I do know that when one is trying to store information for a long time, it HAS to be in the simplest and most durable medium and format available.
  
  - Re: (Score:3)
    
    by TubeSteak ( 669689 ) writes:
    
    Or as a slight step up....there is NO chance that America could build a Saturn V rocket these days.
    We have at least a couple complete Saturn V rockets lying around if we wanted to reverse engineer 'em.
    I've personally seen the ones in Alabama and Washington D.C.
    http://en.wikipedia.org/wiki/Saturn_V#Saturn_V_displays [wikipedia.org]
    The hardest part of rebuilding old hardware is the metallurgy.
    As long as we can get that right (or use a better quality substitute)
    reverse engineering from existing parts isn't anything we couldn't farm out to China.
    - Re: (Score:2)
      
      by countach ( 534280 ) writes:
      
      Are they really complete rockets? Is the documentation available to verify that they are 100% complete?
      - Re:This was understood in Engineering projects too (Score:4, Funny)
        
        by beckett ( 27524 ) writes: on Saturday December 21, 2013 @04:52AM (#45752445) Homepage Journal
        
        not 100% complete; they got this small bag of leftover screws.
        
    - Re: (Score:2)
      
      by volvox_voxel ( 2752469 ) writes:
      
      To what extent can meteorology techniques, esp. non-destructive, be used ? E.g. x-ray crystallography ,microscopy to study crystal grains, electron-microscopy, spectroscopy, etc?
    - Re: (Score:2)
      
      by HiThere ( 15173 ) writes:
      
      I really doubt that they are complete. The internal electronics are probably broken, plastics deteriorate with age, etc. But the real loss is the skill sets needed to build it. There were a LOT of failures before we got a working version.
      FWIW, I believe that NASA has officially said that they couldn't build another Saturn.
Scientific Data Disappears At Alarming Rate too! (Score:5, Informative)

by Anonymous Coward writes: on Friday December 20, 2013 @07:19PM (#45750211)

Sounds familiar! [slashdot.org]

- Re:Scientific Data Disappears At Alarming Rate too (Score:4, Funny)
  
  by badboy_tw2002 ( 524611 ) writes: on Friday December 20, 2013 @07:37PM (#45750307)
  
  Don't worry, Slashdot stories won't suffer the same fate as each one is duplicated later on!
  
- Re: (Score:3)
  
  by msobkow ( 48369 ) writes:
  
  Gee, three hours to a dupe.
  That has to be some kind of new record.
  - Re: (Score:2)
    
    by msobkow ( 48369 ) writes:
    
    Oh. Wait. 15 hours. Maybe it's not a record after all. :P
    Forgot about the 24 hour clock. :)
- Re: (Score:2)
  
  by riverat1 ( 1048260 ) writes:
  
  Maybe they do it so you can use your mod points on one of the posts and make comments on the dupe.
Drinking from the firehose. (Score:4, Insightful)

by Anonymous Coward writes: on Friday December 20, 2013 @07:23PM (#45750239)

My wife is a wildlife biologist. Her office collects raw field data all year, compiles data, runs stats, writes reports, reads reports, creates a pretty large volume of "product" every year.
I ask her who exactly reads all the required papers and reports they produce. The federal Fish and Wildlife Service demands product. State demands product. Various agencies with funding ties that would confuse anyone all demand product. The real ass-kicker? Almost none of it is actually READ by those who asked for it. The papers that are read, are rarely read by more than one person.
In the end, thousands and thousands of offices like hers, producing real scientific data, it is just too much.
The number of people consuming the product is DWARFED by those producing it. The number of people tasked to archive, organize, store, catalog, and index this torrent of information are even FEWER than those who consume it.
These are "real life" scientists out there every day. Not throw in academia, including "research academia".
The bottom line? A true first-world problem. We produce WAY more research than we are prepared to do ANYTHING with.

- Re: (Score:2)
  
  by blue trane ( 110704 ) writes:
  
  Put it on the web. Who knows who may find it useful? The value of the research might not reveal itself for some time, but if google or someone has archived it, it might sit there waiting to unveil secrets, like the Pillars of Ashoka.
  - Re: (Score:2)
    
    by Obfuscant ( 592200 ) writes:
    
    Put it on the web.
    
    Who pays for that? Disks and servers and networks cost money. Academics rarely have that just sitting unused.
    - - Re: (Score:2)
        
        by krlynch ( 158571 ) writes:
        
        It ain't the hardware ... it's the people. Who maintains the hardware? Programs it all? Who maintains the networks? Who is charged with tracking what's working and what isn't? Who backs it up, and updates formats, and catalogs it, and indexes it, and tracks the changing methodologies used in the collection? Who translates the old code, and operating systems, and storage formats, and hardware, and whatever else?
        Hardware's cheap ... people (especially those of the knowledgeable and reliable variety) are
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        http://archive.org/web/ [archive.org]
        
        Re: (Score:2)
        
        by sjames ( 1099 ) writes:
        
        Since you're just serving up static data, throw it on a Debian server, put the security updates in a cron job and it should run trouble free. Every three years, someone can oversee the dist-upgrade. It doesn't have to have singing and dancing animations with a background just exactly the right shade of mauve or anything.
        If it gets more popular than expected, torrent it.
    - Re: (Score:2)
      
      by blue trane ( 110704 ) writes:
      
      Let the Fed expand its balance sheet to buy govt bonds that allow for academics to publish data, and keep the loans rolling over forever while returning the interest to the Treasury. Making the research and data available is in the General Welfare. Like libraries...
      - Re: (Score:2)
        
        by Obfuscant ( 592200 ) writes:
        
        An interesting interpretation of the Constitution, where the general welfare clause is part of the preamble and not a proscriptive statement. And an interesting interpretation of how research grants are awarded, and even the general usefulness of vast quantities of research data.
        Academics already write "publish" into the grants they get, or they ought to do so. "Publish" is not the same as "put all the data up in an organized manner for everyone to come use", however. And even being able to put it all up f
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        General Welfare is mentioned twice, in the Preamble and in Article 1, Section 8.
        It isn't printing money, since no physical greenbacks need be involved, just figures in a virtual ledger book. Banks of course use this trick to expand their balance sheets, by issuing loans or otherwise creating assets. UBS for example booked future expected profits right away on AAA mortgage-backed securities, and paid bonuses on those profits. So these type of accounting practices go on all the time in the private sector.
        Perh
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        It isn't printing money, since no physical greenbacks need be involved, just figures in a virtual ledger book.
        There's so much fail in this sentence. "Printing money" is a saying not a literal description of the act. It means that you create currency without creating value. Inflation takes care of that hubris.
        
        And how can anyone think that "figures in a virtual ledger book" is an adequate solution for anything productive or vital?
        Perhaps by the time someone comes across your data, they will be smart enough (or have an AI that's smart enough) to figure it out. Or they could become architectural relics, providing valuable information to future societies. I think you discount your own research unfairly.
        Like a room with a thousand Madonna portraits. Someone will be interested.
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        "figures in a ledger book" is what the financial sector busies itself with. I agree, there's no value added. We'd be better off bypassing the financial sector and simply providing liquidity, from the government or a central bank, when it's needed. The financial sector is mired in all sorts of perverse incentives and moral hazards that cause lots of friction and push prices away from their efficient levels. That's why asset prices bubble and crash, because dealers push them away from their efficient levels.
        A
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        "figures in a ledger book" is what the financial sector busies itself with. I agree, there's no value added.
        Sometimes you're right. And sometimes those figures represent things of value. A loan to you would be "figures in a ledger book" to the bank. But it'd be a home, a business, or an education to you.
        We'd be better off bypassing the financial sector and simply providing liquidity, from the government or a central bank, when it's needed.
        That's what creates these huge bubbles in recent time. Easy money from the Fed gets dumped into dubious investments by the finance sector.
        Inflation is mostly psychological.
        Then you don't know what inflation is. For example, if the US government were to secretly "print money", that is, buy things with currency that they don't have the backing for,
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        The central banks didn't provide the liquidity for the most recent bubble, or for the tech bubble. The Fed was increasing interest rates (which killed dot-com). The credit expansion took place in the private sector, not from the Fed. You're model is deeply flawed, based on an ideology that history doesn't support.
        Financial "innovations" preceding the most recent crash created what private banks thought of as "risk-free" assets. The banks booked future profits from these riskless, AAA-rated, mortgage-backed
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        The central banks didn't provide the liquidity for the most recent bubble, or for the tech bubble.
        You are wrong here. The US Federal Reserve had low interest rates going into both bubbles and Fed officials did link money policies to the asset bubbles (for example, Greenspan's "irrational exuberance" speech in 1996).
        If inflation is tied to the money supply, why didn't we see hyperinflation when the Fed expanded its balance sheet by a factor of at least 2 in a week? Why didn't we see hyperinflation when the private sector was expanding its balance sheet by much larger than a factor of two in the run-up to the crash?
        Because a mere factor of two isn't hyperinflation. If they were doubling money supply every week for many weeks, then that would result in hyperinflation. And the Fed's "balance sheet" isn't a full measure of inflation since one also has to consider velocity of money which slows greatly durin
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        Let's look at some data, shall we?
        http://research.stlouisfed.org/fred2/graph/?g=qip [stlouisfed.org] shows that the Fed was in disciplinary mode, raising interest rates, before both the dot-com and the real-estate crash. Greenspan's "irrational exuberance" attitude was what killed dot-com, because, I think, he's an old fool who didn't understand the potential of technology to make obsolete his feudal economic models.
        Regarding velocity of money: http://research.stlouisfed.org/fred2/graph/?g=qiq [stlouisfed.org]
        If velocity of money leads to i
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        http://research.stlouisfed.org/fred2/graph/?g=qip shows that the Fed was in disciplinary mode, raising interest rates, before both the dot-com and the real-estate crash.
        Interest rates were rather low just the same. Also, your observation is additional support for my argument since the rates were raised just before the asset bubbles crashed. That timing is an important correlation for claiming cause and effect.
        
        When I look earlier, I see a 2.75% rate in early 90s (lowest since the 60s) and sub 2% rates after the 9/11 attacks (lowest since after the 1957-1958 recession).
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        I think you're ignoring far more obvious causes for inflation. In the 1970s, it was oil supply shocks. OPEC raised prices not because of economics of supply and demand, but for purely political, or psychological, reasons.
        The interest rate profile for the 1960s is similar to that for the 2000s. But inflation consequences were quite different, because there are much more important psychological factors involved.
        Rates were being raised years before the "bubble" burst. That's a very strange cause theory you hav
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        Also the obvious point in the interest-rate graph is that 8 of 9 recessions immediately followed a rise in interest rates. Discipline caused the recessions, not too much money.
        In the dot-com crash, investors started pulling back because they couldn't keep their loans rolling over at the low interest rates. In the real-estate crash, mortgage rates went up because money was becoming tighter. What if interest rates had not gone up? Let's run a simulation to see if people would have been better off.
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        OPEC raised prices not because of economics of supply and demand, but for purely political, or psychological, reasons.
        As you noted yourself, 70s recessions were triggered by oil shocks, not by OPEC psychology.
        The interest rate profile for the 1960s is similar to that for the 2000s. But inflation consequences were quite different, because there are much more important psychological factors involved.
        No, they weren't. For example, the 60s interest rates didn't stick around the lowest interest rate for any length of time while the lowest points of the 2000s interest rates were maintained for more than a year.
        Then a hiccup occured when UBS announced it was writing off over $10 billion in MBSes, and groupthink took over and the traders started an emotion-based sell-off. Interest rates and the money supply had little to do with it. Psychology and emotional overreaction were the main causes.
        Why did UBS write off anything in the first place? They had a margin call (well, the equivalent for banks which are required to maintain a level of reserves). Higher interest rates provided a lot of pressure fo
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        OPEC psychology created the oil shocks. There was no production capacity problem. There was a psychological issue.
        Regarding UBS, here's a quote from the Economics of Money and Banking class, Lecture 20 Notes:
        UBS was doing something it called a Negative Basis Trade in which it paid AIG 11 bp for 100% credit protection on a supersenior CDO tranche, and financed its holding of that tranche in the wholesale money market. In its report to shareholders, to explain why it lost so much of their money, it states tha
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        OPEC psychology created the oil shocks. There was no production capacity problem. There was a psychological issue.
        You don't get it. The existence of an effective cartel demonstrates in the first place that there was production capacity problems - namely that production capacity was highly concentrated in the hands of the cartel. And the oil shocks were profitable (in addition to increasing the political power of the OPEC members) - providing a straightforward market advantage for that choice.
        The risk turned out to be liquidity risk, when money market funding dried up and they could not sell their AAA tranche.
        Here we go. Liquidity risk that originated from the easy Fed money no longer being in the market.
        So it wasn't a margin call.
        Then why did they need to "raise
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        For another model, I view recessions as large corrections of market perception. Recent recessions have been asset bubble driven, but there are other kinds of recessions such as the oil crises of the 70s (where suddenly the developed world realized that OPEC could manipulate oil supply and prices a huge amount and that resulted in all sorts of costly economic adjustments from changes in individual behavior up to national investments in alternative energy approaches).
        
        In this light, when the central bank se
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        Regarding OPEC: there was no supply and demand problem with oil. In economic terms, the price should not have risen because there was no production capacity problem. The reason prices rose were purely a matter of politics, of psychology, of policy. Not physical necessity. The proof is that prices later dropped to $10/barrel. So there was no production capacity problem. There was only a psychological problem.
        Regarding UBS's liquidity risk: according to Prof. Mehrling's story, UBS was getting funding from mon
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        "In this light, when the central bank sets interest rates, it is actually paying the markets to see interest rates as being in a certain range. This primes the pump for putting money in any available high leverage investments since suddenly there's no low risk investments with good interest payments out there. And once money starts flowing into such a bubble, it develops an attractive short term trend which brings in more money."
        Your story doesn't take into account that the Fed doesn't set interest rates (e
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        Your story doesn't take into account that the Fed doesn't set interest rates (except the Discount Rate which is set a fixed amount above the natural private rate). It can try to target rates, but the rates are ultimately negotiated by the private institutions themselves.
        The Fed has very effective tools for targeting rates. A control system doesn't need to be perfect to be an effective control system.
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        The private banking system evolved of its own accord towards a centralized system where clearinghouses played a role similar to the Fed today. There was a need for a central bank that could provide elasticity in times of crises, and it was convenient for all the banks to settle payments once a day at a clearinghouse, instead of many times with each bank someone had written a check on or cashed a check at. A centralized system made sense.
        The problem with the centralized system was that it didn't provide enou
  - Re: (Score:2)
    
    by Oligonicella ( 659917 ) writes:
    
    There's nothing in the world preventing you from donating time and hardware to help them do so.
    
    That's the point. There's really too much raw data that's not really needed, just produced.
    - Re: (Score:2)
      
      by blue trane ( 110704 ) writes:
      
      Who knows what's really needed? The market is too short-sighted to be a reliable judge. Mendel's research was not needed, until after his death. The research that went into the internet was thought to be unneeded by AT&T. The library of Alexandria was thought to be unneeded and burned. Kafka wanted all his unpublished manuscripts burned after his death.
      If you or I can't afford to help researchers publish their data on the internet, the government can and should.
      - Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        If you or I can't afford to help researchers publish their data on the internet, the government can and should.
        This is the outcome of government intervention in the scientific process - the generation of scientific activity which can't have long term value merely because it won't be saved. Maybe if we apply more of the poison, we'll save the victim.
        
        Re: (Score:1)
        
        by Mr. Slippery ( 47854 ) writes:
        
        government intervention in the scientific process...
        Involvement is not the same as intervention.
        Scientific research is a public good.
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        Scientific research is a public good.
        Except when it's not.
        
        But by forcing so much research to be a public good, you also create the usual tragedy of the commons situations of overconsumption of the good, such as researchers who research all sorts of things to consume the available public funds, but have no incentive to actually save their work.
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        Why not save their work whether they have the incentive, or not? That way it can be checked by anyone who wants to. It's in the public interest to check research, like Rogoff and Reinhart [theatlantic.com]'s:
        Thomas Herndon, Michael Ash, and Robert Pollin of the University of Massachusetts, Amherst, have found serious problems with Reinhart and Rogoff's austerity-justifying work.
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        Why not save their work whether they have the incentive, or not?
        Because they don't get funded to do that.
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        Create the money to fund them. It's in the public interest, the General Welfare. We, and our grandchildren, will be better off if we can check research by having access to the data it used.
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        You still have the problem that nobody reads most of the research and that even if you did pay people to read research, you'd still be nowhere near use of that research. All this blather about "General Welfare" ignores that it's not really in the public interest to pay brilliant people to spin their wheels.
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        You don't know what research will be read. Maybe it will become valuable after you're dead. It's value to you in the present may be nothing, but to another it might be great. For example, I like to listen to old jazz tunes on youtube that may have one or two other views. But there's value in them, because value is not a popularity contest. In the same way, research that is not valuable to you, or not popular at this time, can have immense value to the future. Example: piles of trash that are invaluable to a
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        You don't know what research will be read.
        No, but I have a pretty good idea.
        Maybe it will become valuable after you're dead.
        But it probably won't. As a general rule of thumb, if something doesn't prove itself in the first few decades to have value, then it probably never will.
        For example, I like to listen to old jazz tunes on youtube that may have one or two other views. But there's value in them, because value is not a popularity contest.
        Sure, because you listen to them, those youtube videos have some value.
        Example: piles of trash that are invaluable to archaeologists in reconstructing ancient Troy, say.
        It doesn't have to have universal value to everyone to have value. This is completely irrelevant to my point. Recall we're speaking of research data that will probably never be examined by anyone other than the author and perhaps a few reviewers. You the
        
        Re: (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        I think your approach is like uniformly reporting "negative" on cancer tests, because the incidence of cancer is so low. You can have a very high successful prediction rate (99%, say) by simply saying "no" on every test. But that doesn't help the patients who have cancer. You can boast "I have a great prediction success rate!" but you're not helping anyone.
        In the same way, saying categorically that no research is valuable because a lot of it isn't valuable is silly. It's precisely the cases that are "thrown
        
        Re: (Score:2)
        
        by khallow ( 566160 ) writes:
        
        I think your approach is like uniformly reporting "negative" on cancer tests, because the incidence of cancer is so low.
        That is not the case. My approach would be picking up most, if not all cases of useful research in question. Recall that scientific research which results in useful progress over the long term invariably has some usefulness and value even in the short term. This is a universal feature, not a quirk of market-oriented research.
        In the same way, saying categorically that no research is valuable because a lot of it isn't valuable is silly.
        Then don't say it. I don't say it either.
        
        I do think that this sort of claim indicates that you don't understand my argument. I'm not arguing that publicly funded research can't have
  - Re: (Score:1)
    
    by as.kdjrfh sxcjvs ( 2872465 ) writes:
    
    There are groups working on this -- the University of California is trying to do it in a consistent way, with its wealth of historical data -- but it's harder than you'd think. It's not very useful if you don't get the metadata reasonable, and that's skilled work and not something we reward. Institutional support (libraries, machine shops, etc) gets pinched because it's constant overhead and hard to point to single high-status payoffs. It takes one year to kill a library (Canada's superb fisheries and lake
- Re: (Score:2)
  
  by nbauman ( 624611 ) writes:
  
  I dunno. There may just be a half a dozen people who are interested in your wife's penguin or whatever, but to them it's really interesting. They might have to make a decision about penguin habitat or whatever.
  And then there's the scientific paper lottery. A few papers turn out to be really important, everybody cites them, and they change the world -- but you can't know in advance which one is going to be important. There were people doing studies of the hearing of fish, and suddenly, when porpoises start g
This article looks familiar (Score:2)

by harvestsun ( 2948641 ) writes:

Maybe because it was posted less than 24 hours ago [slashdot.org]?
- - Re: (Score:2)
    
    by sumdumass ( 711423 ) writes:
    
    I think you are forgetting about the FireHose. Chances are one if not both of the dupes was promoted via the fire hose where enough positive votes promoted it as designed.
- Re: This article looks familiar (Score:1)
  
  by Anonymous Coward writes:
  
  Scientific data would not be lost if it was posted on Slashdot... you could just retrieve the next day's dupe.
- Re: (Score:2)
  
  by gandhi_2 ( 1108023 ) writes:
  
  Because without constant refreshing, this article would disappear!
  - Re: (Score:1)
    
    by Mathinker ( 909784 ) writes:
    
    Gawd, how I wish Slashdot would go back to using SRAM...
- Re: (Score:2)
  
  by bob_super ( 3391281 ) writes:
  
  Don't blame them, the editors really care, given their apparent short-term memory loss and/or schizophrenia.
  (yes I know about varying medical definitions of schizo)
- Re: (Score:2)
  
  by tqk ( 413719 ) writes:
  
  Make it publicly available instead of DRM controlled publications or services.
  I suspect those publications and services are among the few things pushing this in the other direction. Multiple reviewers, each with their own copy of the data and, as they'd be in the same field of research as the author(s), more likely to be personally familiar with the authors' current work and location.
  Is that irony? I never have managed to figure that one out.
  - Re: (Score:2)
    
    by blue trane ( 110704 ) writes:
    
    Do the reviewers have the actual data? Or just the papers?
    - Re: (Score:2)
      
      by riverat1 ( 1048260 ) writes:
      
      Just the papers. The purpose of peer review is sort of like a spelling and grammar check. Reviewers make sure the paper doesn't have any silly scientific mistakes and that the information is presented clearly enough for other scientists to be able to follow the work. Whether the paper ultimately passes muster comes after it is published when the general community in the field can read it and make their comments.
Obvious solution. (Score:1)

by Anonymous Coward writes:

They should post their data to slashdot. Who will duplicate that shit so many times it will never vanish.
Slashdot can solve that (Score:1)

by DMiax ( 915735 ) writes:

That's why Slashdot is keen on posting all new studies at least twice, thus increasing the chances they are still available for future generations!
From personal experience (Score:1)

by Anonymous Coward writes:

I've found dead links to data in peer reviewed papers published just a week or less prior to reading them, sometimes these links were never valid to begin with.
- Re: (Score:2)
  
  by Obfuscant ( 592200 ) writes:
  
  I've found dead links to data in peer reviewed papers published just a week or less prior to reading them, sometimes these links were never valid to begin with.
  Maybe the peer-review process should be shorter, or you should keep up with current journals and not depend on ten year old articles?
  Seriously though. maintenance of data requires money. I have 22 years worth of data here. Much of it is raw video on VHS tapes. Much of it is on old floppies. Much of it is on TK70 tapes. Much of it is on early versions of magnetoptical disks. I don't have anything that reads any of those formats anymore.
  Who pays to keep copying old data onto new media as new media are deve
Entropy most common scientific subjec to lose data (Score:2)

by JoeyRox ( 2711699 ) writes:

Couldn't resist.
Options (Score:5, Interesting)

by jklovanc ( 1603149 ) writes: on Friday December 20, 2013 @07:42PM (#45750341)

Maybe there should be an option to "ignore" an article or "report as duplicate". The second option would require someone to react to it so it may not work.

At least (Score:4, Interesting)

by Nemyst ( 1383049 ) writes: on Friday December 20, 2013 @07:56PM (#45750447) Homepage

Slashdot is doing its part by posting the same data multiple times. Perhaps one copy will survive the test of time!

Neglect causes dupes. (Score:1)

by Anonymous Coward writes:

Dupity dupe dupe!
primary data archival (Score:1)

by Anonymous Coward writes:

Working in the field, I can pretty much state that far from enough care is taken with data archival and/or transfer to newer storage media when older ones approach obsolescence.
There's:
A: not enough staff to take care of it properly or keep a proper archival environment for the various media
B: not enough money & time to modernize the records/transfer to new mediums
C: sometimes not enough money to even properly maintain obsolete, long-unsupported and obscure data recording equipment
(I've seen 'rubber' pi
- Re: (Score:2)
  
  by cusco ( 717999 ) writes:
  
  Sometimes there is also deliberate and/or malicious destruction to take into account as well, like the Bush mAdministration ordering the destruction of the Mariner and Pioneer data.
Digital Data (Score:1)

by koan ( 80826 ) writes:

Think of all the family photos that will get deleted or destroyed by hardware failure, and to think I have family photos (on film) from over 100 years ago.
LIbrary of Congress? (Score:2)

by riverat1 ( 1048260 ) writes:

Maybe designating the Library of Congress as a repository for scientific data would work. They're pretty good at archiving stuff.
There are no unique identifiers for authors (Score:2)

by damn_registrars ( 1103043 ) writes:

Part of the problem with corresponding with authors of papers more than 2 years old is that there is no good way to uniquely identify an author. If you know that you are interested in a "John Smith" who wrote a Nature paper i n1989, good luck figuring out which "John Smith" is the same one today (if he is still alive). Another good example is of how many papers are by "Z Huang":currently over 6,000 to date in pubmed [nih.gov].

Considering how we expect researchers to change institutions multiple times in their car
Scientific Data condensed as papers (Score:2)

by volvox_voxel ( 2752469 ) writes:

One thing that I lament about scientific publications, is that the results are boiled down to a few pages. You rarely see raw data , an generally only the statistical analysis. I would like to see web links in journals that include more of the raw data, the programs that generated that data, etc. We live in a day in age when gigabytes are cheap. It would be a lot easier to duplicate someone's work for peer review if the inherent data & analysis programs were more accessible. Although, there are a fair
A Thing or Two, Within a Factor of Fifty (Score:2)

by darenw ( 74015 ) writes:

"Research scientists could learn an important thing or two from computer scientists,..."
What is the error bar on "a thing or two"?
As someone with a foot in each camp, I believe it's more like fifty or a hundred. The methods of scientists regarding computing are often built of slow evolutionary changes upon old familiar methods, while incorporating selected cutting edge hardware or algorithms. It is partly the nature of some science projects to carry out observations over many years, ideally with the sa
Maybe the dog really did eat John Lott's homework (Score:2)

by nbauman ( 624611 ) writes:

https://en.wikipedia.org/wiki/John_Lott#Disputed_survey [wikipedia.org]
Disputed survey
In the course of a dispute with Otis Dudley Duncan in 1999–2000,[55][56] Lott claimed to have undertaken a national survey of 2,424 respondents in 1997, the results of which were the source for claims he had made beginning in 1997.[57] However, in 2000 Lott was unable to produce the data, or any records showing that the survey had been undertaken. He said the 1997 hard drive crash that had affected several projects with co-authors h
Journals (Score:2)

by belg4mit ( 152620 ) writes:

Perhaps this is n opportunity for journals to update their business models?
Warehouse and convert data, as well as curate contact lists for papers.
Those who do not remember science... (Score:2)

by __aaltlg1547 ( 2541114 ) writes:

... are condemned to repeat it.
Purdue University Research Repository (Score:1)

by Mark Leighton Fisher ( 208223 ) writes:

I'm late to the party here, but I thought it was worth mentioning that the Purdue University Research Repository (https://purr.purdue.edu) is designed as a Trusted Digital Repository for research data. The default lifetime is 10 years, but the Purdue Libraries will add noteworthy datasets to its permanent digital collection after their default lifetime expires. (And yes, I am a programmer on the project.)
- Re: (Score:2)
  
  by allcoolnameswheretak ( 1102727 ) writes:
  
  What data? I just need to walk outside. It's end of December in Germany and we have 6C outside. Tomorrow 12C are forecast. I doubt I will see any snow at all this year. When I was a kid, we used to build snowmen and do battle with snowballs at this time.
  - Re: (Score:1)
    
    by Anonymous Coward writes:
    
    What data? I just need to walk outside. It's end of December in Germany and we have 6C outside. Tomorrow 12C are forecast. I doubt I will see any snow at all this year. When I was a kid, we used to build snowmen and do battle with snowballs at this time.
    That's very interesting! We had record low temperatures here a couple weeks ago. Colder than I've ever experienced in my life and I've been living here for 30 years. Exciting times! But, unfortunately two data points is not enough to make any kind of conclusion about changes in the global climate. I think you may be confusing meteorology and climatology. We need lots of data to examine climate change, which has been collected for that very reason. It'd be a shame to lose it.
    - Re: (Score:2)
      
      by Obfuscant ( 592200 ) writes:
      
      That's very interesting! We had record low temperatures here a couple weeks ago. Colder than I've ever experienced in my life and I've been living here for 30 years. Exciting times! But, unfortunately two data points is not enough to make any kind of conclusion about changes in the global climate.
      Record cold here, too. Now we have three points, and it's two to one in favor of global cooling. Woot woot!
      We need lots of data to examine climate change, which has been collected for that very reason. It'd be a shame to lose it.
      Don't fear. If we lose any real data, the atmospheric modelers will happily create new old data.
      - Re: (Score:2)
        
        by riverat1 ( 1048260 ) writes:
        
        It's sad the misunderstanding of climate science that your post demonstrates. Modelers don't create data (at least not the data you're thinking about), they compare their model output to real world data to understand how well they model the real world.
        
        Re: (Score:2)
        
        by Obfuscant ( 592200 ) writes:
        
        Whoosh.
        And look up the word "hindcast" if you don't think modelers don't create "old" data.
        
        Re: (Score:3)
        
        by riverat1 ( 1048260 ) writes:
        
        I'm perfectly aware of what hindcasting is. The results of a hindcast are never presented as real world data.
        
        Re: (Score:2)
        
        by Obfuscant ( 592200 ) writes:
        
        The results of a hindcast are never presented as real world data.
        A. That you know of.
        B. There was a conditional clause involved that included a complete loss of valuable real data. If the data was valuable and there is a model that can recreate it, it can be done.
        C. If you know about hindcasting, then you know that modelers, as a regular course of business, create "new old data" which they then compare to the real old data. Saying "modelers don't create data" is wrong; they don't routinely or honestly create what they will call real data.
        D. Even that last statement
        
        Re: (Score:2)
        
        by riverat1 ( 1048260 ) writes:
        
        Oops, I guess I fell victim to Poe's law.
        But when you get down to it pretty much everything in science is a model of the real world in one way or another.
  - Re: (Score:2)
    
    by jedrek ( 79264 ) writes:
    
    It's 6C in Warsaw right now... and last year we'd had snow for two months by this time.
    A handful of data points does not a trend make.
- Re: (Score:2)
  
  by composer777 ( 175489 ) * writes:
  
  Right, all those almanac records have just up and disappeared. It's a "conspiracy".
- Re: (Score:2)
  
  by harvey the nerd ( 582806 ) writes:
  
  Lonnie Thomson's missing ice core data [climateaudit.org], unarchived for 20+ yrs comes to mind, among many. Catastrophic anthropogenic global warming, it's a religion. The smart money is thinking more about the probable cold years after 2018.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

This was understood in Engineering projects too (Score:1)

Re:This was understood in Engineering projects too (Score:4, Insightful)

Re: (Score:3)

Re: (Score:2)

Re:This was understood in Engineering projects too (Score:4, Funny)

Re: (Score:2)

Re: (Score:2)

Scientific Data Disappears At Alarming Rate too! (Score:5, Informative)

Re:Scientific Data Disappears At Alarming Rate too (Score:4, Funny)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Drinking from the firehose. (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

This article looks familiar (Score:2)

Re: (Score:2)

Re: This article looks familiar (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Obvious solution. (Score:1)

Slashdot can solve that (Score:1)

From personal experience (Score:1)

Re: (Score:2)

Entropy most common scientific subjec to lose data (Score:2)

Options (Score:5, Interesting)

At least (Score:4, Interesting)

Neglect causes dupes. (Score:1)

primary data archival (Score:1)

Re: (Score:2)

Digital Data (Score:1)

LIbrary of Congress? (Score:2)

There are no unique identifiers for authors (Score:2)

Scientific Data condensed as papers (Score:2)

A Thing or Two, Within a Factor of Fifty (Score:2)

Maybe the dog really did eat John Lott's homework (Score:2)

Journals (Score:2)

Those who do not remember science... (Score:2)