MRI Software Bugs Could Upend Years Of Research (theregister.co.uk) 95

Posted by msmash on Tuesday July 05, 2016 @11:20AM from the bad-data dept.

An anonymous reader shares a report on The Register: A whole pile of "this is how your brain looks like" MRI-based science has been invalidated because someone finally got around to checking the data. The problem is simple: to get from a high-resolution magnetic resonance imaging scan of the brain to a scientific conclusion, the brain is divided into tiny "voxels". Software, rather than humans, then scans the voxels looking for clusters. When you see a claim that "scientists know when you're about to move an arm: these images prove it", they're interpreting what they're told by the statistical software. Now, boffins from Sweden and the UK have cast doubt on the quality of the science, because of problems with the statistical software: it produces way too many false positives. In this paper at PNAS, they write: "the most common software packages for fMRI analysis (SPM, FSL, AFNI) can result in false-positive rates of up to 70%. These results question the validity of some 40,000 fMRI studies and may have a large impact on the interpretation of neuroimaging results."

MRI Software Bugs Could Upend Years Of Research

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 95 Comments Log In/Create an Account

Comments Filter:

That's a Crappy Summary (Score:5, Informative)

by damn_registrars ( 1103043 ) writes: <damn.registrars@gmail.com> on Tuesday July 05, 2016 @11:22AM (#52448283) Homepage Journal

The research is on fMRI - the F stands for Functional. As it mentions later in the summary this is used to try to associate regions of the brain with specific functions. This is not the same as the structure of the brain itself. What we see in terms of actual brain structures - folds, regions, etc, is still very much valid. We're just not so sure about the functional assignments that we've held on to for a while now.

- - Re: (Score:1, Informative)
    
    by Anonymous Coward writes:
    
    And pray tell, where did you read that? Because the linked paper states in no uncertain terms:
    http://www.pnas.org/content/early/2016/06/27/1602413113.full
    Functional MRI (fMRI) is 25 years old, yet surprisingly its most common statistical methods have not been validated using real data.
  - Re: (Score:2)
    
    by Calydor ( 739835 ) writes:
    
    Fe is iron.
    F is fluorine.
    Clearly this only works if you've used brain bleach.
- Re: (Score:2)
  
  by K. S. Kyosuke ( 729550 ) writes:
  
  Regardless, reproductible research [reproducibleresearch.net] is desirable.
  - Re: (Score:2)
    
    by damn_registrars ( 1103043 ) writes:
    
    Regardless, reproductible research is desirable.
    I agree with you 100% on that (assuming of course that you meant reproducible and not reproductible). Honestly though there are few fields of modern science that are not having at least some reproducibility issues. Some suffer it more than others, of course, but it is a rather pervasive problem. As much as neurology is a well established medical speciality, there is a lot we still don't know about how the brain works. fMRI and other tools were supposed to help but without a solid foundation we're still
    - Re: (Score:2)
      
      by K. S. Kyosuke ( 729550 ) writes:
      
      assuming of course that you meant reproducible and not reproductible
      Sorry, that was a typo. :)
      - Re: (Score:3)
        
        by damn_registrars ( 1103043 ) writes:
        
        assuming of course that you meant reproducible and not reproductible
        Sorry, that was a typo. :)
        I figured as much, but thought I'd check in case you are involved in (or want to recruit others to partake in) some sort of cutting-edge HVAC research.
- Re:That's a Crappy Summary (Score:4, Funny)
  
  by JoeMerchant ( 803320 ) writes: on Tuesday July 05, 2016 @12:47PM (#52449125)
  
  Whenever I've read an fMRI "research" paper, it seems like the f should be standing for "full of ____", because the sample sizes are laughably small, the data are fuzzy and interpreted with a lot of handwaving, and the correlation between oxygen uptake and the fMRI signal itself is very weak, finally somebody has gotten around to calling BS on the whole field.
  
  - Re: (Score:2)
    
    by ColdWetDog ( 752185 ) writes:
    
    Dunno, it seems like it could open a whole new field. Research work on fresh salmon.
    Mmmm. Salmon.
- Re: (Score:3)
  
  by XXongo ( 3986865 ) writes:
  
  Nope. Thinking about moving your arm is in one part of the brain. Actually moving your arm is in a different part of the brain.
  It's like in a robot, if you're looking at electrical signals in the servo motor controller, it doesn't matter whether there are signals in the processor core.
  - - Planning vs effecting (Score:3)
      
      by DrYak ( 748999 ) writes:
      
      The claim is quite dubious in that it seems to suggest that scientists know someone is going to move their arm before the person does, simply from reading MRI images.
      Not MRI image.
      But f MRI images (where f = "functional") [wikipedia.org]
      In a nutshell, those image are based around the fact that hemoglobin loaded with oxygen interacts and distorts the magnetic field differently than hemoglobin which has discarded its oxygen.
      By measure these signal differences, it's possible to infer where there's more oxygen consumption, and from there try to guess which parts of the brain are working more (and thus consuming more oxygen).
      Spatial resolution of such image is "not so great" (blurier than
  - - Re: (Score:2)
      
      by mrchaotica ( 681592 ) * writes:
      
      WTF does "upvoted" mean? This ain't Reddit; around here we moderate!
- Re: (Score:2)
  
  by jellomizer ( 103300 ) writes:
  
  The part that is more worrisome. Is that the software ecosystem for this is so small that it seems to affect across many MRI vendors. I mean if you are are going to do a scientific study. You should make sure your results are calculated from different software.
  Open Source or not probably isn't the big issue, but the fact that so many researchers were using the same software.
  - Vendor vs. researcher (Score:2)
    
    by DrYak ( 748999 ) writes:
    
    Is that the software ecosystem for this is so small that it seems to affect across many MRI vendors.
    Nope. It's that the vendor only takes care to write the bit of software that actually controls the MRI machine.
    The vendor takes care of the low-level and behind the scene work need to they point where you obtain an image - usually in a standard format like DICOM.
    (think about the firmware inside a digital point and shoot camera, which is in charge of controlling the CCD, the flash and the zoom/focus, and whose purpose is to write a JPEG file on the storage media at the end).
    Whatever you do with the DICOM out
- Re: (Score:3)
  
  by daenris ( 892027 ) writes:
  
  All of the software packages tested in the article (AFNI, FSL, SPM) are open source, including the package the authors built to do massively parallel non-parametric permutation tests (BROCCOLI).
And a FRMI study of a dead salmon (Score:1, Informative)

by Anonymous Coward writes:

It's not exactly new this issue. Through the link is a study of the active regions of the brain of a dead salmon....
http://prefrontal.org/files/posters/Bennett-Salmon-2009.pdf
- fMRI vs Climate change deniers (Score:2)
  
  by DrYak ( 748999 ) writes:
  
  Climate science has been tested and proven over and over again using numerous different methods, which all broadly lead to more or less the same ballpark of conclusion.
  The only actually *REAL* controversy that exist among scientific is about the minute details of interpretation (like the exact expected decimals at the end of the predicted number), not about the broad existence of climate change.
  From this perspective it's quite normal to have strong scepticism against pseudo-scientist trying to stir controve
  - Re: (Score:1)
    
    by mi ( 197448 ) writes:
    
    The only actually *REAL* controversy that exist among scientific is about the minute details of interpretation (like the exact expected decimals at the end of the predicted number), not about the broad existence of climate change.
    Ah, thanks for clarifying. So it is now Ok, in your opinion, to imprison the remaining deniers and to erase (or otherwise keep inaccessible) the raw data, that has once lead our betters to these universally-accepted conclusions?
    Or do you still agree, criminal prosecution of dissen [americanthinker.com]
  - Re: (Score:2)
    
    by Crashmarik ( 635988 ) writes:
    
    Except they aren't
    https://arxiv.org/pdf/1605.043... [arxiv.org]
    Reproducible and replicable CFD:
    it’s harder than you think
This kind of thing is way too common in science (Score:5, Interesting)

by Anonymous Coward writes: on Tuesday July 05, 2016 @11:30AM (#52448361)

A friend of mine as worked in the social sciences (cue /. laughter, but bear with me) and they were forced by the university to use a closed source statistical package for all their data processing. So anyway, she got some really dubious results and she preferred to do her own maths, so she did, and lo! completely different results. That was the start of a research project which concluded that the closed source package contained a rounding error that basically filtered all minorities out of the data set, which is kind of sad if you're doing research on minorities.
People trust their software too much, are too lazy to do their own maths, don't really want to have got anything to do with data processing even though that's their job, and universities force bad software on their employees. This is an institutional problem that goes way beyond MRI research.

- - Re: (Score:2)
    
    by avandesande ( 143899 ) writes:
    
    Publish or die....
    - - Re: (Score:2)
        
        by Big Hairy Ian ( 1155547 ) writes:
        
        We don't pay you to think :D
- Re: (Score:3)
  
  by jittles ( 1613415 ) writes:
  
  A friend of mine as worked in the social sciences (cue /. laughter, but bear with me) and they were forced by the university to use a closed source statistical package for all their data processing. So anyway, she got some really dubious results and she preferred to do her own maths, so she did, and lo! completely different results. That was the start of a research project which concluded that the closed source package contained a rounding error that basically filtered all minorities out of the data set, which is kind of sad if you're doing research on minorities. People trust their software too much, are too lazy to do their own maths, don't really want to have got anything to do with data processing even though that's their job, and universities force bad software on their employees. This is an institutional problem that goes way beyond MRI research.
  I had a university level Statistics "professor" once tell me that I didn't need to know how my calculator created a box plot, etc etc because I could just use someone else's statistics library instead of writing my own. While in general I agree that there is no point in reinventing the wheel, I felt like I ought to learn how such things work.
  - Two basic rules of statistics (Score:3, Insightful)
    
    by Okian Warrior ( 537106 ) writes:
    
    I had a university level Statistics "professor" once tell me that I didn't need to know how my calculator created a box plot, etc etc because I could just use someone else's statistics library instead of writing my own. While in general I agree that there is no point in reinventing the wheel, I felt like I ought to learn how such things work.
    I do a *ton* of statistical work in my day job, and if I were to write a book or teach a class, I would recommend two things:
    1) Always look at the data
    2) Always write your own functions
    The reason for this has to do with the basic nature of statistics. If you make a mistake in normal software, the error is usually patently visible or benign. Often times the software works fine and does its job and the results are correct, even if it has bugs.
    In statistics however, if you make a mistake the results get closer
    - - Re: (Score:2)
        
        by stoatwblr ( 2650359 ) writes:
        
        "Never rely on a mathematical function in a software package that you can't cross check using other tools and techniques. It doesn't matter whether it is 'widely tested and reliable', which often just means other people assume it works. Software can and does have all kinds of untested corner cases and obscure bugs."
        The same applies to just about anything in software.
        I've had greybeards tell me they will not use XYZ newer package because ABC has been around forever and and "is widely tested and reliable" - w
  - Re: (Score:2)
    
    by TapeCutter ( 624760 ) writes:
    
    I had a university level Statistics "professor"
    What other levels can a professor have?
- Re: (Score:2)
  
  by phantomfive ( 622387 ) writes:
  
  Worth mentioning that not long ago, someone got fMRI results from dead salmon [wired.com].
- Re: (Score:3)
  
  by DarthVain ( 724186 ) writes:
  
  Problem with closed source and science.
  Similarly there was a court case in Florida where people were suspicious of Breathalyzer results. Police use one produced by a company with closed source code. Court ordered them to open it up for inspection. They tried the "Trade Secrets" argument and refused. Court disagreed and starting fining them every day until they release the code. Once they did it was found to be horrible, and inaccurate, invalidating thousands of court cases... As it turned out they knew it w
Probably will happen in other science fields, too (Score:2, Insightful)

by Anonymous Coward writes:

It's a matter of time before this happens with global warming, too. It's well known that the temperature record is adjusted, supposedly to remove biases. However, if you look at the unadjusted data, it fits the solar cycle perfectly, with temperatures declining over the past few decades, coinciding with solar dimming. The adjustment looks like a hockey stick, though, which can explain the entirety of the supposed warming. The National Climatic Data Center once had these figures on their website, though they
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  And gravity. Everyone keeps using our type of matter in their experiments, where the inertial mass and the gravitational mass of everything is nearly identical if you use open source software to make the statistical comparisons! But these two masses are only the "same" if you use mathematics which can be reviewed for accuracy. If you use the correct proprietary software (you have to preserve the trade secrets), you'll see the two masses are different (because of ghosts). That's why only properly equipped sc
- Re:Probably will happen in other science fields, t (Score:4, Insightful)
  
  by TapeCutter ( 624760 ) writes: on Tuesday July 05, 2016 @07:25PM (#52452859) Journal
  
  It's a matter of time before this happens with global warming, too.
  Well financed "skeptics" have been busting a gut for over 20yrs trying to prove your conspiracy theory, they have done nothing but bring the word "skeptic" into disrepute.
  
Issue is likely overstated (Score:5, Informative)

by daenris ( 892027 ) writes: on Tuesday July 05, 2016 @11:48AM (#52448493)

The paper has been available as a preprint for awhile now, and my lab has discussed it internally and I've also paid attention to outside coverage. The key issue that the paper reports is that false positive rates are two high for most existing software WHEN using a specific type of test under a specific set of conditions. They show that voxelwise familywise error (FWE) correction actually seems to work reasonably or even conservatively. Cluster level FWE correction (looking for groups of voxels that are active) fails when using a very liberal cluster-defining threshold, but works reasonably well when using a more stringent cluster defining threshold. It also says nothing about the performance of another very common correction method that is frequently used in fMRI studies (false discovery rate or FDR).

I'm not really sure how extensive the group of findings that these issues actually affect is, but it's certainly not 40,000 as is claimed in the paper's significance section. Many of the earlier papers (and even more recent) likely used uncorrected statistical tests, so are suspect for entirely different reasons from this issue. Of the ones that use correction, the findings in this paper only call into question the results for those that are using FWE cluster correction with a cluster defining threshold that is too liberal (likely > 0.001, the paper's findings suggest that at 0.001 the familywise error rate is in the ballpark of the desired 5%). Those using a cluster defining threshold of p=0.001 or lower are likely fine, and those using a different correction method like FDR are unknown as to my knowledge there isn't currently any similar paper on that correction method.

You can also check out this technical report by some other big names in imaging that basically says that this result is known and expected for overly liberal cluster defining thresholds:
http://www.fil.ion.ucl.ac.uk/s... [ucl.ac.uk]

- Re: (Score:2)
  
  by BenBoy ( 615230 ) writes:
  
  Please keep your "facts" out of my outraged 'science is soooooo stupid' thread ...
  - Re: (Score:2)
    
    by ColdWetDog ( 752185 ) writes:
    
    Yeah, salmon is expensive. I was hoping to get the grant to pay for it.
- - Re: (Score:2)
    
    by Hognoxious ( 631665 ) writes:
    
    Was it researchers, journalists or slashdot editors?
- Re: (Score:2)
  
  by nycsubway ( 79012 ) writes:
  
  We've also been looking this over. It doesn't exactly invalidate previous studies that used high clustering threshold of p0.05, it just indicates that they are not as robust as once thought. The paper itself could change what reviewers accept though. Maybe some reviewers will say that based on this paper, only analyses using a FLAME1 or permutations method should be accepted. Much like registering EPIs directly to the standard template is frowned upon. It depends on the reviewer and the justification for yo
- Re: (Score:3)
  
  by TapeCutter ( 624760 ) writes:
  
  Thank you, comments like yours are the reason I still come here.
- Re: (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  There have been a few other papers criticising FWE clustering lately. It's always struck me as kind of an iffy concept. Even the simpler non-clustering techniques, although they seem to do more or less what they advertise, really should be regarded as exploratory and checked by proper hypothesis driven replication studies.
what about climate research? (Score:2)

by known_coward_69 ( 4151743 ) writes:

i've been wanting to learn R and thought about doing some maths on the raw data and compare it with the released results. mostly looking at trends at specific weather stations compared to official numbers
The Last Part is Important (Score:5, Insightful)

by medv4380 ( 1604309 ) writes: on Tuesday July 05, 2016 @12:10PM (#52448709)

The researchers used published fMRI results, and along the way they swipe the fMRI community for their “lamentable archiving and data-sharing practices” that prevent most of the discipline's body of work being re-analysed.
So the raw data isn't being saved so that someone else can independently verify the results. No checking the computers math, no checking the researchers settings on the machine. Just blanket trust for the people and the machine, and purging of any way of poking holes in someones findings. Even if this wasn't caused by a software bug the lack of archiving the raw dataset so that it can be rerun when software improvements are made is just infuriating.

- Comment removed (Score:4, Informative)
  
  by account_deleted ( 4530225 ) writes: on Tuesday July 05, 2016 @12:45PM (#52449091)
  
  Comment removed based on user account deletion
  
  - Re: (Score:1)
    
    by The Grim Reefer ( 1162755 ) writes:
    
    You can reconstruct enough identifiable features from raw data plus you have to record quite a number of other features (age, weight etc. for radiation calculations)
    There's no ionizing radiation in an MRI. The age is not needed for the scan either. The weight is needed to calculate the SAR (specific absorbtion rate). In simple terms, it's so you don't cook the patient since RF pulses are being used to disrupt the magnetic field. These heat up the patient.
    . If you strip all that out (skull stripping, DICOM anonymize), it's no longer raw data AND it becomes very hard to distinguish things like image orientation.
    Only data that make it possible to identify the patient. The vast majority of the DICOM header does not. The patient name, MRN, etc. must be removed. The image orientation, flip angle, TR, FOV, slice thick
    - Re: (Score:2)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
      - Re: (Score:2)
        
        by medv4380 ( 1604309 ) writes:
        
        Then perhaps they should consider adopting a standard that can adhere to HIPPA privacy rules, and provide a way to re-verify the analysis. Otherwise the research half of the fMRI scans are utilizing HIPAA as a shield to protect their conclusions. I work in Study Research that has to adhere to HIPAA rules, and there is quite a bit that can be included in a dataset sanitized of identity information. Otherwise no one would have their study retracted due to fraud because they could hide the dataset from scrutin
      - Re: (Score:2)
        
        by ceoyoyo ( 59147 ) writes:
        
        Not sure what you're doing, but you're doing it wrong. The DICOM standard includes very specific tags for identifying orientation unambiguously. In hundreds of thousands of images over a decade and a half I've never seen a DICOM file from an image acquisition system that didn't properly implement them.
        http://dicom.nema.org/medical/... [nema.org]
        Also, if you can't figure out all the directions except L/R with the skull stripped, you should probably take an anatomy class. Or look at a scan.
- Re: (Score:2)
  
  by JoeMerchant ( 803320 ) writes:
  
  The whole field is full of "too expensive to do good science, but let's publish anyway." Magnet time runs $500/hr, too expensive to get an adequate number of subjects, or trials with a given subject, fMRI data is a time series of complex volumes - up until recently it was "too expensive" to store 1-2GB of data per subject-trial, but, but, it's just so cool, we wanted to share (and get our name on a publication.)
- Re: (Score:2)
  
  by nycsubway ( 79012 ) writes:
  
  There are two parts to this:
  1) The raw data may or may not be saved. But it costs money to save the data. Once the research study is finished, the money is gone too, so there may be no way to pay for storage to save the data. Some researchers may hold on to it, some delete it. Until very very recently, there was no universal funded repository for neuroimaging data either. Now the NIH mandates, and pays for, the long term archiving of all NIMH funded imaging studies, including genetics.
  2) The other problem i
Video game effect on the brain (Score:1)

by nvm ( 3984313 ) writes:

Meh! I already those studies (video game make you a psychopath/serial killer etc.) were crap, with an agenda.
To our benevolent overlords (Score:2)

by epine ( 68316 ) writes:

When a story embeds the same link three times in a row (once in the mast, then twice in the article text) pretty please with sugar on top display the redundant links with "[register.com]" following the link, just like it does in my configured article view.
Or, clever idea, you could display "[repeat link]" in each case where a link is repeated.
If you're feeling extra ambitious—but you don't wish to interrupt your feverish efforts to deliver proper Unicode support one minute more than absolutely necessa
Great! (Score:2)

by ADRA ( 37398 ) writes:

I love it when people run studies to actually verify / build-upon previous results. What I'm really seeing from this article is that there's a lot more "plug numbers into tool" research going on than I first expected. I would've hoped that the tools themselves would output confidence coefficients so that at least the researchers would have a clue as to how much magic they'd come up with...
fMRI false positives have been demonstrated before (Score:1)

by maas15 ( 1357089 ) writes:

One famous example of error related problems with fMRIs is the infamous brain scan of the dead salmon. I'm not sure if I can post a link but its: http://www.wired.com/images_bl... [wired.com]
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  People these days want to believe against all sanity. There are no miracles and no mega-geniuses. Science done right is very slow and almost never revolutionary. Technology is the same.
Well known since at least 2009 (Score:2)

by Orgasmatron ( 8103 ) writes:

Scanning Dead Salmon in fMRI Machine Highlights Risk of Red Herrings [wired.com]
Neuroscientist Craig Bennett purchased a whole Atlantic salmon, took it to a lab at Dartmouth, and put it into an fMRI machine used to study the brain. The beautiful fish was to be the lab's test object as they worked out some new methods.
So, as the fish sat in the scanner, they showed it "a series of photographs depicting human individuals in social situations." To maintain the rigor of the protocol (and perhaps because it was hilarious)
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Glad to see that there is at least one actual scientist in that field. The others seem to be mainly morons with big mouths.
Now just wait (Score:1)

by axewolf ( 4512747 ) writes:

It will probably only take 20 years for them to come to the same conclusion about the detection of gravity waves
I've had my brain MRIed once... (Score:2)

by Applehu Akbar ( 2968043 ) writes:

But sure enough, they found nothing.
Sounds like confirmation-bias (Score:3)

by gweihir ( 88907 ) writes: on Wednesday July 06, 2016 @01:26AM (#52454361)

I.e. people seeing what they expecting to see, not what is there. With the huge egos, (but not nearly as large skills) in people doing Neuro-"Science" these days, I am entirely unsurprised. The grand claims about what they know and how things work have been a dead giveaway for years. Things are not that simple in practice.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

That's a Crappy Summary (Score:5, Informative)

Re: (Score:1, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re:That's a Crappy Summary (Score:4, Funny)

Re: (Score:2)

Re: (Score:3)

Planning vs effecting (Score:3)

Re: (Score:2)

Re: (Score:2)

Vendor vs. researcher (Score:2)

Re: (Score:3)

And a FRMI study of a dead salmon (Score:1, Informative)

fMRI vs Climate change deniers (Score:2)

Re: (Score:1)

Re: (Score:2)

This kind of thing is way too common in science (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Two basic rules of statistics (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Probably will happen in other science fields, too (Score:2, Insightful)

Re: (Score:1)

Re:Probably will happen in other science fields, t (Score:4, Insightful)

Issue is likely overstated (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

what about climate research? (Score:2)

The Last Part is Important (Score:5, Insightful)

Comment removed (Score:4, Informative)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Video game effect on the brain (Score:1)

To our benevolent overlords (Score:2)

Great! (Score:2)

fMRI false positives have been demonstrated before (Score:1)

Re: (Score:2)

Well known since at least 2009 (Score:2)

Re: (Score:2)

Now just wait (Score:1)

I've had my brain MRIed once... (Score:2)

Sounds like confirmation-bias (Score:3)

Related Links Top of the: day, week, month.

Slashdot Top Deals