Many Surveys, About One In Five, May Contain Fraudulent Data (sciencemag.org) 115
sciencehabit writes: How often do people conducting surveys simply fabricate some or all of the data? Several high-profile cases of fraud over the past few years have shone a spotlight on that question, but the full scope of the problem has remained unknown. [Tuesday], at a meeting in Washington, D.C., a pair of well-known researchers presented a statistical test for detecting fabricated data in survey answers. When they applied it to more than 1000 public data sets from international surveys, a worrying picture emerged: About one in five of the surveys failed, indicating a high likelihood of fabricated data.
97% Consensus (Score:1)
Well, we know one does for sure.
Re:97% Consensus (Score:5, Funny)
Re: (Score:2)
Re: (Score:2)
The lie in this case is the 1 in 5 figure. In reality it is much higher.
Much higher than you think if you take into account surveys done by people who need to see a certain result - look at the survey which claims that 1 in 4 women are sexually molested or raped on campus.
One of the questions, IIRC, was "While drunk, did you ever have sex with someone you wouldn't normally consent to have sex with?", and counted that as a rape. Even if all other parts of the survey work were accurate, it's easy to get the result you want by asking the right question.
Re: (Score:1)
Then there are the domestic violence surveys that count using logic to counter a woman's argument as domestic violence as well as a man withholding sex from a woman. Funny how neither of those counted when it was a woman using the same tactics against a man. There were many more tactics used to inflate the figures, but I do not have time to go through that list, it wo
Gee, you don't suppose respondents lie? (Score:4, Interesting)
When I take most surveys I answer calculated to confound the test as much as possible, on the assumption that anyone could be doing so; and thus, to accelerate the process is to bring it to the attention of the researchers faster. The problem is that it's an inherently flawed method.
Re: (Score:1)
The problem is that it's an inherently flawed method.
The problem is that people believe what they see and hear
Re: (Score:1, Interesting)
Respondent's lie and researchers lie. I've been unable to find it now, but a month ago I came across a essay from an established sociologist who was bemoaning the current number of researchers in his field who were conducting shit studies in the furtherance of promoting their political and cultural ideology. He examined a few prominent studies that were debunked when it came to light that the researchers had either manipulated, or dropped data points that conflicted with their study's conclusions, or outrig
Re: (Score:3)
Re: (Score:3)
When I take most surveys I answer calculated to confound the test as much as possible
If a few people did this randomly, then it wouldn't skew the results much. But it is not random. Liberals are more willing to participate in surveys, and more willing to answer honestly. Conservatives tend to be more cynical and calculating. Other factors skewing the results are that Democrats are more likely to be home, more likely to answer the phone, and more likely to participate in social media polls. Republican are more likely to let their phone calls roll over to voice mail, but also more likely
Re: (Score:2)
Re: (Score:3)
Can you link to a paper, etc., that calculates the adjustments for liberal vs. conservative survey-taking patterns?
Sorry, I have seen these issues mentioned several places, including Nate Silver's blog, but I have never seen them actually quantified.
I think the most famous skewed poll was in 1948. Phone surveys predicted a decisive win by Dewey, but instead Truman was re-elected. The reason was that back in 1948, households with lower incomes (and more likely to vote Democratic) often didn't have a phone.
Re: (Score:2)
Re: (Score:2)
Google does the whole survey thing with its Google Opinion Rewards app, with Google Play Store credit as an incentive. They address this problem by asking you a bogus question like "Have you been to any of the following locations recently?" and then listing locations that do not exist. If you answer in the affirmative, they know you're lying and cut you off from surveys in the future.
Re: (Score:2, Insightful)
Your response will most likely be washed out by a sea of honest responses.
Most participants respond to the best of their ability---although it cannot be assumed they are always correct. Respondent error, even about themselves, is more common than outright deception.
Researchers are aware that participants lie due to self-deception, social desirability, deliberate sabotage, and other reasons. Surveys often incorporate measures to detect deception.
If you answer in a nonsensical fashion, the worst you'll do is
Re: (Score:2)
The problem is that it's an inherently flawed method.
Yet Nate Silver accurately predicted the last two presidential election outcomes. Put another way, just because you're a skeptic doesn't mean you're not the one making shit up and chasing ghosts that aren't there.
Re: (Score:2)
As a gross estimate, the odds are 1/4 that he would have predicted both elections successfully. I'd give it a bit longer before deciding that he's got all the answers.
Re: (Score:2)
There were 50 states and he nailed them all.
Re: (Score:2)
Re: Gee, you don't suppose respondents lie? (Score:1)
Self referential? (Score:5, Insightful)
23.7 % of statistical analyses make up their statistics.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Re:only 1 in 5 fraudulent? (Score:5, Informative)
The rest are merely intentionally misleading. You can get just about any answer you want, without making up data, by carefully selecting your questions and your survey population.
Re: (Score:2)
There's nothing wrong with statistics as a field. There's plenty wrong with how some people apply it.
A couple of sayings come to mind: "Figures don't lie, but liars figure", and "He used the statistics as a drunk uses a lamppost: for support, not illumination."
Re: (Score:2)
You can lie about anything. Claiming that it is a statistic doesn't make it anything other than another lie.
It's also true that statistics are easy to misuse, but the problem isn't (usually) lies, it's usually people not understanding what they are doing...or at least so I believe.
Statistics was invented to solve practical problems, specifically to win gambling games without cheating. It works. (And the original games it was developed for have for some reason become unpopular, or had their rules changed.
One in 5 surveys incompetently fraudulent (Score:5, Insightful)
It's not that only 1 in 5 surveys may contain fraudulent data, it is that the fraud is only incompetent enough to be caught by this method in 1 in 5 surveys.
Re: (Score:2)
Re: (Score:2)
They could have both false positives and false negatives.
They eliminated studies which had a prima facie case for having highly similar responses. However, it is possible for them to miss a study which generates fairly consistent response patterns for non-obvious reasons.
I can buy the 17% number. A sizable majority are legitimate scientific endeavors, but there are enough bad actors that you need to actively seek them out.
I seem to recall about 9% of Americans have a felony conviction. If you assume that's
I'm not actually surprised ... (Score:5, Insightful)
Like it or not, a lot of public opinion polls are paid for by people who want to support a specific point.
Public opinion polls these days are as much PR and marketing as anything else.
Honestly, Pew makes money doing this stuff; honest player or not, they have a vested interest in keeping up the belief that their stuff is honest, unbiased, and accurate.
But I'm entirely willing to believe opinion polls are carefully crafted, or sneakily tweaked, to arrive at the conclusions they've been commissioned to a arrive at.
Re: (Score:1)
Right, like the Gartner Magic Quadrant shit that most C-suite people drool over. I've talked to several software vendors who have told me that Gartner approaches them on the side and allows them to "buy up" their rankings for the right price.
Re:I'm not actually surprised ... (Score:4, Insightful)
Or bond ratings agencies.
I suspect most people, except the people who cite those things, have long since assumed they're full of shit and the conclusions are paid for.
Why would you assume it's honest and objective information? Someone has to make money off it.
And if you don't like that, start your own foundation or think tank, and have them publish stuff to your liking.
Sorry, it's all PR and marketing. It sure as hell aint facts or accurate predictons.
Actually... No. Exactly opposite of that. (Score:2)
Public opinion polls these days are as much PR and marketing as anything else.
Honestly, Pew makes money doing this stuff; honest player or not, they have a vested interest in keeping up the belief that their stuff is honest, unbiased, and accurate.
But I'm entirely willing to believe opinion polls are carefully crafted, or sneakily tweaked, to arrive at the conclusions they've been commissioned to a arrive at.
This is NOT AT ALL about public opinion polls which "people who want to support a specific point" would pay to skew in a certain direction NOR is it about polls that are designed (or "sneakily tweaked") to create certain results.
It's not even about confirmation bias by the pollsters or researchers leaching into the data.
This is about finding cases where a pollster would just sit down and fill out a survey after survey by themselves instead of going door to door.
I.e. Charging for a field survey, forging the
Something's Fishy... (Score:2)
...a pair of well-known researchers presented a statistical test for detecting fabricated data in survey answers...
That sounds a little suspect...
May not be fraud - simply incompetence. (Score:5, Informative)
If you ask: "Do you believe that mothers should be able to legally murder their babies within 2 months of the creation of life?" you get a very different answer than if you ask "Do you believe that women should have the legal right to abortion when the fetus can be demonstrated to show no brain activity more significant than that of a snail."
This might be intentional, or simple unconscious bias.
Re: (Score:1)
Never attribute to incompetence that which can easily be explained by malice.
Consider how loaded and/or leading these questions often are. How unwanted results are known to be thrown out. How population samples are carefully selected to maximize the odds of a certain result.
To assume that fraudulent data would be some mere accidental whoopsie -though perhaps claimed as such when those defrauding us are caught- would be not only blind, but total sensory deprivation.
Re: (Score:2)
Who uses the word "baby" that way? That's an important question here. Also, what do "we" do to distinguish a "baby" still inside a woman from one that's been born?
Including This Study? (Score:2)
So one in five studies might contain false data. Essentially what they are saying is that there's a 20% chance that they are lying about there being a 20% chance of them lying.
Re: (Score:2)
Be careful...read what they actually said. (Score:3)
What they ACTUALLY said was that in surveys conducted in the Western World - only 5% failed their test - but in developing countries - the number was 26% of faked surveys.
Then, they also say that the KIND of survey matters. Their approach is to say that if 85% of answers are identical between two or more respondents then the result is likely to be faked...but they recognize that (for example) in a health survey, all of the healthy people will answer identically to questions about how healthy they are. So that kind of survey is excluded.
So if the research is to be taken at face value, then in the Western world, one in twenty of *some* classes of survey are probably faked. But they looked at 1000 surveys to arrive at that number - we don't know what fraction of those came from the developing world. If all you're interested in is Western World surveys - then maybe the sample size is very small. Given that there are some classes of survey that are known to be excluded - is it possible that they included a few of "the wrong kind" in their sample.
All surveys have an error bar of a few percent - this is a survey about surveys.
I think the conclusion here is that you should ignore surveys carried out by dubious agencies in the developing world. I don't think you should conclude that surveys done by reputable agencies in the western world are unreliable.
Re: (Score:3)
Re: (Score:2)
Leading questions are also an issue (Score:1)
A few years ago a survey said 2/3 of the Danes supports nuclear power. Two days later another survey said 1/3 supports it. A journalist then figured at least one of them is wrong and started digging. It turned out that they didn't answer the same question. 2/3 said yes when asked if nuclear power can be used to reduce worldwide CO2 emissions. 1/3 said yes to Denmark gaining nuclear power for environmental reasons. Despite not answering the same question, the press releases and following headlines made it lo
Re: (Score:3)
It had absolutely no hint of the city being fictional.
If they're willing to bomb a city and kill people without even knowing the specific reason for that bombing, their ignorance is truly dangerous to the world.
If they cannot immediately recall the "where" and "why" to justify homicide, they deserve to be embarrassed.
The only conceivable justification for their position is the lack of an option for indicating "unsure" or "no opinion". And I'd be shocked if a modern survey didn't include that.
How ironic would it be... (Score:1)
...if the data from THIS survey was deliberately falsified to see if anybody actually checked the sources?
Re: (Score:2)
Re: (Score:1)
You're right, I was mistaken in calling it a "survey". But my point was, did anybody verify their data? :D
Re: (Score:2)
Yes, Pew did it. Then they pulled it offline... (Score:2)
From TFA:
During her turn on the stage, Kennedy mounted an attack on the test's methodology. For example, she points out, it does not account for the number of questions on a survey, the number of respondents, nor other factors that can skew the results. She also takes exception to the 85% similarity threshold. "I would choose a different threshold depending on the population and the survey," she says. By putting a number on the extent of data fabrication across all surveys, "they took it too far," Kennedy says. Pew's rebuttal is now online.
Some at the meeting saw merit in both sides of the fight. Rather than overestimating data fabrication, the method of Kuriakose and Robbins "very likely underestimates the true extent of the problem," says Michael Spagat, an economist at Royal Holloway, University of London, who has investigated high-profile cases of possible data fabrication in war zones. Yet Kennedy's response impressed him, too. "I think the Pew paper is interesting and made some good points," he says. "Specifically, there isn't a hard and fast cutoff beyond which you know there is fabrication." Overall, however, Spagat remains very concerned about data fabrication in surveys. "Robbins and Kuriakose have uncovered a massive problem and the Pew paper doesn't change that."
Nothing was settled by the end, says the meeting's co-organizer Steven Koczela, president of the MassINC Polling Group in Boston and a previous survey research leader for the U.S. State Department. The case laid out by Kuriakose and Robbins "seems unassailable to me," he says, "but [Pew] are giving it their level best."
This seems the hard way easier method: (Score:1)
How many? (Score:3)
Many Surveys, About Eight In Three, May Contain Fraudulent Data
All surveys are not the same (Score:3)
"Do you prefer chocolate or vanilla?" is different from "Do you support Falun Gong?" Opinion surveys always have to account for the confounding factor that each respondent may be more likely to provide the socially acceptable answer than their true opinion. The stronger the social stigma associated with the question, the more likely this will be a problem.
This new test is a useful addition to the data analysis process, but doesn't "prove" anything. The challenge is how to refine the technique. If you want to eliminate "false positives" you would need some way to identify "true positives". And if we had a way to do that, we wouldn't need to do surveys.
Bottom line: Surveys don't prove anything. At best they point to interesting ideas for future study.
Re: (Score:1)
Right, there was a recent survey of Americans where something like 50% said they went to church every Sunday.
The real number was something closer to 20%, but the people didn't want to admit that.
There is actually statistical models and proven theory behind filtering out noise like that. On top of that, it also provides an indication of uncertainty and provides an interval where the right answer likely is. The problem is that statistical analysis and the math involved is fairly complex. Getting people with a Ph.D. would be good and master of science would do too, but looking at how long it takes to handle the data and the amount of skilled people, you get a max for how many surveys can be made. Sadl
Re: (Score:2)
Similarly, a survey would probably tell you that 0% of the population had ever been arrested.
"We surveyed a thousand prison inmates. 99.9% reported they were actually innocent. The exception was some guy named 'Andy'."
Re: (Score:2)
If they publish their method with sufficient details for others to duplicate it, the cheaters will be able to use it as well.
If they vet their fabricated data to ensure it passes muster, we will have a real problem.
I fear this could end up like the arms race between malware coders and antivirus vendors. Because I doubt the good guys would have much of a chance here either.
Research show (Score:1)
Over half of all analysis of surveys is flawed.
Everyone does it. (Score:2)
I once had a girlfriend who was doing her PhD. She habitually 'nudged' her data co-ordinates closer to the line of best fit. Other data she merely fabricated.
Re: (Score:2)
That was just plane wrong. You have to get your doctorate before you are qualified to doctor data.
And the rest... (Score:1)
Survey says (Score:1)
Half the marketing departments I've seen (Score:5, Interesting)
Roughly half the marketing departments at companies I've worked for have used half-baked surveys to gather statistics so the company name and the statistic get repeated in the industry over and over again.
This often happens like this: "At (industry conference) this year, let's pass out a survey asking whether or not someone has every heard of a coworker getting hacked by (whatever threat our product purports to mitigate). Survey goes out to already half-paranoid people walking by, and the entire marketing and sales department fills one out that says 'yes I have'. A week later a press release goes out that says "(company) surveyed (# of people) IT managers and other attendees at (conference) and found that (high percentage) had direct knowledge of a coworker getting hacked by (threat)." Very often this stuff gets picked up by the press, bloggers and even other competitors, and the essentially made-up stat gets repeated and repeated until some people even think its true.
Examples:
- http://www.tripwire.com/compan... [tripwire.com]
- http://www.prnewswire.com/news... [prnewswire.com]
- https://www.voltage.com/breach... [voltage.com]
I do that (Score:1)
When surveys ask for personal details that I don't want to supply, I often put crap data in if they don't offer a way to bypass it, such as a "Do not wish to state" option among the choices. For example, age, occupation, and income level.
Survey Says! (Score:2)
How does this affect the outcome of every episode of FAMILY FEUD??? Maybe the Jones family won after all?
Who takes a 100 questions survey? (Score:2)
Who in the world takes a 100 question survey? Every survey I've been asked to take has been less than 10 questions.
Don't know how many fraudsters there are out there (Score:2)
But I can say that on every porn site that asks, I can truthfully say that I was born on January the 1st, 1927.
I have personally falisfied survey data (Score:2)
Re: (Score:2)
Re: (Score:2)
Based on the questions I see on surveys, and comparing to the quality of writing displayed here, I totally believe that you used to own a polling company.
Well that explains... (Score:1)