Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Social Networks Twitter Science Technology

The Importance — and Limits — of Very Large Data Sets 17

New submitter kodiaktau writes "A recently presented paper discusses how large data sets can improve learning algorithms, but points out that researchers still need to account for bias and incompleteness before drawing conclusions. The paper also goes into the need for responsible business practices to manage these data sets. 'There's been the emergence of a philosophy that big data is all you need. We would suggest that, actually, numbers don't speak for themselves.' The full paper is available through SSRN. Of particular importance is their assertion that even huge data sets can and will be affected by filters or the analyst who is interpreting it. '[Study co-author Kate Crawford] notes that many big data sets — particularly social data — come from companies that have no obligation to support scientific inquiry. Getting access to the data might mean paying for it, or keeping the company happy by not performing certain types of studies.'"
This discussion has been archived. No new comments can be posted.

The Importance — and Limits — of Very Large Data Sets

Comments Filter:
  • by garcia ( 6573 ) on Friday October 07, 2011 @10:22AM (#37638462)

    From the blurb:

    Getting access to the data might mean paying for it, or keeping the company happy by not performing certain types of studies.'"

    Even if you're using data from public institutions you still may have to pay for it (to cover staff time to procure the data--especially if you're asking for something they don't normally provide, which is quite often). While there won't be any limitations on what you can do with the data once you have it, because of lack of knowledge of their own data/bases the provider may simply provide you with incomplete or likely inaccurate data anyway.

    So yeah, welcome to the world of using data. Move along, nothing to see here.

BLISS is ignorance.

Working...