New submitter kodiaktau writes "A recently presented paper discusses how large data sets can improve learning algorithms, but points out that researchers still need to account for bias and incompleteness before drawing conclusions. The paper also goes into the need for responsible business practices to manage these data sets. 'There's been the emergence of a philosophy that big data is all you need. We would suggest that, actually, numbers don't speak for themselves.' The full paper is available through SSRN. Of particular importance is their assertion that even huge data sets can and will be affected by filters or the analyst who is interpreting it. '[Study co-author Kate Crawford] notes that many big data sets — particularly social data — come from companies that have no obligation to support scientific inquiry. Getting access to the data might mean paying for it, or keeping the company happy by not performing certain types of studies.'"