Forgot your password?
typodupeerror
Science

Why the Cloud Cannot Obscure the Scientific Method 137

Posted by CmdrTaco
from the because-of-science-dude dept.
aproposofwhat noted Ars Technica's rebuttal to yesterday's story about "The End of Theory: The Data Deluge Makes the Scientific Method Obsolete." The response is titled "Why the cloud cannot obscure the Scientific Method," and is a good follow up to the discussion.
This discussion has been archived. No new comments can be posted.

Why the Cloud Cannot Obscure the Scientific Method

Comments Filter:
  • by Bandman (86149) <bandmanNO@SPAMgmail.com> on Thursday June 26, 2008 @08:45AM (#23947485) Homepage


    Because a datasource isn't a process?

  • missing link (Score:4, Insightful)

    by lhorn (528432) <lho@ffi . n o> on Thursday June 26, 2008 @08:51AM (#23947555)
    http://arstechnica.com/news.ars/post/20080625-why-the-cloud-cannot-obscure-the-scientific-method.html [arstechnica.com]
    I like the fact that the web and search/aggregate engines may combine vast amounts of data in ways we now
    cannot imagine - it expands the field for new scientific research enormously. Replace science? No.
    • by kalirion (728907) on Thursday June 26, 2008 @10:02AM (#23948471)

      What, you mean I can't just google for "unified field theory" and get the right answer? Why does the universe have to be so hard?????

    • I believe science is a direct descendent of the capacity for deduction as granted by our model-making brains. In our internal symbol sense, we often use the subjunctive tense, if when why hypothetically depicting, don't call it science, fine, it's still predicting what will happen from what has happened. -- that's what i'm rappin'
  • Crack cocaine makes you stupid.

    Oh, you were talking about the "information cloud" the crackheads at Wired always talk about. Never mind.

    • by ceoyoyo (59147)

      I think I figured out why they use "the cloud." Obviously all the good patents for "... on the Internet" have been taken, so they're just making possible a new round of frivolous patents with the phrase "... in the cloud."

  • by Anonymous Coward on Thursday June 26, 2008 @08:58AM (#23947621)

    Latest addition to bullshit bingo cards:

    CLOUD

  • by Hoplite3 (671379) on Thursday June 26, 2008 @09:01AM (#23947653)

    I'd say that the models are the science. They're how you explain your data. They provide evidence that the experiments make sense, and they guide you by making predictions you can test.

    Moreover, SIMPLIFIED MODELS are good science. Understanding which details can be omitted without impacting the predictive ability of your model shows you know which effects are important and which aren't.

    • I agree, but... (Score:4, Insightful)

      by wfolta (603698) on Thursday June 26, 2008 @09:37AM (#23948129)

      What you say is true, Hoplite3. The big issue I see is how people define "model". My guess is that quite a few unfortunately define it as "I got 3 asterisks in the significance test", whether the "model" (say, linear regression) makes sense or not.

      I forget where I read it, but I've been studying linear regression, and there was a fascinating example were if they'd have used linear regression techniques on the early "drop the canonball and time it's fall" data, they would have come up with a nice, highly-significant linear regression for gravity.

      Then there is the whole issue of explanation versus prediction. Something can be predictive while providing no explanation, and perhaps that's where the petabyte idea is going: who cares about explanation if prediction is accurate enough? (Not my philosophy, BTW.)

      • Re:I agree, but... (Score:5, Interesting)

        by Hoplite3 (671379) on Thursday June 26, 2008 @09:58AM (#23948415)

        Yes, I think that prediction without explanation is fascinating, but I don't know if it's what I like about science :) Have you ever heard Lenard Smith speak? I saw him at SAMSI, but his MSRI talk is online and is roughly the same. He's a statistician who works in exactly this.

        Some fancy-pants technique he has is better at predicting the future behavior of chaotic systems (like van der Pol circuits or the weather) than physical models. But he also points out that these predictions don't tell you what type of data to collect to make better predictions, and that they don't generalize. One nice "model" he has can predict the weather at Heathrow better than physical weather models (from the same inputs: wind speed, temperature, pressure, etc), but it's useless for predicting the weather in Kinshasa until the model is re-trained.

        I think these types of data analysis tools will be very important in the future, but they won't replace the explanatory power of models. Just like how scientific computing is useful, but never replaced actual experiments.

      • Re:I agree, but... (Score:5, Insightful)

        by aurispector (530273) on Thursday June 26, 2008 @10:01AM (#23948449)

        Thank you. Sure, there's a ton of data out there, but how was it collected? What statistical methods were used to analyze the data? How did you select the data set you're analyzing? Nothing I understand about science really applies to data mining a so-called "cloud". Prediction without explanation is just observation. Observation in and of itself is not science. You might have data, but is it the right data?

        I see all this petabyte stuff as interesting and even as a valuable adjunct to real science, but a basic requirement of science is reproducibility and you can't reproduce the data collection.

        • I think the consensus is that the original article is a bit presumptuous and flawed. He says that science will be replaced, which implies that there is a hardened definition for how science is to be performed currently, which there isn't. There is no ONE definition of science or the scientific method.

          From a junior high school site about the scientific method:

          "Six steps of the S. M.
          State the problem: Why is that doing that? Or Why is this not working?
          Gather information: Research problem and get
      • by mckorr (1274964)
        Until they shot said cannonball out of a cannon and noticed that it doesn't follow a straight line. For that you need a quadratic regression.

        Linear regression is good for making predictions given strong correlation between items in a data set, but the linear equation you get is a probability, not the solution to the actual data. To show this, plug in the values for any given data point and see if the equation produces the exact results.

        Granted, at the quantum level we are dealing in probabilities, but

      • by ceoyoyo (59147)

        That's kind of a bad example. Galileo basically did just that: rolling marbles down inclined planes and looking for a simple relationship that fit the data. Correcting for the inclination of the plane, he found one.

        I don't remember how far Galileo got in explaining what the various terms in the relationship were, but Newton certainly finished the job. Only when that experimental relationship was explained did we get the theory of gravity and kinematics.

      • The discussion needs to bring in some other terminology.

        • Mapping: Google creates the largest and most detailed mappings of some subjects that we have ever seen. Further, it provides a number of map manipulation tools that are incredibly fast and easy to work with.
        • Territory: The map is not the territory; what Google delivers is always suggestive of the way the world actually is, but should never be mistaken as reality. The data Google draws on is abstracted from reality, and there may be several metadata
        • I agree with most of your post, however since google runs on a computer it MUST use an algroithim to search. Once the algorithim returns the hit list then it is up to the human to use heuristics to determine if the results are usefull or not.
  • another obvious history.

    I am sorry Google, but your ad bussines model will be terminated by random page requests. It is alraedy happening, no 'pseudo' articles will help.

  • Leonardo Davinci is reputed to be the last person who "knew everything" that there was to know during their lifetime. Even that wasn't true. But the scientific method has been the key to both creating and coping with a "data deluge".

    Science suffers when there's too little data: scientists then must generate more by observation, or do something else that isn't science (and doesn't work nearly as well). Too much data is only a problem if you're willing to settle for imprecise/inaccurate results. I'm sure ther

  • by tist (1086039) on Thursday June 26, 2008 @09:14AM (#23947809)
    A large source of data that has a correlation does not somehow imply causation. Even if it works under some conditions (or even all conditions). The science happens when the causation is determined and then applied.
    • by damburger (981828)

      Yup. Mathematicians gushing about clouds and implying they have made science obsolete need to have that branded on their butts then be sent back to the mathematics department. They've already done quite enough giving us string theory (look! its internally consistent! it sounds cool! ergo its real!)

      • Re: (Score:3, Informative)

        Hey, don't try to pin all that stuff on mathematicians: the original cloud-gushing author, Chris Anderson, says, "background is in science, starting with studying physics and doing research at Los Alamos. [thelongtail.com]"

      • Re: (Score:2, Interesting)

        by mckorr (1274964)
        I'm a mathematician, and I have never heard a colleague make the claim that science is obsolete.

        Mathematics is the language of science, and there has never been an advancement in either one without an accompanying advance in the other.

        A mathematician might "gush" about clouds of data, and work on the mathematics of it, but if he insisted it made science obsolete he'd be tossed out on his ear.

        Oh, and string theory? That was the physicists. The mathematicians were pissed off that someone found a use f

    • by maxume (22995)

      Of course correlation implies causation. When things are correlated, it is often a good place to look for causation. That's exactly what "imply" means.

      Correlation doesn't *prove* causation.

      There is a difference.

      • by damburger (981828) on Thursday June 26, 2008 @09:38AM (#23948141)
        Wrong - imply has a very specific meaning to mathematicians and scientists. 'A implies B' means that if A is true, B MUST be true also.
        • Re: (Score:2, Interesting)

          by maxume (22995)

          Fine. I'll try to restate my point using more specific language.

          The fact that correlation does not imply causation isn't nearly as troublesome as the volume of "Remember folks correlation!=causation" would have us believe; lacking other evidence, it is a reasonable assumption to start with.

          • Re: (Score:3, Interesting)

            by damburger (981828)
            But nobody said that here, so your whole point is a strawman. I think its safe to assume that nobody on /. thinks correlation!=causation because that would make all science impossible.
          • The correlation != causation tag is usually applied because either:

            1. There are obvious confounding factors the article fails to mention
            2. There's a good chance the direction of the arrow of causation is incorrect. e.g. just because fireman tend to be where you see big fires, doesn't mean they cause them. Or perhaps less obviously, aluminum doesn't cause Alzheimer's, it builds up in the brain as a consequence of Alzheimer's. Statistical inferences are only as good as the data available to you, and you need theo
            • by maxume (22995)

              The article makes the mistake of assuming that new methods that can be used when you have bigger piles of information will make the old methods less powerful. As you say, it is often the case that they can be used together, resulting in faster/better/cheaper results.

        • ... and correct if you ask some logicians and linguists. For us, imply [stanford.edu] means "something meant although not said but (through different mechanisms) conveyed" and entail [wikipedia.org] means "if A is true, B MUST be true also".
      • Re: (Score:3, Insightful)

        In science, the phrase usually used is "correlation does not imply a specific causation." It does, of course, imply some correlation and most of modern science is noticing correlations and testing for causation.

    • Actually there is a statistical concept "causation" as well.

      So yes, correlation does not imply causation. The reverse is through, though, causation implies correlation. There is only one mathematical relation between "things that correlate" and "causes" that supports this outcome : intersection. All causes correlate.

      So you only need another mathematical property of causation, take the intersection of the concepts and there you'll have a much more precise source for causation.

      You could also simply take the t

      • by zacronos (937891)
        If correlation occurs with a temporal shift, it is trivially simple to separate cause and effect.

        I have to disagree with that -- it's kinda correct, but I think it oversimplifies and misses some situations. (Note that I'm talking about the general case, not your solar output example in particular.)

        As one example, imagine someone without an understanding of the physics of weather discovered that, at least 10 minutes prior to the arrival of any major thunderstorm, all birds in a particular forest stopp
        • Yes but those birds and the thunderstorm do have a very important connection :

          these events SHARE CAUSES. This is true for your second example as well. They would never satisfy the second part of the causation demand : A correlates with B (with a timeshift) but B never decorrelates with A (with or without a timeshift).

          In otherwords : it is a specific type of deviation in correlation that implies causation in statistical data.

          • by zacronos (937891)
            Yes but those birds and the thunderstorm do have a very important connection : these events SHARE CAUSES. This is true for your second example as well.

            Yes, exactly, that's what I was getting at.

            They would never satisfy the second part of the causation demand : A correlates with B (with a timeshift) but B never decorrelates with A (with or without a timeshift).

            You said "If correlation occurs with a temporal shift, it is trivially simple to separate cause and effect." If you were implying additional
    • by eli pabst (948845) on Thursday June 26, 2008 @10:13AM (#23948661)
      You're exactly right. In fact if anything, science has started moving *away* from the kind of purely computational and statistical correlations that you get through data mining. Granted they are extremely important for generating hypotheses, but journals are much less likely to accept a paper without some kind of experimental validation.

      The large scale genetic association studies are a great example. There was a day that you could publish a paper solely describing a correlation between a variant in gene X and its association with disease Y. However, because of the way we do statistics in science, sooner or later you'll find a statistically significant correlation simply due to chance alone. In fact the epidemiologist John ioannidis wrote an article [plosjournals.org] about this (that I believe appeared on Slashdot as well). Now you're often required to show some kind of experimental validation that there is a biological basis that verifies the statistical correlation. The scientific method is not going away anytime soon.
    • by ceoyoyo (59147)

      Yes, it does imply causation. Just not necessarily the obvious one. The correlation != causation meme is technically accurate, but the writer of the previous article, as do so many people here, managed to screw it up completely by assuming that a correlation between two associated factors that is not a causal relationship between those factors is coincidence. It isn't. For a sufficiently strong correlation it implies a causal relationship between those two factors and a third factor.

  • by gopla (597381) on Thursday June 26, 2008 @09:16AM (#23947817)

    All models are wrong, but some are useful.

    We still need scientific methods to develop useful models and understand and refine the existing models. When Newton defined his mechanics that was the state of the art in his era, and now we have progressed to quantum mechanics which might be refined tomorrow.

    But mere observation of some phenomena is not sufficient to postulate the behaviour in a changed condition. A scientific model and its rigorous application is required for this. Correlations drawn from the cloud cannot substitute it.

    gopla

    • by 99BottlesOfBeerInMyF (813746) on Thursday June 26, 2008 @11:10AM (#23949581)

      All models are wrong, but some are useful.

      All models are wrong, to some degree. A better way to put it is all models are imprecise, but some are precise enough to be useful. 'Wrong' is a very flexible word and can easily lead to a misunderstanding in this context.

      • by PCM2 (4486)

        all models are imprecise
        I'm not sure it's even helpful to state it this way. It starts to sound like what you're saying is that a model is not a precise representation of real life, but rather is a simplified representation designed to make it easier to extract pertinent data. Mind you, I'm not trying to put words into your mouth or anything...
      • by Shadowlore (10860)

        "All models are wrong, to some degree." == All models are wrong. Either they are wrong or they are not wrong.

        Precision is not the implication, correctness is. A model is a model because it is incorrect in some way - it is an approximation. "only a little wrong" wrong does not make it not wrong.

        Buffalo buffalo buffalo.

        The reason the distinction of all models being wrong is important is to limit people believing the model is the real world. Far, far too many "scientists" these days do all of their work in

  • The point of the last story was horribly miscommunicated. There were two main points. The first is that data is expanding in such scope that hierarchal organization systems don't work and that the second is we're approaching a time where the method or analysis of data to show causation will come from correlation, because you can determine all the variances due to the fact that all the variables have been accounted for. Look at the human genome project or folding at home.

    I don't think this is completely tr

    • by phobos13013 (813040) on Thursday June 26, 2008 @09:39AM (#23948161)
      You seem to be missing a fundamental flaw in the argument. No matter how many parameters you account for a) you can never account for ALL parameters of this system we call life (if for no other reason, there may well be some we dont know about yet!), and b) most importantly, even if you DO have all the parameters and the results show a correlation, there is no logical jump one can make that says it is the cause of the observed behavior.

      Truly what yesterday's article was saying is that causation or correlation is meaningless if you have a mimic of the real world in the form of a collection of data. You don't need a model that is accurate or valid or anything. You just need to run the data in the exact replica of reality. This is the simulacrum. The first problem is that data does not just run itself. At the least it needs an algorithm to be processed to a result. Thats the model, without its just useless data, which has been mentioned already yesterday in comments. But second, the problem with even ATTEMPTING such an idea is that you lead yourself into a situation where you "predict" the future and then operate to become that future thus destroying the creative nature of humanity and become the self-fulling prophecy of machine code!

      Keep in mind i speak mostly of social sciences that try to pattern human behavior. For hard sciences, etc., all you have done is created a simulation of reality, but it tells you nothing about the reality. It merely mimics it. There is no insight into creating a map the size of the United States, at best it is a work of art.
  • by Angostura (703910) on Thursday June 26, 2008 @09:20AM (#23947863)

    In general I'm right behind the rebuttal. However John Timmer chooses a very bad real-life example as his rebuttal champion.

    He asks: ...would Anderson be willing to help test a drug that was based on a poorly understood correlation pulled out of a datamine? These days, we like our drugs to have known targets and mechanisms of action and, to get there, we need standard science.

    These days we may like our drugs to have these attributes, but very often they don't. There are still quite a few medicines around that clearly work and are prescribed on that basis, but for which there is only the haziest evidence as to how exactly they work.

    The good thing about the scientific method, however is it gives us a framework to investigate these drug's actions - even if the explanation is still currently beyond us.

    • by wfolta (603698)

      You're right about the medicine example. It's odd that medicine has an incredibly rigorous statistical process before approval, yet many medicines are basically black boxes.

      Look at statins (cholesterol medication), which are one of the most widely-prescribed medicines in the world -- and which I take. There's a legitimate question as to whether their main effect is to reduce cholesterol levels, or whether it's actually a specific kind of anti-inflammatory which happens to reduce cholesterol levels.

      Or how ab

      • He makes statements about treatments, causes, and outcomes as if they were God given truths proven to the world beyond all doubt. In truth medicine seems to this mathematician as a field governed sooley by statistical correlation with next to no concern over (a) what is the actual cause is, (b) testing the hypothesized cause in any meaningful way. I've read study after study that goes through a wonderful presented statistical analysis to conclude that such and such drug works well at treating such and suc
        • Re: (Score:3, Interesting)

          by ColdWetDog (752185) *

          In truth medicine seems to this mathematician as a field governed sooley [sic] by statistical correlation with next to no concern over (a) what is the actual cause is, (b) testing the hypothesized cause in any meaningful way. I've read study after study that goes through a wonderful presented statistical analysis to conclude that such and such drug works well at treating such and such symptom; they then close with a couple of paragraphs as to why (they think) the drug is working often not using an qualifier

        • by Angostura (703910)

          He makes statements about treatments, causes, and outcomes as if they were God given truths proven to the world beyond all doubt.

          He has to, if he doesn't he'll bugger up the efficacy of the placebo effect, which is a pretty important element in prescribing.

          I'm only half joking. /Disclaimer: My wife is a hospital consultant and she's really good and interested in root cause.

    • by DrJay (102053)

      Well, what i was trying to say is that no drug company pursues anything without knowing the molecules it targets, the role they play in the cell, etc. It's doubtful that the FDA would approve the testing of a drug if all the company came up with is "we dump it on cells, and it does X, but we have no idea why."

      You're absolutely correct that this sort of knowledge isn't often that deep - we know what serotonin reuptake inhibitors do on the biochemical level, but what that means for the brain is pretty hazy.

  • by phobos13013 (813040) on Thursday June 26, 2008 @09:20AM (#23947865)
    Truly, the whole reason someone like Mr. Anderson could claim the end of science because of data is that he is a writer, a thinker, and large part businessman. Businessmen do not think about Science and how to use it to come with a method that produces a conclusion. He uses information to come up with ways to illicit a reaction in people. So to him data is more important than science because he uses it for his purposes. That is marketing, and the "science" of marketing has almost always been that way.

    Mr. Anderson was not prescient in any way, he was just speaking his perspective. The only thing is we must be careful to even consider his proposition as a valid reality worth pursuing. Not for true scientists, but from a social perspective, or it will truly be the end of science. There are some in power as it is already attempting to make this happen.

    That said, I almost consider responding to yesterday's article as falling for the argument. But, since it hit the /. this article is as cogent a rebuttal as one can make.
    • to come up with ways to illicit a reaction in people

      elicit == v. evoke; illicit == adj. illegal

      BTW, it seemed obvious to me that he equated data discovery with scientific discovery, which is a big mistake. Adding to the sum of human knowledge is not the same as adding to the sum of human understanding, and using datamining and other automated tools for correlation determination does not in any way increase understanding.

      Data discovery is about increasing knowledge. Scientific discovery is about increasi

    • by ceoyoyo (59147)

      Even from a social perspective I don't think his argument holds water. It's akin to the origin of superstition: when I make a sacrifice to the rain gods, in my experience it tends to rain. Therefore, I should believe in the rain gods.

      His central example, Google, doesn't actually support his argument. Google uses an implicit model (which they carefully protect) to rank the likely relevance of search results. Then they give you a giant pile, in order of ranking, and let you sort through it. So not only d

  • by damburger (981828) on Thursday June 26, 2008 @09:24AM (#23947919)

    And can back up this rebuttal with a practical example. I am a physicist, I know sod all about blood samples, or proteins, or cancer. I get a pile of mass spec data (about a billion data points or so on some days) and through binning, background subtraction, and a string of other statistical witchcraft I produce a set of peaks labeled according to intensity and significance.

    This does not make me a cancer researcher. This data has to go back to the cancer guys and they have to pick out the Biomarkers and thus develop new diagnostic tests, based on principles that I don't understand. I am master of the information but entirely blind as far as the science is concerned. Same goes for google.

    • I would agree that the scientific method is not dead, but I like this rebuttal. The scientific method as I understand it is
      1) Observe
      2) Form a hypothesis or create a model to explain some phenomenon
      3) Experiment and gather empirical data to support or refute the hypothesis/model

      We still do all that but the emphasis does seem to be shifting away from traditional models that are sweeping generalizations (e.g., "An atom has a nucleus of protons and neutrons surrounded by moving electrons") to more nuanced
    • by ceoyoyo (59147)

      I'm a computer scientist who was morphed over the last six years into a biomedical researcher. As a computer scientist I can do all kinds of things to an image, including a bunch of statistical magic to tease out any patterns in the database. As a biomedical researcher I know that many of those associations are going to be due to the way the image was collected, or otherwise irrelevant features of the patient. Some may even be introduced by my processing and statistical methods.

    • Often you are exactly the type of help that we so badly need in bioinformatics and proteomics. You know how to deal with the data in a non-biased manner.

      As a biochemist myself, I know that it is far to easy to approach a data set knowing what a given m/z corresponds to, and then chose the data grooming strategy that most favors that peak. And being as we don't really have truly "standard" algorithms for approaching proteomics mass spec data, we need people who know the fundamentals of the techniques w
  • Duh! (Score:5, Insightful)

    by es330td (964170) on Thursday June 26, 2008 @09:26AM (#23947941)
    When I read the original article my thought was that someone was just trying to write something to get noticed. The Scientific method, IMHO, is all about a person or group of persons using a logical process to determine the vailidity of an idea. Observing massive amounts of data can reveal relationships that may not have been noticed in other ways, but at the end of the day the process of "I think X, I wonder if it is true", the heart of the scientific method, can no sooner become obsolete than we can stop being human. The questions of What, Why and How are so fundamental to humans as humans that nothing short of total omniscience will ever replace the logical process represented by the scientific method.
    • by 12357bd (686909)

      The questions of What, Why and How are so fundamental to humans as humans that nothing short of total omniscience will ever replace the logical process represented by the scientific method.

      There's a lot of 'faith' in this statement.

      1) Human > logical being
      2) Logic > Science

      So: I am sorry but to expect that science answers any, 'What' 'Why' or 'How' is just to expect too much. Science has his limits, probably some philosofy and empaty will also be needed.

  • traditionally, science forms its hypothesis, and performs an experimentum crucis to test the hypothesis; rinse & repeat. it seems to me that 'the cloud' refers to a hitherto statistically huge number of samples of data points from which to extract our knowledge of the world -- a sort of broad collection of facts derived from constantly and systematically varying the experimental conditions -- an exploratory experimentation. goethe outlines a method of Exploratory Experimentation in the essay The experim [rsarchive.org]

  • by starfire-1 (159960) on Thursday June 26, 2008 @09:46AM (#23948261)

    I have always viewed this debate in the context of scientist vs. engineer. That is one who views data as "good and true" vs. "good enough". That's not a slam on engineers (I am one), but a reflection of the balance between the two. A scientist that never applies theory sits in an empty room. An engineer who build things with out science, sits in a cluttered room surrounded by useless objects.

    I do find interesting though that the advent of "google data" may indicate a flip in order of the two disciplines. Historically (IMHO) science has led engineering. A theoretical breakthrough, provable by the scientific method, may take years to give birth to a practical application. Now, with enormous piles of data and the knowledge that "good enough" is often good enough, we may be creating useful objects that will take science many years to explain and model.

    The biggest issue and omission in both of these pieces is that this "cloud" of data does not represent "truth" (as the scientist may seek), but rather a summation or averaging of the "perception of truth" as seen by the individual authors. The cloud, therefore, is only as useful as human's ability to divine truth without the scientific method.

    My two cents. :)

    • Re: (Score:3, Insightful)

      by maxume (22995)

      I have a theory that some of the best engineers are scientists, and some of the best scientists are engineers.

      Scientists often need to build crazy stuff to figure things out, and engineers often need to figure things out to build crazy stuff. Because they are each result oriented, they don't get hung up on the things that someone in field would.

    • by ceoyoyo (59147)

      I think it's the other way. Engineering got a head start on science. When we pile up rocks just so, they tend to stay where we put them, even if you walk across them. Voila, a bridge. Science came along later and explained why those particular arrangements are stable. That explanation lets the engineer investigate other bridge designs that he might not have seen before.

      There are perhaps a few areas in which the availability of massive amounts of data may let the engineer go back to his "I've seen it, t

    • by mlwmohawk (801821)

      I seriously disagree with this opinion as you discount engineering as sort of inferior to the science.

      Engineering absolutely requires a scientist. If you're an engineer and don't understand the theories and science you use professionally, you are a poor engineer. Typically speaking, a scientist furthers the scientific theories and an engineer applies them. Some times there is overlap where engineers do further the theories and scientists do apply them. Nowhere would I say that engineering is a profession of

    • The biggest issue and omission in both of these pieces is that this "cloud" of data does not represent "truth" (as the scientist may seek), but rather a summation or averaging of the "perception of truth" as seen by the individual authors.

      I didn't realize we were discussing wikipedia. :)
  • by mlwmohawk (801821) on Thursday June 26, 2008 @09:54AM (#23948355)

    I have a problem with the google generation, sure, they can parrot facts and find things in an instant, as can any slashdotter I'm sure, but knowing something is not the same thing as understanding something.

    I coworker asked me yesterday "how do you call a C++ class member function from C [or java]?" The question is an example of pure ignorance.

    If they "understood" computer science, as a profession, this would be a trivial question, like how do I or can I declare a C function in C++. The second question is what google can help you with while having to ask the first question means you are screwed and need to ask someone who understands what you do not. Not understanding what you do for a living is a problem.

    How programs get linked, how environments function, virtual machines vs pure binaries, etc. These are important parts of computer science, just as much as algorithms and structures. You have to have a WORKING knowledge of things, i.e. an understanding.

    Google's ease of discovery eliminates a lot of the understanding learned from research. Now we can get the information we want, easily, without actually understanding it. IMHO this is a very dangerous thing.

    • Wow, one of the best postings I have read for months.

      Although I wouldn't call it "very dangerous", you are so right about the difference between, what you call, knowing and understanding. Raw data and number crunching is only one step towards understanding. Interpretation of the data and in the end really grasping the problem and hopefully a solution are something different.

      Theories may have gone wild in some sciences in the sense that theorizing is overvalued compared to data munching, but theories and

    • by ceoyoyo (59147)

      Mr. Miyagi?

    • by kabocox (199019)

      Google's ease of discovery eliminates a lot of the understanding learned from research. Now we can get the information we want, easily, without actually understanding it. IMHO this is a very dangerous thing.

      Yes, because people can learn instantly what ever answers and not actually get the accepted view point stamped into them at the same time. That's extremely dangerous. There is no telling what people will come up with if the don't have their government's, employer's, school's, church's, or parent's viewpo

    • Google's ease of discovery eliminates a lot of the understanding learned from research. Now we can get the information we want, easily, without actually understanding it. IMHO this is a very dangerous thing.

      It's still GIGO - "Garbage in, Garbage out" - except now there is a LOT more garbage.

      I recently read an entire book (Super Crunchers) whose substance was that regression analysis was the greatest data analysis tool since sliced bread. Nonsense.

      Finding associations is relatively easy. Making sense of them

  • Petabyte technology suggests new avenues of scientific investigation, but doesnt end science or older alternative ways of doing things. The clever thing is to be first to discover the new possibilities.
  • This whole thing makes no sense. It's all ambiguous concepts. What? Lot's of data means we don't need to use theories? Lot's of data != Omniscience. If fact, lot's of data is not even yet information. You still need to find how it applies. It's the people are Wired making a religion out of new technology that causes them to say crazy things like this.
  • by GodWasAnAlien (206300) on Thursday June 26, 2008 @10:44AM (#23949107)

    Science and openness go together.
    Without openness, we all are reinventing private wheels, which we destroy the plans to when there is no profit.
    If you work in software, consider for a moment how scientific your work is, considering the work of other companies doing similar work.

    This Clouds thing is the "billion monkeys/humans typing on keyboards" model.
    Yes, it really can work (with humans).
    But, as with science, the chaos development model only works with openness.

    Of course, organized science along with a little chaotic development work work even better.

    There are forces in our society that do not like any open model. The Microsoft's, the MPAA, the RIAA. These type of organization thrive from closed models. More copyright controls, more DRM, longer copyright and patent terms.
    These forces would prefer to own,control and close science and clouds of data. They are unaware of the inevitable impact of such actions.

    In a free capitalist society, we are naturally driven my contrary forces.
    A desire to hide discoveries, to maximize profits, even at the expense of innovation.
    A desire to share discoveries, to contribute to society and for credit.

    While it is possible to profit when ideas are shared,
    It is more difficult to contribute to society by hiding information indefinitely.

  • There are coefficients we use in models that we don't fully understand in the physical world. We obtain those coefficients through empirical data. To rely solely on those models for design ignores the fact that those coefficients may change for any reason in the real world, because we don't fully understand what factors influence them.

    In my experience this only applies to certain sciences. Most of my experience with such systems is in the area of fluid mechanics, and thermochemistry. Models can save y
  • The Wired post was a bit over-reaching, sure... but that's Wired for you.

    The bigger point is that science is about testability, not story-telling. There may soon come a day when our analysis can prove that something is true without our being able to explain why it is true.

    We are already there in many respects, but will be much further along when the current crop of Bayesian diagnostics hits the market. Combine those with the flood of information that personal genomics companies hope to make available and

  • ...then WTF??

    Some time ago some researchers came out with a book which was supposed to be called "the end of intuition". The name of the book actually became "Supercrunchers", because people would click more on that ad than in the "end of intuition". I wondered why the final name shouldn't be "hot college lesbians".

    The Eliza effect is so huge that any nice trick machines do seems to give us the immediate feeling that "It's alive!", and it has deep meaning.

    Nonsense.

    As a researcher of psychologic [capyblanca.com]

    • I wondered why the final name shouldn't be "hot college lesbians".
      Have you ever worked in marketing? You might want to think about giving it a shot if you haven't already.

      I have a feeling you could have a brilliant career in that field.

  • by Anonymous Coward

    Another point missed here is that background noise can obscure real results. Much of the data cloud is utter garbage. Picking out the useful information is often a complicated and difficult process, in some cases it's easier to just go and do the measurement yourself. I've heard the "a few days in the library can save you weeks at the bench" about as often as the reverse. I think they're both true.

    -sk

  • Ever wonder how early humans discovered medicinal qualities of plants? They didn't use models and scientific method... they used vast amounts of trial and error results. Then they used prediction based on what they had learned to narrow down what kind of plants to try out next. They didn't understand the underlying mechanisms and test out new findings based on that type of model... they used cheap and dirty statistics and record keeping.

    This is just an extension of what humans have been doing to discover ne

  • The article states that "we know quantum mechanics is wrong on some level". Oh really? That's news to me. Any serious proposed theories of everything have been quantum in nature. It's amusingly hypocritical that the Arstechnica article refers to the Wired author as unscientific, yet makes such a claim itself.

    The only thing "wrong" with quantum theory is that doesn't fit human intuitions. But this is only because people ignore the psychology of perception and are not careful about interpretations; it'
    • I think the biggest problem in QM is the idea that the "collapse of the state vector" actually describes anything real. It's one of those questions like "when does life start" or "what's really a planet" that doesn't really have anything to do with science. It's just a metaphor that makes certain kinds of reasoning about QM easier, and provides guidance as to where you can simplify your model to make the calculations practical.

  • While he does a good job showing that science itself isn't going away, he actually lends credence to the position that cloud computing implies a lot of useful information will be generated outside of science. Moreover, he also might be supporting the position that science isn't necessarily going to catch-up and explain this data any time soon. So, the "strong" position, that Google makes science irrelevant, is naturally false. But the "weak" position, that Google represents a new kind of inquiry that is go
  • Links need thought (Score:3, Interesting)

    by FlyingBishop (1293238) on Thursday June 26, 2008 @02:50PM (#23953719)
    I had a nice example of the complete inadequacy of google's thought-agnostic approach to links browsing around looking for information on samba and fuse under linux. Google's ad bars, completely misinterpreting the context, offered links to fuse boxes, as in wiring, and Samba lessons, as in dancing. But then, maybe I'm not giving Google enough credit. It might have actually recognized the pointlessness of trying to market software to a Linux user, and took the obvious step of throwing in some complete non sequiturs in the hopes of catching something of value.
  • "Because it came from WIRED," should have been enough reason to discard this bullshit from day one. Why not ask some REAL scientists in a REAL peer reviewed scientific journal about what the "cloud" is doing instead of letting a bunch of insular technophiles indulge in masturbatory fantasies about how their "culture jamming" is "shifting paradigms" all while convincing themselves the same shit wasn't going on in the 60's, 70's, 80's and fucking 90's, and is indeed the sort of thing that led to WIRED's kind
  • by frogzilla (1229188) on Thursday June 26, 2008 @04:10PM (#23955733)
    Wasn't this all demonstrated 100 years ago by Francis Galton [wikipedia.org] and an Ox? What's new is that there are more data points and better techniques to identify interesting correlations. Probably this is what we do internally anyway. All of our sensory input is correlated and the interesting bits are filtered out by specific algorithms trained by evolution. What is fascinating to many are the times when these algorithms are spectacularly wrong.

"One Architecture, One OS" also translates as "One Egg, One Basket".

Working...