Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Science Technology

Turning Data Science Into a Spectator 'Sport' 19

vu1986 writes "Kaggle has a 'predictive-modeling competition platform that makes public the competitors in invite-only private competitions. Think of it like watching a major tournament in golf or tennis, where you can watch the best in the world shoot it out to see whose algorithms are king. Kaggle's tagline is "We're making data science a sport." Maybe now it can make data science a spectator sport.'"
This discussion has been archived. No new comments can be posted.

Turning Data Science Into a Spectator 'Sport'

Comments Filter:
  • by Anonymous Coward

    We have plenty of armchair "quaterbacks" right here when it comes to science. They can blab on and on about what's right and wrong but damn it if they're ever asked to put their shoulder into the effort. Most of them seem to know less than a high school chemistry student. But, you know, these guys think they're on the ball.

  • by Sparticus789 ( 2625955 ) on Wednesday September 12, 2012 @03:47PM (#41316045) Journal

    Announcer Holy Cow, that recursive data parsing algorithm discovered a secret code hidden within the Book of Revelations in 18.5897923 seconds! "All your base are be...." Wha- What the hell is this crap!?!?

  • by tlambert ( 566799 ) on Wednesday September 12, 2012 @03:51PM (#41316097)

    They could make it go faster than televised bass fishing.

    Seriously, no one not wearing white polyester pants up to below their chest and golf shoes, or someone wearing hip waders and holding a fly reel, would have the patience to watch this.

    Even if you could trick someone into watching it, you're never going to get beyond the "accumulate points" stage, unless there's an end goal, and you can see progress toward that goal well enough that the representation would allow you to predict a winner or a close race.

    If it goes anywhere, it'll be because Jeff Bezos or Larry Ellison favors a team and drops a bunch of machines into that teams cluster. Actually, if it's Larry Ellison, expect him to drop just enough computers into the underdog to be able to claim a tax write off and fix the Vegas odds to the point he can switch the support at the last minute and cash in.

    • by garcia ( 6573 )

      As someone who works in the data analysis field, I can assure you the people doing predictive modeling are not usually pocket protected geeks with tape between their lenses.

      But I will admit I got a good chuckle from your post; +5 Funny for sure.

  • by Anonymous Coward

    If they get hot Russian chicks in short skirts, I am so there!

  • by Anonymous Coward

    ...people who are fanatically devoted to one viewpoint, ignoring all evidence to the contractrary, and demonizing their opponent. Yeah, science needs to be more like sports.

    I hope the OP gets cancer and dies...

    • ...people who are fanatically devoted to one viewpoint, ignoring all evidence to the contractrary[sic], and demonizing their opponent. Yeah, science needs to be more like sports...

      Hey there. I see you participate in grant reviews too!

  • So instead of being employed, we're all expected to work, for free, in the hopes that we win a contest? I sure as hell hope this violates all kinds of labor laws.

    The labor market has become the Hunger Games. We all lose.

  • by ZahrGnosis ( 66741 ) on Wednesday September 12, 2012 @04:19PM (#41316457) Homepage

    I've been working on the Heritage Health Prize [heritagehealthprize.com] that Kaggle is running for over a year now. It's a fantastic way to learn data science and tackle real world problems with real data and a co-op-etitive spirit. The forums and winning solutions are great for learning the art, and if you've never used R [r-project.org], it's a great opportunity to learn it and talk to people that have a ton of experience in the area.

  • by Okian Warrior ( 537106 ) on Wednesday September 12, 2012 @04:33PM (#41316609) Homepage Journal

    I've entered a couple of Kaggle competitions, but I'm 'kinda put off by the opaque results.

    After the first one ended (predict HIV progression [kaggle.com]), the released full dataset indicated that the data had been sorted before it was separated into train and test sets. IOW, after being sorted by length, all the short sequences were put into the training set, and the longer ones into the test set. This mistake may have invalidated the competition, and I strongly suspect it would have invalidated any paper written about the results.

    More recently, the organizers of one competition [kaggle.com] stated flatly in the forums that they would release the entire data set once the competition had ended, but then didn't. I inquired about this, and a Kaggle data scientist replied saying "we almost never release the test data".

    I'm not sure that Kaggle [kaggle.com] is all that scientific. If the full dataset can't be examined after the competitions close, there's no way to verify the results.

  • Or checking which veggies in the fridge go bad first is a sport, if data analysis is a sport.

    This smells like old SPAM (both kinds).

  • by 19061969 ( 939279 ) on Wednesday September 12, 2012 @06:06PM (#41317671)

    I hope these spectators like endurance sports. My natural language processing models take between 2-7 days to create. While I set the model creation going and have a few beers, watch TV etc, they can sit and watch a terminal with an incomprehensible progress report going on.

    "Wow! He's completed 87% of the tokenisation! He''ll be shooting to score any week now!"

    Never mind, as long as they pay.

  • I am opting for HD data while spectating. And maybe some Good N Plenty.

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...