Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Mathematician Predicts Yankees To Dominate

Posted by CowboyNeal on Thu Apr 05, 2007 07:35 PM
from the safe-bets dept.
anthemaniac writes "Computerized projections in sports are nothing new, but Bruce Bukiet of the New Jersey Institute of Technology has developed a model that seems to work pretty well. He projects how many games a Major League Baseball team will win by factoring in how each hitter ought to do against each pitcher in every game. His crystal ball says the Yankees will win 110 games this year, a pretty safe bet, many might agree. But he also projects all the divisional winners. He claims to be right more than wrong in five of the past six years."
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • 110 wins? (Score:5, Insightful)

    by nebaz (453974) * on Thursday April 05 2007, @07:39PM (#18629517)
    It's a safe bet that the Yankees will do well, they always seem to spend almost twice as much as most other teams on talent, not to mention luring good players from other teams away to crush competition. Having said that, they have always spent such money, and not done exceptionally well as of late. 110 wins is a lot, and not many tesms have accomplished that. Safe bet? Hardly.
    • The Pirates - 2nd lowest payroll - will suck again. 14 losing seasons in a row. I give it a 99.9% certainty they make it 15. I'm not even a MIT grad!

    • Injuries. Did he take these into account? A lot of good teams have had lousy seasons due to players being hurt for long periods of time. MAYBE if every member of every team was able to play a full schedule of 162 games...

      Performances. If every player played consistently every day, but some guys go on hot streaks and get moved up in the batting order. Some guys go cold and get bumped down, or even worse, sent to the minors. MAYBE if the 25-man rosters stayed constant for the entire season.

      Luck. Three
    • Re: (Score:3, Funny)

      by Rogerborg (306625)
      Not safe at all, until you factor in whomever the Mob has their money riding on.
      • Re: (Score:2, Interesting)

        by sebi (152185)
        I agree that RS vs RA is a good way to predict the success of a team. It's not always so helpful looking back. The Indians scored 870 runs last season and only allowed 782. How did they do? Not so well: a 78-84 record, good enough to finish fourth in their division. How can one explain that disparity? Blowouts. Those 22-0 games that happen every once in a while. I like Runs Scored vs Runs Allowed models. Just not the ones that get updated during the season.
  • He claims to be right more than wrong in five of the past six years.

    Whoopty fsck. So's RailGunner [slashdot.org]. Runs are fun to watch, but pitching is what wins. And the Yanks have? Anyone? Anyone at all? Yep. They got nothin' at pitcher.

  • by The Living Fractal (162153) <execyte@nosPAm.execyte.com> on Thursday April 05 2007, @07:41PM (#18629537) Homepage
    Has he put up beaucoup bucks in Vegas on his numbers? If not, why not. If so, how much did he win, and where can I get his numbers this year?

    TLF
  • by krbvroc1 (725200) on Thursday April 05 2007, @07:52PM (#18629645)
    Isn't here some rule or law about 'fitting a curve' to past data? Yet, the sports predictions, and many of the 'stock market systems' are all about
    finding some seemingly obvious pattern in past data. While you might come up with a 'back tested' model that matches really well,
    it doesn't mean squat for the future.
    • by BridgeBum (11413) on Thursday April 05 2007, @08:09PM (#18629789)
      His models have evolved over the years, but he tries to simulate actual games using both individual statistics (players batting averages, etc.) as well as team trends (how well does a player do against a specific pitcher). He uses a large Markov chain to predict state transitions (Runner on first, no outs - how often does it go to two outs? That sort of thing.) Very interesting project, it was a lot of fun to work on. (I was an undergrad working with Bruce 15 years ago, when he was first starting this project. He's kept it going for years.)
      • by Burdell (228580) on Thursday April 05 2007, @09:23PM (#18630297)
        It is still trying to predict future results based on past performance. No matter what you predict, last year's Chipper Jones will never again face last year's Roger Clemens. Even if Clemens un-retires (again), he is not the same person, and neither is Chipper Jones. You also can't predict injuries, trades, managers' decisions, umpires' calls, weather, etc., all of which have an impact on the outcome of an individual game.
        • Re: (Score:2, Insightful)

          by Anonymous Coward
          You're right. We should stop trying to predict anything because we won't ever be 100% correct.
  • The best way to test any model is to start with the end points. How low does it score the New York Mets?
  • Huh? (Score:5, Insightful)

    by Kuukai (865890) on Thursday April 05 2007, @08:00PM (#18629701) Journal

    While Bukiet is the first to admit he's not a baseball expert, in five out of the past six years, he says that his model has produced more correct than incorrect predictions.
    What? Does this even mean anything? If, say, he was right 51% percent of the time five years and wrong 90% of the time that other year, wouldn't that make his number of successes less than the expected number of successes from just guessing "win" or "lose"? I guess he's either really modest ("I don't like to brag, so I'll just say the accuracy is higher than 42%."), or a really, really bad statician.
    • Re: (Score:3, Informative)

      ...or a really, really bad statician.

      Or a really good statistician. Remember, when you ask a statistician to crunch some numbers for you he'll reply back with "and what would you like the numbers to say?". They'll make it fit any curve you throw at them.
  • by ScrewMaster (602015) on Thursday April 05 2007, @08:00PM (#18629703)
    "Hello Mr. Bukiet"

    "It's pronounced bouquet!"
  • amazing (Score:3, Insightful)

    by flynt (248848) on Thursday April 05 2007, @08:03PM (#18629733)
    Wait, you mean you can use past data to try to predict future events under certain assumptions, and sometimes it works? Someone should generalize this into some sort of academic discipline!
  • It was called Strat-O-Matic Baseball, and many a night in the hills of Worcester I had to fall asleep to the constant clinkity-clink-clink-clinkle of a pair of dice in a stolen cafeteria coffee cup.

    • 1-5 HOMERUN

      :)

      PS - My all-time favorite Strat-O-Matic cards belonged to Bobby Witt. Especially his 1987 card. 143 IP, 160 K, 140 BB. Every inning an exciting one. :D

    • Wow, someone else that knows what Strat-O-Matic is.

      By the way, backgammon boards and cups really keep the noise down quite a bit.

      Aero
  • by Jon_S (15368) on Thursday April 05 2007, @08:07PM (#18629773)
    signed,

    Red Sox fan
    • My prediction is that the Yankees will spend more money than any other team. And still not win a World Series.

    • by doormat (63648)
      Signed,

      Yankees fan

      PS Have fun blowing up more innocuous devices because you think they're bombs
        • The stupid thing was that they had to evacuate parts of Boston over some blinking lights attached to batteries.

          No parts of Boston were evacuated, they shut down part of the subway and a bridge or two. None the less, it was pretty stupid and I had a good laugh over it. Luckily I had to get into work really early that day so I completely missed the orange line closing.

        • Re: (Score:3, Insightful)

          by zero1101 (444838)

          I got news for you both. The Yankees AND the Red Sox suck. Put 'em both in the AL Central, and they're fighting for third place tops.

          On what planet? Granted the Red Sox did poorly against the AL Central in 2006 (15-19), but the Yankees were 23-12 against the Central.

          For the last 3 years, the Yankees are 61-37 against the AL Central as a whole, and the Sox are 56-45. For those years, the standings of the top 4 teams from the East and Central are as follows:
          2006:
          NYY 97-65
          MIN 96-66
          DET 95-67
          CWS 90-72
          2005:
          CWS 99-63
          NYY 95-67
          BOS 95-67
          CLE 93-69
          2004:
          NYY 101-61
          BOS 98-64
          MIN 92-70
          CWS 83-79

          Only last year would even one of those two teams not have en

  • The article says he has made more correct than incorrect predictions in his several years of doing this.

    Something tells me that when he predicts that the Yankees will win 110 games, for example, he is counting his prediction as fulfilled if the Yankees win AT LEAST 110 games.

    Because it would be pretty remarkable if he has correctly predicated the EXACT number of games teams will win more than incorrectly over the past several years.

    And since no margin of error is provided, there's really no basis for saying
    • My model predicts that they will win at least one game. That makes me right for all six out of the last six years, so I guess I've got him beat.
  • by ericpi (780324) on Thursday April 05 2007, @08:17PM (#18629843)

    He claims to be right more than wrong in five of the past six years.

    That's nothing: I've devloped a new mathematical algorithm that correctly predicts the outcome of the past six years with 100% accuracy.

  • The Yankees have weak-ass pitching this year. No chance they win 110 games. More likely 90.
  • by Golgafrinchan (777313) on Thursday April 05 2007, @08:46PM (#18630023)
    First, a link to the professor's baseball page. [njit.edu]

    In 2006, he predicted 102 Yankee wins. They won 97. Not too bad.

    In 2005, he predicted 113 Yankee wins. They won 95. Way off.

    In 2004, he predicted 117 Yankee wins. They won 101. Way off.

    In 2003, he predicted 110 Yankee wins. They won 101. Not great.

    In other words, take this forecast with a big boulder of salt.

  • Big Whup... (Score:2, Informative)

    by Anonymous Coward
    Bill James came up with simple quantifiable statistics that could very accurately predict the success rate for a baseball team back in the '70s. The Oakland A's had a lot of success using those methods to put teams out of the field that would win between 95-100 games per year while spending as little as possible. It worked remarkably well and a book (Moneyball, by Michael Lewis) was written about it.

    In short, this is old and well covered news, unless this guy has come up with a simulation that is significan
  • I want to know how he calculated Daisuke Matsuzaka's numbers since he's never played ball in the states. Theoretically he should dominate the AL given his performance in Japan but those numbers don't mean much when considering the power hitters in the AL, much less MLB. Here's hoping Bukiet is wrong though. I'd love to see the Yankees tank and not make the play-offs but I'm a Red Sox fan and I always hope that happens.
  • Climate Models? (Score:5, Insightful)

    by Matteo522 (996602) on Thursday April 05 2007, @09:30PM (#18630341)

    So let me get this straight..

    Climatologists use past data, computer models, and mathematical projections to support global warming and predict future results, and everyone calls it strong science based on facts. If the models are off, it's just a part of the scientific process, but the overall claim is still valid.

    But if a statistician uses past data, computer models, and mathematical projections to predict baseball results, it's dismissed as some crack job's phony science. If the models are off, it's proof that he has no idea what he's doing and how these kinds of models don't work.

    Am I missing something here?

    • Re: (Score:3, Insightful)

      by zippthorne (748122)
      Yes, In the public experience, most fancy sports predictions have a history of being inaccurate. This is unlike the experience with climate models, which historically have also given us some predictions.
    • Re: (Score:3, Insightful)

      by Ibag (101144)
      What you are missing is that not all models are created equal, and not all things are as easy to model. It's all about variance. Consider the weather, for example. We can accurately predict what it will be for a day or two, and we have a decent guess for about a week, but beyond that, there is too much complexity and variability for us to say much (not to mention that weather appears to be a dynamical system, i.e., an example of chaos theory, which means that prediction is theoretically impossible). How
    • by Britz (170620)
      Yes, that was a good one.

      But the guys that modded you Insightful instead of Funny really made my day. I am still snickering writing this post.
  • Nobody could predict this one: http://www.planetworldcup.com/CUPS/1950/wc50index. html [planetworldcup.com] and the "Macacos" still cry about this......
  • by kenb215 (984963) <kenbarney@g[ ]l.com ['mai' in gap]> on Thursday April 05 2007, @09:46PM (#18630443)

    Wow, I never expected somebody that I knew to get on Slashdot. Bruce Bukiet is my Calculus II professor at NJIT.

    He mentioned this before a few times, including today after that article made it to the most popular spot on Yahoo! [yahoo.com] News. This is more of a hobby for him than an official project.

    From what he has said in the past about the model, it tends to overestimate the Yankees, among other reasons, because they often buy good players at the end of their prime. Thus the players won't play as well as they had in the past. He hasn't used it to make any bets. For the model, coming within a game or two of the actual results is considered a good prediction.

    As some people above said, the model isn't intended to be extremely accurate, and is frequently off by a significant amount. The interviews he does are more to get people interested in math, and to see how it has real use, rather than to try and show off. He used to go into more details in the past, but doesn't now because they tend to confuse the interviewer, and don't make it into the final article.

    Some pages of his own about the project are:
    http://m.njit.edu/~bukiet/baseball/baseball.html [njit.edu]
    http://www.egrandslam.com/ [egrandslam.com]
  • AL East: New York Yankees
    AL Central: Cleveland Indians
    AL West: Los Angeles Angels
    AL wildcard: either the Boston Red Sox, the Toronto Blue Jays or the Minnesota Twins


    OK, so he managed to choose division winners and then say that the Wild card would come from one of THREE other teams. I don't think there's much math or stats going on here. Shouldn't he be able to pick ONE team and say they're going to win the Wild Card? This sounds more like a baseball fans prediction than a mathematical prediction.
  • ... in the book Moneyball by Michael Lewis. He follows Billy Beane through a season with the Oakland As, where they beat their division even though they were outspent by nearly every other team. This prompted former Fed Chair Paul Volker to comment that Beane had found a market inefficiency. He had used such an inefficiency, but it wasn't Beane who had found it.

    To do this right, however, you have to do legwork, because according to the model described in Moneyball, On Base Percentage is really what you'r
    • by BridgeBum (11413) on Thursday April 05 2007, @08:05PM (#18629761)
      Bruce is actually a die hard Mets fan. I helped work on this project with him back in my undergrad days 15 years ago or so. I doubt any of my code is still be used though. :-)
    • Well actually if you predicted the Yankees to win the series every year from 1903 to 2006 you'd only have a .257 success rate. On the other hand that's a plurality, and more than double the wins of the next best thing, the Cardinals.
    • Now Jimmy the "Moose" Morgan, him I'll believe. He don't guess the probabilities, he makes them. A lead pipe trumps your modern math any day of the week.

      But setting the odds on sports matches isn't really about the probablility of one team winning or losing. It's about balancing the way that people will bet. The odds are structured to minimize the risk and maximize the return of the bookmaker, based on bettor behavior.

      "Moose" Morgan doesn't need to know or care whether the Yankees are likely to beat Ori

    • by hyfe (641811)

      Predicting the past is easier than predicting the future.

      No, it's seriously not. They are exactly the same. There's no difference between taking the first 3 of the last 5 years and training your dataset and validating on the last 2, and training on the last 3 years and validating on the next two to come. The models doesn't know the clock, and datasets are datasets.

      There is a world of difference between accuracy rates on your training/calibration set and your models performance on the validation set. One of