Alternate Baseball Universes 229
Jamie found a NYTimes op-ed by a grad student and a professor from Cornell, outlining some research they did into alternate baseball universes. The goal was to find out how unlikely in fact was Joe DiMaggio's 56-game hitting streak, played out in the 1941 season. No one since has even come close to that record. The math guys ran simulations of the entire history of baseball from 1885 on — 10,000 of them. For each simulation they put each player up to the plate for each at-bat in each game in each year, just like it happened; and they rolled the dice on him, based on his actual hitting stats for that season. (Their algorithm sounds far simpler than whatever the Strat-O-Matic guys use.) The result: Joltin' Joe's record is not merely likely, it's basically a sure thing. Every alternate universe produced a streak of 39 games or better; one reached 109 games. Joe DiMaggio was not the likeliest player in the history of the game to accomplish the record, not by a long shot.
If its so likely, they why hasn't it happened? (Score:5, Interesting)
I know the statisticians among you are going to bash me with a cluestick for such a naive question, but I'll ask anyway - if this event is so likely to occur, then why hasn't it happened again?
Re:If its so likely, they why hasn't it happened? (Score:2, Interesting)
Re:Nerves (Score:3, Interesting)
Changing game of baseball (Score:5, Interesting)
I think it would be more impressive to take a subset of the data, and compare from 1930 up until the present. Of course, there have been other major changes to; glove sizes, introduction of the slider for a pitch, steroid use.
Re:Bogus (Score:4, Interesting)
What they are actually saying is that reality appears to follow a probability bell curve.
You could also say that, in 1,230,000 years of baseball games, we could be almost certain of a hitting streak longer than 56 games.
Re:So basically... (Score:3, Interesting)
Re:If its so likely, they why hasn't it happened? (Score:3, Interesting)
Statistical analysis isn't inappropriate in terms of studying baseball, it is just inappropriate to use it in this manner.
What you are suggesting is a good example of the gambler's fallacy. And it breaks down in this case for the reasons that I mentioned, the underlying conditions in which those batting averages were collected has changed in such a way that they no longer accurately reflect the present conditions.
The GP was asking if the occurrence is that common, why hasn't it happened since, and the answer I gave was that there was a fundamental change in the way that the game is played which changed who has the advantage. It's similar to why nobody has had a
Re:Nerves (Score:4, Interesting)
Re:too simplistic (Score:3, Interesting)
On the other hand, one doesn't get the benefit of running into the belly-itchers. My feeling is that, on average, the superstars, the ones with above 340 career averages, generally feasted on the mediocre to minor pitchers.
What this study doesn't take into account is how long it takes to live through a streak. DiMaggio needed two months. Besides the strain of day to day playing (and if it's a pennant race, you know the hot hitter is going to be in the lineup) there's also the way the weather and the light changes during the season. There used to be more day games and double-headers back in the 30s-40s-50s when batting averages were highest. Travel was by train and by bus and took longer. There seems to be a week every season when a cold or flu is making the rounds of the club. Then there's situational issues. 7th inning and behind, man on second base, the hitter is 0-3 and 30 games into the streak. I say the pitcher semi-intentionally walks the batter and amid a chorus of boos the streak goes poof. Here's another consideration, the opposing players and pitchers know the hitter has a streak when it gets past 20 games and the pitching gets a bit more careful and the batter has to extend the streak via pitchers' mistakes, and that makes it less likely.
if what I say is true, it should follow that the incidence of any consecutive games with a hit streak beyond 15 in a MLB season should be lower than the probability suggested by the league batting averages (which are depressed in the NL by pitchers and the other bottom 4 from the lineup.)
Re:If its so likely, they why hasn't it happened? (Score:5, Interesting)
Otherwise, buddy, you're way off base.
NL year-by-year stats. [baseball-reference.com]
Look at those ERAs pre-1920. Before 1920, the ERA on the NL never significantly exceeded 3.00. After 1920, it never dropped below 3.3 or so, with the exception of a 2.99 in 1968, after which MLB made changes to the rules, amongst them lowering the acceptable height of the pitcher's mound.
The time prior to 1920 was marked by pitchers such as Cy Young, Mordecai Brown, Walther Johnson, Ed Walsh, Christy Mathewson. You've probably heard of most of them.
Here are the single-season MLB ERA leaders. [baseball-reference.com] Outside of Bob Gibson in the aforementioned 1968, you have to go all the way to Greg Maddux in 1994 at #48 all time to find a season after 1920 on the list. Barely 10 of the 100 lowest single-season ERAs in MLB history occurred after 1920. And that's only because Pedro Martinez in 2000 and Ron Guidry in 1978 tied with 9 others for #100 on the list. So only 8 of the best single-season ERAs happened after 1920.
You need to research "dead ball era", and the response by baseball to "Black Sox". (Hint: just like the response to the 1994 strike, it involves the ball...)
The fact that you got a +5 out of such a demonstrably incorrect post is a major indictment of the baseball knowledge of the Slashdot faithful.
Other tidbits about DiMaggio's streak (Score:3, Interesting)
During the streak Joe DiMaggio had a batting average of
During Joe DiMaggio's streak, Ted Williams actually had a higher batting average. William's batted
Joe DiMaggio had a 61 game hitting streak while playing for the San Francisco Seals in the Pacific Coast League in 1933.
Re:If its so likely, they why hasn't it happened? (Score:4, Interesting)
You might be able to model some long term behavior that way, but never the short term stuff, because the model is too simplified (man versus dice).