## Using Graph Theory To Predict NCAA Tournament Outcomes 91

New submitter SocratesJedi writes

*"Like many technically-minded people, I don't have a lot of time to keep up with sports. Nevertheless, trying to predict the outcome of the NCAA men's basketball tournament is a fun activity to share with friends, family and colleagues. This year, I abandoned my usual strategy of quasi-randomly choosing teams and instead modeled the win-loss history of all Division I teams as a weighted network. The network included information from 5242 games played during the 2011-2012 season. From this, teams came be ranked using tools from graph theory and those rankings can be used to predict tournament outcomes. Without any**a priori*information, this method accurately identified all the #1 seeds in the top 5 best teams. It also predicts that at least one underdog, Belmont (#14 seed), will reach the Elite Eight. Although the ultimate test will be how well it predicts tournament outcomes, initial benchmarks suggest 70-80% accuracy would not be unreasonable."
## past history (Score:5, Insightful)

wouldn't running the algorithm against past years' records and testing against past tournament results be the best possible test to tune the algorithm?

## Predicting the top is easy (Score:5, Insightful)

Everyone knows who the big names are who are likely to make it to the final four. It's predicting how things will go at the middle and bottom, where teams are much more likely to be evenly matched, that's really hard.

## Re:Just take last years results (Score:5, Insightful)

That may work for pro sports, but not for college sports. In fact, because teams usually lose their nucleus after winning it all (players declare for the draft), it is rare for a team to make it to the final game two or more years in a row.

## Re:past history (Score:5, Insightful)

The problem stems from the fact that we traditionally predict a team will win if it is a stronger or better team, and we use our graph theory to produce relative team ratings. And if each game of the tournament were played over and over again with the winner of the majority going to the next round, then our methods would work even better. As it stands though, we are trying to predict a single sampling from a probability distribution - which will necessarily have error. Informally, the real tournament has upsets (when a weaker team beats a stronger one). Our algorithms can't predict these, the best they can do is gain a better understanding than humans as to which team is better.

Add to that the fact that the tournament is structured hierarchically - a mis-prediction in the first round prevents you from even attempting to predict later games (and by NCAA bracket scoring, that counts the same as mis-predicting those later games). So early upsets can potentially have large negative outcomes on brackets.

## Doesn't matter if it works (Score:2, Insightful)

Can you write a windows installer for it and sell it to gamblers?

## Not enough time? (Score:4, Insightful)

You don't have time to follow sports, but you have time to model "information from 5242 games played during the 2011-2012 season".

You could be honest and just say you don't really care, but get involved in the playoffs because everyone else is talking about it.

I'm guessing your level 80 warlock probably doesn't 'have time' either. :)