Interest Still High In the Netflix Algorithm Competition

Interest Still High In the Netflix Algorithm Competition 77

Posted by Soulskill on Saturday November 22, 2008 @06:10AM from the if-sandler-then-quit dept.

circletimessquare brings us an update to the status of the million-dollar Netflix competition to develop a better algorithm for movie recommendations. We've discussed aspects of the competition since it started two years ago, but the New York Times has a lengthy overview of where it stands now. "The Netflix competition is still going strong, with a vibrant, competitive roster of some 30,000 programmers around the globe hard at work trying to win the prize. The Times provides a look at some of the more obsessive searchers, such as Len Bertoni, a semi-retired computer scientist near Pittsburgh who logs 20 hours a week on the problem, oftentimes with the help of his children. There's also Martin Chabbert in Montreal: 'After the kids are asleep and I've packed the lunches for school, I come down at 9 in the evening and work until 11 or 12.' The article gets into the history of the search algorithm Netflix currently uses, and explores the hot commodity called 'singular value decomposition' that serves as the basis for most of the algorithms in competition."

Interest Still High In the Netflix Algorithm Competition

This discussion has been archived. No new comments can be posted.

Search 77 Comments Log In/Create an Account

Comments Filter:

almost impossible to really win (Score:5, Informative)

by mlwmohawk ( 801821 ) writes: on Saturday November 22, 2008 @08:52AM (#25856997)

The problem with the Netflix prize, and I myself am working on it :-) is that it is pretty darn near impossible to do better than what they have.
It is based on user ratings and how close you can come to actual user ratings. For instance, their record set has a frozen point in time, you job is to create a system that will accurately predict what another person will rate a movie in the future.
It doesn't take much psychology to understand that these are very subjective values. If you watch a movie on a "good" date, you'll rate it higher than if you watch the same movie with a "bad" date. Then there's the level of drunkenness under which you watch the movie. The day you had at work. How much money you lost in the stock market, etc.
In aggregate, you can come close, but the percentage of variability in the data suggests that Netflix chose their numbers well enough to never have to pay the prize.
Also, the "data" is nothing more than movie titles and obfuscated user ratings. Any sort of contextual or meta data about the movies you have to go find yourself.
It is a fun project on which to work, but I'm dubious of the end prize. I'll keep working on it because its fun, but I have my doubts as to the winability of the contest based on the criteria for success.

Re:Wow! Think about how many free man-hours Netfli (Score:5, Informative)

by Spy Hunter ( 317220 ) writes: on Saturday November 22, 2008 @08:56AM (#25857009) Journal

Actually Netflix closes nothing off. In fact, in order to receive the prize, the winner must publish their algorithm to the public. The winner could easily open-source the entire thing, or OTOH they're also free to patent it out the wazoo and start pimping it out. The only condition Netflix imposes is that Netflix gets a non-exclusive license to use the algorithm in exchange for the prize money, which is eminently reasonable.

Re:Wow! Think about how many free man-hours Netfli (Score:2, Informative)

by morgan_greywolf ( 835522 ) writes: on Saturday November 22, 2008 @09:02AM (#25857023) Homepage Journal

Saying that, if you enjoy playing with this, go ahead! Just be honest with yourself about. If you still want to do it, wallow in it. But it's an extremely pernicious thing to do to link this with working on something that is done to benefit everyone. It simply is not the same thing.
Exactly. Working on FOSS is a magnanimous thing to do. You are giving freely to the entire world -- anyone who needs done what your particular code does. It's volunteerism.
When you participate in the Netflix competition, you might not be getting paid, but the work you're doing benefits only Netflix and you -- if you win the $1 million prize, that is. There are side benefits even if you don't win the million dollar prize -- you increase your own abilities in the areas of programming, mathematics, critical thinking, etc.
But that's the only place where there are similarities in working on FOSS, and it's where the similarities end. At the end of the day, doing the Netflix competition is spec work at best. If you exclude the $1m -- which is very cheap, BTW, The sole beneficiary is Netflix.
But if you write something that benefits others in the world who share your problem and distribute that freely to the world, the beneficiary is the entire world, or at least some portion of it.

Re:It's fundamentally flawed (Score:2, Informative)

by boyter ( 964910 ) writes: on Saturday November 22, 2008 @09:07AM (#25857043) Homepage

It doesn't make a difference. If you are using the same account for scoring then you are using the same account for the recommendations. So if the algorithm suggests something your wife will like but you don't it is still successful because for the account in general it gave a good match. Besides, you can actually look into the data more deeply and find accounts like this (not too difficult) and vary your scoring weights to improve accuracy for other people.

Re:almost impossible to really win (Score:3, Informative)

by Garse Janacek ( 554329 ) writes: on Saturday November 22, 2008 @04:26PM (#25859725)

I have not looked into it but can you be certain that the top teams are not using additional metadata on the movies?
Pretty sure. IAITTT = I Am In The Top Ten ;)
The winning progress prize entry from 2007 had to publish the full details of their algorithm, and they don't use anything. I don't use anything. PragmaticTheory [blogspot.com] even wrote a blog post about how they don't use anything. Others have said the same thing. It's impossible to say that no one will ever come up with a useful way to use metadata, but so far the "metadata" produced by the algorithms themselves is far more accurate than that generated by human observers on the same data.
It may wind up being something not intuitive (like release month/year, production company, gap score of economic state during release year vs current, or something like that)
Well, that's beyond just counterintuitive to actually demonstrably unhelpful -- it seems a priori unlikely that someone's rating would depend on the production company, for example, but even if it was, that would be much more easily detected by the actual movie average (i.e. if a particular production company gets good ratings, then we will know that just because the movie has a lot of good ratings, and the company becomes superfluous). On the other hand, if you're suggesting that specific people have varying opinions of particular companies, well that again seems odd, but again it's irrelevant -- if such a correlation exists, SVD will find it, and so some of the dimensions of user-movie vectors will correlate to production company.
Similar with the other properties you mention: since SVD is already finding *all* of the (linear) correlations in the data, it's not very helpful to try to come up with a huge list of farfetched ones yourself hoping one of them will work out...

Re: the problem with outliers (Score:2, Informative)

by Anonymous Coward writes: on Saturday November 22, 2008 @06:25PM (#25860365)
The outliers are a major problem, but you can't just ignore them and move on. Collectively they add up to most of the error.
The training data set includes 116,362 user ratings of Napoleon dynamite; the distribution is:
- 1: 13,365 = 11.5%
- 2: 15,790 = 13.6%
- 3: 27,216 = 23.4%
- 4: 31,115 = 26.7%
- 5: 28,876 = 24.8%
The weighted average of these ratings is 3.4, and the math works out that when you only guess one value, the RMSE minimizes at the average. So in this case, a guess of 3.4 on all of those ratings gives you a 1.3025 RMSE for the data shown above. Most movies have an RMSE below 1.1.
Now suppose we try to refine our guess by using a coinflip method. In this model, we can look at the split 12/345 and assign ratings of 25%@1.54 and 75%@4.02. But what happens when we apply these without having any knowledge of which category each person falls? We end up doing worse! The problem is that even though you're only giving a 1.54 a quarter of the time, 3/4 of that 1/4 you're guessing a 1.54 for someone that actually ranked it a 3, 4, or 5. The error for 5 is especially bad, since 5 - 1.54 = 3.46, and then you have to square that! Overall, across the distribution, a guess of 1.54 ends up having an RMSE of 2.27, and the guess of 4.02 has an RMSE of 1.44. Applied together at 25% and 75% respectively you'll get sqrt(25% * 2.27^2 + 75% * 1.44^2) = 1.69 RMSE. Alternately we could use the 123/45 split: 48.5%@2.25 and 51.5%@4.48, but that turns out worse still since you'll end up with sqrt(48.5% * 1.74^2 + 51.5% * 1.69^2) = 1.71 RMSE.
The qualifying set asks for 10,551 guessed ratings of Napoleon dynamite out of 2,817,131 guessed ratings total. So if you can't figure out anything else about the ratings and have to go with the median vote, your error will include 10,551 * (1.3025)^2 = ~17,900 SSE (sum of squared error) from Napoleon Dynamite alone. The coin flip methods mentioned above would give over 30,000 SSE.
To put this in greater perspective: To win $1e6, you need to get below (0.8563)^2 * 2,817,131 = 2,065,660 SSE. The current leader has (0.8616)^2 = 2,091,310 SSE, and the 10th place team has (0.8677)^2 * 2,817,131 = 2,121,027 SSE. Thus the leader is only 25,650 SSE away from the prize, and the 10th place team is only 29,717 behind that at 55,367 SSE away.
So if the leaders were all using 3.4 as their guess for Napoleon Dynamite, and then they suddenly figured out a way to reduce the RMSE of their guesses for that one movie to 0.86, they'd be able to knock off 10,000 points of SSE -- just for the one movie. That's why they're so interested in "solving" the problem with outliers. However, odds are that they're already guessing in the 0.95 to 1.05 RMSE range for Napoleon, based on connections they've deduced about how each individual rated other movies.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Interest Still High In the Netflix Algorithm Competition 77

Interest Still High In the Netflix Algorithm Competition More Login

Interest Still High In the Netflix Algorithm Competition

almost impossible to really win (Score:5, Informative)

Re:Wow! Think about how many free man-hours Netfli (Score:5, Informative)

Re:Wow! Think about how many free man-hours Netfli (Score:2, Informative)

Re:It's fundamentally flawed (Score:2, Informative)

Re:almost impossible to really win (Score:3, Informative)

Re: the problem with outliers (Score:2, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot