Interest Still High In the Netflix Algorithm Competition 77
circletimessquare brings us an update to the status of the million-dollar Netflix competition to develop a better algorithm for movie recommendations. We've discussed aspects of the competition since it started two years ago, but the New York Times has a lengthy overview of where it stands now.
"The Netflix competition is still going strong, with a vibrant, competitive roster of some 30,000 programmers around the globe hard at work trying to win the prize. The Times provides a look at some of the more obsessive searchers, such as Len Bertoni, a semi-retired computer scientist near Pittsburgh who logs 20 hours a week on the problem, oftentimes with the help of his children. There's also Martin Chabbert in Montreal: 'After the kids are asleep and I've packed the lunches for school, I come down at 9 in the evening and work until 11 or 12.' The article gets into the history of the search algorithm Netflix currently uses, and explores the hot commodity called 'singular value decomposition' that serves as the basis for most of the algorithms in competition."
Netflix (Score:5, Interesting)
It's actually not that hard to build an algorithm which works well. Following a demonstration at TechEd I built my own implementation using Python in about 2 hours (using a vector space algorithm) or so with reasonable results. The problem is that it is very difficult to win the prize.
The best thing about it is that you get a lot of data to play with. If you are interested in parallel algorithms and large data sets give it a go. Its surprisingly interesting and sucks you in. In fact I might go play with it now.
Algorithm or Human inaccuracy? (Score:5, Interesting)
When Bertoni runs his algorithms on regular hits like Lethal Weapon or Miss Congeniality and tries to predict how any given Netflix user will rate them, he's usually within eight-tenths of a star
Makes me wonder how accurate my own ratings would be. The difference between clicking 3 or 4 stars is often very minor and arbitrary. At the end of a movie I might rate it something totally different than 20min later. Sounds like they're doing pretty good so far.
There's a sort of unsettling, alien quality to their computers' results ... But many categorizations are now so obscure that they cannot see the reasoning behind them. Possibly the algorithms are finding connections so deep and subconscious that customers themselves wouldn't even recognize them.
Realizing the program you wrote out-performs you and you can't explain why is a rather odd feeling.
Re:Damn you, slashdot. (Score:1, Interesting)
I opened up the RSS in tabs, saw "Incest Still High In t..." and couldn't resist immediately clicking.
And a short things that bugs me. Why does the "In" start with a capital letter, but "the" doesn't?
i usually do all my code comments like that. it's the same system that book titles use where important words and the first word start with a capital letter and all other important words are as well. in german they use every noun is capitalized and that's an interesting system too.
Re:almost impossible to really win (Score:5, Interesting)
If I recall correctly, the last person I remember winning a milestone used an additional data source for rating. (which is fine by their rules)
It's probably going to take an additional data source to improve ratings.
Hey if you do it at least you get a mil ;) It sounds like a worthy hobby in my book.
Re:Wow! Think about how many free man-hours Netfli (Score:2, Interesting)
That's remarkably reasonable. If I was LOVEFiLM or Amazon I'd be cackling with glee. I'm not though, so I'll just be depressed that one could hope to patent an algorithm. Not hardware that carries out an algorithm, but just an algorithm.
Although if I were a netflix shareholder I'd be pissed off that the company were giving away my funded research for free, when they could probably get it closed off and reap the rewards. Mind you, the amount of publicity that they have received - I know about Netflix now and I don't watch DVDs or live in the USA! - is probably more than worth it...
Multi discipline rating (Score:5, Interesting)
Right now neflix tries to infer what it was in the movie you liked by looking at other movies. Why not just ask what they liked about the movie.
For instance, I'm very concerned about the production quality in a movie. The movie may have the best plot ever and great actors but it was shot on a home VHS camera. I would give the movie a 1 star because the production quality was so bad, on the other hand someone who likes plots may have rated it a 5 star. Now netflix will never know if I rated it 1 star because I don't like the genre or don't like the acting or the cinematography. It just sees I rated the whole movie as a 1 and any movies that have similar elements then lose their importance on my personal ratings. If I could tell netflix: don't show me movies shot on a VHS camera (e.g.: production 1 star) then I could tell netflix I love the genre, love the plot hate the production.
A good example is Blood Ryane - this movie absolutely sucks (insert government sponsored movies jab here), but I like the genre - now if I give this one star, as it deserves, netflix will think I really don't like the... whatever, it's most likely going to be wrong about it because it's pure conjecture.
I'm not a big movie nerd so I wouldn't be the best person to come up with the rating categories, but I'll give it a shot since this will never occur:
1. Production Quality
2. Plot
3. Directing
4. Acting
5. Genre
Of course this will never happen because netflix will not change their system to conform to my random idea on slashdot. And by this sentence I've just about exhausted all my interest in the subject.
One last comment: Why are all the online netflix movies so craptastic? Really, if it wasn't made 15 years ago, and it's in the "watch instantly" section, then it must really suck. They had a movie on there called "merc force"