AT&T Labs - Research
AT&T  



BellKor/KorBell Home

AT&T Research Home

Statistics Research

Information Visualization Research

Employment at AT&T Research

     Team BellKor/KorBell - Netflix Entry Details

Updated: March 21,2008

Our new model results in a RMSE of 0.8787 without blending or using any data outside of what Netflix provided. We are quite happy with this accuracy. Moreover, it appears that this model made a nice contribution to the overall blend. The chart below shows the final RMSE as a function of the number of blended predictors. As expected, the first few predictors have a decisive contribution to improving accuracy, while the rest have a marginal contribution. Notice that blending two predictors is enough for achieving an 8.14% improvement over Netflix Cinematch. Also, blending five predictors achieves the RMSE=0.8712 level that granted us the 2007 Progress Prize. Well, it used to take 107 predictors, so we are pleased with the progress...We hope to provide more details on the new model in upcoming publications.


Previous Entry : October, 2007

We tried many diffferent flavors and combinations of models as described here, which describes our winning entry as a combination of 107 different methods. However, we would like to stress that it is not necessary to have such a large number of models to do well. The plot below shows RMSE as a function of the number of methods used. One can achieve our winning score (RMSE=0.8712) with less than 50 methods, using the best 3 methods can yield RMSE < 0.8800, which would land in the top 10. Even just using our single best method puts us on the leaderboard with an RMSE of 0.8890. The lesson here is that having lots of models is useful for the incremental results needed to win competitions, but practically, excellent systems can be built with just a few well-selected models.



Here is a graph showing the progression of leaders over the course of the first year of the competition, combined with BellKor's progress throughout the year. (Note: In order to remove some visual clutter, we removed any entries that held the lead for less than 24 hours).