As scores of the world press corps and blog world reported on Monday, NetFlix, the video rental service, announced the winner of their $1 Mn contest to improve their movie recommendation engine – all aimed at looking for a 10% improvement. Simply put, the underlying basis of the contest was to seek out an algorithm to address one of the core challenges for a video recommendation system, namely, " Will you like this movie you haven’t seen before?"
This blog post is not going to retrench the news reported so well by The NY Times, The Washington Post and 400+ others . Rather I'm going to look at this contest from the perspective of a marketer and psychologist who sees it as a phenomenal marketing experiment . Don't let the somewhat whimsicial titles of news articles fool you (eg. “Flash! Movie Tips form your Robot Overlords" or CNN’s "Box office boffo for brainiacs" ): NetFlix' three-year long contest seems to have provided one of the most compelling reasons for e-commerce merchants to take the technique known as “crowdsourcing” seriously -- very seriously.
The Core Problem: Find Me Stuff I Like
To answer why NetFlix paid $1 Mn for a seemingly small 10% improvement in their CineMatch taste and preferences engine, it’s helpful to look at the typical NetFlix customer experience. My own experience with NetFlix resonates well with NPR’s interview of Clive Thompson, writer for Wired and the NY Times and someone’s who has been researching the NetFlix experiment since its beginnings. Thompson describes:
….when I joined Netflix I had like 20 movies I wanted to see, so I saw those. And once I'm done, I can't really think of any other movies I want to see.
[Editorial note: So here the user sits in the classical psychological state known as the “overchoice paradox” – when faced with too many choices, nothing happens. Not good.]
So if they want to keep on charging me 17 bucks a month, they have to be active in helping me find new stuff or I'll go four or five months without renting any movies and I'll be, like, why am I spending 17 bucks a month, right? Their business model is incumbent upon keeping you renting movies.
He's is spot on with this last one-sentence business analysis. A recommendation engine can be key to what users do next, leveraging another well-known psychological principle operating here: The Principle of Least Effort, or more simply, people will naturally choose the path of least resistance or "effort". In the NetFlix context, it's easier for video subscribers to leave unrequited than take on the onerous chore of exploring a 100,000+ movie title system. But a reccomendation or taste engine changes that.
For NetFlix, the problem was that their internal engineering team was stumped on how to get a 10% improvement in their recommendation engine, CineMatch. (After all this is a very immense and complex data set). In a marvelous act of both insight and bravery – the company decided to run an open contest, offering the best mathematical minds on the planet a chance to interact hands-on and test their immense database of 100 million + movie-rating data points - all for the chance to win a $1 Mn prize. (The privacy of actual user identities in the data set was of course protected.) Interestingly, per The Washington Post, when the contest launched in 2006, the first entrants took just three weeks to improve on what Netflix's internal team had been able to do on its own.
The Contest: A Dramatic ScreenPlay Itself
While I promised not to retread yesterday’s news tires, it bears saying that the manner in which NetFlix promoted the contest and the response to it is the stuff of an exciting screenplay itself. Here’s a summary of some of the more exciting stats:
- 51,000 contestants entered
- Those contestants ultimately merged into some 40,000 teams
- Participants included researchers from over 186 countries
- Only in the final nail-biting 24 minutes did the team BellKor’s Pragmatic Chaos edged out the team The Ensemble with the winning submission. (Both teams did complete the 10% solution.)
BellKor’s Pragmatic Chaos, the first and winning team, consisted of a group of statisticians, machine-learning experts and computer engineers hailing from the US, Austria, Canada and Israel. The Ensemble's 30 members come from Australia, Canada, China, Greece, Hungary, Israel, Poland, The Netherlands, and the United States.
What's the Significance?
1. Recommendation engines are a high stakes game across many e-markets. Beyond NetFlix’ use of user ratings, recommendation engines are well known to users of Amazon (as well as many other e-tailers) for product sales, Pandora for music suggestions, dating sites and social networking sites, such as Digg and Glue. We 've all seen their traces online from the tell-tale phrases such as "People you may like..." "People who bought this also liked X, Y, Z". Indeed, these engines are regarded by Forrester Research as critical mainstays of next-gen ecommerce sites, translating the "overchoice paradox" into a cross-sell opportunity. According to Thompson, two-thirds of all movies rented are picked because people had them recommended by the computer.
2. Engine Improvements Address Recurring Revenue. NetFlix and others with subscriber models are keenly aware that the more accurate their engine is, the more likely they will have better customer retention and, most importantly, more recurring revenue.
3. Crowdsourcing is Now A Viable Marketing Solution and Perhaps A New Imperative. On a marketing plane, NetFlix's contest legitimizes crowdsourcing, establishing it as a viable means of propelling (relatively) fast engineering progress. This point is well made in the NY Times published statements of Chris Volinsky, one of the leaders of the Pragmatic Chaos team and a scientist at AT&T Research. describing the mix of different statistical and machine-learning techniques used in their solution,
[It} only works well if you combine models that approach the problem differently. That’s why collaboration has been so effective, because different people approach problems differently.
His statement captures the essence of why crowdsourcing works: Diverse approaches coming from diverse cultures, not a corporate monoculture, throw new light on a problem.
3. Machine Learning May be as Cool as Cloud Computing. On a more technical plane, NetFlix’ high profile contest, long-watched by the online technical community, has shown the spotlight on too little discussed techniques. Per Gavin Potter, one of the few social scientists competing in the competition,
[The contest] has widened the awareness of machine learning techniques and recommender systems within the broader business community. I have had many,many requests from businesses asking how to implement recommender systems as a result of the competition and I guess other competitors have too. The wider non machine learning community is definitely looking for new applications (see my previous posts for some examples) and this can only be good for the field as a whole.
Take-Aways (Your Prizes)
Just as NetFlix provided data to consumer behavior researchers, their contest also provides powerful information to all companies which use recommendation engines to encourage online product sales.
1. It combined two powerful techniques: prize economics and crowdsourcing. By openly sharing their research database with data-thirsty researchers, the company virtually expanded their engineering department by 51,000 engineers and scientists. You don’t have to look at engineering salaries or consulting fees to know that a $1 Mn marketing prize spend was a great deal for their R&D budget. Could your company too use such an outlandish technique?
2. With this highly promoted contest, NetFlix effectively chased out the identities of thousands of talented engineers and scientists who have further benefited by participating in the contest. Your company now has the talent identified for your own company’s potential benefit. (Judging from Gavin Potter’s blog, smart companies have already started their outreach efforts to these experts some months ago.)
3. Here’s one of the more valuable lessons of the experiment: The winning solution was based on more than math skills. Whether you read the highly illuminating backtrace comment history on the Pragmatic Chaos blog or the recent press release from The Ensemble, it’s absolutely clear that teamwork was a key part in the journey to getting to the 10% improvement solution. The members of the top teams in particular hold highly strategic technical knowledge, project management and timing skills. These promise to position their future clients in a highly advantageous position in producing world-class recommendation engines.
4. Finally, ff you are wondering if these movie database research results really apply to your company’s particular market category, I invite you to read through some of the enlightening market-oriented questions raised by Media Unbound, a software firm known for their recommendation services and one which tracked the contest in its final 30 days. After reading these, it's more obvious that the whole field of recommendation engines has benefited from the NetFlix event.
Okay, a 10% Improvement– But Will It Blend?
Among their questions, Media Unbound raises a truly critical one in terms of the full marketing impact of the NetFlix experiment: Will the 10% improvement through the algorithm result in a noticeable improvement in the Netflix subscriber experience? Comments from NetFlix CEO Reed Hastings seem to suggest so - “It will allow us to double the accuracy of suggestions to customers”.
But is this really a given? Even while the mathematicians and engineers are leaving the playing field, it seems the pychophysicists and marketing scientists need to come onto it. Will the 10% be visibly felt and experienced as a substantial improvement by the typical NetFlix customer?
Consider this. In the branch of psychology called psychophysics, they speak of a Just-Noticeable-Difference in perception or a JND. Here’s an example that may bring that concept into familiar light. If you are sitting in a very dark room, a 10% light increment will be very detectable and highly noticeable. However, if you are sitting on a sunny beach, the 10% increment won’t be noticed. By dark, the light change is greater than one JND, while in a bright environ, the light change is less one JND. Translated for NetFlix, the question is whether a NetFlix video subscriber, selecting among hundreds of thousands of movie titles using the new algorithm, is going to readily detect the difference to affect their satisfaction and retention.
So bring on the next set of scientists! Let the games continue! (And they do. With the prize award, NetFlix also announced a new $1 M prize contest, focusing now on the tougher problem of subscribers with more sparse rmovie ratings.)
I’d love to know your marketing thoughts on the crowdsourcing implications of the NetFlix experiment. Is this, along with crowdsourcing use by StarBucks and others, the sign of a new way to run engineering departments? Do you see this as the path for faster product innovation?
In my next post – I’ll talk a bit about the best-practice marketing tactics NetFlix used to promote their crowdsourced contest.