This is a technical deep dive of the collaborative filtering algorithm and how to use it in practice. From Amazon recommending products you may be interested in based on your recent purchases to Netflix recommending shows and movies you may want to watch, recommender systems have become popular across many applications of data science. Like many other problems in data science, there are several ways to approach recommendations. Two of the most popular are collaborative filtering and content-based recommendations. Content-based Recommendations: If companies have detailed metadata about each of your items, they can recommend items with similar metadata tags.
Steve Jobs once said, "A lot of times, people don't know what they want until you show it to them'. This makes sense, especially in this era of constant choice overload. Consumers today have access to a plethora of products just at the click of their mouse. These innumerable choices can sometimes turn out to be confusing and hampering and do more harm than good. For instance, a company may offer millions of products on its website, but how does a consumer find a new and appealing product from amongst those?
Nowadays, recommender systems are used to personalize your experience on the web, telling you what to buy, where to eat or even who you should be friends with. People's tastes vary, but generally follow patterns. People tend to like things that are similar to other things they like, and they tend to have similar taste as other people they are close with. Recommender systems try to capture these patterns to help predict what else you might like. E-commerce, social media, video and online news platforms have been actively deploying their own recommender systems to help their customers to choose products more efficiently, which serves win-win strategy.
In this article, I use the Kaggle Netflix prize data  to demonstrate how to use model-based collaborative filtering method to build a recommender system in Python. Recommender systems are widely used in product recommendations such as recommendations of music, movies, books, news, research articles, restaurants, etc. . The collaborative filtering method  predicts (filters) the interests of a user on a product by collecting preferences information from many other users (collaborating). The assumption behind the collaborative filtering method is that if a person P1 has the same opinion as another person P2 on an issue, P1 is more likely to share P2's opinion on a different issue than that of a randomly chosen person . Content-based filtering method  utilizes product features/attributes to recommend other products similar to what the user likes, based on other users' previous actions or explicit feedback such as rating on products.
In the WWW (World Wide Web), dynamic development and spread of data has resulted a tremendous amount of information available on the Internet, yet user is unable to find relevant information in a short span of time. Consequently, a system called recommendation system developed to help users find their infromation with ease through their browsing activities. In other words, recommender systems are tools for interacting with large amount of information that provide personalized view for prioritizing items likely to be of keen for users. They have developed over the years in artificial intelligence techniques that include machine learning and data mining amongst many to mention. Furthermore, the recommendation systems have personalized on an e-commerce, on-line applications such as Amazon.com, Netflix, and Booking.com. As a result, this has inspired many researchers to extend the reach of recommendation systems into new sets of challenges and problem areas that are yet to be truly solved, primarily a problem with the case of making a recommendation to a new user that is called cold-state (i.e. cold-start) user problem where the new user might likely not yield much of information searched. Therfore, the purpose of this paper is to tackle the said cold-start problem with a few effecient methods and challenges, as well as identify and overview the current state of recommendation system as a whole