Item-Based Collaborative Filtering Recommendation Algorithms
Sarwar, B., Karypis, G., Konstan, J., & Riedl, J.
WWW '01
http://www.ra.ethz.ch/cdstore/www10/papers/pdf/p519.pdf
ABSTRACT
Recommender systems apply knowledge discovery techniques to the problem of making personalized recommendations for information, products or services during a live interaction. These systems, especially the k-nearest neighbor collaborative filtering based ones, are achieving widespread success on the Web. The tremendous growth in the amount of avail- able information and the number of visitors to Web sites in recent years poses some key challenges for recommender systems. These are: producing high quality recommendations, performing many recommendations per second for millions of users and items and achieving high coverage in the face of data sparsity. In traditional collaborative filtering systems the amount of work increases with the number of participants in the system. New recommender system technologies are needed that can quickly produce high quality recommendations, even for very large-scale problems. To address these issues we have explored item-based collaborative filtering techniques. Item-based techniques first analyze the user-item matrix to identify relationships between different items, and then use these relationships to indirectly compute recommendations for users.
In this paper we analyze different item-based recommendation generation algorithms. We look into different techniques for computing item-item similarities (e.g., item-item correlation vs. cosine similarities between item vectors) and different techniques for obtaining recommendations from them (e.g., weighted sum vs. regression model). Finally, we experimentally evaluate our results and compare them to the basic k-nearest neighbor approach. Our experiments suggest that item-based algorithms provide dramatically better performance than user-based algorithms, while at the same time providing better quality than the best available user- based algorithms.
1. INTRODUCTION
The amount of information in the world is increasing far more quickly than our ability to process it. All of us have known the feeling of being overwhelmed by the number of new books, journal articles, and conference proceedings coming out each year. Technology has dramatically reduced the barriers to publishing and distributing information. Now
it is time to create the technologies that can help us sift through all the available information to nd that which is most valuable to us. One of the most promising such technologies is col laborative ltering [19, 27, 14, 16]. Collaborative ltering works by building a database of preferences for items by users. A new
user, Neo, is matched against the database to discover neigh- bors, which are other users who have historically had similar
taste to Neo. Items that the neighbors like are then recom- mended to Neo, as he will probably also like them. Collab- orative ltering has been very successful in both research
and practice, and in both information ltering applications and E-commerce applications. However, there remain im- portant research questions in overcoming two fundamental challenges for collaborative ltering recommender systems. The rst challenge is to improve the scalability of the collaborative ltering algorithms. These algorithms are able to
search tens of thousands of potential neighbors in real-time, but the demands of modern systems are to search tens of millions of potential neighbors. Further, existing algorithms have performance problems with individual users for whom
the site has large amounts of information. For instance,
if a site is using browsing patterns as indications of con- tent preference, it may have thousands of data points for its most frequent visitors. These \long user rows" slow down
the number of neighbors that can be searched per second, further reducing scalability. The second challenge is to improve the quality of the rec- ommendations for the users. Users need recommendations
they can trust to help them nd items they will like. Users will "vote with their feet" by refusing to use recommender
systems that are not consistently accurate for them. In some ways these two challenges are in con
ict, since the
less time an algorithm spends searching for neighbors, the more scalable it will be, and the worse its quality. For this
reason, it is important to treat the two challenges simul- taneously so the solutions discovered are both useful and
practical. In this paper, we address these issues of recommender
systems by applying a dierent approach{item-based algo- rithm. The bottleneck in conventional collaborative ltering algorithms is the search for neighbors among a large user population of potential neighbors [12]. Item-based al- gorithms avoid this bottleneck by exploring the relationships between items rst, rather than the relationships between
users. Recommendations for users are computed by nding
items that are similar to other items the user has liked. Be- cause the relationships between items are relatively static,
285item-based algorithms may be able to provide the same quality as the user-based algorithms with less online computa- tion.
Item-based Collaborative Filtering provides better performance than user-based ones.
Item-space is relatively static compared to user-space.
So, pre-computing is available for Item-based C.F.
User vector changes much more than Item vector when adding one rating.
References
- none