One-Class Collaborative Filtering
Pan, R., Zhou, Y., Cao, B., Liu, N. N., Lukose, R., Scholz, M., & Yang, Q.
IEEE ICDM '08
http://www.rongpan.net/publications/pan-oneclasscf.pdf
Abstract
Many applications of collaborative filtering (CF), such as news item recommendation and bookmark recommendation, are most naturally thought of as oneclass collaborative filtering (OCCF) problems. In these problems, the training data usually consist simply of binary data reflecting a user’s action or inaction, such as page visitation in the case of news item recommendation or webpage bookmarking in the bookmarking scenario. Usually this kind of data are extremely sparse (a small fraction are positive examples), therefore ambiguity arises in the interpretation of the non-positive examples. Negative examples and unlabeled positive examples are mixed together and we are typically unable to distinguish them. For example, we cannot really attribute a user not bookmarking a page to a lack of interest or lack of awareness of the page. Previous research addressing this one-class problem only considered it as a classification task. In this paper, we consider the oneclass problem under the CF setting. We propose two frameworks to tackle OCCF. One is based on weighted low rank approximation; the other is based on negative example sampling. The experimental results show that our approaches significantly outperform the baselines.
1 Introduction
Personalized services are becoming increasingly indispensable on the Web, ranging from providing search
results to product recommendation. Examples of
such systems include recommending products at Amazon.com1
, DVDs at Netflix2
, News by Google 3
etc.
The central technique used in these systems is collaborative filtering (CF) which aims at predicting the preference of items for a particular user based on the items
previously rated by all users. The rating expressed in
different scores (such as a 1-5 scale in Netflix) can be explicitly given by users in many of these systems. However, in many more situations, it also can be implicitly
expressed by users’ behaviors such as click or not-click
and bookmark or not-bookmark. These forms of implicit ratings are more common and easier to obtain.
Although the advantages are clear, a drawback of
implicit rating, especially in situations of data sparsity, is that it is hard to identify representative negative
examples. All of the negative examples and missing
positive examples are mixed together and cannot be
distinguished. We refer to collaborative filtering with
only positive examples given as One-Class Collaborative Filtering (OCCF). OCCF occurs in different scenarios with two examples as follows.
• Social Bookmarks: Social bookmarks are very
popular in Web 2.0 services such as del.icio.us. In
such a system, each user bookmarks a set of webpages which can be regarded as positive examples
of the user’s interests. But two possible explanations can be made for the behavior that a user did
not bookmark a webpage. The first one is, the
page is of the users’ interest but she did not see
the page before; the second one is the user had
seen this page but it is not of her interest. We
cannot assume all the pages not in his bookmarks
are negative examples. Similar examples include
social annotation, etc.
1http://www.amazon.com
2http://www.netflix.com
3http://news.google.com
1• Clickthrough History: Clickthrough data are
widely used for personalized search and search result improvement. Usually a triple < u, q, p >
indicates a user u submitted a query q and clicked
a page p. It is common that pages that have not
been clicked on are not collected. Similar to the
bookmark example, we cannot judge whether the
page is not clicked because of the irrelevance of its
content or redundancy, for example.
There are several intuitive strategies to attack this
problem. One approach is to label negative examples
to convert the data into a classical CF problem. But
this is very expensive or even intractable because the
users generating the preference data will not bear the
burden. In fact, users rarely supply the ratings needed
by traditional learning algorithms, specifically not negative examples [23]. Moreover, based on some user
studies [14], if a customer is asked to provide many
positive and negative examples before the system performs well, she would get a bad impression of it, and
may decline to use the system. Another common solution is to treat all the missing data as negative examples. Empirically, this solution works well (see Section
4.6). The drawback is that it biases the recommendation results because some of the missing data might
be positive. On the other hand, if we treat missing as
unknown, that is, ignore all the missing examples and
utilize the positive examples only and then feed it into
CF algorithms that only model non-missing data (as
in [24]), a trivial solution arising from this approach
is that all the predictions on missing values are positive examples. All missing as negative (AMAN) and
all missing as unknown (AMAU) are therefore two extreme strategies in OCCF.
In this paper, we consider how to balance the extent of treating missing values as negative examples.
We propose two possible approaches to OCCF. These
methods allow us to tune the tradeoff in the interpretation of so-called negative examples and actually result in better performing CF algorithms overall. The
first approach is based on weighted low rank approximation [24]. The second is based on negative example sampling. They both utilize the information contained in unknown data and correct the bias of treating them as negative examples. While the weightingbased approach solves the problem deterministically,
the sampling-based method approximates the exact solution with much lower computational costs for large
scale sparse datasets.
Our contributions are summarized as follows. First
we propose two possible frameworks for the one-class
collaborative filtering problem and provide and characterize their implementations; second, we empirically
study various weighting and sampling approaches using
several real world datasets. Our proposed solutions
significantly outperform the two extremes (AMAN and
AMAU) in OCCF problems, with at least 8% improvement over the best baseline approaches in our experiments. In addition, we show empirically that these two
proposed solution frameworks (weighting and sampling
based) for OCCF have almost identical performance.
The rest of the paper is organized as follows. In
the next section, we review previous works related to
the OCCF problems. In Section 3, we propose two approaches for OCCF problems. In Section 4, we empirically compare our methods to some baselines on two
real world data sets. Finally, we conclude the paper
and give some future works.
References
- none