Grocery Shopping Recommendations Based on Basket-Sensitive Random Walk
Li, M., Dias, B. M., Jarman, I., El-Deredy, W., & Lisboa, P. J. G.
ACM SIGKDD '09
http://dblab.cs.nccu.edu.tw/presentation/990406/p1215-li.pdf
ABSTRACT
We describe a recommender system in the domain of grocery shopping. While recommender systems have been widely studied, this is mostly in relation to leisure products (e.g. movies, books and music) with non-repeated purchases. In grocery shopping, however, consumers will make multiple purchases of the same or very similar products more frequently than buying entirely new items. The proposed recommendation scheme offers several advantages in addressing the grocery shopping problem, namely: 1) a product similarity measure that suits a domain where no rating information is available; 2) a basket sensitive random walk model to approximate product similarities by exploiting incomplete neighborhood information; 3) online adaptation of the recommendation based on the current basket and 4) a new performance measure focusing on products that customers have not purchased before or purchase infrequently. Empirical results benchmarking on three real-world data sets demonstrate a performance improvement of the proposed method over other existing collaborative filtering models.
1. INTRODUCTION
Grocery shopping is most often considered a real drudgery, especially by busy families. In order to eliminate this negative feeling, many brick-and-mortar grocery stores have set up their online shopping websites and employed recommender systems to support consumers during their shopping processes. For example, these recommender systems sometimes display a list of forgotten items or a list of new but relevant products.
At the heart of the recommender system is a personalisation algorithm. These algorithms model consumer shopping behaviour and are used to automatically identify items1 that are new to the individual consumer, but are likely of interests to them. This is a valuable function to many E-Commerce websites, such as book sales on amazon.com, DVD rental service on netflix.com and online grocery shopping on leshop.ch. Generally, a rank-ordered list of products not currently in the basket is generated from the items present in the individual basket, and it is expected that the shopper is more likely to accept the recommendations ranked at the top of the list.
Collaborative Filtering (CF) has been demonstrated to be an effective framework to generate recommendations [2, 15, 11, 13, 19]. It identifies the potential preference of a consumer for a new product solely based on the information collected from other consumers with similar items in the basket. Thus, compared to the content-based filtering framework [1], there is no need to apply more complicated (and sometime less reliable) content analysis techniques. In general, the development of the CF based recommender system involves three components: 1) a method to represent user-user or product-product similarities, 2) a method to combine the similarities in order to generate a list of recommendations (i.e. items not yet purchased by the consumers, but of potential interest to them), and 3) an evaluation strategy to regulate the model via retrospective data for optimal performance. In general, the relative performance of the recommendations depends on the sparsity of the data, which is a major issue challenging the usefulness of the CFbased techniques. By sparsity, we refer to the prevalence of null entries in transactional or feedback data, which renders them insufficient for deriving reliable product affinities. While recommender systems has become a popular research topic over recent years, most publications are originally developed for those one-off leisure products (e.g. movies, books and music) with non-repeated purchases. Furthermore, many existing recommendation techniques are based on user-ratings, which requires the degree of preference to be explicitly represented on a discrete numerical scale ranging from the lowest (most disliked) to the highest (most favored) value. In grocery shopping, however, people tend to make repeated purchases, e.g. during weekly shopping with dairy, vegetables, etc and buy novel items less frequently. In addition, product preference is conveyed implicitly in the transaction data instead of being expressed explicitly as ratings. This paper is motivated by the application of recommender systems to the grocery shopping domain. We propose a new algorithm under the collaborative filtering framework, which is better suited to the characteristics of grocery shopping data.
There are four novel aspects in the proposed algorithm. Firstly we propose a new product similarity measure based on implicit user preference, which uses a bipartite network to represent the shopping data. Secondly, we propose overcoming with the data sparsity problem by using a basketsensitive random walk model that can derive the product similarities by exploiting partial or incomplete preference information from their neighborhood area (i.e. the other products bought together with the two in consideration). Thirdly, by making the random walk model sensitive to the current basket contents, we define a framework for making personalised recommendations. Finally, we propose a new performance measure of the recommendations, which weights each correct prediction in inverse proportion to overall prevalence of the items. We refer to this new performance measure as the weighted hit rate, which extends the principal performance measure that we have employed thus far (i.e. the popularity based binary hit rate described in [17]). The outline of this paper is as follows. Section 2 reviews the collaborative filtering and random walk algorithms, in the context of recommendations. Sections 3 and 4 describe our proposed method and evaluation measures respectively. The experimental results are described in Section 5. Finally, we present our conclusions and future work in Section 6.
References
- none