Cluster-based scalable collaborative filtering

US 8,738,467 B2
Filed: 03/16/2006
Issued: 05/27/2014
Est. Priority Date: 03/16/2006
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of providing a predictive recommendation for a first item to an active user based at least in part on ratings of the first item, the method comprising:

selecting, by a computing device, from a set of N clusters comprising users, at least one cluster that is similar to the active user based at least in part on ratings of multiple items made by the users in the N clusters;

from the at least one cluster, determining, by the computing device, based at least in part on ratings of the multiple items that have been rated by users in the at least one cluster, similarity values for users in the at least one cluster;

identifying, by the computing device, based at least in part on the similarity values, K users that are similar to the active user, wherein each of the K users have provided a rating for the first item; and

providing, by the computing device, the predictive recommendation for the first item to the active user based at least in part on determining;

an average rating of the active user for the multiple items;

an average rating of each of the K users for items that include at least a subset of the multiple items;

for each of the K users, a difference between the rating for the first item and the average rating for the items that include at least a subset of the multiple items; and

an addition that includes at least the average rating of the active user and the difference to form the predictive recommendation for the first item.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods for determining a predictive rating are disclosed. In an embodiment, an active user is compared to a set of clusters. One or more of the clusters are determined to be most similar to the active user. From the one or more clusters, K users are determined to be most similar to the active user. Prior ratings for an item by the K users may be used to predict a rating for the item for the active user.

Citations

20 Claims

1. A computer-implemented method of providing a predictive recommendation for a first item to an active user based at least in part on ratings of the first item, the method comprising:
- selecting, by a computing device, from a set of N clusters comprising users, at least one cluster that is similar to the active user based at least in part on ratings of multiple items made by the users in the N clusters;
  
  from the at least one cluster, determining, by the computing device, based at least in part on ratings of the multiple items that have been rated by users in the at least one cluster, similarity values for users in the at least one cluster;
  
  identifying, by the computing device, based at least in part on the similarity values, K users that are similar to the active user, wherein each of the K users have provided a rating for the first item; and
  
  providing, by the computing device, the predictive recommendation for the first item to the active user based at least in part on determining;
  
  an average rating of the active user for the multiple items;
  
  an average rating of each of the K users for items that include at least a subset of the multiple items;
  
  for each of the K users, a difference between the rating for the first item and the average rating for the items that include at least a subset of the multiple items; and
  
  an addition that includes at least the average rating of the active user and the difference to form the predictive recommendation for the first item.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, the providing further comprising:
    - determining a summation of the difference multiplied by an associated similarity value for each of the K users;
      
      determining a division of the summation divided by a summation of the associated similarity value for each of the K users; and
      
      determining the addition by further using an addition of the division and the average rating of the active user to form the predictive recommendation for the first item.
  - 3. The method of claim 1, further comprising:
    - sorting the users into the set of N clusters in advance of the active user requesting information.
  - 4. The method of claim 1, wherein the predictive recommendation is provided in response to an input provided by the active user.
  - 5. The method of claim 4, wherein the input is a selection of the first item.
  - 6. The method of claim 1, wherein the at least one cluster comprises at least 30% of a total number of the users in the N clusters.

7. A computer readable storage memory comprising computer executable instructions that, when executed by a processor, cause the processor to perform acts comprising:
- sorting a database of users into N clusters based at least in part on ratings for items made by the users;
  
  in response to receiving an input by an active user, determining a subset of the N clusters that are similar to the active user based at least in part on an average deviation of ratings for items made by the active user and users in the subset of the N clusters;
  
  determining a similarity value for each of the users in the subset of the N clusters based at least in part on the ratings of the items;
  
  selecting, from those users in the subset of the N clusters that have provided a rating for a first item that is not included in the items, K users in the subset of the N clusters that are closest to the active user based at least in part on the similarity value determined for each of the K users; and
  
  providing a predictive rating for the first item to the active user based at least in part on determining;
  
  an average rating of the active user for the items;
  
  for each associated user of the K users, a difference between a rating made by the associated user for the first item and an average rating of the associated user for multiple items that include at least a subset of the items; and
  
  a summation of the difference for each associated user of the K users added to the average rating of the active user for the items to form the predictive rating for the first item.
- View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15)
- - 8. The computer readable storage memory of claim 7, wherein the sorting comprises:
    - using a K-means algorithm to sort the users into N clusters.
  - 9. The computer readable storage memory of claim 7, wherein the determining the subset of the N clusters comprises:
    - selecting a number of clusters so as to include at least 30 percent of the total users in the subset of N clusters.
  - 10. The computer readable storage memory of claim 7, wherein the determining the subset of the N clusters comprises:
    - determining a set of ratings associated with the active user;
      
      determining the similarity of each of the N clusters to the active user; and
      
      selecting a subset of the N clusters that are the most similar to the active user.
  - 11. The computer readable storage memory of claim 10, wherein the determining the similarity of each of the N clusters to the active user comprises:
    - comparing the set of ratings associated with the active user to a centroid of each cluster.
  - 12. The computer readable storage memory of claim 7, wherein the predictive rating is a recommendation to purchase the first item.
  - 13. The computer readable storage memory of claim 12, wherein the input is a request for information regarding a second item related to the first item.
  - 14. The computer readable storage memory of claim 7, wherein the determining similarity values used to select K users comprises:
    - determining a set of ratings associated with the active user; and
      
      determining the similarity between the set of ratings associated with the active user and the ratings associated with each of the users in the subset of N clusters.
  - 15. The computer readable storage memory of claim 7, wherein the providing the predictive rating is further based at least in part on determining:
    - a multiplication of the difference for each associated user of the K users by the corresponding similarity value for the associated user of the K users; and
      
      a summation of the similarity value for each user of the K users.

16. A computer-implemented method of providing a predictive rating to an active user, the method comprising:
- receiving, by a computing device, a request from the active user, the request associated with a first item;
  
  selecting, by the computing device, from a set of clusters of users, at least one cluster that is most similar to the active user based at least in part on an average deviation of ratings for items made by the active user and users in the at least one cluster;
  
  determining, by the computing device, a similarity value for each user in the at least one cluster based at least in part on the ratings for the items, the determining similarity values comprising;
  
  calculating a difference, for each corresponding user of the at least one cluster, between a rating for each of the items that have been rated by the corresponding user, and an average rating of ratings made by the corresponding user for multiple items that include at least a subset of the items and one or more other items that differ from each of the items; and
  
  calculating a difference, for the active user, between a rating for each of the items rated by the active user and an average rating for the items rated by the active user;
  
  determining, by the computing device, from those users in the at least one cluster that have provided a rating for the first item, K users that are most similar to the active user based at least in part on the similarity value determined for each of the K users;
  
  determining, by the computing device, a predictive rating for the first item based at least in part on ratings of the first item made by the K users, wherein each K user that is more similar to the active user relative to the remaining K users will have a greater influence on the predictive rating; and
  
  providing, by the computing device, the predictive rating to the active user.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The method of claim 16, wherein the request received from the active user comprises a request for search results for a class of product.
  - 18. The method of claim 16, wherein the selecting the at least one cluster comprises:
    - determining ratings associated with the active user for each of the items;
      
      determining a similarity between the ratings associated with the active user and a centroid of each cluster in the set of clusters; and
      
      selecting the at least one cluster that has the centroid that is most similar to the ratings associated with the active user.
  - 19. The method of claim 18, wherein the centroid for each cluster includes an average rating value for each of the items.
  - 20. The method of claim 16, wherein the request is a purchase of a second item.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Lin, Chenxi, Xue, Gui-Rong, Zeng, Hua-Jun, Chen, Zheng, Zhang, Benyu, Wang, Jian
Primary Examiner(s)
Garg, Yogesh C

Application Number

US11/377,480
Publication Number

US 20070239554A1
Time in Patent Office

2,994 Days
Field of Search

705/10, 705/14, 705/26, 705/26.1, 705/14.1, 705/14.53, 705/14.14, 705/14.37, 705/14.39, 705/14.66, 705/319, 705/347, 705/26.7
US Class Current

705/26.7
CPC Class Codes

G06F 16/9535   Search customisation based ...

G06Q 30/0212   Chance discounts or incentives

G06Q 30/0601   Electronic shopping [e-shop...

G06Q 30/0631   Item recommendations

Cluster-based scalable collaborative filtering

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Cluster-based scalable collaborative filtering

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links