×

Scalable user clustering based on set similarity

  • US 7,739,314 B2
  • Filed: 08/15/2005
  • Issued: 06/15/2010
  • Est. Priority Date: 08/15/2005
  • Status: Active Grant
First Claim
Patent Images

1. A computer program product, encoded on a machine-readable storage device, comprising instructions that when executed by a processor cause a data processing apparatus to:

  • obtain a respective interest set for each of multiple users, each interest set being a set of elements, each element representing a respective item in which the respective user has expressed interest through interaction with a data processing system;

    for each of the multiple users, apply an i-th hash function to each element of the interest set of the user to obtain a respective function value corresponding to the respective element, for each integer i between 1 and k, the k hash functions being distinct each from the others, and determine, from the function values obtained from the k hash functions, k hash values of the respective interest set, wherein the i-th hash value of the respective interest set is a minimum value among the function values obtained by applying the i-th hash function to the elements of interest set of the user, and where k is an integer greater than or equal to 1; and

    assign each of the multiple users to each of k clusters, the i-th cluster being represented by the i-th hash value of the respective interest set of the respective user, wherein the assignment of each of the multiple users to k clusters is done without regard to the assignment of any of the other users to k clusters.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×