Method and apparatus for partitioning a plurality of items into groups of similar items in a recommender of such items
First Claim
1. A method for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, said method comprising the steps of:
- partitioning said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;
identifying at least one mean item for each of said k clusters; and
assigning each of said plurality of items to one of said clusters based on a distance metric.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus are disclosed for recommending items of interest to a user, such as television program recommendations, before a viewing history or purchase history of the user is available. A third party viewing or purchase history is processed to generate stereotype profiles that reflect the typical patterns of items selected by representative viewers. A user can select the most relevant stereotype(s) from the generated stereotype profiles and thereby initialize his or her profile with the items that are closest to his or her own interests. A clustering routine partitions the third party viewing or purchase history (the data set) into clusters using a k-means clustering algorithm, such that points (e.g., television programs) in one cluster are closer to the mean of that cluster than any other cluster. The value of k is incremented until (i) further incrementing of k does not yield any improvement in the classification accuracy, (ii) a predefined performance threshold is reached, or (iii) an empty cluster is detected.
22 Citations
30 Claims
-
1. A method for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, said method comprising the steps of:
-
partitioning said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;
identifying at least one mean item for each of said k clusters; and
assigning each of said plurality of items to one of said clusters based on a distance metric. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
computing a variance for each of said items; and
selecting said at least one item that minimizes said variance as the mean symbolic value.
-
-
11. The method of claim 1, wherein said step of identifying at least one mean item for each of said k clusters, J, further comprises the steps of:
-
computing a variance of each of said clusters, J, for each of said possible symbolic values, xμ
, for each of said symbolic attributes; and
selecting for each of said symbolic attributes at least one symbolic value, xμ
, that minimizes said variance as the mean symbolic value.
-
-
12. The method of claim 1, wherein said mean is comprised of a plurality of items and wherein said distance metric for a given item in said third party selection history is based on a distance between said given item and each item comprising said mean.
-
13. A method for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, said method comprising the steps of:
-
partitioning said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;
identifying at least one mean item for each of said k clusters;
assigning each of said plurality of items to one of said clusters based on a distance metric; and
incrementing said value of k until a predefined condition is satisfied. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A system for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, said system comprising:
-
a memory for storing computer readable code; and
a processor operatively coupled to said memory, said processor configured to;
partition said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;
identify at least one mean item for each of said k clusters; and
assign each of said plurality of items to one of said clusters based on a distance metric. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
computing a variance for each of said items; and
selecting said at least one item that minimizes said variance as the mean symbolic value.
-
-
28. The system of claim 18, wherein said processor identifies said at least one mean item for each of said k clusters by:
-
computing a variance of each of said clusters, J, for each of said possible symbolic values, xμ
, for each of said symbolic attributes; and
selecting for each of said symbolic attributes at least one symbolic value, xμ
, that minimizes said variance as the mean symbolic value.
-
-
29. The system of claim 18, wherein said mean is comprised of a plurality of items and wherein said distance metric for a given item in said third party selection history is based on a distance between said given item and each item comprising said mean.
-
30. An article of manufacture for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, comprising:
-
a computer readable medium having computer readable code means embodied thereon, said computer readable program code means comprising;
a step to partition said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;
a step to identify at least one mean item for each of said k clusters; and
a step to assign each of said plurality of items to one of said clusters based on a distance metric.
-
Specification