Method and apparatus for partitioning a plurality of items into groups of similar items in a recommender of such items

US 6,801,917 B2
Filed: 11/13/2001
Issued: 10/05/2004
Est. Priority Date: 11/13/2001
Status: Expired due to Term

First Claim

Patent Images

1. A method for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, said method comprising the steps of:

partitioning said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;

identifying at least one mean item for each of said k clusters; and

assigning each of said plurality of items to one of said clusters based on a distance metric.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus are disclosed for recommending items of interest to a user, such as television program recommendations, before a viewing history or purchase history of the user is available. A third party viewing or purchase history is processed to generate stereotype profiles that reflect the typical patterns of items selected by representative viewers. A user can select the most relevant stereotype(s) from the generated stereotype profiles and thereby initialize his or her profile with the items that are closest to his or her own interests. A clustering routine partitions the third party viewing or purchase history (the data set) into clusters using a k-means clustering algorithm, such that points (e.g., television programs) in one cluster are closer to the mean of that cluster than any other cluster. The value of k is incremented until (i) further incrementing of k does not yield any improvement in the classification accuracy, (ii) a predefined performance threshold is reached, or (iii) an empty cluster is detected.

22 Citations

View as Search Results

30 Claims

1. A method for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, said method comprising the steps of:
- partitioning said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;
  
  identifying at least one mean item for each of said k clusters; and
  
  assigning each of said plurality of items to one of said clusters based on a distance metric.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, further comprising the step of incrementing said value of k until a further increment of k does not improve a classification accuracy.
  - 3. The method of claim 1, further comprising the step of incrementing said value of k until a predefined performance threshold is reached.
  - 4. The method of claim 1, further comprising the step of incrementing said value of k until an empty cluster is detected.
  - 5. The method of claim 1, further comprising the step of assigning a label to each of said clusters.
  - 6. The method of claim 5, further comprising the step of receiving a user selection of at least one cluster based on said assigned labels.
  - 7. The method of claim 1, wherein said partitioning step further comprises the step of employing a k-means clustering routine.
  - 8. The method of claim 1, wherein said distance metric is based on a distance between corresponding symbolic feature values of two items based on an overall similarity of classification of all instances for each possible value of said symbolic feature values.
  - 9. The method of claim 8, wherein said distance between symbolic features is computed using a Value Difference Metric (VDM) technique.
  - 10. The method of claim 1, wherein said step of identifying at least one mean item for each of said k clusters further comprises the steps of:
11. The method of claim 1, wherein said step of identifying at least one mean item for each of said k clusters, J, further comprises the steps of:
- computing a variance of each of said clusters, J, for each of said possible symbolic values, x_μ, for each of said symbolic attributes; and
  
  selecting for each of said symbolic attributes at least one symbolic value, x_μ, that minimizes said variance as the mean symbolic value.
12. The method of claim 1, wherein said mean is comprised of a plurality of items and wherein said distance metric for a given item in said third party selection history is based on a distance between said given item and each item comprising said mean.

13. A method for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, said method comprising the steps of:
- partitioning said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;
  
  identifying at least one mean item for each of said k clusters;
  
  assigning each of said plurality of items to one of said clusters based on a distance metric; and
  
  incrementing said value of k until a predefined condition is satisfied.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The method of claim 13, wherein said predefined condition is a further increment of k does not improve a classification accuracy.
  - 15. The method of claim 13, wherein said predefined condition is a predefined performance threshold is reached.
  - 16. The method of claim 13, wherein said predefined condition is detection of an empty cluster.
  - 17. The method of claim 13, wherein said mean is comprised of a plurality of items and wherein said distance metric for a given item in said third party selection history is based on a distance between said given item and each item comprising said mean.

18. A system for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, said system comprising:
- a memory for storing computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  partition said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;
  
  identify at least one mean item for each of said k clusters; and
  
  assign each of said plurality of items to one of said clusters based on a distance metric.
- View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
- - 19. The system of claim 18, wherein said processor is further configured to increment said value of k until a further increment of k does not improve a classification accuracy.
  - 20. The system of claim 18, wherein said processor is further configured to increment said value of k until a predefined performance threshold is reached.
  - 21. The system of claim 18, wherein said processor is further configured to increment said value of k until an empty cluster is detected.
  - 22. The system of claim 18, wherein said processor is further configured to assign a label to each of said clusters.
  - 23. The system of claim 22, wherein said processor is further configured to receive a user selection of at least one cluster based on said assigned labels.
  - 24. The system of claim 18, wherein said processor performs said partitioning using a k-means clustering routine.
  - 25. The system of claim 18, wherein said distance metric is based on a distance between corresponding symbolic feature values of two items based on an overall similarity of classification of all instances for each possible value of said symbolic feature values.
  - 26. The system of claim 25, wherein said distance between symbolic features is computed using a Value Difference Metric (VDM) technique.
  - 27. The system of claim 18, wherein said processor identifies said at least one mean item for each of said k clusters by:
28. The system of claim 18, wherein said processor identifies said at least one mean item for each of said k clusters by:
- computing a variance of each of said clusters, J, for each of said possible symbolic values, x_μ, for each of said symbolic attributes; and
  
  selecting for each of said symbolic attributes at least one symbolic value, x_μ, that minimizes said variance as the mean symbolic value.
29. The system of claim 18, wherein said mean is comprised of a plurality of items and wherein said distance metric for a given item in said third party selection history is based on a distance between said given item and each item comprising said mean.

30. An article of manufacture for partitioning a plurality of items into groups of similar items, said plurality of items corresponding to a selection history by at least one third party, comprising:
- a computer readable medium having computer readable code means embodied thereon, said computer readable program code means comprising;
  
  a step to partition said third party selection history into k clusters of said plurality of items, said plurality of items including at least one of programs, content and products;
  
  a step to identify at least one mean item for each of said k clusters; and
  
  a step to assign each of said plurality of items to one of said clusters based on a distance metric.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sisvel S.p.A. (Fineur International S.A.)
Original Assignee
Koninklijke Philips Electronics N.V. (Koninklijke Philips N.V.)
Inventors
Kurapati, Kaushal, Gutta, Srinivas
Primary Examiner(s)
Amsbury, Wayne
Assistant Examiner(s)
Nguyen, Cindy

Application Number

US10/014,216
Publication Number

US 20030097353A1
Time in Patent Office

1,057 Days
Field of Search

707/1, 707/102
US Class Current

1/1
CPC Class Codes

G06F 18/23   Clustering techniques

H04N 21/252   Processing of multiple end-...

H04N 21/4532   involving end-user characte...

H04N 21/454   Content or additional data ...

H04N 21/466   Learning process for intell...

H04N 21/4662   characterized by learning a...

H04N 21/4668   for recommending content, e...

H04N 7/163   by receiver means only

Y10S 707/99931   Database or file accessing

Y10S 707/99943   Generating database or data...

Method and apparatus for partitioning a plurality of items into groups of similar items in a recommender of such items

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

22 Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for partitioning a plurality of items into groups of similar items in a recommender of such items

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

22 Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links