System and method for clustering data objects in a collection
First Claim
1. A method for selecting a set of initial cluster centers in wavefront clustering a collection of objects, each object being represented by a set of multi-modal feature vectors, comprising the steps of:
- selecting a first number of first objects from the collection;
computing a vector centroid of the first objects using the set of multi-modal feature vectors associated with each object;
selecting a second number of second objects from the collection;
identifying a second number of initial cluster centers between the centroid and the second objects; and
wavefront clustering the collection of objects using the second number of initial cluster centers.
11 Assignments
0 Petitions
Accused Products
Abstract
A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users'"'"' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.
444 Citations
18 Claims
-
1. A method for selecting a set of initial cluster centers in wavefront clustering a collection of objects, each object being represented by a set of multi-modal feature vectors, comprising the steps of:
-
selecting a first number of first objects from the collection;
computing a vector centroid of the first objects using the set of multi-modal feature vectors associated with each object;
selecting a second number of second objects from the collection;
identifying a second number of initial cluster centers between the centroid and the second objects; and
wavefront clustering the collection of objects using the second number of initial cluster centers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer readable medium storing instructions for wavefront clustering a collection of objects, each object being represented by a set of multi-modal feature vectors, comprising the instructions for:
-
randomly selecting a first number of first objects from the collection;
computing a vector centroid of the first objects using the set of multi-modal feature vectors associated with each object;
randomly selecting a second number of second objects from the collection;
identifying a second number of initial cluster centers between the centroid and the second objects; and
performing iterated k-means wavefront clustering around the initial cluster centers to cluster the objects. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A signal for transmitting computer instructions for selecting a set of initial cluster centers in wavefront clustering a collection of objects, each object being represented by a set of multi-modal feature vectors, the instructions comprising:
-
randomly selecting a first number of first objects from the collection, the first number being less than a number of objects in the collection;
computing a vector centroid of the first objects using the set of multi-modal feature vectors associated with each object;
selecting a second number of second objects from the collection, the second number equaling a desired number of initial cluster centers;
identifying a second number of initial cluster centers between the centroid and the second objects; and
wavefront clustering the collection of objects using the second number of initial cluster centers. - View Dependent Claims (17, 18)
-
Specification