SYSTEM AND METHOD FOR CLUSTERING DATA OBJECTS IN A COLLECTION
First Claim
1. A method for selecting a set of initial cluster centers in clustering a collection of objects in a multi-dimensional vector space, comprising the steps of:
- selecting a first number of first objects from the collection;
computing a vector centroid of the first objects;
selecting a second number of second objects from the collection; and
identifying a second number of initial cluster centers between the centroid and the second objects.
11 Assignments
0 Petitions
Accused Products
Abstract
A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users'"'"' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.
223 Citations
10 Claims
-
1. A method for selecting a set of initial cluster centers in clustering a collection of objects in a multi-dimensional vector space, comprising the steps of:
-
selecting a first number of first objects from the collection;
computing a vector centroid of the first objects;
selecting a second number of second objects from the collection; and
identifying a second number of initial cluster centers between the centroid and the second objects. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for clustering a collection of objects in a multi-dimensional vector space, comprising the steps of:
-
selecting a first number of first objects from the collection;
computing a vector centroid of the first objects;
selecting a second number of second objects from the collection;
identifying a second number of initial cluster centers between the centroid and the second objects; and
performing iterated k-means clustering around the initial cluster centers to cluster the objects.
-
Specification