Method of constant interaction-time clustering applied to document browsing
First Claim
Patent Images
1. A method of processing a corpus of electronically stored documents, comprising the steps of:
- expanding a focus set comprised of at least one initial metadocument, representative of a plurality of documents, into a plurality of subsequent metadocuments, a number of said subsequent metadocuments being approximately equal to a predetermined maximum number, said subsequent metadocuments being descendants of said at least one initial metadocument in a tree, said expanding step comprising,choosing a metadocument in the focus set that represents the most individual documents, andexpanding the chosen metadocument into its descendant metadocuments; and
clustering the subsequent metadocuments into a predetermined number of new metadocuments. the predetermined number of new metadocuments being less than the predetermined maximum numbers; and
selecting the predetermined maximum number so that said expanding and clustering steps can be completed within a time constraint.
4 Assignments
0 Petitions
Accused Products
Abstract
Arbitrarily large document collections are processed by expanding a focus set having at least one initial metadocument into a plurality of subsequent metadocuments. The number of subsequent metadocuments is approximately equal to a predetermined maximum number. The subsequent metadocuments are then clustered into a predetermined number of new metadocuments, which are summarized and presented to a user. The focus set is redefined to include only user-selected new metadocuments.
66 Citations
13 Claims
-
1. A method of processing a corpus of electronically stored documents, comprising the steps of:
-
expanding a focus set comprised of at least one initial metadocument, representative of a plurality of documents, into a plurality of subsequent metadocuments, a number of said subsequent metadocuments being approximately equal to a predetermined maximum number, said subsequent metadocuments being descendants of said at least one initial metadocument in a tree, said expanding step comprising, choosing a metadocument in the focus set that represents the most individual documents, and expanding the chosen metadocument into its descendant metadocuments; and clustering the subsequent metadocuments into a predetermined number of new metadocuments. the predetermined number of new metadocuments being less than the predetermined maximum numbers; and selecting the predetermined maximum number so that said expanding and clustering steps can be completed within a time constraint. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
Specification