Agent-based method for distributed clustering of textual information
First Claim
1. A computer method for storing information in a computer system having at least first and second computers for retrieval and display based on similarity of information, the method comprising:
- a first-tier program module operating on a first computer for determining a new document vector to characterize a new document for comparison of a similarity of the new document to other documents stored in the computer system;
the first-tier program module transmitting the new document to a second-tier program module operating on a second computer in the computer system;
wherein the second-tier program module transmits the document vector to a plurality of third-tier program modules operating on the second computer in the computer system;
the third-tier program modules each storing a composite vector representing the similarity of a respective plurality of documents stored in the second computer under control of the respective third-tier program module; and
the third-tier program modules each receiving the document vector for the new document and comparing the document vector to a respective composite vector to determine similarity of the new document to the plurality of documents stored in the second computer under the control of the third-tier program module; and
the third-tier program modules each returning to the second-tier module a similarity value resulting from comparison of the new document vector to a respective composite vector, the second-tier module returning a best match similarity value to the first-tier module representing a greatest measure of similarity of the new document to a respective plurality of documents stored under control of a respective third-tier program module to determine routing of the document to a selected second-tier program module from among a plurality of second-tier program modules.
3 Assignments
0 Petitions
Accused Products
Abstract
A computer method and system for storing, retrieving and displaying information has a multiplexing agent (20) that calculates a new document vector (25) for a new document (21) to be added to the system and transmits the new document vector (25) to master cluster agents (22) and cluster agents (23) for evaluation. These agents (22, 23) perform the evaluation and return values upstream to the multiplexing agent (20) based on the similarity of the document to documents stored under their control. The multiplexing agent (20) then sends the document (21) and the document vector (25) to the master cluster agent (22), which then forwards it to a cluster agent (23) or creates a new cluster agent (23) to manage the document (21). The system also searches for stored documents according to a search query having at least one term and identifying the documents found in the search, and displays the documents in a clustering display (80) of similarity so as to indicate similarity of the documents to each other.
-
Citations
25 Claims
-
1. A computer method for storing information in a computer system having at least first and second computers for retrieval and display based on similarity of information, the method comprising:
-
a first-tier program module operating on a first computer for determining a new document vector to characterize a new document for comparison of a similarity of the new document to other documents stored in the computer system; the first-tier program module transmitting the new document to a second-tier program module operating on a second computer in the computer system; wherein the second-tier program module transmits the document vector to a plurality of third-tier program modules operating on the second computer in the computer system; the third-tier program modules each storing a composite vector representing the similarity of a respective plurality of documents stored in the second computer under control of the respective third-tier program module; and the third-tier program modules each receiving the document vector for the new document and comparing the document vector to a respective composite vector to determine similarity of the new document to the plurality of documents stored in the second computer under the control of the third-tier program module; and the third-tier program modules each returning to the second-tier module a similarity value resulting from comparison of the new document vector to a respective composite vector, the second-tier module returning a best match similarity value to the first-tier module representing a greatest measure of similarity of the new document to a respective plurality of documents stored under control of a respective third-tier program module to determine routing of the document to a selected second-tier program module from among a plurality of second-tier program modules. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 22, 23)
-
-
12. A computer system for storing, retrieving and displaying information, the computer system being operable on at least one computer having a software operating system, the computer system comprising:
-
a first-tier, multiplexing program module running on a first computer for receiving a new document originating from an information source and for calculating a new document vector for the new document, and for transmitting said new document vector to at least one second-tier program module; and a second-tier program module running on a second computer; wherein the second-tier program module transmits the document vector to a plurality of third-tier program modules operating on the second computer in the computer system; the third-tier program modules each storing a composite vector representing the similarity of a respective plurality of documents stored in the second computer under the control of the respective third-tier program module; and the third-tier program modules each receiving the document vector for the new document and comparing the document vector to a respective composite vector to determine similarity of the new document to the plurality of documents stored in the second computer under control of the third-tier program module; the third-tier program module each returning a similarity value to the second-tier module resulting from comparison of the new document vector to a respective composite vector, the second-tier module returning a best match similarity value to the first-tier, multiplexing program module representing a greatest measure of similarity of the new document to a respective plurality of documents stored under control of a respective third-tier program module to determine routing of the document to a selected second-tier program module from among a plurality of second-tier program modules. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 24, 25)
-
Specification