System and method for document collection, grouping and summarization
First Claim
1. A system for generating a summary of a plurality of documents comprising:
- a computer processor;
a computer readable document collection containing a plurality of related documents stored in electronic form therein;
a plurality of forms of multiple document summarization engines including a single event engine, a biography engine and a multi-event engine operating on the computer processor; and
a router, the router determining a temporal relationship of at least a subset of the documents in the collection and selecting one of the plurality of forms of multiple document summarization engines for generating a summary of the subset of documents based on the temporal relationship, wherein the router selects the biography engine if the documents in the collection are not generated within a predetermined time period and the number of capitalized words and the number of personal pronouns each exceed a predetermined threshold value.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
32 Citations
11 Claims
-
1. A system for generating a summary of a plurality of documents comprising:
-
a computer processor; a computer readable document collection containing a plurality of related documents stored in electronic form therein; a plurality of forms of multiple document summarization engines including a single event engine, a biography engine and a multi-event engine operating on the computer processor; and a router, the router determining a temporal relationship of at least a subset of the documents in the collection and selecting one of the plurality of forms of multiple document summarization engines for generating a summary of the subset of documents based on the temporal relationship, wherein the router selects the biography engine if the documents in the collection are not generated within a predetermined time period and the number of capitalized words and the number of personal pronouns each exceed a predetermined threshold value. - View Dependent Claims (2, 3, 4)
-
-
5. A system for document collection, grouping and summarization comprising:
-
a computer processor; a document collection engine, the document collection engine being configured to be operatively coupled to a computer network to access a plurality of content providers; a computer readable document collection database, the document collection database being operatively coupled to the document collection engine; a cluster processing engine operatively coupled to the document collection database, the cluster processing engine grouping at least a portion of the documents in the document collection database into clusters having a plurality of related documents; a plurality of forms of multiple document summarization engines including a single event engine, a biography engine and a multi-event engine operating on the computer processor; a summarization router, the summarization router being interposed between the cluster processing engine and the plurality of forms of multiple document summarization engines, the summarization router determining a temporal relationship among the documents in the clusters and selecting one of the plurality of forms of multiple document summarization engines for generating a summary of the cluster based on the temporal relationship, wherein the summarization router selects the single event engine if a predetermined number of documents in the subset of the collection are generated within a predetermined time period and the router selects the biography engine if the documents in the collection are not generated within a predetermined time period and the number of capitalized words and the number of personal pronouns each exceed a predetermined threshold value; and a graphical user interface, the graphical user interface being operatively coupled to the cluster processing engine and the plurality of summarization engines and providing a display including cluster summaries and cluster information. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
Specification