System and method for document collection, grouping and summarization
First Claim
1. A system for generating a summary of a plurality of documents comprising:
- a computer readable document collection containing a plurality of related documents stored in electronic form therein;
a plurality of document summarization engines; and
a router, the router determining a relationship of at least a subset of the documents in the collection and selecting one of the plurality of document summarization engines for generating a summary of the subset of documents based on the relationship.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for generating a summary of a plurality of documents and presenting the summary information to a user is provided which includes a computer readable document collection containing a plurality of related documents stored in electronic form. Documents can be pre-processed to group documents into document clusters. The document clusters can also be assigned to predetermined document categories for presentation to a user. A number of multiple document summarization engines are provided which generate summaries for specific classes of multiple documents clusters. A summarizer router is employed to determining a relationship of the documents in a cluster and select one of the document summarization engines for use in generating a summary of the cluster. A single event engine is provided to generate summaries of documents which are closely related temporally and to a specific event. A dissimilarity engine for multiple document summary generation is provided which generates summaries of document clusters having documents with varying degrees of relatedness. A user interface is provided to display categories, cluster titles, summaries, related images.
-
Citations
41 Claims
-
1. A system for generating a summary of a plurality of documents comprising:
-
a computer readable document collection containing a plurality of related documents stored in electronic form therein;
a plurality of document summarization engines; and
a router, the router determining a relationship of at least a subset of the documents in the collection and selecting one of the plurality of document summarization engines for generating a summary of the subset of documents based on the relationship. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-based method of presenting content, such as news articles, to a user, comprising:
-
gathering a plurality of articles from a plurality of content providers;
determining clusters of at least a portion of the articles;
selecting one of a plurality of multiple document summarization engines for each cluster of articles;
generating a summary for each cluster of articles; and
displaying at least one summary to a user. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A system for document collection, grouping and summarization comprising:
-
a document collection engine, the document collection engine being configured to be operatively coupled to a computer network to access a plurality of content providers;
a computer readable document collection database, the document collection database being operatively coupled to the document collection engine;
a cluster processing engine operatively coupled to the document collection database, the cluster processing engine grouping at least a portion of the documents in the document collection database into clusters having a plurality of related documents;
a plurality of document summarization engines;
a summarization router, the summarization router being interposed between the cluster processing engine and the plurality of document summarization engines, the summarization router determining a relationship among the documents in the clusters and selecting one of the plurality of document summarization engines for generating a summary of the cluster based on the relationship; and
a graphical user interface, the graphical user interface being operatively coupled to the cluster processing engine and the plurality of summarization engines and providing a display including cluster summaries and cluster information. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A method of generating a summary of multiple documents in a document cluster, comprising:
-
for each document in the cluster, determining a score for each sentence based on features of the sentences;
selecting a subset of sentences based on a weighted sentence score;
merging the selected sentences in accordance with a predetermined order for the selected sentences; and
removing selected sentences which are duplicative. - View Dependent Claims (35, 36, 37, 38, 39, 40)
-
-
41. Computer readable media encoded with instructions for a computer processor to perform the following steps:
-
gathering a plurality of articles from a plurality of content providers;
determining clusters of at least a portion of the articles;
selecting one of a plurality of multiple document summarization engines for each cluster of articles;
generating a summary for each cluster of articles; and
displaying at least one summary to a user.
-
Specification