Computer-implemented system and method for identifying relevant documents for display
First Claim
Patent Images
1. A computer-implemented system for identifying relevant documents for display, comprising:
- themes for a set of documents;
an extraction module to extract noun phrases from the documents as concepts;
a theme generator to group two or more of the concepts as one such theme;
a frequency table that identifies each of the concepts and a frequency of occurrence of each concept within each of the documents in the set;
a graph generator to generate a graph of the concepts, comprising;
an x-axis of the graph defining the concepts;
a y-axis of the graph defining a number of the documents that reference each concept; and
a mapping module to map the concepts on the graph in order of descending number of referring documents;
a cluster module to cluster the documents based on the themes;
a matrix for the documents comprising an inner product of document frequency occurrences and cluster concept weightings for each theme;
an identification module to identify from the matrix, documents most relevant to a particular theme; and
a display to present the relevant documents.
6 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented system and method for identifying relevant documents for display are provided. Themes for a set of documents are generated. The documents are clustered based on the themes. A matrix including an inner product of document frequency occurrences and cluster concept weightings for each theme is generated for the documents. From the matrix, documents most relevant to a particular theme are identified, and the relevant documents are displayed.
-
Citations
14 Claims
-
1. A computer-implemented system for identifying relevant documents for display, comprising:
-
themes for a set of documents; an extraction module to extract noun phrases from the documents as concepts; a theme generator to group two or more of the concepts as one such theme; a frequency table that identifies each of the concepts and a frequency of occurrence of each concept within each of the documents in the set; a graph generator to generate a graph of the concepts, comprising; an x-axis of the graph defining the concepts; a y-axis of the graph defining a number of the documents that reference each concept; and a mapping module to map the concepts on the graph in order of descending number of referring documents; a cluster module to cluster the documents based on the themes; a matrix for the documents comprising an inner product of document frequency occurrences and cluster concept weightings for each theme; an identification module to identify from the matrix, documents most relevant to a particular theme; and a display to present the relevant documents. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method for identifying relevant documents for display, comprising:
-
generating themes for a set of documents, comprising; extracting noun phrases from the documents as concepts; and grouping two or more of the concepts as one such theme; generating a frequency table that identifies each of the concepts and a frequency of occurrence of each concept within each of the documents in the set; generating a graph of the concepts, comprising; defining an x-axis of the graph as the concepts; defining a y-axis of the graph as a number of the documents that reference each of the concepts; and mapping the concepts on the graph in order of descending number of referring documents; clustering the documents based on the themes; generating a matrix for the documents comprising an inner product of document frequency occurrences and cluster concept weightings for each theme; identifying from the matrix, documents most relevant to a particular theme; and displaying the relevant documents. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification