Systems and methods for key phrase characterization of documents
First Claim
1. An electronic device comprising:
- one or more computer-readable storage media configured to store instructions; and
one or more processors configured to execute the instructions to cause the electronic device to;
receive a user input indicative of an entity and a search query;
identify a statistical model associated with the entity, wherein the statistical model is determined based on a first plurality of documents associated with the entity, the statistical model indicative at least of frequencies of one or more words within the first plurality of documents;
identify, responsive to the user input, a second plurality of documents, at least partially different than the first plurality of documents, corresponding to the search query and the indicated entity;
identify, for each of the second plurality of documents, one or more segments;
apply the identified statistical model to each of the identified segments to determine, for each of the second plurality of documents, a statistical significance of segments identified in the documents, the statistical significance indicative of frequencies of the one or more words in the segment compared to the frequencies of the one or more words indicated in the statistical model;
and provide for display at least a representative segment having a highest statistical significance and a link to the document containing the representative segment.
8 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed for key phrase characterization of documents. In accordance with one implementation, a method is provided for key phrase characterization of documents. The method includes obtaining a first plurality of documents based at least on a user input, obtaining a statistical model based at least on the user input, and obtaining, from content of the first plurality of documents, a plurality of segments. The method also includes determining statistical significance of the plurality of segments based at least on the statistical model and the content, and providing for display a representative segment from the plurality of segments, the representative segment being determined based at least on the statistical significance.
815 Citations
17 Claims
-
1. An electronic device comprising:
-
one or more computer-readable storage media configured to store instructions; and one or more processors configured to execute the instructions to cause the electronic device to; receive a user input indicative of an entity and a search query; identify a statistical model associated with the entity, wherein the statistical model is determined based on a first plurality of documents associated with the entity, the statistical model indicative at least of frequencies of one or more words within the first plurality of documents; identify, responsive to the user input, a second plurality of documents, at least partially different than the first plurality of documents, corresponding to the search query and the indicated entity; identify, for each of the second plurality of documents, one or more segments; apply the identified statistical model to each of the identified segments to determine, for each of the second plurality of documents, a statistical significance of segments identified in the documents, the statistical significance indicative of frequencies of the one or more words in the segment compared to the frequencies of the one or more words indicated in the statistical model; and provide for display at least a representative segment having a highest statistical significance and a link to the document containing the representative segment. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method comprising:
-
by a computing system comprising a hardware computer processor and non-transitory storage medium storing software instructions, receive a user input indicative of an entity and a search query; identify a statistical model associated with the entity, wherein the statistical model is determined based on a first plurality of documents associated with the entity, the statistical model indicative at least of frequencies of one or more words within the first plurality of documents; identify a second plurality of documents, at least partially different than the first plurality of documents, corresponding to the search query and the indicated entity; identify, for each of the second plurality of documents, one or more segments; apply the identified statistical model to each of the identified segments to determine, for each of the second plurality of documents, a statistical significance of segments identified in the documents, the statistical significance indicative of frequencies of the one or more words in the segment compared to the frequencies of the one or more words indicated in the statistical model; and provide for display at least a representative segment having a highest statistical significance and a link to the document containing the representative segment. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A non-transitory computer-readable medium storing a set of instructions that are executable by one or more electronic devices, each having one or more processors, to cause the one or more electronic devices to perform a method, the method comprising:
-
receiving a user input indicative of an entity and a search query; identifying a statistical model associated with the entity, wherein the statistical model is determined based on a first plurality of documents associated with the entity, the statistical model indicative at least of frequencies of one or more words within the first plurality of documents; identifying a second plurality of documents, at least partially different than the first plurality of documents, corresponding to the search query and the indicated entity; identifying, for each of the second plurality of documents, one or more segments; applying the identified statistical model to each of the identified segments to determine, for each of the second plurality of documents, a statistical significance of segments identified in the documents, the statistical significance indicative of frequencies of the one or more words in the segment compared to the frequencies of the one or more words indicated in the statistical model; and providing for display at least a representative segment having a highest statistical significance and a link to the document containing the representative segment. - View Dependent Claims (14, 15, 16, 17)
-
Specification