Corroborating facts in electronic documents
First Claim
1. A computer-implemented method for identifying facts described by electronic documents, comprising:
- defining a query, the query posing a question having an answer formed of terms from the electronic documents;
creating one or more hypothetical facts in response to the query and the electronic documents, each hypothetical fact representing a possible answer to the query, wherein creating one or more hypothetical facts in response to the query comprises;
parsing the query to filter out noise words and produce filtered terms;
searching a repository of facts comprising attributes and values to identify attributes corresponding to the filtered terms;
searching the electronic documents to identify terms that frequently appear near the filtered terms; and
forming one or more hypothetical facts responsive to the attributes corresponding to the filtered terms and the terms that frequently appear near the filtered terms in the electronic documents;
corroborating the one or more hypothetical facts using the electronic documents to identify a likely correct fact; and
presenting the identified likely correct fact as the answer to the query.
2 Assignments
0 Petitions
Accused Products
Abstract
A query is defined that has an answer formed of terms from electronic documents. A repository having facts is examined to identify attributes corresponding to terms in the query. The electronic documents are examined to find other terms that commonly appear near the query terms. Hypothetical facts representing possible answers to the query are created based on the information identified in the fact repository and the commonly-appearing terms. These hypothetical facts are corroborated using the electronic documents to determine how many documents support each fact. Additionally, contextual clues in the documents are examined to determine whether the hypothetical facts can be expanded to include additional terms. A hypothetical fact that is supported by at least a certain number of documents, and is not contained within another fact with at least the same level of support, is presented as likely correct.
-
Citations
27 Claims
-
1. A computer-implemented method for identifying facts described by electronic documents, comprising:
-
defining a query, the query posing a question having an answer formed of terms from the electronic documents; creating one or more hypothetical facts in response to the query and the electronic documents, each hypothetical fact representing a possible answer to the query, wherein creating one or more hypothetical facts in response to the query comprises; parsing the query to filter out noise words and produce filtered terms; searching a repository of facts comprising attributes and values to identify attributes corresponding to the filtered terms; searching the electronic documents to identify terms that frequently appear near the filtered terms; and forming one or more hypothetical facts responsive to the attributes corresponding to the filtered terms and the terms that frequently appear near the filtered terms in the electronic documents; corroborating the one or more hypothetical facts using the electronic documents to identify a likely correct fact; and presenting the identified likely correct fact as the answer to the query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for identifying facts described by electronic documents, comprising:
-
one or more processors; memory; and one or more programs stored in the memory, the one or more programs comprising instructions to; define a query, the query posing a question having an answer formed of terms from the electronic documents; create one or more hypothetical facts in response to the query and the electronic documents, each hypothetical fact representing a possible answer to the query, wherein creating one or more hypothetical facts in response to the query comprises; parsing the query to filter out noise words and produce filtered terms; searching a repository of facts comprising attributes and values to identify attributes corresponding to the filtered terms; searching the electronic documents to identify terms that frequently appear near the filtered terms; and forming one or more hypothetical facts responsive to the attributes corresponding to the filtered terms and the terms that frequently appear near the filtered terms in the electronic documents; corroborate the one or more hypothetical facts using the electronic documents to identify a likely correct fact; and present the identified likely correct fact as the answer to the query. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A non-transitory computer readable storage medium storing one or more programs configured for execution by a computer, the one or more programs comprising instructions for:
-
defining a query, the query posing a question having an answer formed of terms from the electronic documents; creating one or more hypothetical facts in response to the query and the electronic documents, each hypothetical fact representing a possible answer to the query, wherein creating one or more hypothetical facts in response to the query comprises; parsing the query to filter out noise words and produce filtered terms; searching a repository of facts comprising attributes and values to identify attributes corresponding to the filtered terms; searching the electronic documents to identify terms that frequently appear near the filtered terms; and forming one or more hypothetical facts responsive to the attributes corresponding to the filtered terms and the terms that frequently appear near the filtered terms in the electronic documents; corroborating the one or more hypothetical facts using the electronic documents to identify a likely correct fact; and presenting the identified likely correct fact as the answer to the query. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification