Unsupervised extraction of facts
First Claim
Patent Images
1. A computer-implemented method for extracting facts, the method comprising:
- at a computer system including one or more processors and memory storing one or more programs, the one or more processors executing the one or more programs to perform the operations of;
identifying a predefined pattern in a first document;
extracting a first fact having an attribute and a value from the first document based on the predefined pattern;
retrieving a second document that contains the attribute and the value of the first fact;
identifying in the second document a contextual pattern associated with the attribute and value of the first fact;
extracting a second fact from the second document using the contextual pattern; and
storing the first fact and the second fact in a fact repository of the computer system.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for extracting facts from documents. A fact is extracted from a first document. The attribute and value of the fact extracted from the first document are used as a seed attribute-value pair. A second document containing the seed attribute-value pair is analyzed to determine a contextual pattern used in the second document. The contextual pattern is used to extract other attribute-value pairs from the second document. The extracted attributes and values are stored as facts.
315 Citations
23 Claims
-
1. A computer-implemented method for extracting facts, the method comprising:
at a computer system including one or more processors and memory storing one or more programs, the one or more processors executing the one or more programs to perform the operations of; identifying a predefined pattern in a first document; extracting a first fact having an attribute and a value from the first document based on the predefined pattern; retrieving a second document that contains the attribute and the value of the first fact; identifying in the second document a contextual pattern associated with the attribute and value of the first fact; extracting a second fact from the second document using the contextual pattern; and storing the first fact and the second fact in a fact repository of the computer system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
13. A system for extracting facts
one or more processors; - and
memory storing one or more programs to be executed by the one or more processors; the one or more programs comprising instructions for; identifying a predefined pattern in a first document, extract a first fact having an attribute and value from the first document based on the predefined pattern, and store the first fact in a fact repository; and receiving said first fact, identify a contextual pattern of said first fact in a second document, and extract a second fact from the second document using the contextual pattern, and store the second fact in the fact repository. - View Dependent Claims (14, 15)
- and
-
16. A non-transitory computer readable storage medium storing one or more programs configured for execution by a computer, the one or more programs comprising instructions for:
-
program code for identifying a predefined pattern in a first document; extracting a first fact having an attribute and a value from the first document based on the predefined pattern; retrieving a second document that contains the attribute and the value of the first fact; identifying in the second document a contextual pattern associated with the attribute and value of the first fact; extracting a second fact from the second document using the contextual pattern; and storing the first fact and the second fact in a fact repository. - View Dependent Claims (17, 18, 19, 20, 21)
-
-
22. A computer-implemented method for extracting facts, the method comprising:
at a computer system including one or more processors and memory storing one or more programs, the one or more processors executing the one or more programs to perform the operations of; identifying a predefined pattern in a first document; extracting a first fact having an attribute and a value from the first document based on the predefined pattern; retrieving a second document; if the second document corroborates the first fact; retrieving a third document that contains the attribute and the value of the first fact; identifying in the third document a contextual pattern associated with the attribute and value of the first fact; extracting a second fact from the third document using the contextual pattern; and storing the first fact and the second fact in a fact repository of the computer system. - View Dependent Claims (23)
Specification