Unsupervised extraction of facts
First Claim
Patent Images
1. A method for extracting facts, the method comprising:
- extracting a first fact having an attribute and a value from a first document;
retrieving a second document that contains the attribute and the value of the first fact;
identifying in the second document a contextual pattern associated with the attribute and value of the first fact; and
extracting a second fact from the second document using the contextual pattern.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for extracting facts from documents. A fact is extracted from a first document. The attribute and value of the fact extracted from the first document are used as a seed attribute-value pair. A second document containing the seed attribute-value pair is analyzed to determine a contextual pattern used in the second document. The contextual pattern is used to extract other attribute-value pairs from the second document. The extracted attributes and values are stored as facts.
-
Citations
24 Claims
-
1. A method for extracting facts, the method comprising:
-
extracting a first fact having an attribute and a value from a first document;
retrieving a second document that contains the attribute and the value of the first fact;
identifying in the second document a contextual pattern associated with the attribute and value of the first fact; and
extracting a second fact from the second document using the contextual pattern. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for extracting facts comprising:
-
an importer, configured to extract a first fact from a first document; and
a janitor, configured to receive said first fact, identify a contextual pattern of said first fact in a second document, and extract a second fact from the second document using the contextual pattern. - View Dependent Claims (15, 16)
-
-
17. A computer program product, the computer program product comprising a computer-readable medium, for extracting facts, the computer-readable medium comprising:
-
program code for extracting a first fact having an attribute and a value from a first document;
program code for retrieving a second document that contains the attribute and the value of the first fact;
program code for identifying in the second document a contextual pattern associated with the attribute and value of the first fact; and
program code for extracting a second fact from the second document using the contextual pattern. - View Dependent Claims (18, 19, 20, 21, 22)
-
-
23. A method for extracting facts, the method comprising:
-
extracting a first fact having an attribute and a value from a first document;
retrieving a second document;
if the second document corroborates the first fact;
retrieving a third document that contains the attribute and the value of the first fact;
identifying in the third document a contextual pattern associated with the attribute and value of the first fact; and
extracting a second fact from the third document using the contextual pattern.
-
-
24. The method of claim 24, further comprising:
-
retrieving a fourth document;
if the fourth document corroborates the second fact, storing the second fact in a fact repository.
-
Specification