Unsupervised extraction of facts
First Claim
Patent Images
1. A computer-implemented method for extracting facts, the method comprising:
- at a computer system including one or more processors and memory storing one or more programs, the one or more processors executing the one or more programs to perform the operations of;
identifying a first fact having an attribute and a value obtained from a first document;
retrieving a second document that contains the attribute and the value of the first fact;
identifying in the second document a contextual pattern associated with the attribute and value of the first fact;
extracting a second fact from the second document using the contextual pattern, the second fact having an attribute that is different than the attribute of the first fact and having a value that is different than the value of the first fact; and
storing the first fact and the second fact in a fact repository of the computer system.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for extracting facts from documents. A fact is extracted from a first document. The attribute and value of the fact extracted from the first document are used as a seed attribute-value pair. A second document containing the seed attribute-value pair is analyzed to determine a contextual pattern used in the second document. The contextual pattern is used to extract other attribute-value pairs from the second document. The extracted attributes and values are stored as facts.
-
Citations
20 Claims
-
1. A computer-implemented method for extracting facts, the method comprising:
at a computer system including one or more processors and memory storing one or more programs, the one or more processors executing the one or more programs to perform the operations of; identifying a first fact having an attribute and a value obtained from a first document; retrieving a second document that contains the attribute and the value of the first fact; identifying in the second document a contextual pattern associated with the attribute and value of the first fact; extracting a second fact from the second document using the contextual pattern, the second fact having an attribute that is different than the attribute of the first fact and having a value that is different than the value of the first fact; and storing the first fact and the second fact in a fact repository of the computer system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
13. A system for extracting facts comprising:
-
one or more processors; and memory storing one or more programs to be executed by the one or more processors; the one or more programs comprising instructions for; identifying a first fact having an attribute and a value obtained from a first document; retrieving a second document that contains the attribute and the value of the first fact; identifying in the second document a contextual pattern associated with the attribute and value of the first fact; extracting a second fact from the second document using the contextual pattern, the second fact having an attribute that is different than the attribute of the first fact and having a value that is different than the value of the first fact, and storing the first fact and the second fact in a fact repository. - View Dependent Claims (14, 15, 16)
-
-
17. A non-transitory computer readable storage medium storing one or more programs configured for execution by a computer, the one or more programs comprising instructions for:
-
identifying a first fact having an attribute and a value obtained from a first document; retrieving a second document that contains the attribute and the value of the first fact; identifying in the second document a contextual pattern associated with the attribute and value of the first fact; extracting a second fact from the second document using the contextual pattern, the second fact having an attribute that is different than the attribute of the first fact and having a value that is different than the value of the first fact; and storing the first fact and the second fact in a fact repository. - View Dependent Claims (18, 19, 20)
-
Specification