Information extraction from a database
First Claim
Patent Images
1. A method performed by one or more devices, the method comprising:
- identifying, by the one or more devices, a tuple, from text of a plurality of documents, using a first data pattern;
identifying, by the one or more devices, second data patterns,the first data pattern being different than each of the second data patterns;
determining, by the one or more devices, a quantity of the second data patterns that match the tuple; and
storing, by the one or more devices, the tuple in a data storage when the quantity of the second data patterns that match the tuple satisfies a particular threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for extracting information from a database are provided. A database such as the Web is searched for occurrences of tuples of information. The occurrences of the tuples of information that were found in the database are analyzed to identify a pattern in which the tuples of information were stored. Additional tuples of information can then be extracted from the database utilizing the pattern. This process can be repeated with the additional tuples of information, if desired.
21 Citations
20 Claims
-
1. A method performed by one or more devices, the method comprising:
-
identifying, by the one or more devices, a tuple, from text of a plurality of documents, using a first data pattern; identifying, by the one or more devices, second data patterns, the first data pattern being different than each of the second data patterns; determining, by the one or more devices, a quantity of the second data patterns that match the tuple; and storing, by the one or more devices, the tuple in a data storage when the quantity of the second data patterns that match the tuple satisfies a particular threshold. - View Dependent Claims (4, 5, 6, 7, 16, 17)
-
-
2. A system comprising:
-
one or more processors; and one or more memories including a plurality of instructions that, when executed by the one or more processors, cause the one or more processors to; identify a tuple, from text of a plurality of documents, using a first data pattern; identify second data patterns, the first data pattern being different than each of the second data patterns;
determine a quantity of the second data patterns that match the tuple; and
store the tuple in a data storage when the quantity of the second data patterns thatmatch the tuple satisfies a particular threshold. - View Dependent Claims (8, 9, 10, 11, 18, 19)
-
-
3. A non-transitory computer-readable storage medium comprising:
one or more instructions which, when executed by at least one processor, cause the at least one processor to; identify a tuple, from text of a plurality of documents, using a first data pattern; identify second data patterns, the first data pattern being different than each of the second data patterns;
determine a quantity of the second data patterns that match the tuple; andstore the tuple in a data storage when the quantity of the second data patterns that match the tuple satisfies a particular threshold. - View Dependent Claims (12, 13, 14, 15, 20)
Specification