Information extraction from a database
First Claim
Patent Images
1. A computer-implemented method comprising:
- receiving, from a user, a first example of target information, where the first example includes a first tuple that corresponds to the target information in documents stored in a database, the first tuple including a plurality of fields;
finding ones of the documents in the database that contain the first tuple;
analyzing the ones of the documents in the database to recognize a pattern, in the ones of the documents, that includes the first tuple and at least one of text that precedes the plurality of fields of the first tuple, text that occurs between at least two of the plurality of fields of the first tuple, or text that follows the plurality of fields of the first tuple; and
automatically searching the database for at least a second tuple that matches the pattern, where the at least a second tuple is a second example of the target information and differs from the first tuple and the pattern.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for extracting information from a database are provided. A database such as the Web is searched for occurrences of tuples of information. The occurrences of the tuples of information that were found in the database are analyzed to identify a pattern in which the tuples of information were stored. Additional tuples of information can then be extracted from the database utilizing the pattern. This process can be repeated with the additional tuples of information, if desired.
19 Citations
19 Claims
-
1. A computer-implemented method comprising:
-
receiving, from a user, a first example of target information, where the first example includes a first tuple that corresponds to the target information in documents stored in a database, the first tuple including a plurality of fields; finding ones of the documents in the database that contain the first tuple; analyzing the ones of the documents in the database to recognize a pattern, in the ones of the documents, that includes the first tuple and at least one of text that precedes the plurality of fields of the first tuple, text that occurs between at least two of the plurality of fields of the first tuple, or text that follows the plurality of fields of the first tuple; and automatically searching the database for at least a second tuple that matches the pattern, where the at least a second tuple is a second example of the target information and differs from the first tuple and the pattern. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-readable storage device including instructions for execution by a processor, the instructions comprising:
-
instructions to receive, from a user, a first example of target information, where the target information includes a tuple that corresponds to the target information in documents stored in a database, the tuple including a plurality of fields; instructions to find ones of the documents in the database that contain the tuple; instructions to analyze the ones of the documents to recognize a pattern, in the ones of the documents, that includes the tuple and at least one of text that precedes the plurality of fields of the tuple, text that occurs between two fields of the tuple, or text that follows the plurality of fields of the tuple; and instructions to automatically search the database for one or more other tuples that match the pattern, where the one or more other tuples are other examples of the target information and differ from each other, the tuple, and the pattern. - View Dependent Claims (11, 12, 13)
-
-
14. A computing device comprising:
-
a memory to store instructions; and a processor to execute the instructions to; receive, from a user, a set of examples of a first type of information, where the set of the examples includes one or more tuples that correspond to the first type of information in documents stored in a database, each of the one or more tuples including a plurality of fields; find the documents in the database that contain the one or more tuples; analyze the documents to identify a plurality of patterns of at least one of text that precedes the plurality of fields of the one or more tuples, text that occurs between two fields of the one or more tuples, or text that follows the plurality of fields of the one or more tuples; and automatically search the database for at least one other tuple that matches one of the patterns, where the at least one other tuple is another example of the first type of information and the at least one other tuple differs from the set of examples and the patterns. - View Dependent Claims (15, 16)
-
-
17. A computer-implemented method comprising:
-
receiving, from a user, a first example of target information, where the first example includes a first tuple that corresponds to the target information in documents stored in a database, the first tuple including a plurality of fields; finding ones of the documents in the database that contain the first tuple; extracting occurrences of the first tuple from the ones of the documents, each occurrence including the first tuple and at least one of text preceding the plurality of fields of the first tuple, text occurring between two of the plurality of fields of the first tuple, or text following the plurality of fields of the first tuple; analyzing the extracted occurrences to recognize a pattern, in the ones of the documents, of the first tuple and the at least one of the text preceding the plurality of fields of the first tuple, the text occurring between two fields of the first tuple, or the text following the plurality of fields of the first tuple; and automatically searching the database for at least a second tuple that matches the pattern, where the at least a second tuple is a second example of the target information and differs from the first tuple and the pattern. - View Dependent Claims (18, 19)
-
Specification