Method and system for indexing and searching databases
First Claim
1. A method in a computer system for searching web databases, the method comprising:
- providing of mapping of attribute and attribute value pairs to web databases that contain the attribute and attribute value pairs, the mapping being generated by sampling the web databases;
receiving an unstructured query having terms;
identifying, from the unstructured query, terms that are attribute terms that correspond to an attribute of a web database;
identifying, from the unstructured query, terms that are attribute value terms that correspond to an attribute value of a web database;
for each pair of identified attribute terms and identified attribute value terms,formulating a query with an attribute corresponding to the identified attribute term set to the identified attribute value term; and
when the mapping indicates that a web database has an attribute and an attribute value pair that corresponds to the pair of the identified attribute term and the identified attribute value term used to formulate the query, submitting the formulated query to the web database.
2 Assignments
0 Petitions
Accused Products
Abstract
A search system generates an index for databases by generatively sampling the databases and uses that index to identify and formulate queries for searching the databases. The generated index is referred to as a domain-attribute index and contains a domain-level index and site-level indexes. A site-level index for a database maps site attributes to distinct attribute values within the database. The domain-level index for a domain maps attribute values to database and site attribute pairs that contain those attribute values. To generate a site-level index for a database within a certain domain, the search system starts out with an initial set of the sample data for that domain. The search system generates sampling queries based on the sample data and submits the sampling queries to a database. The search system updates the site-level index based on the sampling results and uses the results to generate more sampling queries.
-
Citations
27 Claims
-
1. A method in a computer system for searching web databases, the method comprising:
-
providing of mapping of attribute and attribute value pairs to web databases that contain the attribute and attribute value pairs, the mapping being generated by sampling the web databases; receiving an unstructured query having terms; identifying, from the unstructured query, terms that are attribute terms that correspond to an attribute of a web database; identifying, from the unstructured query, terms that are attribute value terms that correspond to an attribute value of a web database; for each pair of identified attribute terms and identified attribute value terms, formulating a query with an attribute corresponding to the identified attribute term set to the identified attribute value term; and when the mapping indicates that a web database has an attribute and an attribute value pair that corresponds to the pair of the identified attribute term and the identified attribute value term used to formulate the query, submitting the formulated query to the web database. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method in a computer system for searching web databases, the method comprising:
-
receiving pairs of attributes and attribute values; formulating a query for each received pair; identifying web databases to submit the formulated queries using a domain-attribute index, the domain-attribute index providing, for a plurality of attribute values, each contained within at least one attribute of one entry of one web database, a mapping from the attribute value to a web database and attribute of that web database that contains that attribute value in that attribute of an entry of that web database, the mapping being generated by sampling the web databases by submitting queries to the web databases with various combinations of attribute and attribute value pairs; and submitting the formulated queries to the identified web databases. - View Dependent Claims (7)
-
-
8. A method in a computer system for sampling databases within a domain, the method comprising:
-
providing sets of attribute values for attributes of the domain (See e.g. col. 1, lines 58–
59, ‘
means for storing a value based index of selected attributes’
); andfor each of a plurality of databases to sample, for each of a plurality of attribute and an-attribute value pairs selected from the provided sets, submitting a query to the database that queries the selected attribute of the pair with the attribute value of the pair, wherein the submitted query generates a result; and generating a mapping between the attributes of the database and attribute values when the result indicates that the database contains the attribute and the attribute value pair of the submitted query. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer system for generating a mapping between attribute values and databases having attributes with the attribute values, the databases being in a domain, comprising:
-
a component that provides attributes and attribute values for the domain; a component that, for each of a plurality of pairs of provided attributes and provided attribute values, submits to each database a query for querying the attribute of the pair for the attribute value of the pair and that receives a result of each submitted query; and a component that generates from the received results a mapping that indicates which databases contain which attribute values for which attributes. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27)
-
Specification