Supporting unified querying over autonomous unstructured and structured databases
First Claim
1. A computer-implemented method for dynamically querying structured and unstructured data repositories, the method comprising:
- maintaining, by a computer, a unified index associated with a structured data repository and an unstructured data repository;
receiving, by the computer, a unified query of keywords including a first keyword for querying the structured data repository and a second keyword for querying the unstructured data repository;
first querying, by the computer, one of the structured data repository and the unstructured data repository;
if first querying the structured data repository, then, using the computer, matching the unified index to first querying results with the first keyword, and using unified index matches to limit second querying of the unstructured data repository with the second keyword; and
if first querying the unstructured data repository, then, using the computer, matching the unified index to first querying results with the second keyword, and using unified index matches to limit second querying of the structured data repository with the first keyword,wherein the first querying of the one of the structured data repository and the unstructured data repository is decided at query time using occurrence statistics for the keywords, thereby selecting an efficient approach to answer a query.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems and computer products perform cost estimate to determine an efficient approach to answer a query according to one of several unified query plans. One unified query plan involves querying an unstructured database, referencing a unified index, and probing a structured database based on matches discovered in the unified index. The results of the unstructured database query are used to lookup entries in a unified index associated with the structured database. Then the structured database is probed by querying only the subset of the structured database gleaned from the unstructured database query.
31 Citations
12 Claims
-
1. A computer-implemented method for dynamically querying structured and unstructured data repositories, the method comprising:
-
maintaining, by a computer, a unified index associated with a structured data repository and an unstructured data repository; receiving, by the computer, a unified query of keywords including a first keyword for querying the structured data repository and a second keyword for querying the unstructured data repository; first querying, by the computer, one of the structured data repository and the unstructured data repository; if first querying the structured data repository, then, using the computer, matching the unified index to first querying results with the first keyword, and using unified index matches to limit second querying of the unstructured data repository with the second keyword; and if first querying the unstructured data repository, then, using the computer, matching the unified index to first querying results with the second keyword, and using unified index matches to limit second querying of the structured data repository with the first keyword, wherein the first querying of the one of the structured data repository and the unstructured data repository is decided at query time using occurrence statistics for the keywords, thereby selecting an efficient approach to answer a query. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A non-transitory computer program storage device readable by a computer, tangibly embodying a program of instructions executable by the computer to perform a method for dynamically querying structured and unstructured data repositories, the method comprising:
-
maintaining a unified index associated with a structured data repository an unstructured data repository; receiving a unified query of keywords including a first keyword for querying the structured data repository and a second keyword for querying the unstructured data repository; first querying one of the structured data repository and the unstructured data repository with a keyword; if first querying the structured data repository, then matching the unified index to first querying the results with the first keyword, and using unified index matches to limit second querying of the unstructured data repository with the second keyword; and if first query the unstructured data repository, then matching the unified index to first querying results with the second keyword, and using unified index matches to limit second querying of the structured data repository with the first keyword, wherein the first querying of the one of the structured data repository and the unstructured data repository is decided at query time using occurrence statistics for the keywords. - View Dependent Claims (7, 8, 9)
-
-
10. A non-transitory computer program storage device readable by a computer, tangibly embodying a program of instructions executable by the computer to perform a method for unified querying an unstructured data repository and a structured data repository, the method comprising:
-
maintaining a unified index associated with the structured data and the unstructured data repositories, wherein; the unified index stores occurrence statistics for keywords associated with the structured data and the unstructured data repositories; the structured data repository comprises table names and dimensional values of tables that include the keywords; and the unstructured data repository comprises text that includes the keywords; receiving a unified query of keywords including a first keyword for querying the structured data repository and a second keyword for querying the unstructured data repository; performing a cost analysis to dynamically select an efficient plan to answer a unified query, wherein the cost analysis is based on the occurrence statistics for the keyword, wherein the plan comprises one of; a first plan comprising a first querying of the structured data repository, then matching the unified index to first querying results with the first keyword, and using unified index matches to limit second querying of the unstructured data repository with the second keyword; a second plan comprising a first querying of the unstructured data repository, then matching the unified index to first querying results with second keyword, and using unified index matches to limit second querying of the structured data repository with the first keyword; and a third plan comprising a first querying of the structured data repository and a second querying of the unstructured data repository.
-
-
11. A computer system configured for dynamically querying structured and unstructured data repositories, the computer system comprising:
-
a processor configured to; maintain a unified index associated with a structured data repository and an unstructured data repository; receive a unified query of keywords including a first keyword for querying the structured data repository and a second keyword for querying the unstructured data repository; first query one of the structured data repository and the unstructured data repository with a keyword; if first querying the structured data repository, then matching the unified index to first querying results with the results first keyword, and using unified index matches to limit second querying of the unstructured data repository with the second keyword; and if first querying the unstructured data repository, then matching the unified index to first querying results with the second keyword, and using unified index matches to limit second querying of the structured data repository with the first keyword, wherein the first query of the one of the structured data repository and the unstructured data repository is decided at query time using occurrence statistics for the keywords; and a memory configured to store the occurrence statistics for the keywords, table names and dimensional values of tables that include the first keyword, and unstructured text that includes the second keyword.
-
-
12. A computer-implemented method for unified querying of an unstructured data repository and a structured data repository, the method comprising:
-
maintaining, by a computer, a unified index associated with the structured data and the unstructured data repositories, wherein; the unified index stored occurrence statistics for keywords associated with the structured data and the unstructured data repositories; the structured data repository comprises table names and dimensional values of tables that include the keywords; and the unstructured data repository comprises text that includes the keywords; receiving, by the computer, a unified query of keywords including a first keyword for querying the structured data repository and a second keyword for querying the unstructured data repository; performing, by the computer, a cost analysis to dynamically select an efficient plan to answer a unified query, wherein the cost analysis is based on the occurrence statistics for the keywords, wherein the plan comprises one of; a first plan comprising a first querying of the structured data repository, then matching the unified index for first querying results with the first keyword, and using unified index matches to limit second querying of the unstructured data repository with the second keyword; a second plan comprising a first querying of the unstructured data repository, then matching the unified index to first querying results with the second keyword, and using unified index matches to limit second querying of the structured data repository with the first keyword; and a third plan comprising a first querying of the structured data repository and a second querying of the unstructured data repository.
-
Specification