Database operation using metadata of data sources
First Claim
1. A system comprising:
- computer-readable media having thereon a plurality of modules; and
at least one processing unit operably coupled with the computer-readable media, configured to access data from one or more data sources, and adapted to execute one or more modules of the plurality of modules comprising;
at least one module of a construction engine configured to;
determine a dataflow corresponding to a job specification associated with one or more data sources;
determine a data manipulation to operate on the job specification based on the dataflow;
determine a data-processing instruction to perform functions of the data manipulation based on the data manipulation and at least some metadata of the one or more data sources; and
determine a query by assembling the data-processing instruction based at least in part on the job specification;
at least one module of an execution engine configured to;
execute the query by accessing the data from at least one of the one or more data sources to provide query results; and
at least one module of an ingestion engine configured to;
retrieve a data record;
determine metadata corresponding to the data record by extracting attribute values or relationships from the data, the metadata including a data-source identifier indicating at least one of the one or more data sources; and
store the metadata in a metadata repository in association with the data-source identifier.
1 Assignment
0 Petitions
Accused Products
Abstract
In some examples, a computing device determines a data manipulation from a job specification. The device determines a corresponding data-processing instruction using data-source metadata, and determines and executes a corresponding query. In some examples, a device receives search keys. The device searches data-source metadata using the search keys. The device weights a first data source based on producer-consumer relationships between data sources, and ranks the first data source using the weight. In some examples, a device determines structural and content information of a data record. The device determines a data-source identifier from the structural information and stores the content information with the data-source identifier in a database. In some examples, via a user interface, a device receives a job specification and annotation data. The device stores the spec and the annotation data in a metadata repository.
-
Citations
15 Claims
-
1. A system comprising:
-
computer-readable media having thereon a plurality of modules; and at least one processing unit operably coupled with the computer-readable media, configured to access data from one or more data sources, and adapted to execute one or more modules of the plurality of modules comprising; at least one module of a construction engine configured to; determine a dataflow corresponding to a job specification associated with one or more data sources; determine a data manipulation to operate on the job specification based on the dataflow; determine a data-processing instruction to perform functions of the data manipulation based on the data manipulation and at least some metadata of the one or more data sources; and determine a query by assembling the data-processing instruction based at least in part on the job specification; at least one module of an execution engine configured to; execute the query by accessing the data from at least one of the one or more data sources to provide query results; and at least one module of an ingestion engine configured to; retrieve a data record; determine metadata corresponding to the data record by extracting attribute values or relationships from the data, the metadata including a data-source identifier indicating at least one of the one or more data sources; and store the metadata in a metadata repository in association with the data-source identifier. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method, comprising:
-
determining a dataflow corresponding to a job specification associated with one or more data sources; determining a data manipulation to operate on the job specification based on the dataflow; determining a data-processing instruction to perform functions of the data manipulation based on the data manipulation and at least some metadata of the one or more data sources; and determining a query by assembling the data-processing instruction based at least in part on the job specification; executing the query by accessing data from at least one of the one or more data sources to provide query results; and retrieving a data record; determining metadata corresponding to the data record by extracting attribute values or relationships from the data, the metadata including a data-source identifier indicating at least one of the one or more data sources; and storing the metadata in a metadata repository in association with the data-source identifier. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A non-transitory computer-readable medium comprising instructions executable by a processor, comprising:
-
instructions to determine a dataflow corresponding to a job specification associated with one or more data sources; instructions to determine a data manipulation to operate on the job specification based on the dataflow; instructions to determine a data-processing instruction to perform functions of the data manipulation based on the data manipulation and at least some metadata of the one or more data sources; and instructions to determine a query by assembling the data-processing instruction based at least in part on the job specification; instructions to execute the query by accessing data from at least one of the one or more data sources to provide query results; and instructions to retrieve a data record; instructions to determine metadata corresponding to the data record by extracting attribute values or relationships from the data, the metadata including a data-source identifier indicating at least one of the one or more data sources; and instructions to store the metadata in a metadata repository in association with the data-source identifier. - View Dependent Claims (12, 13, 14, 15)
-
Specification