Systems and methods for massive structured data management over cloud aware distributed file system
First Claim
Patent Images
1. A method comprising:
- accommodating a query;
directing the query to datasets which include data from files in a distributed file system;
creating partitions of the datasets, wherein said creating comprises creating smart replicas, and wherein the smart replicas comprise reordered content of the datasets;
indexing the partitions via mapping partition keys to a database atop the distributed file system;
parsing the query;
looking up indexed values based on the parsed query;
identifying partitions based on said looking up of indexed values;
rewriting the query, with partition information, into the database atop the distributed file system; and
returning a response to the query, wherein the response is structured based on the created partitions.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and arrangements for accommodating a query, directing the query to datasets, creating partitions and partitioning the datasets, and returning a response to the query, the response being structured in accordance with the created partitions.
-
Citations
16 Claims
-
1. A method comprising:
-
accommodating a query; directing the query to datasets which include data from files in a distributed file system; creating partitions of the datasets, wherein said creating comprises creating smart replicas, and wherein the smart replicas comprise reordered content of the datasets; indexing the partitions via mapping partition keys to a database atop the distributed file system; parsing the query; looking up indexed values based on the parsed query; identifying partitions based on said looking up of indexed values; rewriting the query, with partition information, into the database atop the distributed file system; and returning a response to the query, wherein the response is structured based on the created partitions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus comprising:
-
one or more processors; and a computer readable storage medium having computer readable program code embodied therewith and executable by the one or more processors, the computer readable program code being configured to; accommodate a query; direct the query to datasets which include data from files in a distributed file system; create partitions of the datasets, wherein to create partitions comprises creating smart replicas, and wherein the smart replicas comprise reordered content of the datasets; index the partitions via mapping partition keys to a database atop the distributed file system; parse the query; look up indexed values based on the parsed query; identify partitions based on the looking up of indexed values; rewrite the query, with partition information, into the database atop distributed file system; and return a response to the query, wherein the response is structured based on the created partitions.
-
-
10. A computer program product embedded in a computer readable storage medium embodied with computer readable program code which, when executed, causes a computing device to perform operations, the computer readable program code being configured to:
-
accommodate a query; direct the query to datasets which include data from files in a distributed file system; create partitions of the datasets, wherein to create partitions comprises creating smart replicas, and wherein the smart replicas comprise reordered content of the datasets; index the partitions via mapping partition keys to a database atop the distributed file system; parse the query; look up indexed values based on the parsed query; identify partitions based on the looking up of indexed values; rewrite the query, with partition information, into the database atop distributed file system; and return a response to the query, wherein the response is structured based on the created partitions. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
Specification