Multi-range and runtime pruning
First Claim
Patent Images
1. A method comprising:
- storing a plurality of metadata items in a metadata store that is separate from a storage platform that stores a plurality of files, each file of the plurality of files comprising a block of database data;
determining a join operation to be performed on two or more files of the plurality of files, the join operation comprising a predicate indicating a first value;
scanning the plurality of metadata items without accessing the database data to identify the two or more files that are necessary to complete the join operation, wherein;
each metadata item corresponds to and comprises information about a file of the plurality of files; and
each metadata item indicates a minimum/maximum value range for its corresponding file;
identifying, from the plurality of metadata items, a first group of files whose minimum/maximum value ranges do not encompass the first value;
pruning the first group of files;
consuming, by a build operator of the join operation, database data in at least one file of the two or more files that are necessary to complete the join operation;
generating a vector summarizing the database data consumed by the build operator;
broadcasting the vector to at least one probe operator of the join operation; and
filtering rows from the one or more files that are necessary to complete the join operation based on the vector.
2 Assignments
0 Petitions
Accused Products
Abstract
A system, apparatus, and method for managing data storage and data access with querying data and filtering value ranges using only a constant amount of computer memory in the implementation of bloom filters based on a first consumption of a relation.
-
Citations
26 Claims
-
1. A method comprising:
-
storing a plurality of metadata items in a metadata store that is separate from a storage platform that stores a plurality of files, each file of the plurality of files comprising a block of database data; determining a join operation to be performed on two or more files of the plurality of files, the join operation comprising a predicate indicating a first value; scanning the plurality of metadata items without accessing the database data to identify the two or more files that are necessary to complete the join operation, wherein; each metadata item corresponds to and comprises information about a file of the plurality of files; and each metadata item indicates a minimum/maximum value range for its corresponding file; identifying, from the plurality of metadata items, a first group of files whose minimum/maximum value ranges do not encompass the first value; pruning the first group of files; consuming, by a build operator of the join operation, database data in at least one file of the two or more files that are necessary to complete the join operation; generating a vector summarizing the database data consumed by the build operator; broadcasting the vector to at least one probe operator of the join operation; and filtering rows from the one or more files that are necessary to complete the join operation based on the vector. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
a memory to store one or more processing instructions; a processing device, operatively coupled to the memory, to; store a plurality of metadata items in a metadata store that is separate from a storage platform that stores a plurality of files, each file of the plurality of files comprising a block of database data; determine a join operation to be performed on two or more files of the plurality of files, the join operation comprising a predicate indicating a first value; scan the plurality of metadata items without accessing the database data to identify the two or more files that are necessary to complete the join operation, wherein; each metadata item corresponds to and comprises information about a file of the plurality of files; and each metadata item corresponding metadata for a file indicates a minimum/maximum value range for its corresponding file; prune the first group of files; consume, by a build operator of the join operation, database data in at least one file of the one or more files that are necessary to complete the join operation; generate a vector summarizing the database data consumed by the build operator; broadcast the vector to at least one probe operator of the join operation; and filter rows from the one or more files that are necessary to complete the join operation based on the vector. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. Non-transitory computer readable storage media storing instructions that, when executed by one or more processors, cause the one or more processors to:
-
store a plurality of metadata items in a metadata store that is separate from a storage platform that stores a plurality of files, each file of the plurality of files comprising a block of database data; determine a join operation to be performed on two or more files of the plurality of files, the join operation comprising a predicate indicating a first value; scan the plurality of metadata items without accessing the database data to identify the two or more files that are necessary to complete the join operation, wherein; each metadata item corresponds to and comprises information about a file of the plurality of files; and each metadata item indicates a minimum/maximum value range for its corresponding file; identify, from the plurality of metadata items, a first group of files whose minimum/maximum value ranges do not encompass the first value; prune the first group of files; consume, by a build operator of the join operation, database data in at least one file of the two or more files that are necessary to complete the join operation; generate a vector summarizing the database data consumed by the build operator; broadcast the vector to at least one probe operator of the join operation; and filter rows from the one or more files that are necessary to complete the join operation based on the vector. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26)
-
Specification