Data pruning based on metadata
First Claim
Patent Images
1. A method for managing query operations, the method comprising:
- receiving a query directed to a set of files comprising data, wherein the query comprises a plurality of predicates;
determining whether data in each file matches at least one predicate of the plurality of predicates based on file metadata without accessing the set of files;
removing files that do not match at least one predicate from the set of files to create a reduced set of files;
identifying, based on the file metadata, one or more predicates of the query that do not fully match any file in the set of files without accessing the set of files;
removing the one or more predicates that do not fully match any file in the set of files from the query to create a modified query;
executing the modified query against the reduced set of files to create a final set of files; and
returning the final set of files in response to the query.
2 Assignments
0 Petitions
Accused Products
Abstract
A system, apparatus, and method for processing queries wherein the query includes a request to access or delete data and accessing metadata associated with the set of data, the metadata defining data characteristics of the set of data and identifying at least sets of data that need or not need to be accessed or deleted based on the metadata without accessing the actual data in the set of data; also methods to optimize processing of some operations based on the collected metadata on data.
99 Citations
57 Claims
-
1. A method for managing query operations, the method comprising:
-
receiving a query directed to a set of files comprising data, wherein the query comprises a plurality of predicates; determining whether data in each file matches at least one predicate of the plurality of predicates based on file metadata without accessing the set of files; removing files that do not match at least one predicate from the set of files to create a reduced set of files; identifying, based on the file metadata, one or more predicates of the query that do not fully match any file in the set of files without accessing the set of files; removing the one or more predicates that do not fully match any file in the set of files from the query to create a modified query; executing the modified query against the reduced set of files to create a final set of files; and returning the final set of files in response to the query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A processor that is programmable to execute instructions stored in non-transitory computer readable storage media, the instructions comprising:
-
receiving a query directed to a set of files comprising data, wherein the query comprises a plurality of predicates; accessing metadata associated with the set of files, determining whether data in each file of the set of files matches at least one predicate of the plurality of predicates based on the metadata without accessing the set of files, and removing files that do not match at least one predicate from the set of files to create a reduced set of files; identifying, based on the metadata, one or more predicates of the query that do not fully match any file in the set of files without accessing the set of files; removing the one or more predicates that do not fully match any file in the set of files from the query to create a modified query; executing the modified query against the reduced set of files to create a final set of files; and returning the final set of files in response to the query. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
-
-
38. A system comprising:
-
a resource manager in communication with a plurality of shared storage devices collectively storing database data and an execution platform comprising a plurality of execution nodes, the resource manager comprising a processor that is programmable to execute instructions stored in non-transitory computer readable storage media and configured to; receive a query directed to a set of files wherein the query comprises a plurality of predicates for defining data; access metadata associated with the set of files, determine whether data in each file matches at least one predicate of the plurality of predicates based on the metadata without accessing the set of files, and remove files that do not match at least one predicate from the set of files to create a reduced set of files; identify, based on the metadata, one or more predicates of the query that do not fully match any file in the set of files without accessing the set of files; remove the one or more predicates that do not fully match any file in the set of files from the query to create a modified query; and the execution platform configured to; execute the modified query against the reduced set of files to create a final set of files; and return the final set of files to the resource manager in response to the query. - View Dependent Claims (39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57)
-
Specification