SYSTEMS AND METHODS FOR DETECTING MISSING DATA IN QUERY RESULTS
First Claim
1. A computer system comprising:
- at least one processor; and
a memory storing instructions configured to instruct the at least one processor to perform;
receiving a data set for storage;
storing a data subset of the data set at at least one leaf node of a plurality of leaf nodes;
storing data accounting information regarding storage of the data subset at the at least one leaf node;
receiving an initial query configured to be performed on the data set;
submitting a first query on the data set to one or more of the plurality of leaf nodes, wherein the first query is based on the initial query;
performing a second query on the data accounting information, wherein the second query is based on the initial query; and
comparing a first result, received in response to the first query, to a second result, received in response to the second query, to determine whether the first result is missing data.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques provided herein allow for estimating data missing in query results provided in response to queries performed on data managed by a data management system. In the event that one or more leaf nodes are unable or unavailable to process a query, a final query result provided in response to the original query may be missing data that exists on those leaf nodes. A data accounting service monitors what managed data is being stored on the leaf nodes and on what leaf node. The data accounting service can estimate how much data is missing from a final query result when one or more of the leaf nodes are unable or unavailable to process a query.
-
Citations
20 Claims
-
1. A computer system comprising:
-
at least one processor; and a memory storing instructions configured to instruct the at least one processor to perform; receiving a data set for storage; storing a data subset of the data set at at least one leaf node of a plurality of leaf nodes; storing data accounting information regarding storage of the data subset at the at least one leaf node; receiving an initial query configured to be performed on the data set; submitting a first query on the data set to one or more of the plurality of leaf nodes, wherein the first query is based on the initial query; performing a second query on the data accounting information, wherein the second query is based on the initial query; and comparing a first result, received in response to the first query, to a second result, received in response to the second query, to determine whether the first result is missing data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer-storage medium storing computer-executable instructions that, when executed, cause a computer system to perform a computer-implemented method comprising:
-
receiving a data set for storage; storing a data subset of the data set at at least one leaf node of a plurality of leaf nodes; storing data accounting information regarding storage of the data subset at the at least one leaf node; receiving an initial query configured to be performed on the data set; submitting a first query on the data set to one or more of the plurality of leaf nodes, wherein the first query is based on the initial query; performing a second query on the data accounting information, wherein the second query is based on the initial query; and comparing a first result, received in response to the first query, to a second result, received in response to the second query, to determine whether the first result is missing data.
-
-
20. A computer implemented method comprising:
-
receiving, by a computer system, a data set for storage; storing, by the computer system, a data subset of the data set at at least one leaf node of a plurality of leaf nodes; storing, by the computer system, data accounting information regarding storage of the data subset at the at least one leaf node; receiving, by the computer system, an initial query configured to be performed on the data set; submitting, by the computer system, a first query on the data set to one or more of the plurality of leaf nodes, wherein the first query is based on the initial query; performing, by the computer system, a second query on the data accounting information, wherein the second query is based on the initial query; and comparing, by the computer system, a first result, received in response to the first query, to a second result, received in response to the second query, to determine whether the first result is missing data.
-
Specification