Techniques for data retrieval in a distributed computing environment
First Claim
1. At least one non-transitory machine-readable storage medium comprising instructions that, when executed by a processor, cause the processor to:
- receive a data request at a computing node of a distributed computing system to store a set of data, the computing node including a subset of data of the set of data;
sort the subset of data into a defined order via one or more data sorting sessions;
iteratively search among the sorted subset of data for a targeted beginning location of a data range that is responsive to the data request, each iteration to;
receive a proposed value of one or more proposed values from one of a plurality of computing nodes of the distributed computing system, the proposed value used to search for the beginning location of the data range,utilize a binary search to locate a local insert location for the proposed value, the local insert location local to the subset of data of the computing node, the binary search to;
determine a first sum value of locally stored values indicating a number of locally stored values that are less than the proposed value stored on the computing node,determine a second sum value of globally stored values less than the proposed value, wherein the globally stored values are values stored on the distributed computing system and the second sum value to include a summation of a number of values stored on other computing nodes of the distributed computing system and the first sum value,determine whether the second sum value is less than, equal to, or greater than the targeted beginning location,based on determining the second sum value is less than the targeted beginning location, the local insert location is set as a current lower bound of a bracketed range, andbased on determining the second sum value is greater than the targeted beginning location, the local insert location is set as an upper bound of the bracketed range, andbased on determining that the second sum value equals the targeted beginning location, set the local insert location to be the targeted beginning location, and cease searching among the sorted subset of data,share, by the computing node, a location of a greatest local value known to be less than the proposed value with the other computing nodes within the distributed system,cease searching among the sorted subset of data when the bracketed range equals zero, anddetermine, during at least one of the iterations, one of the one or more proposed values and share the one of the one or more proposed values with other computing nodes within the distributed computing system; and
forward, from the computing node, data responsive to the search to another computing node of the distributed computing system to be merged with one or more other data from one or more other computing nodes.
1 Assignment
0 Petitions
Accused Products
Abstract
Enhanced techniques for data retrieval in a distributed computing environment are described. A computing node of a distributed computing environment may receive a data request. The computing node may include one or more subsets of data. The computing node may be configured to search among the one or more subset of data for a beginning of a data range that is responsive to the data request. The computing node may be further configured to forward a data range responsive to the search to another computing node of the distributed computing system to be merged with one or more additional data ranges. Other embodiments are described and claimed.
17 Citations
45 Claims
-
1. At least one non-transitory machine-readable storage medium comprising instructions that, when executed by a processor, cause the processor to:
-
receive a data request at a computing node of a distributed computing system to store a set of data, the computing node including a subset of data of the set of data; sort the subset of data into a defined order via one or more data sorting sessions; iteratively search among the sorted subset of data for a targeted beginning location of a data range that is responsive to the data request, each iteration to; receive a proposed value of one or more proposed values from one of a plurality of computing nodes of the distributed computing system, the proposed value used to search for the beginning location of the data range, utilize a binary search to locate a local insert location for the proposed value, the local insert location local to the subset of data of the computing node, the binary search to; determine a first sum value of locally stored values indicating a number of locally stored values that are less than the proposed value stored on the computing node, determine a second sum value of globally stored values less than the proposed value, wherein the globally stored values are values stored on the distributed computing system and the second sum value to include a summation of a number of values stored on other computing nodes of the distributed computing system and the first sum value, determine whether the second sum value is less than, equal to, or greater than the targeted beginning location, based on determining the second sum value is less than the targeted beginning location, the local insert location is set as a current lower bound of a bracketed range, and based on determining the second sum value is greater than the targeted beginning location, the local insert location is set as an upper bound of the bracketed range, and based on determining that the second sum value equals the targeted beginning location, set the local insert location to be the targeted beginning location, and cease searching among the sorted subset of data, share, by the computing node, a location of a greatest local value known to be less than the proposed value with the other computing nodes within the distributed system, cease searching among the sorted subset of data when the bracketed range equals zero, and determine, during at least one of the iterations, one of the one or more proposed values and share the one of the one or more proposed values with other computing nodes within the distributed computing system; and forward, from the computing node, data responsive to the search to another computing node of the distributed computing system to be merged with one or more other data from one or more other computing nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method, comprising:
-
receive a data request at a computing node of a distributed computing system to store a set of data, the computing node including a subset of data of the set of data; sorting the subset of data into a defined order via one or more data sorting sessions; iteratively searching among the sorted subset of data for a targeted beginning location of a data range that is responsive to the data request, each iteration to include; receiving a proposed value of one or more proposed values from one of a plurality of computing nodes of the distributed computing system, the proposed value used to search for the beginning location of the data range, utilizing a binary search to locate a local insert location for the proposed value, the local insert location local to the subset of data of the computing node, the binary search including; determining a first sum value of locally stored values indicating a number of locally stored values that are less than the proposed value stored on the computing node, determining a second sum value of globally stored values less than the proposed value, wherein the globally stored values are values stored on the distributed computing system and the second sum value to include a summation of a number of values stored on other computing nodes of the distributed computing system and the first sum value, determining whether the second sum value is less than, equal to, or greater than the targeted beginning location, based on determining the second sum value is less than the targeted beginning location, the local insert location is set as a current lower bound of a bracketed range, and based on determining the second sum value is greater than the targeted beginning location, the local insert location is set as an upper bound of the bracketed range, and based on determining that the second sum value equals the targeted beginning location, set the local insert location to be the targeted beginning location, and cease searching among the sorted subset of data, sharing, by the computing node, a location of a greatest local value known to be less than the proposed value with the other computing nodes within the distributed system and cease searching among the sorted subset of data when the bracketed range equals zero, determining, during at least one of the iterations, one of the one or more proposed values and share the one of the one or more proposed values with other computing nodes within the distributed computing system; and forwarding, from the computing node, data responsive to the search to another computing node of the distributed computing system to be merged with one or more other data from one or more other computing nodes. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. An apparatus, comprising:
-
a processor; a memory unit communicatively coupled to the processor; and logic, at least partially implemented by the processor, the logic to; receive a data request at a computing node of a distributed computing system to store a set of data, the computing node including a subset of data of the set of data, sort the subset of data into a defined order via one or more data sorting sessions, iteratively search among the sorted subset of data for a targeted beginning location of a data range that is responsive to the data request, each iteration to; receive a proposed value of one or more proposed values from one of a plurality of computing nodes of the distributed computing system, the proposed value used to search for the beginning location of the data range, utilize a binary search to locate a local insert location for the proposed value, the local insert location local to the subset of data of the computing node, the binary search to; determine a first sum value of locally stored values indicating a number of locally stored values that are less than the proposed value stored on the computing node, determine a second sum value of globally stored values less than the proposed value, wherein the globally stored values are values stored on the distributed computing system and the second sum value to include a summation of a number of values stored on other computing nodes of the distributed computing system and the first sum value, determine whether the second sum value is less than, equal to, or greater than the targeted beginning location, based on determining the second sum value is less than the targeted beginning location, the local insert location is set as a current lower bound of a bracketed range, and based on determining the second sum value is greater than the targeted beginning location, the local insert location is set as an upper bound of the bracketed range, and based on determining that the second sum value equals the targeted beginning location, set the local insert location to be the targeted beginning location, and cease searching among the sorted subset of data, share, by the computing node, a location of a greatest local value known to be less than the proposed value with the other computing nodes within the distributed system and cease searching among the sorted subset of data when the bracketed range equals zero, and determine, during at least one of the iterations, one of the one or more proposed values and share the one of the one or more proposed values with other computing nodes within the distributed computing system; and forward, from the computing node, data responsive to the search to another computing node of the distributed computing system to be merged with one or more other data from one or more other computing nodes. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. At least one non-transitory machine-readable storage medium comprising instructions that, when executed by a computing device, cause the computing device to:
-
receive a data request at a distributed computing system including a plurality of computing nodes, each computing node including a subset of data of a set of data stored by the distributed computing system; cause each computing node to sort each subset of data into a defined order via one or more data sorting sessions; cause each of the plurality of computing nodes to iteratively search among the sorted subset of data for a targeted beginning location of a data range that is responsive to the data request, each iteration to; receive a proposed value of one or more proposed values from one of the plurality of computing nodes of the distributed computing system, the proposed value used to search for the beginning location of the data range, utilize a binary search to locate a local insert location for the proposed value, the local insert location local to the subset of data of a computing node, the binary search to; determine a first sum value of locally stored values indicating a number of locally stored values that are less than the proposed value stored on the computing node, determine a second sum value of globally stored values less than the proposed value, wherein the globally stored values are values stored on the distributed computing system and the second sum value to include a summation of a number of values stored on other computing nodes of the distributed computing system and the first sum value, determine whether the second sum value is less than, equal to, or greater than the targeted beginning location, based on determining the second sum value is less than the targeted beginning location, the local insert location is set as a current lower bound of a bracketed range, and based on determining the second sum value is greater than the targeted beginning location, the local insert location is set as an upper bound of the bracketed range, and based on determining that the second sum value equals the targeted beginning location, set the local insert location to be the targeted beginning location, and cease searching among the sorted subset of data, share a location of a greatest local value known to be less than the proposed value with the plurality of computing nodes within the distributed system and cease searching among the sorted subset of data when the bracketed range equals zero, determine, during at least one of the iterations, one of the one or more proposed values and share the one of the one or more proposed values with the plurality of computing nodes within the distributed computing system; and cause one or more computing nodes to forward the subsets of data between computing nodes within the distributed computing system, the forwarding beginning from a computing node that has the targeted beginning location of the data range found from the search; merge the forwarded subsets of data into a final data range responsive to the data request; and send the final data range to a requesting device. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29)
-
-
30. A computer-implemented method, comprising:
-
receiving a data request at a distributed computing system including a plurality of computing nodes, each computing node including a subset of data of a set of data stored by the distributed computing system; causing each computing node to sort each subset of data into a defined order via one or more data sorting sessions; cause each of the plurality of computing nodes to iteratively search among the sorted subset of data for a targeted beginning location of a data range that is responsive to the data request, each iteration to include; receiving a proposed value of one or more proposed values from one of the plurality of computing nodes of the distributed computing system, the proposed value used to search for the beginning location of the data range, utilizing a binary search to locate a local insert location for the proposed value, the local insert location local to the subset of data of a computing node, the binary search including; determining a first sum value of locally stored values indicating a number of locally stored values that are less than the proposed value stored on the computing node, determining a second sum value of globally stored values less than the proposed value, wherein the globally stored values are values stored on the distributed computing system and the second sum value to include a summation of a number of values stored on other computing nodes of the distributed computing system and the first sum value, determining whether the second sum value is less than, equal to, or greater than the targeted beginning location, based on determining the second sum value is less than the targeted beginning location, the local insert location is set as a current lower bound of a bracketed range, and based on determining the second sum value is greater than the targeted beginning location, the local insert location is set as an upper bound of the bracketed range, and based on determining that the second sum value equals the targeted beginning location, set the local insert location to be the targeted beginning location, and cease searching among the sorted subset of data, sharing a location of a greatest local value known to be less than the proposed value with the plurality of computing nodes within the distributed system and cease searching among the sorted subset of data when the bracketed range equals zero, determining, during at least one of the iterations, one of the one or more proposed values and share the one of the one or more proposed values with the plurality of computing nodes within the distributed computing system; and cause one or more computing nodes to forward the subsets of data between computing nodes within the distributed computing system, the forwarding beginning from a computing node that has the targeted beginning location of the data range found from the search; merging the forwarded subsets of data into a final data range responsive to the data request; and sending the final data range to a requesting device. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37)
-
-
38. An apparatus, comprising:
-
a processor; a memory unit communicatively coupled to the processor; and logic, at least partially implemented by the processor, the logic to; receive a data request at a distributed computing system including a plurality of computing nodes, each computing node including a subset of data of a set of data stored by the distributed computing system, cause each computing node to sort each subset of data into a defined order via one or more data sorting sessions, cause each of the plurality of computing nodes to iteratively search among the sorted subset of data for a targeted beginning location of a data range that is responsive to the data request, each iteration to; receive a proposed value of one or more proposed values from one of the plurality of computing nodes of the distributed computing system, the proposed value used to search for the beginning location of the data range, utilize a binary search to locate a local insert location for the proposed value, the local insert location local to the subset of data of a computing node, the binary search to; determine a first sum value of locally stored values indicating a number of locally stored values that are less than the proposed value stored on the computing node, determine a second sum value of globally stored values less than the proposed value, wherein the globally stored values are values stored on the distributed computing system and the second sum value to include a summation of a number of values stored on other computing nodes of the distributed computing system and the first sum value, determine whether the second sum value is less than, equal to, or greater than the targeted beginning location, based on determining the second sum value is less than the targeted beginning location, the local insert location is set as a current lower bound of a bracketed range, and based on determining the second sum value is greater than the targeted beginning location, the local insert location is set as an upper bound of the bracketed range, and based on determining that the second sum value equals the targeted beginning location, set the local insert location to be the targeted beginning location and cease searching among the sorted subset of data, share a location of a greatest local value known to be less than the proposed value with the plurality of computing nodes within the distributed system and cease searching among the sorted subset of data when the bracketed range equals zero, determine, during at least one of the iterations, one of the one or more proposed values and share the one of the one or more proposed values with the plurality of computing nodes within the distributed computing system, and cause one or more computing nodes to forward the subsets of data between computing nodes within the distributed computing system, the forwarding beginning from a computing node that has the targeted beginning location of the data range found from the search, and merge the forwarded subsets of data into a final data range responsive to the data request. - View Dependent Claims (39, 40, 41, 42, 43, 44, 45)
-
Specification