Distributed processing framework file system fast on-demand storage listing
First Claim
1. A computer-implemented method, comprising:
- receiving a request from a requestor to provide a listing of keys for data objects in a directory of a distributed processing framework file system, wherein the data objects are stored with a storage service of a computing resource service provider;
obtaining a set of keys from a data store of the computing resource service provider, wherein each key of the set of keys from the data store corresponds to a data object in the data objects stored with the storage service;
determining a first starting key and a second starting key based at least in part on a scheme for partitioning the set of keys;
generating a first sub-listing of keys by executing a first thread, wherein the first sub-listing of keys includes one or more keys in an order from the first starting key to the second starting key;
generating a second sub-listing of keys by executing a second thread, wherein the second sub-listing of keys includes one or more keys in the order from the second starting key to an end of the data objects, and wherein the second thread executes in parallel with the first thread;
merging the first sub-listing of keys with the second sub-listing of keys to produce the listing of keys; and
providing the listing of keys to the requestor.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for improving the speed of generating a list of previously-uncounted items stored with a computing resource service provider. The system and method involve obtaining a set of keys from a data store, wherein each key of the set of keys corresponds to an item in a group of items, wherein a quantity of items in the group is uncounted. The system and method further includes generating a first sub-listing of keys based at least in part on a first key range of the set of keys by executing a first thread, generating a second sub-listing of keys based at least in part on a second key range of the set of keys by executing a second thread, combining the first sub-listing of keys with the second sub-listing of keys to produce a list of keys, and providing the list of keys.
-
Citations
21 Claims
-
1. A computer-implemented method, comprising:
-
receiving a request from a requestor to provide a listing of keys for data objects in a directory of a distributed processing framework file system, wherein the data objects are stored with a storage service of a computing resource service provider; obtaining a set of keys from a data store of the computing resource service provider, wherein each key of the set of keys from the data store corresponds to a data object in the data objects stored with the storage service; determining a first starting key and a second starting key based at least in part on a scheme for partitioning the set of keys; generating a first sub-listing of keys by executing a first thread, wherein the first sub-listing of keys includes one or more keys in an order from the first starting key to the second starting key; generating a second sub-listing of keys by executing a second thread, wherein the second sub-listing of keys includes one or more keys in the order from the second starting key to an end of the data objects, and wherein the second thread executes in parallel with the first thread; merging the first sub-listing of keys with the second sub-listing of keys to produce the listing of keys; and providing the listing of keys to the requestor. - View Dependent Claims (2, 3, 4)
-
-
5. A system, comprising:
-
one or more processors; and memory including instructions that, if executed by the one or more processors, cause the system to; obtain a set of keys from a data store, wherein each key of the set of keys corresponds to a respective item in a group of items, wherein a quantity of items in the group is uncounted; generate a first sub-listing of keys based at least in part on a first key range of the set of keys by executing a first thread; generate a second sub-listing of keys based at least in part on a second key range of the set of keys by executing a second thread; combine the first sub-listing of keys with the second sub-listing of keys to produce a list of keys; and provide the list of keys. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable storage medium having stored thereon executable instructions that, if executed by one or more processors of a computer system, cause the computer system to at least:
-
obtain a set of keys from a data store; initialize one or more threads, wherein each thread is configured to generate a respective sub-listing of keys ordered from a respective start key to a respective end key, the respective sub-listing of keys determined based at least in part on a scheme for partitioning the set of keys, and wherein a first key in the respective sub-listing of keys corresponds to a respective data object of a file system; generate a second sub-listing of keys based at least in part on a second key range of the set of keys by executing a second thread; receive the respective sub-listing of keys from a first thread of the one or more threads; combine the respective sub-listing of keys with the second sub-listing of keys to produce a master list, and provide the master list. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification