Archival data flow management
First Claim
Patent Images
1. A computer system for managing data storage, comprising:
- a plurality of data storage devices;
a plurality of storage nodes, each of the plurality of data storage nodes being operably connected to one or more of the plurality of data storage devices;
a storage node manager operably connected to the plurality of data storage nodes;
one or more processors;
memory, including executable instructions that, when executed by the one or more processors, cause the one or more processors to collectively at least;
receive, by the storage node manager, a request to store a data object;
obtain, by the storage node manager, the data object to be stored; and
allocate, by the storage node manager, storage space to store a plurality of redundantly encoded data components that are generated using at least one erasure code and based at least in part on the data object to be stored;
for each encoded data component of the plurality of redundantly encoded data components;
generate a data collection that includes at least the encoded data component and redundantly encoded data components associated with another data object, such that the data collection is redundantly encoded; and
provide, by the storage node manager, the encoded data component to a storage node;
after providing the encoded data component, make available storage space allocated to the plurality of encoded data components regardless of whether a response is received from the storage node;
receive, by a storage node, one of the plurality of encoded data components; and
store, by the storage node, the data collection on a storage device connected to the storage node.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and systems are provided herein to allow efficient management of data flowing in and out of an archival data storage system. In an embodiment, storage entities keep very little state information in memory to provide higher throughput. Further, storage entities may send data in large chunks to facilitate high throughput. Techniques such as batching and coalescing may be used by various storage entities to provide efficiency.
-
Citations
24 Claims
-
1. A computer system for managing data storage, comprising:
-
a plurality of data storage devices; a plurality of storage nodes, each of the plurality of data storage nodes being operably connected to one or more of the plurality of data storage devices; a storage node manager operably connected to the plurality of data storage nodes; one or more processors; memory, including executable instructions that, when executed by the one or more processors, cause the one or more processors to collectively at least; receive, by the storage node manager, a request to store a data object; obtain, by the storage node manager, the data object to be stored; and allocate, by the storage node manager, storage space to store a plurality of redundantly encoded data components that are generated using at least one erasure code and based at least in part on the data object to be stored; for each encoded data component of the plurality of redundantly encoded data components; generate a data collection that includes at least the encoded data component and redundantly encoded data components associated with another data object, such that the data collection is redundantly encoded; and provide, by the storage node manager, the encoded data component to a storage node; after providing the encoded data component, make available storage space allocated to the plurality of encoded data components regardless of whether a response is received from the storage node; receive, by a storage node, one of the plurality of encoded data components; and store, by the storage node, the data collection on a storage device connected to the storage node. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method, comprising:
under the control of one or more computer systems configured with executable instructions, receiving a job to store a data object; and executing the job by at least; encoding the data object using at least one erasure code so as to obtain a plurality of redundantly encoded data components; allocating storage space to store the plurality of redundantly encoded data components; generating a data collection that includes at least the plurality of redundantly encoded data components and redundantly encoded data components associated with another data object, such that the data collection is redundantly encoded; for each of the plurality of encoded data components, providing the redundantly encoded data component as part of the data collection in a request to store the redundantly encoded data component; and for at least some of the plurality of redundantly encoded data components, deallocating the allocate storage space for the redundantly encoded data component regardless of whether a response to the request to store the redundantly encoded data component is received. - View Dependent Claims (8, 9, 10, 11, 12)
-
13. A computer-implemented method, comprising:
under the control of one or more computer systems configured with executable instructions, receiving a plurality of data storage related requests, at least some of the plurality of data storage related requests specifying at least a data collection that includes a plurality of redundantly encoded data components derived by application of at least one erasure code; determining a coalesced plurality of data storage related requests based at least in part on the received plurality of data storage related requests, the coalesced plurality of data storage related requests including at least a coalesced request formed based on two or more of the plurality of data storage related requests; and causing the coalesced plurality of data storage related requests to be fulfilled by one or more storage devices according to a batch processing schedule, by at least generating and storing the data collection to includes at least the plurality of redundantly encoded data components and redundantly encoded data components associated with another data object, such that the data collection is redundantly encoded. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
20. One or more non-transitory computer-readable storage media having collectively stored thereon executable instructions that, when executed by one or more processors of a computer system, cause the computer system to at least:
-
receive a request to store a data object; obtain a plurality of redundantly encoded data components based at least in part on the data object and derived using at least one erasure code; allocate storage space to store the plurality of redundantly encoded data components; generate a data collection that includes at least the encoded data component and redundantly encoded data components associated with another data object, such that the data collection is redundantly encoded; for each of the plurality of encoded data components, provide the encoded data component to a storage node as part of the data collection; and for at least some of the plurality of encoded data components, make available the allocate storage space for the encoded data component regardless of whether a response from the storage node is received. - View Dependent Claims (21, 22, 23, 24)
-
Specification