Systems and techniques for data recovery in a keymapless data storage system
First Claim
Patent Images
1. A computer-implemented method, comprising:
- storing, by one or more computing systems, a plurality of components corresponding to a data object in different locations of a data storage system, the plurality of components being generated by applying a redundancy encoding to the data object;
generating, by the one or more computing systems and based at least in part on a configuration of the data storage system, a manifest for the data object that includes at least;
locations, in the data storage system, of the plurality of components; and
information that identifies at least one construction of the data object from a subset of the plurality of components, the at least one construction based at least in part on the configuration;
storing the manifest in a different data storage system;
detecting, by the one or more computing systems, inaccessibility of the manifest from the different data storage system; and
as a result of detecting the inaccessibility of the manifest;
determining, by the one or more computing systems, without access to the manifest and based at least in part on a search parameter obtained from a data object identifier corresponding to the data object, the locations of the plurality of components, wherein the data object identifier includes information indicative of a location of the generated manifest;
determining, by the one or more computing systems, without access to the manifest, and based at least in part on the determined locations and based at least in part on the configuration, a construction of the data object from the plurality of components; and
regenerating, by the one or more computing systems, the manifest based at least in part on the determined construction.
1 Assignment
0 Petitions
Accused Products
Abstract
Components of a data object are distributed throughout a data storage system. Manifests are used to store the locations of the components of data objects in a data storage system to allow for subsequent reconstruction of the data objects. The manifests may be stored in another data storage system when cost projections indicate it being economical to do so. If a manifest for a data object becomes lost or otherwise inaccessible, clues are used to regenerate the manifest, thereby providing a continued ability to access the components of the data object to reconstruct the data object.
240 Citations
26 Claims
-
1. A computer-implemented method, comprising:
-
storing, by one or more computing systems, a plurality of components corresponding to a data object in different locations of a data storage system, the plurality of components being generated by applying a redundancy encoding to the data object; generating, by the one or more computing systems and based at least in part on a configuration of the data storage system, a manifest for the data object that includes at least; locations, in the data storage system, of the plurality of components; and information that identifies at least one construction of the data object from a subset of the plurality of components, the at least one construction based at least in part on the configuration; storing the manifest in a different data storage system; detecting, by the one or more computing systems, inaccessibility of the manifest from the different data storage system; and as a result of detecting the inaccessibility of the manifest; determining, by the one or more computing systems, without access to the manifest and based at least in part on a search parameter obtained from a data object identifier corresponding to the data object, the locations of the plurality of components, wherein the data object identifier includes information indicative of a location of the generated manifest; determining, by the one or more computing systems, without access to the manifest, and based at least in part on the determined locations and based at least in part on the configuration, a construction of the data object from the plurality of components; and regenerating, by the one or more computing systems, the manifest based at least in part on the determined construction. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-implemented method, comprising:
-
generating a manifest for a data object, the manifest including at least; locations, in a data storage system, of a plurality of components corresponding to the data object, the plurality of components generated by applying a redundancy encoding to the data object; and construction information for the data object identifying a subset of the plurality of components; and regenerating the manifest by at least; determining, based at least in part on a search parameter obtained based at least in part on an identifier of the data object, the locations of the plurality of components, wherein the identifier of the data object includes information indicative of a location of the generated manifest; determining, without access to the manifest, a construction of the data object using a subset of the plurality of components; and processing at least the determined construction to create the manifest. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A system, comprising:
-
one or more processors; and memory, including instructions that, if executed by the one or more processors, cause the system to; identify, based at least in part on characteristics of a data object, a subset of a set of data objects persistently stored among a plurality of data storage devices, the subset comprising components of the data object; determine, based at least in part on comparing a hash generated based on a potential ordering of the subset to a hash derived from an identifier of the data object, a construction of the data object using the identified subset; and generate, based at least in part on the construction of the data object and the configuration, a manifest for the data object, the manifest comprising a specification of the subset informing construction of the data object from the subset. - View Dependent Claims (17, 18, 19, 20)
-
-
21. One or more non-transitory computer-readable storage media having stored thereon instructions that, if executed by one or more processors of a computer system, cause the computer system to generate a manifest for a data object by at least:
-
locating, in a data storage system, a plurality of components of the data object that are combinable to construct the data object, the locating based at least in part on information indicative of a composition of the data object obtained from an identifier of the data object; determining, without access to the manifest, and based at least in part on comparing a hash generated based on a potential ordering of a subset of plurality of components to a hash derived from an identifier of the data object, a construction of the data object using the subset of the plurality of components; and generating, based at least in part on the determined construction, the manifest to include the construction and locations of the subset of the plurality of components. - View Dependent Claims (22, 23, 24, 25, 26)
-
Specification