Synchronization of storage using log files and snapshots
First Claim
1. A system for processing data, comprising:
- a deduplicating system that stores a copy of at least a portion of data stored in a data storage system at a first snapshot time at least in part by generating a first snapshot of the data, wherein the deduplicating system includes a fingerprint index that comprises a list of fingerprints associated with every unique segment stored in the deduplicating system, wherein the data storage system includes a stored log file that stores one or more data changes and times associated with each data change, and wherein the deduplicating system generates the first snapshot by;
breaking at least a portion of the data at the data storage system into a plurality of segments;
computing a fingerprint for each of at least a subset of the plurality of segments;
storing the fingerprints in the first snapshot, wherein a snapshot includes a list of fingerprints of data comprising at least the portion of the data stored in the data storage system, and wherein a fingerprint that correspond to identical segments is repeated in a snapshot;
identifying, based at least in part on the fingerprint index, fingerprints in the snapshot that are not in the fingerprint index; and
storing only segments that correspond to the identified fingerprints such that each stored segment is able to be used to reconstruct the data stored in the data storage system;
an interface for receiving an indication to revert data stored in the data storage system to a state at a snapshot time; and
a processor configured to;
determine a first subset of the data stored in the data storage system that has changed since a prior snapshot using the stored log file; and
determine a second subset of the data stored in the data storage system that has changed between the prior snapshot and the snapshot time using a first list of fingerprints associated with the prior snapshot and a second list of fingerprints associated with the snapshot time.
9 Assignments
0 Petitions
Accused Products
Abstract
A system for processing data comprises a deduplicating system, an interface, and a processor. The deduplicating system stores a copy of data stored in a data storage system by storing a set of segments that is able to reconstruct the data stored in the data storage system. The data storage system has a stored log file. The stored log file stores a data change and an associated time for the data change. The interface receives an indication to revert data stored in the data storage system to a state at a snapshot time. The processor is configured to determine a first subset of the data stored in the data storage system that has changed since a prior snapshot using the stored log file and to determine a second subset of the data stored in the data storage system that has changed between the prior snapshot and the snapshot time using a first list of fingerprints associated with the prior snapshot and a second list of fingerprints associated with the snapshot time.
43 Citations
19 Claims
-
1. A system for processing data, comprising:
-
a deduplicating system that stores a copy of at least a portion of data stored in a data storage system at a first snapshot time at least in part by generating a first snapshot of the data, wherein the deduplicating system includes a fingerprint index that comprises a list of fingerprints associated with every unique segment stored in the deduplicating system, wherein the data storage system includes a stored log file that stores one or more data changes and times associated with each data change, and wherein the deduplicating system generates the first snapshot by; breaking at least a portion of the data at the data storage system into a plurality of segments; computing a fingerprint for each of at least a subset of the plurality of segments; storing the fingerprints in the first snapshot, wherein a snapshot includes a list of fingerprints of data comprising at least the portion of the data stored in the data storage system, and wherein a fingerprint that correspond to identical segments is repeated in a snapshot; identifying, based at least in part on the fingerprint index, fingerprints in the snapshot that are not in the fingerprint index; and storing only segments that correspond to the identified fingerprints such that each stored segment is able to be used to reconstruct the data stored in the data storage system; an interface for receiving an indication to revert data stored in the data storage system to a state at a snapshot time; and a processor configured to; determine a first subset of the data stored in the data storage system that has changed since a prior snapshot using the stored log file; and determine a second subset of the data stored in the data storage system that has changed between the prior snapshot and the snapshot time using a first list of fingerprints associated with the prior snapshot and a second list of fingerprints associated with the snapshot time. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for processing data, comprising:
-
receiving an indication to revert data stored in a data storage system to a state at a snapshot time, wherein the data storage system for storing data has a stored log file, wherein the stored log file stores a data change and an associated time for the data change, wherein a deduplicating system stores a copy of data stored in the data storage system at a first snapshot time at least in part by generating a first snapshot of the data, wherein the deduplicating system includes a fingerprint index that comprises a list of fingerprints associated with every unique segment stored in the deduplicating system, and wherein the deduplicating system generates the first snapshot by; breaking at least a portion of the data at the data storage system into a plurality of segments; computing a fingerprint for each of at least a subset of the plurality of segments; storing the fingerprints in the first snapshot, wherein a snapshot includes a list of fingerprints of data comprising at least the portion of the data stored in the data storage system, and wherein a fingerprint that correspond to identical segments is repeated in a snapshot; identifying, based at least in part on the fingerprint index, fingerprints in the snapshot that are not in the fingerprint index; and storing only segments that correspond to the identified fingerprints such that each stored segment is able to be used to reconstruct the data stored in the data storage system; determining, using a processor, a first subset of the data stored in the data storage system that has changed since a prior snapshot using the stored log file; and determining a second subset of the data stored in the data storage system that has changed between the prior snapshot and the snapshot time using a first list of fingerprints associated with the prior snapshot and a second list of fingerprints associated with the snapshot time. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer program product for processing data, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for:
-
receiving an indication to revert data stored in a data storage system to a state at a snapshot time, wherein the data storage system for storing data has a stored log file, wherein the stored log file stores a data change and an associated time for the data change, wherein a deduplicating system stores a copy of data stored in the data storage system at a first snapshot time at least in part by generating a first snapshot of the data, wherein the deduplicating system includes a fingerprint index that comprises a list of fingerprints associated with every unique segment stored in the deduplicating system, and wherein the deduplicating system generates the first snapshot by; breaking at least a portion of the data at the data storage system into a plurality of segments; computing a fingerprint for each of at least a subset of the plurality of segments; storing the fingerprints in the first snapshot, wherein a snapshot includes a list of fingerprints of data comprising at least the portion of the data stored in the data storage system, and wherein a fingerprint that correspond to identical segments is repeated in a snapshot; identifying, based at least in part on the fingerprint index, fingerprints in the snapshot that are not in the fingerprint index; and storing only segments that correspond to the identified fingerprints such that each stored segment is able to be used to reconstruct the data stored in the data storage system; determining a first subset of the data stored in the data storage system that has changed since a prior snapshot using the stored log file; and determining a second subset of the data stored in the data storage system that has changed between the prior snapshot and the snapshot time using a first list of fingerprints associated with the prior snapshot and a second list of fingerprints associated with the snapshot time.
-
Specification