Extracting data changes and storing data history to allow for instantaneous access to and reconstruction of any point-in-time data
First Claim
1. A method for capturing and storing a data history of a file to enable instantaneous access to and reconstruction of any version of the file, comprising:
- for a first version of the file at a first location;
storing a full copy of the file at a second location;
generating a first byte range index at the second location;
wherein the first byte range index references the entire contents of the full copy of the file;
storing the first byte range index at the second location; and
labeling the contents of the file referenced by the first byte range index as the first version of the file;
for a second version of the file at the first location;
comparing the second version of the file to the first version of the file to generate one or more delta strings associated with the second version of the file;
storing the one or more delta strings associated with the second version of the file at the second location;
generating a second byte range index at the second location that refers to bytes in the full copy of the file and to bytes in the one or more delta strings associated with the second version of the file;
wherein the second byte range index references the entire contents of the second version of the file;
storing the second byte range index at the second location;
wherein storing the second byte range index does not overwrite the first byte range index;
labeling the contents referenced by the second byte range index as the second version of the file; and
using the second byte range index to enable instantaneous access to and reconstruction of the second version of the file without having to apply to the full copy of the file the one or more delta strings associated with the second version of the file.
24 Assignments
0 Petitions
Accused Products
Abstract
A “forward” delta data management technique uses a “sparse” index associated with a delta file to achieve both delta management efficiency and to eliminate read latency while accessing history data. The invention may be implemented advantageously in a data management system that provides real-time data services to data sources associated with a set of application host servers. To facilitate a given data service, a host driver embedded in an application server connects an application and its data to a cluster. The host driver captures real-time data transactions, preferably in the form of an event journal that is provided to the data management system. In particular, the driver functions to translate traditional file/database/block I/O into a continuous, application-aware, output data stream. In an illustrative embodiment, a given application aware data stream is processed through a multi-stage data reduction process to produce a compact data representation from which an “any point-in-time” reconstruction of the original data can be made.
-
Citations
14 Claims
-
1. A method for capturing and storing a data history of a file to enable instantaneous access to and reconstruction of any version of the file, comprising:
-
for a first version of the file at a first location; storing a full copy of the file at a second location; generating a first byte range index at the second location; wherein the first byte range index references the entire contents of the full copy of the file; storing the first byte range index at the second location; and labeling the contents of the file referenced by the first byte range index as the first version of the file; for a second version of the file at the first location; comparing the second version of the file to the first version of the file to generate one or more delta strings associated with the second version of the file; storing the one or more delta strings associated with the second version of the file at the second location; generating a second byte range index at the second location that refers to bytes in the full copy of the file and to bytes in the one or more delta strings associated with the second version of the file; wherein the second byte range index references the entire contents of the second version of the file; storing the second byte range index at the second location; wherein storing the second byte range index does not overwrite the first byte range index; labeling the contents referenced by the second byte range index as the second version of the file; and using the second byte range index to enable instantaneous access to and reconstruction of the second version of the file without having to apply to the full copy of the file the one or more delta strings associated with the second version of the file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
Specification