System and method for storing redundant information
First Claim
1. A method performed by a computer system of storing a single-instance copy on a sequential storage medium, wherein the single instance copy is created from copies of original data objects, the method comprising:
- receiving or accessing multiple data objects from a computer network;
wherein some of the multiple data objects are substantially identical according to a hashing algorithm;
storing, on a random-access storage medium, a single-instance copy of the multiple data objects;
wherein the single-instance copy contains a copy of only one of the substantially identical data objects; and
wherein the random-access storage medium includes at least one reference to the copy of the only one of the substantially identical data objects;
storing the single-instance copy of the multiple data objects on a sequential storage medium by;
transferring the copy of the only one of the substantially identical data objects from the random-access storage medium to the sequential storage medium; and
transferring the at least one reference to the copy of the substantially identical data objects from the random-access storage medium to the sequential storage medium after the copy of the only one of the substantially identical data objects is stored on the sequential storage medium.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for reducing storage requirements and speeding up storage operations by reducing the storage of redundant data includes receiving a request that identifies one or more data objects to which to apply a storage operation. For each data object, the storage system determines if the data object contains data that matches another data object to which the storage operation was previously applied. If the data objects do not match, then the storage system performs the storage operation in a usual manner. However, if the data objects do match, then the storage system may avoid performing the storage operation.
-
Citations
20 Claims
-
1. A method performed by a computer system of storing a single-instance copy on a sequential storage medium, wherein the single instance copy is created from copies of original data objects, the method comprising:
-
receiving or accessing multiple data objects from a computer network; wherein some of the multiple data objects are substantially identical according to a hashing algorithm; storing, on a random-access storage medium, a single-instance copy of the multiple data objects; wherein the single-instance copy contains a copy of only one of the substantially identical data objects; and wherein the random-access storage medium includes at least one reference to the copy of the only one of the substantially identical data objects; storing the single-instance copy of the multiple data objects on a sequential storage medium by; transferring the copy of the only one of the substantially identical data objects from the random-access storage medium to the sequential storage medium; and transferring the at least one reference to the copy of the substantially identical data objects from the random-access storage medium to the sequential storage medium after the copy of the only one of the substantially identical data objects is stored on the sequential storage medium. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method performed by a computer system of storing a de-duplicated copy of data objects on a sequential storage medium, comprising:
-
receiving one or more data objects in a hierarchy, wherein some of the data objects are identified as identical based on hashing; storing, on a random-access storage medium, a de-duplicated copy of the one or more data objects, wherein the de-duplicated copy contains information describing— a first instance of each of the one or more data objects, and one or more references to the one or more first instances as stored on the random-access storage medium; and transferring the de-duplicated copy of the one or more data objects from the random-access storage medium to a sequential storage medium for storage on the sequential storage medium. - View Dependent Claims (10, 11)
-
-
12. A non-transitory computer-readable medium containing instructions for controlling a computer system to execute a method of storing a copy of data objects on a sequential storage medium, the method comprising:
-
receiving or accessing multiple data objects from a computer network; wherein some of the multiple data objects are substantially identical according to a hashing algorithm; storing, on a random-access storage medium, a single-instance copy of the multiple data objects; wherein the single-instance copy contains a copy of only one of the substantially identical data objects; and wherein the random-access storage medium includes at least one reference to the copy of the only one of the substantially identical data objects; storing the single-instance copy of the multiple data objects on a sequential storage medium by; transferring the copy of the only one of the substantially identical data objects from the random-access storage medium to the sequential storage medium; and transferring the at least one reference to the copy of the substantially identical data objects from the random-access storage medium to the sequential storage medium after the copy of the only one of the substantially identical data objects is stored on the sequential storage medium. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
Specification