System and method for storing redundant information
First Claim
1. A computer system for restoring data from a sequential storage medium, wherein the data has been deduplicated, the system comprising:
- at least one processor;
at least one data storage device coupled to the at least one processor;
a receiving unit configured to receive a request to restore at least first and second different data objects from a deduplicated copy of the first and second data objects,wherein the deduplicated copy of the first and second data objects is stored on a sequential storage medium,wherein the first and second data objects, prior to deduplication and storage on the sequential storage medium, had multiple, identical instances, andwherein the deduplicated copy contains one instance, stored on the sequential storage medium, of the first and second data objects, and information describing one or more references to the one instance of the first and second data objects;
an identifying unit, in response to the received request, configured to—
when a requested first or second data object is stored as an instance, identify a location on the sequential storage medium of the instance of the respective first or second data objects, andwhen a requested first or second data object is stored as a reference to an instance, identify a location on the sequential storage medium of the instance of the respective first or second data objects, andconfigured to sort the identified locations; and
a restoring unit configured to restore the requested data objects to a random-access storage medium from the sorted locations on the sequential storage medium.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and system for reducing storage requirements and speeding up storage operations by reducing the storage of redundant data includes receiving a request that identifies one or more data objects to which to apply a storage operation. For each data object, the storage system determines if the data object contains data that matches another data object to which the storage operation was previously applied. If the data objects do not match, then the storage system performs the storage operation in a usual manner. However, if the data objects do match, then the storage system may avoid performing the storage operation.
-
Citations
15 Claims
-
1. A computer system for restoring data from a sequential storage medium, wherein the data has been deduplicated, the system comprising:
-
at least one processor; at least one data storage device coupled to the at least one processor; a receiving unit configured to receive a request to restore at least first and second different data objects from a deduplicated copy of the first and second data objects, wherein the deduplicated copy of the first and second data objects is stored on a sequential storage medium, wherein the first and second data objects, prior to deduplication and storage on the sequential storage medium, had multiple, identical instances, and wherein the deduplicated copy contains one instance, stored on the sequential storage medium, of the first and second data objects, and information describing one or more references to the one instance of the first and second data objects; an identifying unit, in response to the received request, configured to— when a requested first or second data object is stored as an instance, identify a location on the sequential storage medium of the instance of the respective first or second data objects, and when a requested first or second data object is stored as a reference to an instance, identify a location on the sequential storage medium of the instance of the respective first or second data objects, and configured to sort the identified locations; and a restoring unit configured to restore the requested data objects to a random-access storage medium from the sorted locations on the sequential storage medium. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer system for restoring data from a single-instance copy on a sequential storage medium, comprising:
-
a receiving unit configured to receive a request to restore one or more data objects from a single-instance copy of the data objects on a sequential storage medium, wherein some of the data objects are or were identical, and wherein the single-instance copy contains information describing a first instance of each of the one or more data objects, and one or more references to the first instances as stored on the sequential storage medium; an identifying unit configured to, for each of the one or more data objects— identify the storage location of the instance when the data object is stored as an instance in the single-instance copy, and identify the storage location of the first instance when the data object is stored as a reference to a first instance in the single-instance copy; and a restoring unit configured to restore the one or more data objects on a random-access storage medium in an order of the identified storage locations on the sequential storage medium.
-
-
8. A non-transitory computer-readable medium containing instructions for controlling a computer system to execute a method of copying a deduplicated copy of data from a sequential storage medium to a random-access storage medium for data restoration, the method comprising:
-
receiving a request to restore data objects from a deduplicated copy of data objects stored on a sequential storage medium, wherein the deduplicated copy contains information describing a first instance of each of the data objects, and one or more references to the first instances as stored on the sequential storage medium; in response to the request, recreating at least a portion of the deduplicated copy on the random-access storage medium, wherein the references in the recreated copy refer to first instances as stored on the random-access storage medium; receiving a request to restore one of the data objects to a destination location; determining whether the data object to be restored is stored as an instance or a reference in the deduplicated copy on the random-access storage medium; and when the data object is stored as an instance, storing the instance in the destination location, but when the data object is stored as a reference to a first instance, storing, if necessary, the first instance in the destination location. - View Dependent Claims (9, 10, 11)
-
-
12. A method for copying a deduplicated copy of data from a sequential storage medium to a random-access storage medium for data restoration, the method comprising:
-
receiving a request to restore data objects from a deduplicated copy of data objects stored on a sequential storage medium, wherein the deduplicated copy contains information describing a first instance of each of the data objects, and one or more references to the first instances as stored on the sequential storage medium; in response to the request, recreating at least a portion of the deduplicated copy on the random-access storage medium, wherein the references in the recreated copy refer to first instances as stored on the random-access storage medium; receiving a request to restore one of the data objects to a destination location; determining whether the data object to be restored is stored as an instance or a reference in the deduplicated copy on the random-access storage medium; and when the data object is stored as an instance, storing the instance in the destination location, but when the data object is stored as a reference to a first instance, storing, if necessary, the first instance in the destination location. - View Dependent Claims (13, 14, 15)
-
Specification