Application-aware and remote single instance data management
First Claim
1. A system for copying files or data objects from a computer system at a first location to a second location, including single-instancing the files or data objects with a plurality of differing associated metadata, the system comprising:
- a processor; and
multiple hardware components, including;
a storage operation manager component coupled to the processor and configured to receive a request to copy a file or data object from a computer system at a first location to a second location, wherein the first location and the second location are geographically remote from each other;
a file cache component at the first location configured to;
receive the file or data object to be copied from the computer system, andstore the file or data object before it is copied to the second location;
a single instance database component at the first location configured to;
extract metadata associated with the file or data object,query the second location to determine whether the file or data object is already stored at the second location,wherein the query includes the extracted metadata, andreceive a response from the second location that indicates whether the file or data object is already stored at the second location, wherein the response is based on determining at the second location whether the extracted metadata matches metadata from any files or data objects stored at the second location; and
wherein the single instance database component at the first location is further configured to;
when the file or data object is not already stored at the second location, copy the file or data object from the file cache component at the first location to the second location, andwhen the file or data object is already stored at the second location and the extracted metadata does not match metadata stored at the second location,(a) single-instance the file or data object at the second location by declining to copy the file or data object thereto from the file cache component at the first location, and(b) copy the extracted metadata to the second location and associate the extracted metadata with the already-stored file or data object at the second location, thereby storing for a single stored instance of the file or data object at the second location at least a first metadata version and a second metadata version that is different from the first metadata version.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for reducing storage requirements and speeding up storage operations by reducing the storage of redundant data includes receiving a request that identifies one or more files or data objects to which to apply a storage operation. For each file or data object, the storage system determines if the file or data object contains data that matches another file or data object to which the storage operation was previously applied, based on awareness of the application that created the data object. If the data objects do not match, then the storage system performs the storage operation in a usual manner. However, if the data objects do match, then the storage system may avoid performing the storage operation with respect to the particular file or data object.
-
Citations
19 Claims
-
1. A system for copying files or data objects from a computer system at a first location to a second location, including single-instancing the files or data objects with a plurality of differing associated metadata, the system comprising:
-
a processor; and multiple hardware components, including; a storage operation manager component coupled to the processor and configured to receive a request to copy a file or data object from a computer system at a first location to a second location, wherein the first location and the second location are geographically remote from each other; a file cache component at the first location configured to; receive the file or data object to be copied from the computer system, and store the file or data object before it is copied to the second location; a single instance database component at the first location configured to; extract metadata associated with the file or data object, query the second location to determine whether the file or data object is already stored at the second location, wherein the query includes the extracted metadata, and receive a response from the second location that indicates whether the file or data object is already stored at the second location, wherein the response is based on determining at the second location whether the extracted metadata matches metadata from any files or data objects stored at the second location; and wherein the single instance database component at the first location is further configured to; when the file or data object is not already stored at the second location, copy the file or data object from the file cache component at the first location to the second location, and when the file or data object is already stored at the second location and the extracted metadata does not match metadata stored at the second location, (a) single-instance the file or data object at the second location by declining to copy the file or data object thereto from the file cache component at the first location, and (b) copy the extracted metadata to the second location and associate the extracted metadata with the already-stored file or data object at the second location, thereby storing for a single stored instance of the file or data object at the second location at least a first metadata version and a second metadata version that is different from the first metadata version. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A non-transitory computer-readable storage medium encoded with instructions for controlling a computer system to transfer files from a computer system at a source location to a target location, by a method comprising:
-
receiving a request to transfer a file from a computer system at a source location to a target location, wherein the target location includes a single instance database, and wherein the source location and the target location are geographically remote from each other; sending a request to the single instance database to determine whether the file matches any file already stored by the single instance database and wherein metadata extracted from the file to be transferred matches any metadata associated with any file already stored by the single instance database; receiving a determination from the single instance database as to whether the file matches any file already stored by the single instance database; when the file does not match any file already stored at the target location, storing the file from the computer system at the source location to the single instance database at the target location; receiving a determination from the single instance database as to whether the extracted metadata matches any metadata associated with any file stored by the single instance database; and when the file is already stored by the single instance database and the extracted metadata does not match metadata associated with the already-stored file;
(a) declining to store the file from the computer system to the single instance database at the target location, and (b) storing the extracted metadata from the computer system to the single instance database at the target location and associating the extracted metadata with the already-stored file, thereby storing for a single stored instance of the file at the single instance database at least a first metadata version and a second metadata version that is different from the first metadata version. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A method for transferring files from a computer system at a source location to a target location, the method comprising:
-
receiving a request to transfer a file from a computer system at a source location to a target location, wherein the target location includes a single instance database, and wherein the source location and the target location are geographically remote from each other; sending a request to the single instance database to determine whether the file matches any file already stored by the single instance database and wherein metadata extracted from the file to be transferred matches any metadata associated with any file already stored by the single instance database; receiving a determination from the single instance database as to whether the file matches any file already stored by the single instance database; when the file does not match any file already stored at the target location, storing the file from the computer system at the source destination to the single instance database at the target location; receiving a determination from the single instance database as to whether the extracted metadata matches any metadata associated with any file stored by the single instance database; and when the file is already stored by the single instance database and the extracted metadata does not match metadata associated with the already-stored file;
(a) declining to store the file from the computer system at the source location to the single instance database at the target location, and (b) storing the extracted metadata from the computer system to the target location and associating the extracted metadata with the already-stored file, thereby storing for a single stored instance of the file at the single instance database at least a first metadata version and a second metadata version that is different from the first metadata version. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification