Systems and methods for creating copies of data, such as archive copies
First Claim
Patent Images
1. At least one data storage system that creates an archive copy of data originating from a file system of one or more client computers, comprising:
- a hierarchical storage system having—
a storage manager server computer, andat least one data store subsystem,wherein the storage manager server computer includes at least one storage policy for directing storage operations,wherein the one or more client computers are coupled for communication among the storage manager server computer and the data store subsystem,wherein the data store subsystem is geographically separated from the one or more client computers, andwherein the hierarchical storage system comprises;
an archive copy creation subsystem, wherein the archive copy creation subsystem creates an archive copy of at least a subset of data from the file system based on the storage policy of the storage manager server computer,wherein the archive copy encapsulates a header file having cryptographically and substantially unique identifiers of each of multiple data files within the subset of data,wherein the archive copy creation subsystem inserts the header file into a beginning of the archive copy prior to transfer of the archive copy to the data store subsystem,wherein the archive copy creation subsystem includes;
a data selection component, wherein the data selection component selects the subset of data to be copied based at least in part on the storage policy,
wherein the selected subset of data is to be stored for a longer period of time than other data;
an indexing component, wherein the indexing component creates an index of content within the selected subset of data; and
a data adjustment component, communicatively coupled to the data selection component,
wherein the data adjustment component—
performs data compression of the selected subset of data;
performs data deduplication for the selected subset of data, or
performs both data compression and data deduplication for the selected subset of data.
4 Assignments
0 Petitions
Accused Products
Abstract
A system and method of creating archive copies of data sets is described. In some examples, the system creates an archive copy from an original data set. In some examples, the system creates an archive copy when creating a recovery copy for a data set. In some examples, the system creates a copy without redundant data, and then encrypts the data set.
-
Citations
20 Claims
-
1. At least one data storage system that creates an archive copy of data originating from a file system of one or more client computers, comprising:
a hierarchical storage system having— a storage manager server computer, and at least one data store subsystem, wherein the storage manager server computer includes at least one storage policy for directing storage operations, wherein the one or more client computers are coupled for communication among the storage manager server computer and the data store subsystem, wherein the data store subsystem is geographically separated from the one or more client computers, and wherein the hierarchical storage system comprises; an archive copy creation subsystem, wherein the archive copy creation subsystem creates an archive copy of at least a subset of data from the file system based on the storage policy of the storage manager server computer, wherein the archive copy encapsulates a header file having cryptographically and substantially unique identifiers of each of multiple data files within the subset of data, wherein the archive copy creation subsystem inserts the header file into a beginning of the archive copy prior to transfer of the archive copy to the data store subsystem, wherein the archive copy creation subsystem includes; a data selection component, wherein the data selection component selects the subset of data to be copied based at least in part on the storage policy,
wherein the selected subset of data is to be stored for a longer period of time than other data;an indexing component, wherein the indexing component creates an index of content within the selected subset of data; and a data adjustment component, communicatively coupled to the data selection component,
wherein the data adjustment component—
performs data compression of the selected subset of data;
performs data deduplication for the selected subset of data, or
performs both data compression and data deduplication for the selected subset of data.- View Dependent Claims (2, 3, 4, 5, 6, 18, 19, 20)
-
7. A method of archiving an original set of data created by a file system, the method comprising:
-
at a first time, identifying one or more redundant data objects within the original set of data and creating a copy of the original set of data that does not include the identified one or more redundant data objects, wherein the identifying is based on cryptographically and substantially unique identifiers of data objects in the original set of data, wherein the cryptographically and substantially unique identifiers are based on content of the data objects in the original set, and, wherein identifying includes searching an index of the cryptographically and substantially unique identifiers; appending at least one header file to the beginning of the copy of the original set of data to create an archive copy, before a second time that is after the first time, wherein the at least one header file includes the cryptographically and substantially unique identifiers of the data objects of the copy of the original set of data; at the second time after the first time, encrypting or compressing the copy of the original set of data; and at a third time after the first time and the second time, transferring the archive copy to a secondary storage device, wherein the archive copy is stored in the secondary storage device in a geographical located separated from the file system, wherein the archive copy is to be stored for a longer period of time than the original set of data, wherein the archive copy is in a format that differs from a format of the original set of data, and wherein data within the archive copy cannot be used by applications that created the original set of data without first decrypting, decompressing or converting the data within the archive copy. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. At least one non-transitory computer-readable medium storing instructions that, when executed by at least one data processing device, causes the creation of a copy of a production volume of data for archiving the production volume, comprising:
-
receiving or accessing two or more secondary copies of the production volume of the data, wherein the two or more secondary copies include multiple instances of one or more data objects within the volume of data; creating or accessing an index of cryptographically and substantially unique identifiers of each data object within the two or more secondary copies of the production volume; identifying multiple instances of the one or more data objects using the index of cryptographically and substantially unique identifiers; storing the data from the two or more secondary copies into an archive copy, wherein the stored data includes only one instance for at least one data object having multiple instances, wherein the stored data includes at least part of the index of cryptographically and substantially unique identifiers in a header file that is at a beginning of the archive copy; encrypting or compressing the data stored within the archive copy, wherein the archive copy is for storing data for a longer period of time than the two or more secondary copies; and after encrypting or compressing, transferring the archive copy to a secondary storage device. - View Dependent Claims (15, 16, 17)
-
Specification