SYSTEMS AND METHODS FOR CREATING COPIES OF DATA, SUCH AS ARCHIVE COPIES
First Claim
Patent Images
1. A data storage system that creates and stores an archive copy of data originating from a file system, comprising:
- a hierarchical storage system having a storage manager server computer, at lest one data store subsystem, and one or more client computers coupled among the storage manager and the data store subsystem,wherein the storage manager includes at least one storage policy for directing storage operations,wherein each client computer includes a data agent that copies data originating from the file system,wherein the one or more client computers communicate the copied data to be included in storage operations performed by the at least one data store subsystem based on the storage policy, andwherein the hierarchical storage system comprises;
a copy creation subsystem, wherein the copy creation subsystem, based on the storage policy of the storage manager, creates a copy of at least a subset of data from the file system, the copy creation subsystem including;
a data selection component, wherein the data selection component selects the subset of data to be copied;
an indexing component, wherein the indexing component indexes content from the subset of data; and
a data redundancy component, communicatively coupled to the data selection component, wherein the data redundancy component identifies at least some redundancies within the subset of data and creates a copy of the subset of data without the identified redundancies; and
wherein the data store subsystem is communicatively coupled to the copy creation subsystem, and wherein the data store subsystem writes the created copy of the subset of data to data storage media.
4 Assignments
0 Petitions
Accused Products
Abstract
A system and method of creating archive copies of data sets is described. In some examples, the system creates an archive copy from an original data set. In some examples, the system creates an archive copy when creating a recovery copy for a data set. In some examples, the system creates a copy without redundant data, and then encrypts the data set.
-
Citations
28 Claims
-
1. A data storage system that creates and stores an archive copy of data originating from a file system, comprising:
a hierarchical storage system having a storage manager server computer, at lest one data store subsystem, and one or more client computers coupled among the storage manager and the data store subsystem, wherein the storage manager includes at least one storage policy for directing storage operations, wherein each client computer includes a data agent that copies data originating from the file system, wherein the one or more client computers communicate the copied data to be included in storage operations performed by the at least one data store subsystem based on the storage policy, and wherein the hierarchical storage system comprises; a copy creation subsystem, wherein the copy creation subsystem, based on the storage policy of the storage manager, creates a copy of at least a subset of data from the file system, the copy creation subsystem including; a data selection component, wherein the data selection component selects the subset of data to be copied; an indexing component, wherein the indexing component indexes content from the subset of data; and a data redundancy component, communicatively coupled to the data selection component, wherein the data redundancy component identifies at least some redundancies within the subset of data and creates a copy of the subset of data without the identified redundancies; and wherein the data store subsystem is communicatively coupled to the copy creation subsystem, and wherein the data store subsystem writes the created copy of the subset of data to data storage media. - View Dependent Claims (2, 3, 4, 5)
-
6. A method of archiving an original set of data created by a file system, the method comprising:
-
at a first time, identifying one or more redundant data objects within the original set of data and creating a copy of the original set of data that does not include the identified one or more redundant data objects, wherein the original set of data is a production copy of the set of data that is for primary use by users of the set of data; and at a second time after the first time, encrypting the created copy of the original set of data; and at a third time after the first time and the second time, storing the encrypted copy of the original set of data as an archive copy, wherein the archive copy is stored in a data storage medium separate from the file system, and wherein the archive copy is in a format that differs from a format of the production copy of the set of data, and wherein data within the archive copy can not be used by applications that created the set of data without first converting the data within the archive copy into another format that is different than the format of the archive copy. - View Dependent Claims (7, 8)
-
-
9. A system for creating an archive copy of a data set created by a file system, comprising:
-
a signature component, wherein the signature component generates a alphanumeric identification signature for all data objects within the data set and stores the alphanumeric identification signatures in a signature database, wherein the alphanumeric identification signature is at least partially based on a first characteristic of the data object; an encryption component, wherein the encryption component encrypts the data objects of the data set; and a copy component, wherein the copy component; creates an archive copy of the encrypted data objects, wherein the archive copy represents a logical view of storage of the data objects; and stores the archive copy via data chunks to a physical storage component; and stores information related to locations of the encrypted data objects on the physical storage component in a location database separate from the signature database. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A computer-readable medium whose contents cause a data storage system to perform a method of building an archive of data objects within a set of data created by a file system, wherein at least some of the data objects of the set of data are stored within a primary copy of the set of data and wherein at least some of the data objects of the set of data are stored as copies within one or more secondary copies of the set of data, the method comprising:
-
identifying a data object to be stored in an archive of data objects; creating an identifier for the data object, wherein creating the identifier includes calculating an alphanumeric representation for the data object; comparing the identifier with other identifiers for data objects already stored in the archive of data objects; when the comparison determines that the identifier for the data object is different than the other identifiers, encrypting a copy of the data object and transferring the encrypted copy of the data object to the archive of data objects;
orwhen the comparison determines that the identifier for the data object is similar to one or more of the other identifiers, transferring information that represents the data object to the archive of data objects; and storing the transferred data object, or the information that represents the data object, to storage media that do not contain the primary copy of the set of data or the secondary copies of the set of the data. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. A method for creating a copy of a production volume of data used for archiving of the production volume, the method comprising:
-
creating two or more secondary copies of the production volume of the data, wherein the two or more secondary copies include multiple instances of one or more data objects within the volume of data; removing at least some of the multiple instances of the one or more data objects; storing the data from the two or more secondary copies into a tertiary copy, wherein the stored data includes only one instance for at least one data objects having multiple instances; and encrypting the data stored within the tertiary copy. - View Dependent Claims (23)
-
-
24. A method for building an index associated with a secondary copy of a data set, wherein the index relates to single instancing the data set, comprising:
-
receiving an indication that a previous index related to the single instanced data set is unrecoverable; identifying one or more data files stored within a secondary copy of a data set created by a file system that were associated with the previous index; extracting information related to the data file within a header of the data file, wherein the extracted information includes information that identifies the data file with respect to other data files stored within the secondary copy of the data set; adding the extracted information to the index associated with the secondary copy. - View Dependent Claims (25, 26, 27, 28)
-
Specification