ARCHIVING DATA OBJECTS USING SECONDARY COPIES
First Claim
1. A system to create archive copies of one or more secondary copies of production data stored by a production computing system, wherein the production data contains multiple data files, the system comprising:
- at least one processor; and
at least one data storage device storing a data structure, wherein the data structure tracksdata identifying the data files for which the system has created secondary copies, andlogical locations of the secondary copies of the data files;
wherein the secondary copies are stored on a different data storage device than the production data; and
wherein the data storage device stores instructions to be executed by the at least one processor, the instructions comprising steps to archive data files for the production data byapplying rules to determine that certain data files are to be archived;
verifying that previously-created secondary copies of the certain data files to be archived exist in the different data storage device; and
replacing the certain data files in the production data with stubs, pointers or logical addresses;
wherein the first data structure provides an association between the stubs, pointers or logical addresses and the logical locations of the secondary copies of the certain data files,whereby the system archives the certain data files without creating an additional secondary copy of the certain data files.
2 Assignments
0 Petitions
Accused Products
Abstract
A system for archiving data objects using secondary copies is disclosed. The system creates one or more secondary copies of primary copy data that contains multiple data objects. The system maintains a first data structure that tracks the data objects for which the system has created secondary copies and the locations of the secondary copies. To archive data objects in the primary copy data, the system identifies data objects to be archived, verifies that previously-created secondary copies of the identified data objects exist, and replaces the identified data objects with stubs. The system maintains a second data structure that both tracks the stubs and refers to the first data structure, thereby creating an association between the stubs and the locations of the secondary copies.
24 Citations
19 Claims
-
1. A system to create archive copies of one or more secondary copies of production data stored by a production computing system, wherein the production data contains multiple data files, the system comprising:
-
at least one processor; and at least one data storage device storing a data structure, wherein the data structure tracks data identifying the data files for which the system has created secondary copies, and logical locations of the secondary copies of the data files; wherein the secondary copies are stored on a different data storage device than the production data; and wherein the data storage device stores instructions to be executed by the at least one processor, the instructions comprising steps to archive data files for the production data by applying rules to determine that certain data files are to be archived; verifying that previously-created secondary copies of the certain data files to be archived exist in the different data storage device; and replacing the certain data files in the production data with stubs, pointers or logical addresses; wherein the first data structure provides an association between the stubs, pointers or logical addresses and the logical locations of the secondary copies of the certain data files, whereby the system archives the certain data files without creating an additional secondary copy of the certain data files. - View Dependent Claims (2, 3)
-
-
4. A non-transitory computer-readable storage medium whose contents cause a data storage system to perform a method for archiving multiple data objects included in primary copy data, the method comprising:
-
obtaining, from a client computing device, both full and incremental backup copies of the client'"'"'s primary copy data; using the received backup copies to create a secondary copy of multiple data objects included in the primary copy data; for each of the multiple data objects for which a secondary copy was created, adding an entry for the data object to a data structure, wherein the entry includes an identifier associated with the data object; after creating the secondary copy, identifying one or more of the multiple data objects that satisfy one or more predetermined archival criteria; and
,for each of the identified one or more data objects; looking up the identified data object in the data structure using the identifier associated with the identified data object; receiving a token for the identified data object; and replacing the identified data object in the primary copy data with a stub referencing the secondary copy of the identified data object, wherein the stub includes the token. - View Dependent Claims (5, 6, 7, 8)
-
-
9. A computer-implemented method for managing backup and archiving of information in an information management system, wherein the information management system includes first and second data storage systems, the computer-implemented method comprising:
-
scanning a first data storage system, wherein the first storage system includes primary copies of data objects, and wherein the scanning includes gathering metadata from the primary copies; on a second, different data storage system, creating corresponding secondary copies of the primary copies of the data objects, creating a database of the gathered metadata; and when the primary copies of the data objects meet one or more archiving conditions, then creating stubs in the primary copy for the data objects, wherein the stubs replace the data objects in the primary copies, and wherein the stubs reference a logical location of the secondary copy in the second data storage system. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A system for archiving data objects using secondary copies, the system comprising:
-
at least one processor; at least one memory coupled to the processor; a first software component stored on the memory that creates one or more secondary copies of primary copy data that contains multiple data objects; a first data structure stored on the memory that tracks the data objects for which secondary copies have been created and locations of the secondary copies; a second software component stored on the memory that identifies data objects to be archived, verifies that previously-created secondary copies of the identified data objects exist, and replaces the identified data objects with stubs; and a second data structure stored on the memory that both tracks the stubs and refers to the first data structure, thereby creating an association between the stubs and the locations of the secondary copies. - View Dependent Claims (17, 18, 19)
-
Specification