Stubbing systems and methods in a data replication environment

US 8,504,515 B2
Filed: 03/30/2010
Issued: 08/06/2013
Est. Priority Date: 03/30/2010
Status: Active Grant

First Claim

Patent Images

1. A method for performing data management operations in a computer network, the method comprising:

monitoring operations associated with a source computing device, the operations operative to write data to a source storage device;

copying the data to a destination storage device based at least in part on the operations, the data comprising at least one first stub file, said copying to the destination storage device comprising processing at least one log file having a plurality of log entries indicative of the operations to replay the operations on the destination storage device;

with one or more computer processors, scanning the data of the destination storage device to identify a common data object repeated between multiple portions of the data on the destination storage device;

creating a copy of the common data object on a secondary storage device;

determining a last access time of each of the multiple data portions of the destination storage device having the common data object;

for each of the multiple data portions having a last access time at or before the time of the creation of the copy of the common data object, replacing the common data object of the particular data portion with a second stub file, wherein the second stub file comprises a tag value not possessed by, and used to distinguish the second stub file from, any of the at least one first stub file, and wherein the second stub file comprises information indicative of a location of the copy of the common data object on the secondary storage device; and

wherein the second stub file is used to recall the copy of the common data object from the secondary storage device during a restore operation.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Stubbing systems and methods are provided for intelligent data management in a replication environment, such as by reducing the space occupied by replication data on a destination system. In certain examples, stub files or like objects replace migrated, de-duplicated or otherwise copied data that has been moved from the destination system to secondary storage. Access is further provided to the replication data in a manner that is transparent to the user and/or without substantially impacting the base replication process. In order to distinguish stub files representing migrated replication data from replicated stub files, priority tags or like identifiers can be used. Thus, when accessing a stub file on the destination system, such as to modify replication data or perform a restore process, the tagged stub files can be used to recall archived data prior to performing the requested operation so that an accurate copy of the source data is generated.

Citations

15 Claims

1. A method for performing data management operations in a computer network, the method comprising:
- monitoring operations associated with a source computing device, the operations operative to write data to a source storage device;
  
  copying the data to a destination storage device based at least in part on the operations, the data comprising at least one first stub file, said copying to the destination storage device comprising processing at least one log file having a plurality of log entries indicative of the operations to replay the operations on the destination storage device;
  
  with one or more computer processors, scanning the data of the destination storage device to identify a common data object repeated between multiple portions of the data on the destination storage device;
  
  creating a copy of the common data object on a secondary storage device;
  
  determining a last access time of each of the multiple data portions of the destination storage device having the common data object;
  
  for each of the multiple data portions having a last access time at or before the time of the creation of the copy of the common data object, replacing the common data object of the particular data portion with a second stub file, wherein the second stub file comprises a tag value not possessed by, and used to distinguish the second stub file from, any of the at least one first stub file, and wherein the second stub file comprises information indicative of a location of the copy of the common data object on the secondary storage device; and
  
  wherein the second stub file is used to recall the copy of the common data object from the secondary storage device during a restore operation.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein said creating a copy of the common data object comprises copying only a single copy of the common data object on the second storage device.
  - 3. The method of claim 1, wherein the at least one first stub file represents data archived from the source storage device.
  - 4. The method of claim 1, wherein determining a last access time comprises determining a last time each of the multiple data portions having the common data object was modified.
  - 5. The method of claim 1, wherein the common data object comprises a 64 kilobyte data block.
  - 6. The method of claim 1, wherein the tag value comprises a unique alphanumeric value.
  - 7. The method of claim 1, wherein each of the multiple portions of the data comprises a different data file.

8. A system for performing data management operations, comprising:
- a destination storage device storing data that comprises at least one stub file, the data copied to the destination storage device from a source storage device and corresponding to data write operations associated with a source computing device, wherein the data is copied to the destination storage device at least partly by processing at least one log file having a plurality of log entries indicative of the operations to replay the operations on the destination storage device;
  
  an archiving module executing in one or more computer processors and configured to;
  
  scan data of the destination storage device to identify a common data object repeated between multiple portions of the data on the destination storage device;
  
  create a copy of the common data object on a secondary storage device;
  
  determine a last access time of each of the multiple data portions of the destination storage device having the common data object;
  
  for ones of the multiple data portions having a last access time at or before the time of the creation of the copy of the common data object, replace the common data object of the particular data portion with a second stub file, wherein the second stub file comprises a tag value not possessed by, and used to distinguish the second stub file from, any of the at least one first stub file, and wherein the second stub file comprises information indicative of a location of the copy of the common data object on the secondary storage device,wherein the second stub file is used to recall the copy of the common data object from the secondary storage device during a restore operation.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein said creating a copy comprises creating a copy of only a single copy of the common data object on the second storage device.
  - 10. The system of claim 8, wherein the at least one first stub file represents data archived from the source storage device.
  - 11. The system of claim 8, wherein the archiving module determines the last access time at least partly by determining a last time each of the multiple data portions having the common data object was modified.
  - 12. The system of claim 8, wherein the common data object comprises a 64 kilobyte data block.
  - 13. The system of claim 8, wherein the tag value comprises a unique alphanumeric value.
  - 14. The system of claim 8, wherein each of the multiple portions of the data comprises a different data file.

15. A non-transitory computer readable medium configured to store software code that is readable by a computing system, wherein the software code is executable on the computing system in order to cause the computing system to perform steps comprising:
- monitoring operations associated with a source computing device, the operations operative to write data to a source storage device;
  
  copying the data to a destination storage device based at least in part on the operations, the data comprising at least one first stub file, said copying to the destination storage device comprising processing at least one log file having a plurality of log entries indicative of the operations to replay the operations on the destination storage device;
  
  with one or more computer processors, scanning the data of the destination storage device to identify a common data object repeated between multiple portions of the data on the destination storage device;
  
  creating a copy of the common data object on a secondary storage device;
  
  determining a last access time of each of the multiple data portions of the destination storage device having the common data object;
  
  for each of the multiple data portions having a last access time at or before the time of creation of the copy of the common data object, replacing the common data object of the particular data portion with a second stub file, wherein the second stub file comprises a tag value not possessed by, and used to distinguish the second stub file from, any of the at least one first stub file, and wherein the second stub file comprises information indicative of a location of the copy of the common data object on the secondary storage device; and
  
  wherein the second stub file is used to recall the copy of the common data object from the secondary storage device during a restore operation.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
CommVault Systems Incorporated
Original Assignee
CommVault Systems Incorporated
Inventors
Prahlad, Anand, Agrawal, Vijay H.
Primary Examiner(s)
Ruiz, Angelica

Application Number

US12/750,067
Publication Number

US 20110246416A1
Time in Patent Office

1,225 Days
Field of Search

None
US Class Current

707/609
CPC Class Codes

G06F 16/1734   Details of monitoring file ...

G06F 16/1748   De-duplication implemented ...

G06F 16/182   Distributed file systems

G06F 16/22   Indexing; Data structures t...

G06F 16/81   Indexing, e.g. XML tags; Da...

G06F 16/951   Indexing; Web crawling tech...

G06F 2201/80   Database-specific techniques

Stubbing systems and methods in a data replication environment

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Stubbing systems and methods in a data replication environment

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links