DE-DUPLICATION SYSTEMS AND METHODS FOR APPLICATION-SPECIFIC DATA

US 20160306708A1
Filed: 06/30/2016
Published: 10/20/2016
Est. Priority Date: 06/24/2008
Status: Abandoned Application

First Claim

Patent Images

1. A system for creating a backup copy of data, the system comprising:

computer readable memory comprising at least a first de-duplication database associated with data generated by at least first and second clients;

a de-duplication module executing on one or more computer processors comprising computer hardware, the de-duplication module receives the data and performs de-duplication as part of a backup of the data, the de-duplication module further configured to;

determine if a duplicate copy of a first portion of the data from the first client exists in the first de-duplication database; and

if a duplicate copy does not exist in the first de-duplication database, storing first metadata that identifies the first client in association with the duplicate copy;

determine if a duplicate copy of a second portion of the data from the second client exists in the first de-duplication database;

if a duplicate copy of the second portion of the data exists in the first de-duplication database, removing the duplicate data in the second portion of the data;

determining whether second metadata in the second portion of the data identifies whether the second client is unique; and

if the second metadata is unique, storing the second metadata in association with the duplicate copy in the first de-duplication database, store the first and second metadata associated with the duplicate copy wherein the first metadata that identifies the first client and the second metadata that identifies the second client are stored in association with the duplicate copy.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Content-aware systems and methods for improving de-duplication, or single instancing, in storage operations. In certain examples, backup agents on client devices parse application-specific data to identify data objects that are candidates for de-duplication. The backup agents can then insert markers or other indictors in the data that identify the location(s) of the particular data objects. Such markers can, in turn, assist a de-duplication manager to perform object-based de-duplication and increase the likelihood that like blocks within the data are identified and single instanced. In other examples, the agents can further determine if a data object of one file type can or should be single-instanced with a data object of a different file type. Such processing of data on the client side can provide for more efficient storage and back-end processing.

83 Citations

View as Search Results

20 Claims

1. A system for creating a backup copy of data, the system comprising:
- computer readable memory comprising at least a first de-duplication database associated with data generated by at least first and second clients;
  
  a de-duplication module executing on one or more computer processors comprising computer hardware, the de-duplication module receives the data and performs de-duplication as part of a backup of the data, the de-duplication module further configured to;
  
  determine if a duplicate copy of a first portion of the data from the first client exists in the first de-duplication database; and
  
  if a duplicate copy does not exist in the first de-duplication database, storing first metadata that identifies the first client in association with the duplicate copy;
  
  determine if a duplicate copy of a second portion of the data from the second client exists in the first de-duplication database;
  
  if a duplicate copy of the second portion of the data exists in the first de-duplication database, removing the duplicate data in the second portion of the data;
  
  determining whether second metadata in the second portion of the data identifies whether the second client is unique; and
  
  if the second metadata is unique, storing the second metadata in association with the duplicate copy in the first de-duplication database, store the first and second metadata associated with the duplicate copy wherein the first metadata that identifies the first client and the second metadata that identifies the second client are stored in association with the duplicate copy.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The system of claim 1 wherein the first and second metadata identify differing operating systems.
  - 3. The system of claim 1 wherein the first and second metadata identify differing permissions.
  - 4. The system of claim 1 wherein the data comprises first application-specific data associated with a first application and second application-specific data associated with a second application.
  - 5. The system of claim 4, further comprising a third module executing on one or more computer processors configured to:
    - parse the second application-specific data that is different in format than the first application-specific data, the second application-specific data comprising a second plurality of data objects;
      
      identify portions within the second plurality of data objects to be considered for de-duplication; and
      
      insert at least one de-duplication indicator in the second application-specific data that identifies at least one location of the identified portions in the second plurality of data objects to be considered for de-duplication.
  - 6. The system of claim 4 wherein the de-duplication module is further configured to determine whether a duplicate copy of the second portion of the data exists in a second de-duplication database.
  - 7. The system of claim 4 wherein the inserted at least one de-duplication indicator in the second application-specific data further identifies that the second de-duplication database is to be used in de-duplicating the application-specific data.
  - 8. The system of claim 4 wherein:
    - the first de-duplication database is configured to store unique blocks of the first portion of the data associated with the first application-specific data; and
      
      the second de-duplication database is configured to store unique blocks of the second portion of the data associated with the second application-specific data, wherein the first de-duplication database is separate and different from the second de-duplication database.
  - 9. The system of claim 4 wherein the inserted at least one de-duplication indicator indicates at least one of the first de-duplication database and the second de-duplication database.
  - 10. The system of claim 4 wherein the first application-specific data is associated with an electronic mail server application.

11. A method for creating a backup copy of data, the method comprising:
- storing a first de-duplication database associated with data generated by at least first and second clients;
  
  performing de-duplication of the data as part of a backup of the data;
  
  determining if a duplicate copy of a first portion of the data from the first client exists in the first de-duplication database;
  
  if a duplicate copy of the first portion of the data does not exist in the first de-duplication database, storing first metadata that identifies the first client in association with the duplicate copy;
  
  determining if a duplicate copy of a second portion of the data from the second client exists in the first de-duplication database;
  
  if a duplicate copy associated with the second portion of the data exists in the first de-duplication database, removing the duplicate data in the second portion of the data;
  
  determining whether second metadata in the second portion of the data identifies whether the second client is unique; and
  
  if the second metadata is unique, storing the second metadata in association with the duplicate copy in the first de-duplication database, wherein the backup copy stores the first and second metadata associated with the duplicate copy wherein the first metadata that identifies the first client and the second metadata that identifies the second client are stored in association with the duplicate copy.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The method of claim 11 wherein the first and second metadata identify differing operating systems.
  - 13. The method of claim 11 wherein the first and second metadata identify differing permissions.
  - 14. The method of claim 11 wherein the data comprises first application-specific data associated with a first application and second application-specific data associated with a second application.
  - 15. The method of claim 14 further comprising:
    - parsing the second application-specific data that is in a different format than the first application-specific data, the second application-specific data comprising a second plurality of second data objects;
      
      identifying portions within the second plurality of second data objects to be considered for de-duplication; and
      
      inserting at least one de-duplication indicator in the second application-specific data that identifies at least one location of the identified portions in the second plurality of data objects to be considered for de-duplication.
  - 16. The method of claim 14 further comprising determining whether the duplicate copy of the second portion of the data exists in a second de-duplication database.
  - 17. The method of claim 14 wherein the inserted de-duplication indicators in the second application-specific data further identify that the second de-duplication database is to be used in de-duplicating the application-specific data.
  - 18. The method of claim 14 wherein:
    - the first de-duplication database is configured to store unique blocks of the first portion of the data associated with the first application-specific data; and
      
      the second de-duplication database is configured to store unique blocks of the second portion of the data associated with the second application-specific data, wherein the first de-duplication database is separate and different from the second de-duplication database.
  - 19. The method of claim 14 wherein the inserted at least one de-duplication indicators indicate at least one of the first de-duplication database and the second de-duplication database.
  - 20. The method of claim 14 wherein the first application-specific data is associated with an electronic mail server application.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
CommVault Systems Incorporated
Original Assignee
CommVault Systems Incorporated
Inventors
PRAHLAD, Anand, VIJAYAN, Manoj Kumar, KOTTOMTHARAYIL, Rajiv, GOKHALE, Parag

Application Number

US15/198,269
Publication Number

US 20160306708A1
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 11/1448   Management of the data invo...

G06F 11/1453   using de-duplication of the...

G06F 11/1464   for networked environments

G06F 11/1469   Backup restoration techniques

G06F 16/162   Delete operations erasing i...

G06F 16/1748   De-duplication implemented ...

G06F 16/9574   of access to content, e.g. ...

G06F 21/604   Tools and structures for ma...

DE-DUPLICATION SYSTEMS AND METHODS FOR APPLICATION-SPECIFIC DATA

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

83 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

DE-DUPLICATION SYSTEMS AND METHODS FOR APPLICATION-SPECIFIC DATA

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

83 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links