DATA OBJECT STORE AND SERVER FOR A CLOUD STORAGE ENVIRONMENT, INCLUDING DATA DEDUPLICATION AND DATA MANAGEMENT ACROSS MULTIPLE CLOUD STORAGE SITES

US 20130024424A1
Filed: 09/14/2012
Published: 01/24/2013
Est. Priority Date: 06/30/2009
Status: Active Grant

First Claim

Patent Images

1. A method for scheduling storage operations on a cloud storage site, comprising:

determining a current capacity of the cloud storage site by accessing information relating to at least one of;

a capacity policy, a scheduled job, a quoted job, one or more queued requests, and a quotation policy that includes a set of preferences and criteria associated with generating a quote in response to auction client requests;

receiving multiple new requests for cloud storage from one or more auction clients;

identifying one or more winning requests that will receive responsive quotes by evaluating pending requests by applying preferences and criteria specified in the accessed quotation policy, wherein pending requests comprise the received new requests and the one or more queued requests;

generating one or more responsive quotes for winning requests by applying preferences and criteria specified in the accessed quotation policy, wherein the responsive quotes include one or more pricing values;

sending the one or more responsive quotes to one or more auction clients; and

receiving from one or more auction clients an indication of acceptance of one or more responsive quotes.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, are performed within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, systems and methods for providing a cloud gateway and a scalable data object store within a cloud environment are disclosed, along with other features.

Citations

17 Claims

1. A method for scheduling storage operations on a cloud storage site, comprising:
- determining a current capacity of the cloud storage site by accessing information relating to at least one of;
  
  a capacity policy, a scheduled job, a quoted job, one or more queued requests, and a quotation policy that includes a set of preferences and criteria associated with generating a quote in response to auction client requests;
  
  receiving multiple new requests for cloud storage from one or more auction clients;
  
  identifying one or more winning requests that will receive responsive quotes by evaluating pending requests by applying preferences and criteria specified in the accessed quotation policy, wherein pending requests comprise the received new requests and the one or more queued requests;
  
  generating one or more responsive quotes for winning requests by applying preferences and criteria specified in the accessed quotation policy, wherein the responsive quotes include one or more pricing values;
  
  sending the one or more responsive quotes to one or more auction clients; and
  
  receiving from one or more auction clients an indication of acceptance of one or more responsive quotes.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein, the capacity policy specifies system resources available for auction during specified periods, scheduled maintenance windows and current storage capacity available on servers.
  - 3. The method of claim 1, further comprising determining system resources required for storage operations already scheduled or quoted.
  - 4. The method of claim 1, wherein the quotation policy specifies at least three of:
    - a revenue function;
      
      a pricing function;
      
      a pricing rate table;
      
      information associated with marketing promotions;
      
      a list of preferred auction clients;
      
      a list of disfavored auction clients;
      
      classes of storage;
      
      retention policies;
      
      upload time periods;
      
      data characteristics;
      
      compression or encryption requirements; and
      
      estimated or historic cost of storage, including a cost of power.
  - 5. The method of claim 1, wherein the quotation policy specifies a revenue function that describes a method for numerically evaluating a projected revenue generated by the received requests.
  - 6. The method of claim 1, wherein the quotation policy specifies a pricing function that describes a method for generating various pricing values for a responsive quote.
  - 7. The method of claim 1, wherein identifying one or more winning requests further comprises identifying received requests that either do not satisfy minimum requirements specified by the quotation policy or cannot be accommodated due to a lack of system resources.
  - 8. The method of claim 1, further comprising at least one of the following:
    - sending a responsive quote having at least one term that is different from a term in a received request;
      
      sending an explicit rejection of a received request; and
      
      queuing a received request for later evaluation.
  - 9. The method of claim 1, wherein identifying one or more winning requests further comprises identifying a set of requests that results in a maximum combined value of a revenue function.
  - 10. The method of claim 1, wherein identifying one or more winning requests further comprises identifying a set of requests that results in a combined value of a revenue function that is sufficient to satisfy the quotation policy.

11. A method for storing a secondary copy of an original data set on a cloud storage site using a cloud gateway, wherein the cloud gateway is coupled between one or more client computers and one or more cloud storage sites via a network, the method comprising:
- identifying data blocks within a cache of the cloud gateway that satisfy certain criteria,wherein the original data set comprises data blocks,wherein the certain criteria are from a storage policy, andwherein the certain criteria include time-based criteria;
  
  performing block-level deduplication of the identified data blocks to create a deduplicated set of data, wherein the block-level deduplication includes—
  
  determining a size for a container file to utilize when deduplicating the identified data blocks; and
  
  deduplicating at least some of the identified data blocks to create one or more container files containing deduplicated data,wherein at least one of the container files has the determined size; and
  
  storing the deduplicated set of data on the cloud storage site by;
  
  buffering data for later transmission to the cloud storage site;
  
  repeating the following steps while the data buffer is not full;
  
  receiving a file system request to write a group of data to the cloud storage site; and
  
  adding the group of data to the buffer;
  
  converting a file system request to one or more application program interface calls associated with the cloud storage site; and
  
  transmitting contents of the buffer to the cloud storage site using the one or more application program interface calls associated with the cloud storage site.
- View Dependent Claims (12, 13)
- - 12. The method of claim 11, further comprising identifying the cloud storage site on which to store the secondary copy of the original data set by:
    - identifying two or more candidate cloud storage sites;
      
      accessing a storage policy having a set of preferences and storage criteria,wherein the set of preferences and storage criteria includes at least two of the following;
      
      one or more preferred cloud storage sites;
      
      one or more preferred classes or quality of cloud storage sites;
      
      requirements regarding deduplication of the original data set,requirements regarding encryption of the original data set,requirements regarding compression of the original data set,quality of a network connection available to the cloud storage site;
      
      one or more data retention periods;
      
      data characteristics of at least some data in the original data set;
      
      estimated or historic usage associated with operating one or more system components;
      
      frequency with which the original data set was accessed or modified during a particular time period;
      
      a specified level of fault tolerance; and
      
      ,one or more geographical locations or political states in which data storage devices for a cloud storage site exist; and
      
      selecting at least one of the two or more of the candidate cloud storage sites based at least in part on the set of preferences and storage criteria in the storage policy.
  - 13. The method of claim 11 wherein the contents of the buffer are transmitted to the cloud storage site using at least one of hypertext transfer protocol (HTTP) and HTTP over Transport Layer Security/Secure Sockets Layer.

14. A system for creating a secondary copy of an original data set using a cloud storage site, wherein the original data set is received from one or more client computers, the system comprising:
- means for identifying sub-objects of the original data set that satisfy certain criteria, wherein the certain criteria are related a storage policy;
  
  means for performing deduplication of the identified data sub-objects to create a deduplicated set of data; and
  
  ,means for storing the deduplicated set of data on the cloud storage site, wherein the means for storing includes;
  
  means for buffering data for later transmission to the cloud storage site;
  
  means for converting file system requests into application program interface calls associated with the cloud storage site; and
  
  ,means for transmitting the buffered data to the cloud storage site using the one or more application program interface calls associated with the cloud storage site.
- View Dependent Claims (15, 16)
- - 15. The system of claim 14, further comprising:
    - means for determining a size for a container file and for deduplicating at least some of the data sub-objects to create one or more container files containing deduplicated data, wherein at least one of the container files has the determined size.
  - 16. The system of claim 14 wherein the means for buffering further comprises:
    - means for receiving a file system request to write a group of data to the cloud storage site; and
      
      means for adding the group of data to the buffer.

17. A non-transitory computer-readable medium storing instructions that when executed by a processor perform a method for utilizing cloud storage resources to store at least a first portion of at least one data object within a network attached storage (NAS) device, wherein the NAS device includes a NAS file system and a non-volatile data store, and wherein the NAS device is communicatively coupled to access the cloud storage resources, the method comprising:
- accessing calls to or from the NAS file system for reading of data from or writing of data to the non-volatile data store of the NAS device,wherein the at least one data object consists of multiple data blocks,wherein the non-volatile data store of the NAS device stores the multiple data blocks of the at least one data object;
  
  wherein the NAS file system of the NAS device controls the reading of data from or the writing of data to the multiple data blocks of the at least one data object, andwherein the accessing includes identifying individual blocks or groups of blocks within the multiple data blocks of the at least one data object that the NAS file system of the NAS device reads data from or writes data to;
  
  based on the accessing, identifying a portion of the multiple data blocks of the at least one data object that satisfies a data storage criteria; and
  
  automatically transferring the identified portion of the multiple data blocks for storage by the cloud storage resources.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
CommVault Systems Incorporated
Original Assignee
CommVault Systems Incorporated
Inventors
Prahlad, Anand, Muller, Marcus S., Kottomtharayil, Rajiv, Gokhale, Parag, Vijayan, Manoj, Kavuri, Srinivas

Granted Patent

US 8,849,761 B2
Time in Patent Office

Days
Field of Search
US Class Current

707/640
CPC Class Codes

G06F 11/3485   for I/O devices

G06F 16/122   using management policies b...

G06F 16/1748   De-duplication implemented ...

G06F 16/1827   Management specifically ada...

G06F 16/1844   Management specifically ada...

G06F 16/41   Indexing; Data structures t...

G06F 3/06   Digital input from, or digi...

G06F 3/0605   by facilitating the interac...

G06F 3/061   Improving I/O performance

G06F 3/0626   Reducing size or complexity...

G06F 3/0631   by allocating resources to ...

G06F 3/0641   De-duplication techniques

G06F 3/0649   Lifecycle management

G06F 3/0667   at data level, e.g. file, r...

G06F 3/067   Distributed or networked st...

G06Q 30/02   Marketing; Price estimation...

G06Q 30/0206   Price or cost determination...

G06Q 50/188   Electronic negotiation

H04L 63/0428   wherein the data content is...

H04L 67/02   based on web technology, e....

H04L 67/06 : specially adapted for file ...

H04L 67/1095 : Replication or mirroring of...

H04L 67/1097 : for distributed storage of ...

H04L 67/535 : Tracking the activity of th...

H04L 67/56 : Provisioning of proxy servi...

H04L 67/5682 : Policies or rules for updat...

H04L 69/08 : Protocols for interworking;...

Y04S 40/20 : Information technology spec...

View All

DATA OBJECT STORE AND SERVER FOR A CLOUD STORAGE ENVIRONMENT, INCLUDING DATA DEDUPLICATION AND DATA MANAGEMENT ACROSS MULTIPLE CLOUD STORAGE SITES

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

DATA OBJECT STORE AND SERVER FOR A CLOUD STORAGE ENVIRONMENT, INCLUDING DATA DEDUPLICATION AND DATA MANAGEMENT ACROSS MULTIPLE CLOUD STORAGE SITES

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links