CLOUD GATEWAY SYSTEM FOR MANAGING DATA STORAGE TO CLOUD STORAGE SITES
First Claim
1. A cloud gateway system for storing, on a target cloud storage site, a secondary copy of an original data set that comprises data blocks, wherein the cloud gateway system is coupled between one or more client computers and one or more cloud storage sites via a network, the cloud gateway system comprising:
- a data reception component configured to receive the original data set from a client computer;
a local cache configured to buffer the original data set received from the client computer before the secondary copy of the original data set is stored on the target cloud storage site;
a callback layer configured to intercept calls for the original data set between a file system and the cache and to track the intercepted calls to provide information regarding when the original data set is changed, updated, and/or accessed by the file system;
a data migration component configured to transfer some or all of the original data set buffered in the cache, wherein the data migration component is further configured to receive information from the callback layer regarding when the original data set is changed, updated, and/or accessed by the file system; and
a media agent component, further comprising;
a network agent configured to establish and manage a network connection between the media agent and the target cloud storage site using at least one of HTTP and HTTP over Transport Layer Security/Secure Sockets Layer; and
a cloud storage submodule configured to open, read, and write data files stored on the target cloud storage site and direct the target cloud storage site to perform data storage operations, wherein the cloud storage submodule is further configured to;
convert received file system commands to store a copy of data blocks from the original data set into specific calls specified by an application programming interface utilized by the target cloud storage site; and
transfer at least some of the contents of the local cache over the network connection for storage at the target cloud storage site.
3 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed for performing data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, systems and methods for providing a cloud gateway and a scalable data object store within a cloud environment are disclosed, along with other features.
1161 Citations
19 Claims
-
1. A cloud gateway system for storing, on a target cloud storage site, a secondary copy of an original data set that comprises data blocks, wherein the cloud gateway system is coupled between one or more client computers and one or more cloud storage sites via a network, the cloud gateway system comprising:
-
a data reception component configured to receive the original data set from a client computer; a local cache configured to buffer the original data set received from the client computer before the secondary copy of the original data set is stored on the target cloud storage site; a callback layer configured to intercept calls for the original data set between a file system and the cache and to track the intercepted calls to provide information regarding when the original data set is changed, updated, and/or accessed by the file system; a data migration component configured to transfer some or all of the original data set buffered in the cache, wherein the data migration component is further configured to receive information from the callback layer regarding when the original data set is changed, updated, and/or accessed by the file system; and a media agent component, further comprising; a network agent configured to establish and manage a network connection between the media agent and the target cloud storage site using at least one of HTTP and HTTP over Transport Layer Security/Secure Sockets Layer; and a cloud storage submodule configured to open, read, and write data files stored on the target cloud storage site and direct the target cloud storage site to perform data storage operations, wherein the cloud storage submodule is further configured to; convert received file system commands to store a copy of data blocks from the original data set into specific calls specified by an application programming interface utilized by the target cloud storage site; and transfer at least some of the contents of the local cache over the network connection for storage at the target cloud storage site. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for storing a secondary copy of an original data set on a cloud storage site using a cloud gateway, wherein the cloud gateway is coupled between one or more client computers and one or more cloud storage sites via a network, the method comprising:
-
identifying data blocks within a cache of the cloud gateway that satisfy certain criteria, wherein the original data set comprises data blocks, wherein the certain criteria are from a storage policy, and wherein the certain criteria include time-based criteria; performing block-level deduplication of the identified data blocks to create a deduplicated set of data, wherein the block-level deduplication includes— determining a size for a container file to utilize when deduplicating the identified data blocks; and deduplicating at least some of the identified data blocks to create one or more container files containing deduplicated data, wherein at least one of the container files has the determined size; and storing the deduplicated set of data on the cloud storage site by; buffering data for later transmission to the cloud storage site; repeating the following steps while the data buffer is not full; receiving a file system request to write a group of data to the cloud storage site; and adding the group of data to the buffer; converting a file system request to one or more application program interface calls associated with the cloud storage site; and transmitting contents of the buffer to the cloud storage site using the one or more application program interface calls associated with the cloud storage site. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for creating a secondary copy of an original data set using a cloud storage site, wherein the original data set is received from one or more client computers, the system comprising:
-
means for identifying sub-objects of the original data set that satisfy certain criteria, wherein the certain criteria are related a storage policy; means for performing deduplication of the identified data sub-objects to create a deduplicated set of data; and
,means for storing the deduplicated set of data on the cloud storage site, wherein the means for storing includes; means for buffering data for later transmission to the cloud storage site; means for converting file system requests into application program interface calls associated with the cloud storage site; and
,means for transmitting the buffered data to the cloud storage site using the one or more application program interface calls associated with the cloud storage site. - View Dependent Claims (15, 16)
-
-
17. A non-transitory computer-readable medium storing instructions that when executed by a processor perform a method for utilizing cloud storage resources to store at least a first portion of at least one data object within a network attached storage (NAS) device, wherein the NAS device includes a NAS file system and a non-volatile data store, and wherein the NAS device is communicatively coupled to access the cloud storage resources, the method comprising:
-
accessing calls to or from the NAS file system for reading of data from or writing of data to the non-volatile data store of the NAS device, wherein the at least one data object consists of multiple data blocks, wherein the non-volatile data store of the NAS device stores the multiple data blocks of the at least one data object; wherein the NAS file system of the NAS device controls the reading of data from or the writing of data to the multiple data blocks of the at least one data object, and wherein the accessing includes identifying individual blocks or groups of blocks within the multiple data blocks of the at least one data object that the NAS file system of the NAS device reads data from or writes data to; based on the accessing, identifying a portion of the multiple data blocks of the at least one data object that satisfies a data storage criteria; and automatically transferring the identified portion of the multiple data blocks for storage by the cloud storage resources.
-
-
18. A stand-alone cloud gateway device, coupled to one or more external computing devices over a network, wherein at least one cloud storage site is also connected to the cloud gateway device via the network, the cloud gateway device comprising:
-
at least one processor; a communication component coupled to the at least one processor and associated with a network address for the stand-alone cloud gateway device, wherein the communication component receives data transfer commands from the one or more external computing devices on the network, wherein the one or more external computing devices direct the data transfer commands to the cloud gateway device via the network address for the cloud gateway device, and wherein the data transfer commands direct operation of the cloud gateway device; a non-volatile, internal data store, coupled to the at least one processor, wherein the internal data store stores data objects; a data storage component that comprises program code, which when executed by the processor, performs data storage tasks with respect to the internal data store; a file system that comprises program code, which when executed by the processor, stores and organizes data objects stored in the internal data store; a network agent that comprises program code, which when executed by the processor, establishes and manages network connections between the media agent component and the cloud storage site; and a cloud storage submodule that comprises program code, which when executed by the processor— converts received file system commands to specific application programming interface calls utilized by the cloud storage site; and transfers the data objects stored in the internal data store to the cloud storage site over the network. - View Dependent Claims (19)
-
Specification