DATA OBJECT STORE AND SERVER FOR A CLOUD STORAGE ENVIRONMENT, INCLUDING DATA DEDUPLICATION AND DATA MANAGEMENT ACROSS MULTIPLE CLOUD STORAGE SITES
First Claim
1. A system to provide cloud-based data management services, wherein the system is communicatively coupled to multiple client computers via at least one network, and wherein the system is communicatively coupled to multiple cloud storage sites, the system comprising:
- an object server node, including a secondary storage computing device, configured to create, on multiple, different cloud storage sites, secondary copies from logical groups of data objects, wherein the object server node further comprises—
an object server agent configured to;
receive data objects from the multiple client computers; and
provide a web-based interface to the multiple client computers to permit the multiple client computers to write, read, retrieve, and manipulate the data objects received by the object server agent and stored as secondary copies of the data objects on the cloud storage sites;
a data ingestion database configured to record information about each data object received by the object server agent from the multiple client computers, wherein the recorded information for each data object includes two or more of the following;
a unique token or universal resource identifier that identifies the data object;
a client computer or a user from which the data object was received;
a sub-client identifier that identifies a logical container and associated storage policy parameters that dictate handling or management of data objects within the logical container;
a location of an instance of the data object within the multiple cloud storage sites;
a location of deduplication information pertaining to the data object; and
a cryptographically unique identifier for the data object; and
wherein the secondary storage computing device is further configured to perform at least one of the following operations before copying the logical group of data objects to at least one of the cloud storage sites—
content indexing each data object in the logical group;
performing deduplication on the data objects in the logical group; and
encrypting the data objects in the logical group.
4 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed for performing data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, systems and methods for providing a cloud gateway and a scalable data object store within a cloud environment are disclosed, along with other features.
561 Citations
20 Claims
-
1. A system to provide cloud-based data management services, wherein the system is communicatively coupled to multiple client computers via at least one network, and wherein the system is communicatively coupled to multiple cloud storage sites, the system comprising:
an object server node, including a secondary storage computing device, configured to create, on multiple, different cloud storage sites, secondary copies from logical groups of data objects, wherein the object server node further comprises— an object server agent configured to; receive data objects from the multiple client computers; and provide a web-based interface to the multiple client computers to permit the multiple client computers to write, read, retrieve, and manipulate the data objects received by the object server agent and stored as secondary copies of the data objects on the cloud storage sites; a data ingestion database configured to record information about each data object received by the object server agent from the multiple client computers, wherein the recorded information for each data object includes two or more of the following; a unique token or universal resource identifier that identifies the data object; a client computer or a user from which the data object was received; a sub-client identifier that identifies a logical container and associated storage policy parameters that dictate handling or management of data objects within the logical container; a location of an instance of the data object within the multiple cloud storage sites; a location of deduplication information pertaining to the data object; and a cryptographically unique identifier for the data object; and wherein the secondary storage computing device is further configured to perform at least one of the following operations before copying the logical group of data objects to at least one of the cloud storage sites— content indexing each data object in the logical group; performing deduplication on the data objects in the logical group; and encrypting the data objects in the logical group. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A method for managing a storage request from a requesting client computer to store a data object by an object store system, wherein the object store system is communicatively coupled between the client computer and one or more cloud storage sites via at least one network, the method comprising:
-
receiving a request to store the data object, and receiving an identifier for the data object and metadata associated with the data object; determining one or more storage policy parameters applicable to the data object; based on the received identifier, determining if the system currently has the data object stored in a manner consistent with the storage policy parameters applicable to the data object, and— if the system currently has the data object stored in a manner consistent with the storage policy parameters applicable to the data object, then; updating a deduplication database to associate previously stored blocks with the received request to store the data object, storing the received metadata, and storing in a local data store one or more references to at least one of;
(1) a stored copy of the data object, and (2) constituent blocks that form a stored copy of the data object;if the system does not currently have the data object stored in a manner consistent with the storage policy parameters applicable to the data object, then; requesting a copy of the data object from the client computer, receiving a copy of the data object, and storing the received data object and metadata in the local data store; aggregating the data object and the received metadata stored in the local data store into a logical group of data objects, wherein the logical group of data objects include additional data objects from the local data store, and wherein the additional data objects are logically related to the logical group; and directing storage of the logical group of data objects, as a secondary copy, within the one or more cloud storage sites. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system for managing a storage request from a requesting client computer to store a data object that comprises multiple blocks, wherein the system is communicatively coupled between the client computer and at least one cloud storage site, the system comprising:
-
means for receiving a request to store a data object, wherein the data object includes metadata associated therewith; means for identifying a logical group for the received data object, wherein the means for identifying identifies the logical group based at least in part on the metadata received with the data object and a storage policy having storage policy parameters, and wherein the logical group aggregates for storage data objects sharing a common characteristic; means for determining, based on a cryptographically unique identifier for the data object, whether a copy of the data object is already stored in an archive file, wherein the archive file is stored within a cloud storage site; means for updating at least one of a deduplication database and the archive file when a copy of the data object is already stored in the archive file; means for performing block level deduplication of the data object when a copy of the data object is not already stored in the archive file; and means for updating an ingestion database to reflect at least the storage request and received metadata. - View Dependent Claims (20)
-
Specification