Data object store and server for a cloud storage environment, including data deduplication and data management across multiple cloud storage sites
First Claim
1. A system for storing a set of data files to a cloud storage site, the system comprising memory and a processor that are configured to:
- provide multiple requests for cloud storage to two or more cloud storage sites,wherein the multiple requests each include a request for data storage to a cloud storage site;
wherein the multiple requests each include—
information associated with a total size of the set of data files to be stored, andrequirements for the data storage for the set of files;
wherein the multiple requests each include at least one pricing rate request; and
,wherein the two or more cloud storage sites are respectively operated by two or more independent organizations;
receive a response from each at least two of the two or more cloud storage sites,wherein each of the responses from the at least two cloud storage sites includes;
a current capacity that is determined based at least in part on;
a capacity policy that specifies preferences or criteria associated with data storage for that cloud storage site, anda quotation policy that includes a set of preferences and criteria associated with generating a quote in response to received requests; and
a pricing quote for a data storage job at that cloud storage site;
select one of the at least two cloud storage sites based on the received responses,wherein the selecting is based at least in part on the pricing quote and the current capacity; and
,provide to the selected cloud storage site the set of data files to be stored according to the provided request, the selected received response, or both the provided request and the selected received response.
2 Assignments
0 Petitions
Accused Products
Abstract
Data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, are performed within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, systems and methods for providing a cloud gateway and a scalable data object store within a cloud environment are disclosed, along with other features.
260 Citations
14 Claims
-
1. A system for storing a set of data files to a cloud storage site, the system comprising memory and a processor that are configured to:
-
provide multiple requests for cloud storage to two or more cloud storage sites, wherein the multiple requests each include a request for data storage to a cloud storage site; wherein the multiple requests each include— information associated with a total size of the set of data files to be stored, and requirements for the data storage for the set of files; wherein the multiple requests each include at least one pricing rate request; and
,wherein the two or more cloud storage sites are respectively operated by two or more independent organizations; receive a response from each at least two of the two or more cloud storage sites, wherein each of the responses from the at least two cloud storage sites includes; a current capacity that is determined based at least in part on; a capacity policy that specifies preferences or criteria associated with data storage for that cloud storage site, and a quotation policy that includes a set of preferences and criteria associated with generating a quote in response to received requests; and a pricing quote for a data storage job at that cloud storage site; select one of the at least two cloud storage sites based on the received responses, wherein the selecting is based at least in part on the pricing quote and the current capacity; and
,provide to the selected cloud storage site the set of data files to be stored according to the provided request, the selected received response, or both the provided request and the selected received response. - View Dependent Claims (2, 3, 4)
-
-
5. A system for identifying storage locations for a set of data files subject to a storage policy, wherein the set of data files is generated within a storage operation cell that has multiple client computers, and wherein the storage operation cell is coupled to multiple cloud storage sites via a network, the system comprising a processor that is configured to:
-
group the data files into at least one logical group of data files using a storage policy, wherein the storage policy defines performance-based classes of storage locations on which the set of data files may be stored, wherein the logical grouping of the set of data files facilitates deduplication of the set of data files; determine aggregate storage requirements of a logical group of data files based at least in part on the storage policy; estimate storage costs for each of two or more candidate storage sites based on historical or projected cost information stored within a storage manager computing device, wherein the storage manager computing device tracks and directs storage operations between client computing devices and secondary storage devices for the storage operation cell; based on the estimated storage costs, identify some of the two or more candidate cloud storage sites to store a copy of the logical group of data files, wherein each of the two or more candidate cloud storage sites are operated by independent organizations; generate a request for quotes for storing a copy of the logical group of data files on one of the candidate cloud storage sites, wherein the request for quotes includes the aggregate storage requirements of the logical group of data files; identify a target cloud storage site from the two or more candidate cloud storage sites by evaluating, based at least in part on received quotes, storage costs of storing a copy of the logical group of data files; wherein the received quotes are based on quotation polices that include a set of preferences and criteria associated with generating a quote in response to received requests and further comprise at least two of; a first pricing rate for an initial upload to the candidate cloud storage sites, a second pricing rate for downloads from the candidate cloud storage sites, a third pricing rate for searching or accessing the candidate cloud storage sites, and a fourth pricing rate for continued storage and maintenance of data on the candidate cloud storage sites; wherein the storage costs include estimated monetary expenses associated with storing the logical group of data objects; and transmit to storage at least some of the logical group of data files from a client computer to the target cloud storage site.
-
-
6. A non-transitory computer-readable medium storing computer-implementable instructions for identifying storage locations for a set of data files, the method comprising:
-
determining storage requirements of a group of data files, wherein the group of data files are logically organized in a primary storage device to facilitate deduplication of the group of data files; estimating storage costs for each of two or more candidate storage sites based on historical or projected cost information stored within a storage manager computing device, wherein the storage manager computing device tracks and directs storage operations between client computing devices and secondary storage devices for a storage operation cell; based on estimating the storage costs, identifying some of the two or more candidate cloud storage sites to store a copy of the group of data files, wherein each of the two or more candidate cloud storage sites are operated by independent organizations; generating a request for quotes for storing a copy of the group of data files on one of the candidate cloud storage sites, wherein the request for quotes includes the aggregate storage requirements of the group of data files, and wherein the request for quotes is provided to the two or more candidate cloud storage sites; receiving one or more quotes from each of the two or more candidate cloud storage sites, wherein the received quotes are based on quotation polices that include a set of preferences and criteria associated with generating a quote in response to received requests, wherein the received quotes include a least price rates for different types of storage media used at the two or more candidate cloud storage sites, and wherein received quotes from individual candidate cloud storage sites further comprise at least two of; a first pricing rate for an initial upload to the candidate cloud storage sites, a second pricing rate for downloads from the candidate cloud storage sites, a third pricing rate for searching or accessing the candidate cloud storage sites, and a fourth pricing rate for continued storage and maintenance of data on the candidate cloud storage sites; identifying a target cloud storage site from the two or more candidate cloud storage sites by evaluating, based at least in part on the received quotes, storage costs of storing a copy of the group of data files, wherein the storage costs include estimated monetary expenses associated with storing the group of data objects, and wherein evaluating storage costs is based at least in part on a reputation or reliability of the candidate cloud storage sites; and causing at least some of the group of data files to be sent to the target cloud storage site for storage. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14)
-
Specification