DISTRIBUTED COMPUTING BACKUP AND RECOVERY SYSTEM
1 Assignment
0 Petitions
Accused Products
Abstract
The distributed computing backup and recovery (DCBR) system and method provide backup and recovery for distributed computing models (e.g., NoSQL) The DCBR system extends the protections from server node-level failure and introduces persistence in time so that the evolving data set may be stored and recovered to a past point in time. The DCBR system, instead of performing backup and recovery for an entire dataset, may be configured to apply to a subset of data. Instead of keeping or recovering snapshots of the entire dataset which requires the entire cluster, the DCBR system identifies the particular nodes and/or archive files where the dataset resides so that backup or recovery may be done with a much smaller number of nodes.
-
Citations
41 Claims
-
1-20. -20. (canceled)
-
21. A method for distributed computing backup and recovery, comprising:
-
retrieving at least one user selectable preference; identifying a first subset of data from within a data set according to the at least one user selectable preference, the first subset of data containing less than all of the data in the data set wherein the first subset is selectable by using the user selectable preference; intercepting the first subset of data at an application programming interface (API); encrypting the first subset of data by using the API; receiving, into a memory via an interface controlled by a processor connected to a network in a computing environment wherein the encrypted first subset of data objects are within a second subset of data, the second subset of data containing less than all of the data in the first subset of data; evaluating, using the processor, a hash function stored in the memory to determine network storage locations or network retrieval locations, or both for the data objects; storing at a granular level, at each of the network storage locations, the data objects according to a data object request, when the data object request comprises a request to store the data objects, where the stored data objects are identified as a replica of the data objects stored at each of the network storage locations; identifying the data objects within the second subset of data; retrieving at a granular level from one of the network retrieval locations from a backup of the computing environment, using the processor connected to the network, the stored data objects identified by the one of the network retrieval locations, when the data object request comprises a request to retrieve the data objects, where the stored data objects are retrieved from the second subset of data. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A product for distributed computing backup and recovery, comprising:
- a computer readable memory with processor executable instructions stored thereon, wherein the instructions when executed by the processor cause the processor to;
retrieve at least one user selectable preference; identify a first subset of data from within a data set according to the at least one user selectable preference, the first subset of data containing less than all of the data in the data set wherein the first subset is selectable by using the user selectable preference; intercept the first subset of data at an application programming interface (API); encrypting the first subset of data by using the API; receive, into a memory via an interface controlled by a processor connected to a network in a computing environment, a data object request that identifies data objects to store or retrieve, wherein the identified data objects are encrypted and are within a second subset of data, the second subset of data containing less than all of the data in the first subset of data; evaluate, using the processor, a hash function stored in the memory to determine network storage locations or network retrieval locations, or both for the data objects, wherein the hash function uses a hash ring to map a first namespace into an evenly distributed second namespace using a hashing function wherein the evenly distributed second namespace is smaller than the first namespace; identify the data objects within the second subset of data; store at a granular level, at each of the network storage locations, the data objects according to the data object request, when the data object request comprises a request to store the data objects, where the stored data objects are identified as a replica of the data objects at each of the network storage locations; retrieve at a granular level, from one of the network retrieval locations from a backup of the computing environment, using the processor connected to the network, the data objects identified by the one of the network retrieval locations, when the data object request comprises a request to recover or retrieve the data objects, where the stored data objects are retrieved from the second subset of data. - View Dependent Claims (31, 32, 33, 34)
- a computer readable memory with processor executable instructions stored thereon, wherein the instructions when executed by the processor cause the processor to;
-
35. A system for distributed computing backup and recovery (DCBR), comprising:
-
a processor to retrieve at least one user selectable preference, identify a first subset of data from within a data set according to the at least one user selectable preference, the first subset of data containing less than all of the data in the data set wherein the first subset is selectable by using the user selectable preference; an application programming interface (API) to intercept the first subset of data, and encrypt the first subset of data; a cluster of computing nodes in a computing environment; an interface controlled by the processor connected to a network in the computing environment; a memory coupled to the processor, wherein the memory comprises; a data object request received through the interface for data objects wherein the data objects are encrypted and are within a second subset of data, the second subset of data containing less than all of the data in the first subset of data; a hash function that is evaluated by the processor to determine network storage locations or network retrieval locations, or both for the data object, wherein the hash function uses a hash ring to map a first namespace into an evenly distributed second namespace using a hashing function wherein the evenly distributed second namespace is smaller than the first namespace; instructions executable by the processor that cause the processor to; identify the data objects within the second subset of data; retrieve at a granular level from one of the network retrieval locations the data objects from a backup of the computing environment, when the request is a request to retrieve the data objects, where the stored data objects retrieved are identified by the one of the network retrieval locations, where the stored data objects are retrieved from the second subset of data; store at a granular level the data objects, when the request is a request to store the data object;
where a copy of the data objects are located on one or more of the nodes, where the stored data objects are identified as a replica of the data objects at each of the network storage locations. - View Dependent Claims (36, 37, 38, 39, 40, 41)
-
Specification