Read, write, and recovery operations for replicated data
First Claim
1. A method of reading replicated data comprising:
- receiving a request to read replicated data from a requestor;
issuing a message to each of a plurality of distributed storage devices wherein the message includes a timestamp and wherein each storage device has a version of the data and a timestamp that indicates when the version of data was last updated, wherein the timestamp maintained by each storage device is for a range of data blocks;
comparing the timestamp from the message to the timestamp at each storage device and, if the comparison indicates the storage device has the same version of the data, returning an affirmative response;
when at least a majority of the storage devices but not necessarily all of the storage devices have returned an affirmative response, providing the data to the requestor of the data.
6 Assignments
0 Petitions
Accused Products
Abstract
Read, write and recovery operations for replicated data are provided. In one aspect, a system for redundant storage of data included a plurality of storage devices and a communication medium for interconnecting the storage devices. At least two of the storage devices are designated devices for storing a block of data. Each designated device has a version of the data and a first timestamp that is indicative of when the version of data was last updated and a second timestamp that is indicative of any pending update to the block of data. The read, write and recovery operations are performed to the data using the first and second timestamps to coordinate the operations among the designated devices.
53 Citations
32 Claims
-
1. A method of reading replicated data comprising:
-
receiving a request to read replicated data from a requestor; issuing a message to each of a plurality of distributed storage devices wherein the message includes a timestamp and wherein each storage device has a version of the data and a timestamp that indicates when the version of data was last updated, wherein the timestamp maintained by each storage device is for a range of data blocks; comparing the timestamp from the message to the timestamp at each storage device and, if the comparison indicates the storage device has the same version of the data, returning an affirmative response; when at least a majority of the storage devices but not necessarily all of the storage devices have returned an affirmative response, providing the data to the requestor of the data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method of reading replicated data comprising:
-
receiving a request to read replicated data from a requestor; issuing a message to each of a plurality of distributed storage devices wherein the message includes a timestamp and wherein each storage device has a version of the data and a timestamp that indicates when the version of data was last updated, wherein the timestamp maintained by each storage device is for a range of data blocks; comparing the timestamp from the message to the timestamp at each storage device and, if the comparison indicates the device has the same version of the data, returning an affirmative response; and when at least a majority of the storage devices but not necessarily all of the storage devices have returned an affirmative response, providing the data to the requestor of the data; and a coordinator device initiating a data recovery operation when less than a majority of the storage devices have returned an affirmative response.
-
-
14. A method of writing replicated data comprising:
-
receiving a request to write replicated data; issuing a message to each of a plurality of distributed storage devices wherein the message includes a timestamp and wherein each storage device has a version of the data and a timestamp that indicates when the version of data was last updated, wherein each storage device also stores indicia of any pending update operation to the data, wherein the indicia of any pending update operation includes a timestamp that indicates a time associated with the pending update operation, and wherein each timestamp maintained by each storage device is for a range of data blocks; comparing the timestamp from the message to the timestamp at each storage device and, if the comparison indicates the storage device has an earlier version of the data and the timestamp associated with the pending update is not higher than the timestamp of the message, returning an affirmative response, wherein an affirmative response is not returned when the timestamp associated with the pending update operation is higher than the timestamp of the message; and when at least a majority of the storage devices but not necessarily all of the storage devices have returned an affirmative response, providing the data to at least the majority of the storage devices. - View Dependent Claims (15, 16, 17, 18)
-
-
19. A method of writing replicated data comprising:
-
receiving a request to write replicated data; issuing a message to each of a plurality of distributed storage devices wherein the message includes a timestamp and wherein each storage device has a version of the data and a timestamp that indicates when the version of data was last updated, wherein the timestamp maintained by each storage device is for a range of data blocks; comparing the timestamp from the message to the timestamp at each storage device and, if the comparison indicates the device has an earlier version of the data, returning an affirmative response; and when at least a majority of the storage devices but not necessarily all of the storage devices have returned an affirmative response, providing the data to at least the majority of the storage devices; and aborting a write associated with the write request when less than a majority of the storage devices return an affirmative response. - View Dependent Claims (20)
-
- 21. A system for redundant storage of data comprising a plurality of distributed storage devices and a communication medium for interconnecting the storage devices, wherein at least two of the storage devices are designated storage devices for storing a block of replicated data and wherein each designated storage device has a version of the block of data and a first timestamp that is indicative of when the version of the block of data was last updated and a second timestamp that is indicative of any pending update to the block of data that has not yet been committed, wherein the first timestamp at each designated storage device is maintained for a range of data blocks, wherein in response to a request to read the block of data, one of the storage devices is configured to issue a message to each of the designated storage devices, wherein the message includes a timestamp and wherein each of the designated storage devices is configured to compare the timestamp from the message to its first and second timestamps, wherein when the comparison indicates the designated storage device has an earlier version of the block of data and a later update is not pending, the designated storage device is configured to return an affirmative response and, otherwise, the designated storage device is configured to return a negative response, wherein when at least a majority of the designated storage devices but not necessarily all of the designated storage devices have returned an affirmative response, the block of data at one of the designated storage devices is provided to a requestor of the data.
-
28. A system for redundant storage of data comprising a plurality of distributed storage devices and a communication medium for interconnecting the storage devices wherein at least two of the storage devices are designated storage devices for storing a block of replicated data and wherein each designated storage device has a version of the block of data and a first timestamp that is indicative of when the version of the block of data was last updated and a second timestamp that is indicative of any pending update to the block of data that has not yet been committed, wherein the first timestamp at each designated storage device is maintained for a range of data blocks,
wherein in response to a message indicating that the block of data is to be recovered, the designated storage devices are configured to forward their first timestamps, and the system is configured to make a determination as to which version of the block of data is most-current based on the forwarded first timestamps.
Specification