Storage scheme for a distributed storage system
First Claim
1. A system comprising:
- one or more compute nodes executing one or more applications;
a plurality of storage nodes each hosting one or more storage devices;
a consistency database manager hosting a consistency database, the consistency database storing, for each storage node of the plurality of storage nodes—
an operational status of the each storage node;
a listing of one or more storage units assigned to the each storage node; and
an update status for each storage one of the one or more storage units assigned to the each storage node;
wherein the consistency database manager is programmed to (a) receive notifications from each node of the one or more compute nodes and the plurality of storage nodes and (b) update the consistency database according to the notifications, each notification indicating at least one of;
that a source of the each notification is not current; and
that a storage node of the plurality of storage nodes is not responsive to the source of the each notification;
wherein each storage node of the plurality of storage nodes is further programmed to, for each first write IOP (input/output operation) from a first compute node of the one or more compute nodes, execute the each first write IOP with respect to a first copy of a first storage unit stored by the each storage node and referenced by the each first write IOP by—
assigning a first virtual block address (VBA) to a logical block address (LBA) referenced in the each first write IOP according to a first VBA counter;
incrementing the first VBA counter;
storing an association between the LBA and the first VBA;
writing data from the each first write IOP to a first physical storage location;
storing an association between the first physical storage location and the first VBA; and
transmitting the each first write IOP to a second storage node of the plurality of storage nodes with the first VBA.
2 Assignments
0 Petitions
Accused Products
Abstract
A system maintains a consistency database that maintains a status (current, down, stale) for copies of logical storage volumes stored on storage nodes. As failures are detected, the consistency database is updated. Copies are synchronized with one another using information in the consistency database. Write operations on a primary node for a slice of a logical storage node are assigned a virtual block address (VBA) that is mapped to a logical block address (LBA) within the slice. Consistency of the VBAs of the primary node and that of a secondary node is evaluated and used to detect currency. VBA holes are detected and corresponding write commands resent to maintain currency. Physical segments on the primary node are assigned virtual segment identifiers (VSID) that are maintained consistent with VSIDs on clone nodes so that they can be used for garbage collection and synchronization.
-
Citations
18 Claims
-
1. A system comprising:
-
one or more compute nodes executing one or more applications; a plurality of storage nodes each hosting one or more storage devices; a consistency database manager hosting a consistency database, the consistency database storing, for each storage node of the plurality of storage nodes— an operational status of the each storage node; a listing of one or more storage units assigned to the each storage node; and an update status for each storage one of the one or more storage units assigned to the each storage node; wherein the consistency database manager is programmed to (a) receive notifications from each node of the one or more compute nodes and the plurality of storage nodes and (b) update the consistency database according to the notifications, each notification indicating at least one of; that a source of the each notification is not current; and that a storage node of the plurality of storage nodes is not responsive to the source of the each notification; wherein each storage node of the plurality of storage nodes is further programmed to, for each first write IOP (input/output operation) from a first compute node of the one or more compute nodes, execute the each first write IOP with respect to a first copy of a first storage unit stored by the each storage node and referenced by the each first write IOP by— assigning a first virtual block address (VBA) to a logical block address (LBA) referenced in the each first write IOP according to a first VBA counter; incrementing the first VBA counter; storing an association between the LBA and the first VBA; writing data from the each first write IOP to a first physical storage location; storing an association between the first physical storage location and the first VBA; and transmitting the each first write IOP to a second storage node of the plurality of storage nodes with the first VBA. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising:
-
providing one or more compute nodes executing one or more applications; providing a plurality of storage nodes each hosting one or more storage devices; storing, by a computing device, a consistency database storing, for each storage node of the plurality of storage nodes— an operational status of the each storage node; a listing of one or more storage units assigned to the each storage node; and an update status for each storage one of the one or more storage units assigned to the each storage node; receiving, by the computing device, notifications from each node of the one or more compute nodes and the plurality of storage nodes; updating, by the computing device, the consistency database according to the notifications, each notification indicating at least one of; that a source of the each notification is not current; and that a storage node of the plurality of storage nodes is not responsive to the source of the each notification; and for each first write IOP (input/output operation) from a first compute node of the one or more compute nodes, executing, by a first storage node of the plurality of storage nodes, the each first write IOP with respect to a first copy of a first storage unit stored by the first storage node and referenced by the each first write IOP by— assigning a first virtual block address (VBA) to a logical block address (LBA) referenced in the each first write IOP according to a first VBA counter; incrementing the first VBA counter; storing an association between the LBA and the first VBA; writing data from the each first write IOP to a first physical storage location; storing an association between the first physical storage location and the first VBA; and transmitting the each first write IOP to a second storage node of the plurality of storage nodes with the first VBA. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification