Data storage array employing block checksums and dynamic striping
First Claim
1. A storage system, comprising:
- a plurality of storage devices, wherein each of the storage devices comprises a plurality of addressable locations for storing data, each addressable location having a physical address;
a storage controller coupled to said plurality of storage devices, wherein said storage controller is configured to store data to and retrieve data from said plurality of storage devices;
a storage configured to store an indirection map comprising a plurality of map entries, wherein each of said map entries maps a virtual address to one of said physical addresses, and wherein each map entry stores a checksum for data stored at the physical address indicated by that map entry; and
wherein said storage controller is further configured to receive storage requests specifying a virtual address, and wherein said storage controller is configured to access said indirection map for each storage request to obtain the corresponding physical address and to obtain or update the corresponding checksum.
2 Assignments
0 Petitions
Accused Products
Abstract
A storage system may include a plurality of storage devices each having a plurality of addressable locations for storing data. A storage controller may be coupled to the storage devices and configured to store and retrieve data from the storage devices. An indirection map may be stored within the system having a plurality of map entries each configured to map a virtual address to a physical address on the storage devices. Each map entry may also store a checksum for data stored at the physical address indicated by the map entry. The storage controller may receive storage requests specifying a virtual address and may access the indirection map for each storage request to obtain the corresponding physical address and checksum. Dynamic striping may be employed so that new writes form new parity groups. Thus, stripes of various sizes may be supported by the storage system.
-
Citations
28 Claims
-
1. A storage system, comprising:
-
a plurality of storage devices, wherein each of the storage devices comprises a plurality of addressable locations for storing data, each addressable location having a physical address;
a storage controller coupled to said plurality of storage devices, wherein said storage controller is configured to store data to and retrieve data from said plurality of storage devices;
a storage configured to store an indirection map comprising a plurality of map entries, wherein each of said map entries maps a virtual address to one of said physical addresses, and wherein each map entry stores a checksum for data stored at the physical address indicated by that map entry; and
wherein said storage controller is further configured to receive storage requests specifying a virtual address, and wherein said storage controller is configured to access said indirection map for each storage request to obtain the corresponding physical address and to obtain or update the corresponding checksum. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
said storage controller is configured to store a first stripe of data as a first plurality of data stripe units across ones of said plurality of storage devices;
wherein each data stripe unit is stored at a different one of said physical addresses;
for each data stripe unit, one of said map entries in said indirection map is configured to map a virtual address to the physical address where that data stripe unit is written and to store the checksum for that data stripe unit.
-
-
3. The storage system as recited in claim 2, wherein when one of said storage requests is a write transaction modifying a subset of said first plurality of data stripe units, said storage controller is configured to store said subset of said first plurality of data stripe units modified by the write transaction as a new stripe to new physical addresses across ones of said plurality of storage devices, wherein said storage controller is configured to update the map entries in said indirection map for each modified data stripe unit to store the checksum for each modified data stripe unit and to indicate the new physical address for each modified data stripe unit.
-
4. The storage system as recited in claim 3, wherein said first plurality of data stripe units includes a first plurality of data blocks and a first parity block which is calculated for said first plurality of data blocks, wherein said storage controller is configured to calculate a new parity block for said subset of said first plurality of data blocks modified by the write transaction, and wherein said storage controller is configured to only store said subset of said first plurality of data blocks modified by the write transaction and said new parity block as the new stripe.
-
5. The storage system as recited in claim 4, wherein each of said entries in said indirection map further comprise a pointer to a next entry in a same parity group.
-
6. The storage system as recited in claim 1, wherein said storage controller is configured to receive a write transaction as one of said storage requests, wherein said write transaction comprises one or more data blocks to be written and one or more virtual addresses for the one or more data blocks;
- and wherein, for each data block of the write transaction, said storage controller is further configured to store a checksum in an entry in said indirection map specifying one of said physical addresses at which that data block is written.
-
7. The storage system as recited in claim 6, wherein the write transaction further comprises the checksum for each data block to be written, and wherein said storage controller is configured to verify that each of the one or more data blocks to be written matches its corresponding checksum before writing that data block to its physical address.
-
8. The storage system as recited in claim 1, wherein said storage controller is configured to receive, as one of said storage requests, a read request from a host system, wherein said read request comprises a virtual address for a data block to be read, wherein said storage controller is further configured to obtain, in a single access of said indirection map, the checksum and physical address for the data block to be read, and wherein said storage controller is configured to obtain the data block to be read from its physical address and return it and its checksum to the host system in response to the read transaction.
-
9. The storage system as recited in claim 1, wherein said plurality of storage devices comprise disk drives.
-
10. The storage system as recited in claim 1, wherein said storage configured to store an indirection map comprises one or more of said storage devices.
-
11. The storage system as recited in claim 10, wherein said storage controller is configured to cache a portion or all of said indirection map in a memory.
-
12. A system, comprising:
-
a plurality of storage devices configured in an array; and
an array controller coupled to said plurality of storage devices, wherein said array controller is configured to store a first stripe of data as a first parity group comprising a first plurality of data stripe units across ones of said plurality of storage devices, and wherein said array controller is further configured to store a checksum for each of said first plurality of data stripe units;
wherein said first plurality of data stripe units includes a first plurality of data blocks and a first parity block which is calculated for said first plurality of data blocks;
wherein said array controller is configured to receive a write transaction modifying a subset of said first plurality of data blocks;
wherein said array controller is configured to calculate a new parity block for said subset of said first plurality of data blocks modified by the write transaction;
wherein, in response to the write transaction, said array controller is configured to store only said subset of said first plurality of data blocks modified by the write transaction and said new parity block as a new parity group to new locations across ones of said plurality of storage devices; and
wherein said array controller is configured to store a new checksum for each block of said new parity group. - View Dependent Claims (13, 14, 15, 16, 17, 18)
collecting a plurality of existing parity groups each one of which comprises a non-default number of data blocks stored across said storage devices;
forming a plurality of new parity groups from said plurality of existing parity groups, wherein each one of said plurality of new parity groups comprises a default number of data blocks;
calculating a plurality of new parity blocks for each one of said new parity groups;
storing each one of said plurality of new parity groups including said new parity blocks to new locations across ones of said plurality of storage devices; and
wherein said array controller is configured to store checksums in indirection map entries for each data and parity block of the new parity groups.
-
-
16. The data storage subsystem as recited in claim 15, wherein said array controller is further configured to maintain a plurality of versions of said plurality of existing parity groups which existed prior to a modification of ones of said data blocks in said plurality of existing parity groups.
-
17. The data storage subsystem as recited in claim 13, wherein said array controller is configured to receive a read request for one of said plurality of first data stripe units, wherein said read request specifies a virtual address for the requested data stripe unit, and wherein said array controller is configured to access said indirection map to obtain the physical address and checksum for the requested data stripe unit.
-
18. The data storage subsystem as recited in claim 12, wherein each one of said plurality of storage devices includes a disk head unit configured for reading and writing data, and wherein said array controller is further configured to select ones of a plurality of new locations closest in proximity to said disk head unit.
-
19. A method for storing data in a data storage subsystem, comprising:
-
storing a first stripe of data as a first plurality of data stripe units across a plurality of storage devices;
wherein said first plurality of data stripe units includes a first plurality of data blocks and a first parity block which is calculated for said first plurality of data blocks;
storing entries in an indirection map for each data stripe unit, wherein each entry maps a virtual address to a physical address for one of the data stripe units, and wherein each entry further stores a checksum for the data stripe unit corresponding to that entry;
receiving a write transaction specifying the virtual addresses of a subset of said first plurality of data blocks;
calculating a new parity block for said subset of said first plurality of data blocks;
storing only said subset of said first plurality of data blocks modified by the write transaction and said new parity block as a new parity group to new physical addresses across ones of said plurality of storage devices; and
updating the entries in the indirection map for the data blocks modified by the write transaction to indicate the new physical address and checksum for each modified data block. - View Dependent Claims (20, 21, 22, 23, 24, 25)
storing a second stripe of data as a second plurality of data stripe units across said ones of said plurality of storage devices, wherein said second plurality of data stripe units includes a second plurality of data blocks, which is different in number than said first plurality of data blocks, and a second parity block which is calculated for said second plurality of data blocks; and
storing entries in said indirection map for each second data stripe unit, wherein each entry for one of the second data stripe units maps a virtual address to a physical address, and stores a checksum for the second data stripe unit corresponding to that entry.
-
-
21. The method as recited in claim 19, further comprising remapping a plurality of parity groups by:
-
collecting a plurality of existing parity groups each one of which comprises a non-default number of data blocks stored across said storage devices;
forming a plurality of new parity groups from said plurality of existing parity groups, wherein each one of said plurality of new parity groups comprises a default number of data blocks;
calculating a plurality of new parity blocks for each one of said new parity groups; and
storing each one of said plurality of new parity groups and said new parity blocks to new physical addresses across ones of said plurality of storage devices.
-
-
22. The method as recited in claim 21, further comprising updating the entries in the indirection map for each block of the new parity groups to indicate the new physical address each block.
-
23. The method as recited in claim 19, further comprising maintaining a plurality of versions of said first plurality of data stripe units which existed prior to a modification of ones of said first plurality of data blocks.
-
24. The method as recited in claim 23, further comprising storing entries in said indirection map for each of said plurality of versions, wherein each entry for each of said plurality of versions maps a virtual address to a physical address, and stores a checksum for the data stripe unit corresponding to that entry.
-
25. The method as recited in claim 19, further comprising:
-
receiving a read request from a host system specifying a virtual address for one of the data stripe units;
accessing said indirection map to obtain the physical address mapped to the specified virtual address and to obtain the checksum for the requested data stripe unit;
obtaining the requested data strip unit at the physical address mapped to the specified virtual address from one of the storage devices;
returning the requested data stripe unit and corresponding checksum to the host system.
-
-
26. A method for storing data in a data storage subsystem, comprising:
-
storing a first plurality of data stripe units across a plurality of storage devices, wherein said first plurality of data stripe units includes a first plurality of data blocks;
storing entries in an indirection map for each data block, wherein each entry maps a virtual address to a physical address for one of the data blocks, and wherein each entry further stores a checksum for the data block corresponding to that entry;
receiving a read request specifying the virtual addresses one of said first plurality of data blocks;
accessing said indirection map to obtain the physical address mapped to the specified virtual address and to obtain the corresponding checksum; and
in response to the read request, returning the data block at the physical address mapped to the specified virtual address and returning the corresponding checksum. - View Dependent Claims (27, 28)
receiving a write transaction specifying a virtual address and a data write for the data block corresponding to the specified virtual address;
performing the data write to a different physical address than currently mapped to the specified virtual address when the write transaction is received;
updating the indirection map entry corresponding to the specified virtual address to indicate the different physical address and a new checksum for the data block modified by the write transaction.
-
-
28. The method as recited in claim 26, wherein said first plurality of data blocks is part of a parity group also comprising a first parity block which is calculated for said first plurality of data blocks, the method further comprising:
-
receiving a write transaction specifying the virtual addresses of a subset of said first plurality of data blocks;
calculating a new parity block for said subset of said first plurality of data blocks;
storing only said subset of said first plurality of data blocks modified by the write transaction and said new parity block as a new parity group to new physical addresses across ones of said plurality of storage devices; and
updating the entries in the indirection map for the data blocks modified by the write transaction to indicate the new physical address and checksum for each modified data block.
-
Specification