Mass storage data integrity-assuring technique utilizing sequence and revision number metadata
First Claim
1. A method of creating metadata from user data to detect errors arising from input/output (I/O) operations performed on information storage media of a mass storage system, comprising the steps of:
- creating a plurality of user data structures, each user data structure containing user data and metadata, the metadata of each user data structure describing the user data contained in that same user data structure;
creating a parity data structure associated with the plurality of user data structures, the parity data structure containing metadata which describes separately the user data and metadata in each of the user data structures with which the parity data structure is associated;
writing the plurality of user data structures and the associated parity data structure to the storage media as an integral group of related data structures in a group-write I/O operation;
including a sequence number as part of the metadata in each user data structure and in the parity data structure of the group, the sequence number identifying the group-write I/O operation;
including a revision number as part of the metadata in each user data structure and in the parity data structure of the group, the revision number identifying a partial-write I/O operation in which the user data in each of less than all of the user data structures of the group is written while the user data in the other remaining user data structures of the group is not written; and
including parity information in the parity data structure which describes the parity of the collective user data in all of the user data structures of the group.
2 Assignments
0 Petitions
Accused Products
Abstract
Sequence number metadata which identifies an input/output (I/O) operation, such as a full stripe write on a redundant array of independent disks (RAID) mass storage system, and revision number metadata which identifies an I/O operation such as a read modify write operation on user data recorded in components of the stripe, are used in an error detection and correction technique, along with parity metadata, to detect and correct silent errors arising from inadvertent data path and drive data corruption. An error arising after a full stripe write is detected by a difference in sequence numbers for all of the components of user data in the stripe. An error arising after a read modify write is detected by a revision number which occurred before the correct revision number. The errors in both cases are corrected by using the parity metadata for the entire collection of user data and the correct information from the other components of the user data and metadata, and applying this information to an error correcting algorithm. The technique may be executed in conjunction with a read I/O operation without incurring a substantial computational overhead penalty.
-
Citations
21 Claims
-
1. A method of creating metadata from user data to detect errors arising from input/output (I/O) operations performed on information storage media of a mass storage system, comprising the steps of:
-
creating a plurality of user data structures, each user data structure containing user data and metadata, the metadata of each user data structure describing the user data contained in that same user data structure;
creating a parity data structure associated with the plurality of user data structures, the parity data structure containing metadata which describes separately the user data and metadata in each of the user data structures with which the parity data structure is associated;
writing the plurality of user data structures and the associated parity data structure to the storage media as an integral group of related data structures in a group-write I/O operation;
including a sequence number as part of the metadata in each user data structure and in the parity data structure of the group, the sequence number identifying the group-write I/O operation;
including a revision number as part of the metadata in each user data structure and in the parity data structure of the group, the revision number identifying a partial-write I/O operation in which the user data in each of less than all of the user data structures of the group is written while the user data in the other remaining user data structures of the group is not written; and
including parity information in the parity data structure which describes the parity of the collective user data in all of the user data structures of the group. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
reading the sequence number from each of the user data structures involved in the partial-write I/O operation and from the parity data structure during the partial-write I/O operation; and
determining whether the sequence numbers read from the user data structures and from the parity data structure match.
-
-
3. A method as defined in claim 2, further comprising the steps of:
-
reading the sequence number of another user data structure of the group when the sequence numbers read from one user data structure and the parity data structure do not match; and
determining a correct sequence number which is equal to two matching ones of the three sequence numbers read from the two user data structures and the parity data structure of the group.
-
-
4. A method as defined in claim 3 used additionally for correcting errors arising from I/O operations, further comprising the step of:
-
using the user data from the user data structure and the parity information from the parity data structure which both have correct sequence numbers to construct correct user data and metadata information for writing in another user data structure of the group which has the incorrect sequence number; and
using the user data from the user data structures of the group which have correct sequence numbers to construct correct parity information for the parity data structure which has an incorrect sequence number.
-
-
5. A method as defined in claim 4, further comprising the steps of:
writing the constructed correct information to the one of the user data structures or the parity data structure which previously had an incorrect sequence number.
-
6. A method as defined in claim 5, further comprising the step of:
executing the I/O operation after performing the aforesaid step of writing the constructed correct information.
-
7. A method as defined in claim 2, further comprising the steps of:
-
reading the revision number from the user data structures involved in the partial-write I/O operation and from the parity data structure during the partial-write I/O operation when the sequence numbers match; and
thereafterdetermining whether the revision numbers from the user data structures and the parity data structure match.
-
-
8. A method as defined in claim 7, further comprising the step of:
executing completely the partial-write I/O operation when the sequence numbers and the revision numbers match.
-
9. A method as defined in claim 7, further comprising the step of:
-
determining which of the revision numbers is indicative of a later-occurring partial-write I/O operation when the revision numbers do not match; and
thereafterattributing the revision number indicative of the later-occurring partial-write I/O operation as a correct revision number.
-
-
10. A method as defined in claim 9, further comprising the steps of:
determining whether the revision number read from the parity data structure occurred before the correct revision number.
-
11. A method as defined in claim 10 used additionally for correcting errors arising from I/O operations, further comprising the steps of:
-
reading the user data and parity information from the other user data structures of the group and reading the parity information from the parity data structure, when the revision number read from a user data structure involved in the partial-write I/O operation occurred before the correct revision number; and
thereafterconstructing the correct user data for the user data structure involved in the partial-write I/O operation from the user data read from the other user data structures of the group and the parity information read from the parity data structure.
-
-
12. A method as defined in claim 11, further comprising the step of:
executing the partial-write I/O operation after performing the aforesaid step of writing the correct user data in the user data structure.
-
13. A method as defined in claim 10 used additionally for correcting errors arising from I/O operations, further comprising the steps of:
-
reading the user data and metadata from the user data structures of the group when the revision number read from the parity data structure occurred before the correct revision number; and
thereafterconstructing correct metadata and parity information for the parity data structure from the user data and metadata read from the user data structures of the group.
-
-
14. A method as defined in claim 13, further comprising the steps of:
writing the correct metadata including the correct revision number and the correct parity information in the parity data structure.
-
15. A method as defined in claim 14, further comprising the step of:
executing the partial-write I/O operation after performing the aforesaid step of writing the correct metadata and parity information in the parity data structure.
-
16. A method as defined in claim 1 wherein the mass storage system comprises a redundant array of independent disks (RAID) mass storage system having at least one redundancy group which includes a plurality of disk drives, and the group-write operation is a full stripe write operation in which a stripe is written to the redundancy group.
-
17. A method as defined in claim 16 wherein each user data structure and the parity data structure of the stripe are each written on separate disk drives in the redundancy group.
-
18. A method as defined in claim 17 wherein the partial-write operation is a read modify write operation performed on each of the less than all of the user data structures of the group.
-
19. A method as defined in claim 17 wherein the RAID mass storage system includes a cache memory apart from the disk drives of each redundancy groups, and said method further comprises the steps of:
-
recording sequence numbers and revision numbers identical to those within the parity data structure in the cache memory; and
referring to the sequence number and revision number in the cache memory during a read I/O operation of the user data structures written on the disk drives.
-
-
20. A method of detecting and correcting errors arising from input/output I/O operations on user data stored in storage media of a mass information storage system, comprising the steps of:
-
writing sequence number metadata with each of a plurality of associated groups of user data to identify a group-write I/O operation which wrote the groups of user data;
writing revision number metadata with the user data to identify a partial-write I/O operation on each of less than all of the groups of user data while other remaining ones of the groups of user data are not written;
writing parity metadata associated with each group of user data to describe each group of user data;
writing separate sequence number metadata, revision number metadata and parity metadata on the storage media at a separate location from the groups of user data, the separate sequence number metadata and revision number metadata being substantial duplicates of the sequence number metadata and revision number metadata associated with each group of user data, the separate parity metadata describing the collective user data of all of the groups;
correcting the user data of any group having an incorrect sequence number which is different from the sequence numbers of two other groups by using the separate parity metadata and user data from the groups having correct sequence numbers; and
correcting the user data of any group having an incorrect revision number which occurred before the separate revision number metadata by using the separate parity metadata and the user data from all of the groups having correct revision numbers.
-
-
21. A redundant array of independent disks (RAID) mass information storage system, comprising an array controller and at least one redundancy group connected to the array controller, each redundancy group including a plurality of disk drives as storage media and a disk controller connected to the disk drives, the array controller and the disk controller each including a processor executing programmed instructions to detect and correct errors arising from input/output (I/O) operations on user data stored in a full stripe written on the plurality of disk drives of the redundancy group, the processors operatively:
-
writing sequence number metadata to each of the plurality of disk drives with the user data to identify the stripe which contains the user data collectively written to the plurality of disk drives;
writing revision number metadata to each of the plurality of disk drives with the user data to identify each read modify write I/O operation performed on the user data of a single disk drive after the stripe had been previously written;
writing separate sequence number metadata, revision number metadata and parity metadata to a separate disk drive of the redundancy group as part of the stripe, the separate sequence number metadata and revision number metadata being substantial duplicates of the sequence number metadata and revision number metadata associated with the user data of the stripe contained on each other disk drive of the redundancy group, the separate parity metadata of the separate disk drive describing the all of the user data of the stripe contained on all of the other disk drives;
correcting the user data of the stripe obtained from a disk drive when the user data of the stripe has a sequence number which is different from the sequence numbers of the user data of the stripe obtained from two other disk drives by using the separate parity metadata and user data of the stripe having correct sequence numbers in an error correcting algorithm; and
correcting the user data of the stripe obtained from a disk drive when the user data of the stripe has an incorrect revision number which is different from the separate revision number metadata by using the parity metadata obtained from the separate disk drive and the user data from the disk drives of the stripe having correct revision numbers in an error correcting algorithm.
-
Specification