Parallel computer system including parallel storage subsystem including facility for correction of data in the event of failure of a storage device in parallel storage subsystem
First Claim
1. A parallel computer system comprising a plurality of processing elements and an input/output system interconnected by a communications network:
- A. the input/output system including a plurality of input/output devices connected to said communications network for receiving messages from, and transmitting messages to, said processing elements over said communications network; and
B. each processing element including a memory including a series of storage locations for storing data items, the data items in the series of storage locations defining a series of stripes each including data items in a predetermined number of successive storage locations, the processors of all of said processing elements;
i. during a storage operation generating a series of messages for transmission over said communications network to successive input/output devices, each message including a data item from one of said storage locations, each processor generating for each stripe of data items a check value for transmission in a message to another of said input/output devices; and
ii. during a reconstruction operation, receiving said data items and said check values from at least some of said input/output devices and performing a reconstruction operation in connection with each check value and a stripe of data values for which the check value was generated during the storage operation thereby to reconstruct data items stored in at least one other of said input/output devices.
6 Assignments
0 Petitions
Accused Products
Abstract
A parallel computer system comprising a plurality of processing elements and an input/output system interconnected by a communications network. The input/output system includes a plurality of input/output devices connected to the communications network for receiving messages from, and transmitting messages to, the processing elements over the communications network. The processing element during the storage operation generates a series of messages for transmission over the communications network to successive input/output devices. Each message includes a data item from one of a series of storage locations. The processing element generates for each stripe of a selected number of data items a check value for transmission in a message to another of the input/output devices. During a reconstruction operation, the processing element receives the data items and the check values from at least some of the input/output devices and performs a reconstruction operation in connection with each check value and a stripe of data values for which the check value was generated during the storage operation thereby to reconstruct data items stored in at least one other of the input/output devices.
1467 Citations
4 Claims
-
1. A parallel computer system comprising a plurality of processing elements and an input/output system interconnected by a communications network:
-
A. the input/output system including a plurality of input/output devices connected to said communications network for receiving messages from, and transmitting messages to, said processing elements over said communications network; and B. each processing element including a memory including a series of storage locations for storing data items, the data items in the series of storage locations defining a series of stripes each including data items in a predetermined number of successive storage locations, the processors of all of said processing elements; i. during a storage operation generating a series of messages for transmission over said communications network to successive input/output devices, each message including a data item from one of said storage locations, each processor generating for each stripe of data items a check value for transmission in a message to another of said input/output devices; and ii. during a reconstruction operation, receiving said data items and said check values from at least some of said input/output devices and performing a reconstruction operation in connection with each check value and a stripe of data values for which the check value was generated during the storage operation thereby to reconstruct data items stored in at least one other of said input/output devices. - View Dependent Claims (2)
-
-
3. A parallel computer system comprising a plurality of processing elements and an input/output system interconnected by a communications network:
-
A. the input/output system including a plurality of input/output devices defining a reconstruction group, at least some of said input/output devices during a reconstruction operation transmitting messages over said communications network to each processing element including a different data item from a stripe of data items or a check value associated therewith; and B. each processing element, during a reconstruction operation, receiving data items and check values transmitted thereto by said input/output devices and performing a reconstruction operation in connection with each check value and a stripe of data values for which the check value was generated during a storage operation thereby to reconstruct data items in at least one other of said input/output devices.
-
-
4. A parallel computer system comprising a plurality of processing elements and an input/output system interconnected by a communications network:
-
A. the input/output system including a plurality of input/output devices connected to said communications network for receiving messages from, and transmitting messages to, said processing elements over said communications network; and B. each processing element including a memory including a series of storage locations for storing data items, the data items in the series of storage locations defining a series of stripes each including data items in a predetermined number of successive storage locations, each said processing element during a storage operation generating a series of messages for transmission over said communications network to successive input/output devices in said input/output system, each message including a data item from one of said storage locations, the processing element generating for each stripe of data items a check value for transmission in a message to another of said input/output devices, thereby to store different data items in each stripe and the associated associated check value on diverse ones or the input/output devices.
-
Specification