System and method for coordinated bringup of a storage appliance in a cluster configuration
First Claim
Patent Images
1. A system, comprising:
- a first storage system;
a second storage system coupled to the first storage system and configured to detect a failure of the first storage system, upon detection of the failure of the first storage system the second storage system placing reservations on data storage devices used by the first storage system, and the second storage system to begin servicing data access requests directed to the first storage system;
after repair of the first storage system, the first storage system to begin a boot operation, the first storage system to detect the reservations placed by the second storage system on the data storage devices, and in response to detecting the reservations halting the boot operation and writing to a memory a first state indicating that the first storage system is booting;
the second storage system reading the first state, and in response to reading the first state releasing the reservations and writing a second state into the memory;
in response to the second state written into the memory, the first storage system initializing the data storage devices, and writing a first message into the memory;
in response to the first message, the second storage system to no longer service the data access requests directed to the first storage system, the second storage system to release resources including the data storage devices to the first storage system, and the second storage system to write a second message into the memory; and
in response to the second message, the first storage system completes the boot operation and begins processing the data access requests directed to the first storage system.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method for coordinated bringup of a storage appliance in a storage appliance cluster. The repaired storage appliance, during its initialization, sets a variety of state values in a predetermined memory location comprising a state data structure, which is detected by a remote direct memory access read operation by the surviving storage appliance. By the use of the RDMA operations, the repaired storage appliance and surviving storage appliance coordinate the bringup and giveback of data servicing functionality.
-
Citations
25 Claims
-
1. A system, comprising:
-
a first storage system; a second storage system coupled to the first storage system and configured to detect a failure of the first storage system, upon detection of the failure of the first storage system the second storage system placing reservations on data storage devices used by the first storage system, and the second storage system to begin servicing data access requests directed to the first storage system; after repair of the first storage system, the first storage system to begin a boot operation, the first storage system to detect the reservations placed by the second storage system on the data storage devices, and in response to detecting the reservations halting the boot operation and writing to a memory a first state indicating that the first storage system is booting; the second storage system reading the first state, and in response to reading the first state releasing the reservations and writing a second state into the memory; in response to the second state written into the memory, the first storage system initializing the data storage devices, and writing a first message into the memory; in response to the first message, the second storage system to no longer service the data access requests directed to the first storage system, the second storage system to release resources including the data storage devices to the first storage system, and the second storage system to write a second message into the memory; and in response to the second message, the first storage system completes the boot operation and begins processing the data access requests directed to the first storage system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for operating a first storage system operatively coupled to a second storage system, comprising:
-
detecting a failure of the first storage system by the second storage system; placing reservations by the second storage system on data storage devices used by the first storage system, and the second storage system servicing data access requests directed to the first storage system; after repair of the first storage system, beginning a boot operation by the first storage system, the first storage system detecting the reservations placed by the second storage system on the data storage devices, and in response to detecting the reservations halting the boot operation and writing to a memory a first state indicting that the first storage system is booting; reading by the second storage system the first state, and in response to reading the first state, the second storage system releasing the reservations and writing a second state into the memory; in response to the second state written into the memory, initializing by the first storage system the data storage devices, and writing a first message into the memory; in response to the first message, the second storage system no longer servicing the data access requests directed to the first storage system, and the second storage system releasing resources including the data storage devices to the first storage system, and the second storage system writing a second message into the memory; and in response to the second message, the first storage system completing the boot operation, and beginning processing the data access requests directed to the first storage system. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A computer readable non-transitory storage media, comprising:
-
said computer readable storage media containing program instructions for execution on a processor for a method of operating a first storage system operationally coupled to a second storage system, the program instructions for, detecting a failure of the first storage system by the second storage system; placing reservations by the second storage system on data storage devices used by the first storage system, and the second storage system servicing data access requests directed to the first storage system; after repair of the first storage system, beginning a boot operation by the first storage system, the first storage system detecting the reservations placed by the second storage system on the data storage devices, and in response to detecting the reservations halting the boot operation and writing to a memory a first state indicting that the first storage system is booting; reading by the second storage system the first state, and in response to reading the first state, the second storage system releasing the reservations and writing a second state into the memory; in response to the second state written into the memory, initializing by the first storage system the data storage devices, and writing a first message into memory; in response to the first message, the second storage system no longer servicing the data access requests directed to the first storage system, and the second storage system releasing resources including the data storage devices to the first storage system, and the second storage system writing a second message into the memory; and in response to the second message, the first storage system completing the boot operation, and beginning processing the data access requests directed to the first storage system.
-
Specification