System and method for monitoring cluster partner boot status over a cluster interconnect
First Claim
1. A method for detecting a failed computer, comprising:
- initiating a boot procedure by a first computer, the boot procedure controlled by boot firmware of the first computer;
establishing a virtual interface by the boot firmware, the virtual interface having boot status data written therein as the first computer proceeds through the boot procedure; and
reading the boot status data in the virtual interface by a second computer, the second computer using a remote direct memory access procedure to access data in the virtual interface.
0 Assignments
0 Petitions
Accused Products
Abstract
A method for detecting an un-bootable first computer is described. A failed first computer initiates a boot procedure, and the boot procedure is controlled by boot firmware of the first computer. A virtual interface is established by the boot firmware, the virtual interface having boot status data written therein as the failed computer boots. A second computer reads the boot status data in the virtual interface using a remote direct memory access procedure to access data in the virtual interface. The second computer determines, in response to the boot status data, if the boot procedure of the first computer failed, and if it failed performing a failover routine; and if it succeeded allowing the failed computer to complete its boot procedure. Another connection between the first computer and the second computer is opened, in response to the boot procedure succeeding, using higher level software than the boot firmware.
52 Citations
17 Claims
-
1. A method for detecting a failed computer, comprising:
-
initiating a boot procedure by a first computer, the boot procedure controlled by boot firmware of the first computer; establishing a virtual interface by the boot firmware, the virtual interface having boot status data written therein as the first computer proceeds through the boot procedure; and reading the boot status data in the virtual interface by a second computer, the second computer using a remote direct memory access procedure to access data in the virtual interface. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system to detect a failed computer, comprising:
-
a first computer, the first computer initiating a boot procedure controlled by boot firmware of the first computer; a virtual interface established by the boot firmware, the virtual interface having boot status data written therein as the first computer proceeds through the boot procedure; and a second computer to read the boot status data in the virtual interface by using a remote direct memory access procedure to access data in the virtual interface. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer readable media, comprising:
-
said computer readable media containing instructions for execution on a processor for the practice of a method of detecting a failed computer, the method having the steps of, initiating a boot procedure by a first computer, the boot procedure controlled by boot firmware of the first computer; establishing a virtual interface by the boot firmware, the virtual interface having boot status data written therein as the first computer proceeds through the boot procedure; and reading the boot status data in the virtual interface by a second computer, the second computer using a remote direct memory access procedure to access data in the virtual interface.
-
-
14. A method for detecting a failed computer, comprising:
-
initiating a boot procedure by a first computer, the boot procedure controlled by boot firmware of the first computer; establishing a virtual interface by the boot firmware, the virtual interface having boot status data written therein as the first computer proceeds through the boot procedure; reading the boot status data in the virtual interface of the first computer by a second computer to ascertain whether the first computer'"'"'s boot procedure is progressing normally; in response, if the second computer determines that the first computer'"'"'s boot procedure is progressing normally, the first computer completes its initialization routine; and in response, if the second computer determines that the first computer'"'"'s boot procedure is not progressing normally, the second computer will perform a failover routine.
-
-
15. A method for detecting a failed computer, comprising:
-
assigning a first virtual interface (VI) for failure recovery to the first computer and a second VI for failure recovery to the second computer; establishing during booting by the first computer, after a failure by the first computer, a VI for connection to the second VI of the second computer; and reading status information by the second computer of the first computer through the first VI of the first computer, the second computer using its second VI to communicate with the first VI of the first computer. - View Dependent Claims (16, 17)
-
Specification