Method and apparatus for providing scaleable levels of application availability
First Claim
1. A method for providing high availability applications comprising, in combination, the steps of:
- a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups;
b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups;
c. providing, on at least a first of said at least two computers, a system layer having at least one process group;
d. taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer;
e. re-booting said first computer upon a system layer process group fault occurring on said first computer;
f. providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said one or more process groups on said first computer; and
g. activating said paired process group on said second computer upon a system layer process group fault occurring on said first computer.
8 Assignments
0 Petitions
Accused Products
Abstract
A method and an apparatus for providing scalable layers of highly available applications using loosely coupled commercially available computers. The software running on the loosely coupled computers is divided into three layers: the system layer, the platform layer, and the application layer, each having its own process group activation and fault recovery strategy. A process group contains software processes that depend upon a set of resources common to the process group. In addition to depending upon a common set of resources, processes within a process group share a fault recovery strategy. Fault recovery is performed at the process group level, such that if one process within a process group fails, fault recovery is takes place for all processes within the process group. In the preferred embodiment, an application layer process group may be paired with another application layer process group on a separate computer. As part of certain escalated process group fault recovery strategies, upon taking an application layer process group out of service, its paired application layer process group, if any exists, takes over performing the functions of the process group that was taken out of service.
-
Citations
88 Claims
-
1. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least a first of said at least two computers, a system layer having at least one process group; d. taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. re-booting said first computer upon a system layer process group fault occurring on said first computer; f. providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said one or more process groups on said first computer; and g. activating said paired process group on said second computer upon a system layer process group fault occurring on said first computer. - View Dependent Claims (2, 3)
-
-
4. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least a first of said at least two computers, a system layer having at least one process group; d. taking all of said process groups out of service on said first computer upon a fault in a resource depended upon by at least one of said system layer process groups on said first computer; e. re-booting said first computer upon said fault in said resource depended upon by at least one of said system layer process groups on said first computer; f. providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said process groups on said first computer; and g. activating said paired process group on said second computer upon said fault in said resource depended upon by at least one of said system layer process groups on said first computer. - View Dependent Claims (5, 6)
-
-
7. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least a first of said at least two computers, a system layer having at least one process group; d. taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. re-booting said first computer upon a system layer process group fault occurring on said first computer; f. providing, on at least said first computer, a platform layer having at least one process group; g. taking all of said process groups, except each of said at least one process group in said system layer, out of service on said first computer upon a platform layer process group fault occurring on said first computer; h. providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said process groups on said first computer; and i. activating said paired process group on said second computer upon said platform layer process group fault occurring on said first computer. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least a first of said at least two computers, a system layer having at least one process group; d. taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. re-booting said first computer upon a system layer process group fault occurring on said first computer; f. providing, on at least a first of said at least two computers, a platform layer having at least one process group; g. taking all of said process groups, except each of said at least one process group in said system layer, out of service on said first computer upon a fault in a resource depended upon by at least one of said platform layer process groups on said first computer; h. providing at least one paired process group on a second of said at least two computers, said at least one paired process group being paired with one of said process groups on said first computer; and i. activating said at least one paired process group on said second computer upon said fault in said resource depended upon by at least one of said platform layer process groups on said first computer. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least a first of said at least two computers, a system layer having at least one process group; d. taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. re-booting said first computer upon a system layer process group fault occurring on said first computer; f. providing, on at least a first of said at least two computers, a platform layer having at least one process group; g. restarting at least one of said platform layer process groups upon a platform layer process group fault occurring on said first computer; h. taking all of said process groups, except each of said at least one process group in said system layer, out of service on said first computer upon failure of said re-start to cure said platform layer process group fault; i. providing at least one paired process group on a second of said at least two computers, said at least one paired process group being paired with one of said process groups on said first computer; and j. activating said at least one paired process group on said second computer upon failure of said re-start to cure said platform layer process group fault. - View Dependent Claims (18, 19, 20, 21)
-
-
22. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least said first computer, a system layer having at least one process group; d. taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. re-booting said first computer upon a system layer process group fault occurring on said first computer; f. providing, on at least a first of said at least two computers, an application layer having at least one process group; g. taking said at least one application layer process group out of service on said first computer upon a fault in said at least one application layer process group on said first computer; h. providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said at least one application layer process group taken out of service on said first computer; i. activating said paired process group on said second computer upon said fault in said at least one application layer process group on said first computer; and j. re-initializing all of said process groups, except each of said at least one process group in said system layer, on said first computer upon not being able to take said application layer process group having said fault out of service on said first computer. - View Dependent Claims (23, 24, 25)
-
-
26. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least a first of said at least two computers, an application layer having at least two process groups; d. defining a dependency by at least a first of said at least two application layer process groups upon at least a second of said at least two application layer process groups; e. taking said first and said second application layer process groups out of service on said first computer upon a fault in said second application layer process group on said first computer; f. providing at least one paired process group on a second of said at least two computers, said paired process group being paired with said second application layer process group on said first computer; and g. activating said paired process group on said second computer upon said fault in said second application layer process group on said first computer. - View Dependent Claims (27, 28, 29)
-
-
30. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least a first of said at least two computers, an application layer having at least one process group; d. taking said at least one application layer process group out of service on said first computer upon a fault in a resource depended upon by said at least one application layer process group; e. providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said at least one application layer process group on said first computer; and f. activating said paired process group on said second computer upon said fault in said resource depended upon by said at least one application layer process group. - View Dependent Claims (31, 32, 33, 34)
-
-
35. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least said first computer, a system layer having at least one process group; d. taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. re-booting said first computer upon a system layer process group fault occurring on said first computer; f. providing, on at least a first of said at least two computers, an application layer having at least one process group; g. re-starting said at least one application layer process group on said first computer upon a fault in said at least one application layer process group; h. taking said at least one application layer process group out of service on said first computer upon failure of said re-start to cure said fault in said at least one application layer process group; i. providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said at least one application layer process group taken out of service on said first computer; j. activating said paired process group on said second computer upon failure of said re-start to cure said fault in said at least one application layer process group taken out of service on said first computer; and k. re-initializing all of said process groups, except each of said at least one process group in said system layer, on said first computer upon not being able to take said application layer process group having said fault out of service on said first computer. - View Dependent Claims (36, 37, 38, 39)
-
-
40. A method for providing high availability applications comprising, in combination, the steps of:
-
a. running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. providing, on at least a first of said at least two computers, an application layer having at least two process groups; d. defining a dependency by at least a first of said at least two application layer process groups upon at least a second of said at least two application layer process groups; e. re-starting said second application layer process group on said first computer upon a fault in said second application layer process group; f. taking said first and said second application layer process groups out of service on said first computer upon failure of said re-start to cure said fault in said second application layer process group; g. providing at least one paired process group on a second of said at least two computers, said paired process group being paired with said first application layer process group on said first computer; and h. activating said at least one paired process group on said second computer upon failure of said re-start to cure said fault in said second application layer process group. - View Dependent Claims (41, 42, 43, 44)
-
-
45. An apparatus for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least a first of said at least two computers, a system layer having at least one process group; d. means for taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. means for re-booting said first computer upon a system layer process group fault occurring on said first computer; f. means for providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said one or more process groups on said first computer; and g. means for activating said paired process group on said second computer upon a system layer process group fault occurring on said first computer. - View Dependent Claims (46, 47)
-
-
48. An apparatus for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least a first of said at least two computers, a system layer having at least one process group; d. means for taking all of said process groups out of service on said first computer upon a fault in a resource depended upon by at least one of said system layer process groups on said first computer; e. means for re-booting said first computer upon said fault in said resource depended upon by at least one of said system layer process groups on said first computer; f. means for providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said process groups on said first computer; and g. means for activating said paired process group on said second computer upon said fault in said resource depended upon by at least one of said system layer process groups on said first computer. - View Dependent Claims (49, 50)
-
-
51. An apparatus for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least a first of said at least two computers, a system layer having at least one process group; d. means for taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. means for re-booting said first computer upon a system layer process group fault occurring on said first computer; f. means for providing, on at least said first computer, a platform layer having at least one process group; g. means for taking all of said process groups, except each of said at least one process group in said system layer, out of service on said first computer upon a platform layer process group fault occurring on said first computer; h. means for providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said process groups on said first computer; and i. means for activating said paired process group on said second computer upon said platform layer process group fault occurring on said first computer. - View Dependent Claims (52, 53, 54, 55)
-
-
56. An apparatus for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least a first of said at least two computers, a system layer having at least one process group; d. means for taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. means for re-booting said first computer upon a system layer process group fault occurring on said first computer; f. means for providing, on at least a first of said at least two computers, a platform layer having at least one process group; g. means for taking all of said process groups, except each of said at least one process group in said system layer, out of service on said first computer upon a fault in a resource depended upon by at least one of said platform layer process groups on said first computer; h. means for providing at least one paired process group on a second of said at least two computers, said at least one paired process group being paired with one of said process groups on said first computer; and i. means for activating said at least one paired process group on said second computer upon said fault in said resource depended upon by at least one of said platform layer process groups on said first computer. - View Dependent Claims (57, 58, 59, 60)
-
-
61. An apparatus for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least a first of said at least two computers, a system layer having at least one process group; d. means for taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. means for re-booting said first computer upon a system layer process group fault occurring on said first computer; f. means for providing, on at least a first of said at least two computers, a platform layer having at least one process group; g. means for restarting at least one of said platform layer process groups upon a platform layer process group fault occurring on said first computer; h. means for taking all of said process groups, except each of said at least one process group in said system layer, out of service on said first computer upon failure of said re-start to cure said platform layer process group fault; i. means for providing at least one paired process group on a second of said at least two computers, said at least one paired process group being paired with one of said process groups on said first computer; and j. means for activating said at least one paired process group on said second computer upon failure of said re-start to cure said platform layer process group fault. - View Dependent Claims (62, 63, 64, 65)
-
-
66. An apparatus for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least said first computer, a system layer having at least one process group; d. means for taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. means for re-booting said first computer upon a system layer process group fault occurring on said first computer; f. means for providing, on at least a first of said at least two computers, an application layer having at least one process group; g. means for taking said at least one application layer process group out of service on said first computer upon a fault in said at least one application layer process group on said first computer; h. means for providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said at least one application layer process group taken out of service on said first computer; i. means for activating said paired process group on said second computer upon said fault in said at least one application layer process group on said first computer; and j. means for re-initializing all of said process groups, except each of said at least one process group in said system layer, on said first computer upon not being able to take said application layer process group having said fault out of service on said first computer. - View Dependent Claims (67, 68, 69)
-
-
70. An apparatus for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least a first of said at least two computers, an application layer having at least two process groups; d. means for defining a dependency by at least a first of said at least two application layer process groups upon at least a second of said at least two application layer process groups; e. means for taking said first and said second application layer process groups out of service on said first computer upon a fault in said second application layer process group on said first computer; f. means for providing at least one paired process group on a second of said at least two computers, said paired process group being paired with said second application layer process group on said first computer; and g. means for activating said paired process group on said second computer upon said fault in said second application layer process group on said first computer. - View Dependent Claims (71, 72, 73, 74)
-
-
75. An apparatus for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least a first of said at least two computers, an application layer having at least one process group; d. means for taking said at least one application layer process group out of service on said first computer upon a fault in a resource depended upon by said at least one application layer process group; e. means for providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said at least one application layer process group on said first computer; and f. means for activating said paired process group on said second computer upon said fault in said resource depended upon by said at least one application layer process group. - View Dependent Claims (76, 77, 78, 79)
-
-
80. An apparatus for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least said first computer, a system layer having at least one process group; d. means for taking all of said process groups out of service on said first computer upon a system layer process group fault occurring on said first computer; e. means for re-booting said first computer upon a system layer process group fault occurring on said first computer; f. means for providing, on at least a first of said at least two computers, an application layer having at least one process group; g. means for re-starting said at least one application layer process group on said first computer upon a fault in said at least one application layer process group; h. means for taking said at least one application layer process group out of service on said first computer upon failure of said re-start to cure said fault in said at least one application layer process group; i. means for providing at least one paired process group on a second of said at least two computers, said paired process group being paired with one of said at least one application layer process group taken out of service on said first computer; j. means for activating said paired process group on said second computer upon failure of said re-start to cure said fault in said at least one application layer process group taken out of service on said first computer; and k. means for re-initializing all of said process groups, except each of said at least one process group in said system layer, on said first computer upon not being able to take said application layer process group having said fault out of service on said first computer. - View Dependent Claims (81, 82, 83)
-
-
84. A method for providing high availability applications comprising, in combination:
-
a. means for running, on at least two computers, one or more process groups, at least one of said process groups containing one or more processes that have a fault recovery strategy common to said at least one of said process groups; b. means for running, on at least one of said at least two computers, a process group manager that initiates a fault recovery strategy for at least one of said one or more process groups; c. means for providing, on at least a first of said at least two computers, an application layer having at least two process groups; d. means for defining a dependency by at least a first of said at least two application layer process groups upon at least a second of said at least two application layer process groups; e. means for re-starting said second application layer process group on said first computer upon a fault in said second application layer process group; f. means for taking said first and said second application layer process groups out of service on said first computer upon failure of said re-start to cure said fault in said second application layer process group; g. means for providing at least one paired process group on a second of said at least two computers, said paired process group being paired with said first application layer process group on said first computer; and h. means for activating said at least one paired process group on said second computer upon failure of said re-start to cure said fault in said second application layer process group. - View Dependent Claims (85, 86, 87, 88)
-
Specification