Fault tolerance using logical checkpointing in computing systems
First Claim
1. A service operating on a computer system to process requests from client processes, comprising:
- a primary service instance operating on the computer system, including means for receiving and processing client requests, from the client processes; and
at least one backup service instance operating on the computer system, including means for receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the primary service instance includes means for determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change; and
means for communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
wherein each backup service instance includes means for communicating, to the primary service instance, acknowledgement of the determined logical requests.
4 Assignments
0 Petitions
Accused Products
Abstract
A system and method for checkpointing a primary computer process to a backup computer process such that if there is a failure of a primary process, the backup process can takeover without interruption. In addition, upgrades to different version of software or equipment can take place without interruption. The invention provides a lightweight checkpointing method that allows checkpointing of only external requests or messages that change the state of the service instance, thereby reducing the overhead and performance penalties. In addition, the present invention checkpoints data for primary and backups that do not share resources but are logically equivalent. All communication between the primary and backup takes places using network protocols.
100 Citations
81 Claims
-
1. A service operating on a computer system to process requests from client processes, comprising:
-
a primary service instance operating on the computer system, including means for receiving and processing client requests, from the client processes; and
at least one backup service instance operating on the computer system, including means for receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the primary service instance includes means for determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change; and
means for communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
wherein each backup service instance includes means for communicating, to the primary service instance, acknowledgement of the determined logical requests. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A service operating on a computer system to process requests from client processes, comprising:
-
a primary service instance operating on the computer system, including means for receiving and processing client requests, from the client processes; and
at least one backup service instance operating on the computer system, including means for receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the primary service instance includes means for determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change;
means for communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance;
means for receiving an indication of a path failure between the primary service instance and a client process that occurs after a result of processing a particular client request is committed and before the client process receives the result from the service; and
means for receiving the particular client request again and for providing the response to the client request without again processing the client request.
-
-
9. A service operating on a computer system to process requests from client processes, comprising:
-
a primary service instance operating on the computer system, including means for receiving and processing client requests, from the client processes; and
at least one backup service instance operating on the computer system, including means for receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the primary service instance includes means for determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change;
means for communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
wherein each backup service instance includes means for receiving an indication of a failure of the primary service instance, and means for notifying the client process that each backup service instance has become a new primary service instance; and
the new primary service instance includes means for receiving client requests and, for client requests that have already been committed by the new primary service instance when the new primary service instance was a backup service instance, providing the committed response to the client request without again processing the request. - View Dependent Claims (10, 11, 12)
-
-
13. A service operating on a computer system to process requests from client processes, comprising:
-
a primary service instance operating on the computer system, including means for receiving and processing client requests, from the client processes; and
at least one backup service instance operating on the computer system, including means for receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the primary service instance includes means for determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change;
means for communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance;
means for creating an additional backup service instance, including means for replicating the logical view of the primary service instance to the additional backup service instance, wherein the means for replicating the logical view of the primary service instance to the additional backup service instance provides the determined logical requests to the additional backup service instance; and
processes a request history table to ensure that logical view of the additional backup service instance is the same as the logical view of the primary service instance. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A service operating on a computer system to process requests from client processes, comprising:
-
a primary service instance operating on the computer system, and being adapted to receive and process client requests from the client processes;
at least one backup service instance operating on the computer system, and being adapted to receive and process the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the primary service instance determines which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicates with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
a mechanism for creating and operating an additional backup service instance, including a replicating mechanism for replicating the logical view of the primary service instance to the additional backup service instance, and for providing the determined logical requests to the additional backup service instance. - View Dependent Claims (19)
-
-
20. A service operating on a computer system to process requests from client processes, comprising:
-
a primary service instance operating on the computer system, and being adapted to receive and process client requests from the client processes;
at least one backup service instance operating on the computer system, and being adapted to receive and process the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the primary service instance determines which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicates with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
a mechanism for creating and operating an additional backup service instance, including a replicating mechanism for replicating the logical view of the primary service instance to the additional backup service instance, and for implementing the additional backup service instance to have a physical behavior different from the physical behavior of the primary service instance. - View Dependent Claims (21, 22, 23)
-
-
24. A service operating on a computer system to process requests from client processes, comprising:
-
a primary service instance operating on the computer system, and being adapted to receive and process client requests from the client processes;
at least one backup service instance operating on the computer system, and being adapted to receive and process the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the primary service instance determines which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicates with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
a mechanism for creating and operating an additional backup service instance, including a replicating mechanism for replicating the logical view of the primary service instance to the additional backup service instance, and for implementing the additional backup service instance to have a physical behavior the same as the physical behavior of the primary service instance. - View Dependent Claims (25, 26, 27)
-
-
28. A method of operating a computer system to process requests from client processes, comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests, from the client processes; and
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
wherein the step of operating each backup service instance includes communicating, to the primary service instance, acknowledgement of the determined logical requests.
-
-
29. A method of operating a computer system to process requests from client processes, comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests, from the client processes; and
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance, and committing the determined logical requests in response to the acknowledgements of the determined logical requests. - View Dependent Claims (30, 31, 32, 33, 34)
-
-
35. A method of operating a computer system to process requests from client processes, comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests, from the client processes; and
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance, wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
wherein;
the step of operating the primary service instance includes receiving an indication of a path failure between the primary service instance and a client process that occurs after a result of processing a particular client request is committed and before the client process receives the result from the service, and receiving the particular client request again and providing the response to the client request without again processing the client request.
-
-
36. A method of operating a computer system to process requests from client processes, comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests, from the client processes; and
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
wherein;
operating each backup service instance includes receiving an indication of a failure of the primary service instance, and notifying the client process that each backup service instance has become a new primary service instance; and
operating the new primary service instance includes receiving client requests and, for client requests that have already been committed by the new primary service instance when the new primary service instance was a backup service instance, providing the committed response to the client request without again processing the request. - View Dependent Claims (37, 38, 39)
-
-
40. A method of operating a computer system to process requests from client processes, comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests, from the client processes;
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change; and
communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
creating an additional backup service instance, including replicating the logical view of the primary service instance to the additional backup service instance;
wherein the step of replicating the logical view of the primary service instance to the additional backup service instance includes providing the determined logical requests to the additional backup service instance, andprocessing a request history table to ensure that logical view of the additional backup service instance is the same as the logical view of the primary service instance. - View Dependent Claims (41, 42, 43, 44)
-
-
45. A method of operating a service on a computer system to process requests from client processes, comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests from the client processes;
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
creating and operating an additional backup service instance, including replicating the logical view of the primary service instance to the additional backup service instance;
wherein;
the step of replicating the logical view of the primary service instance to the additional backup service instance includes providing the determined logical requests to the additional backup service instance.
-
-
46. A method of operating a service on a computer system to process requests from client processes, comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests from the client processes;
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
creating and operating an additional backup service instance, including replicating the logical view of the primary service instance to the additional backup service instance;
wherein;
the replicating step includes processing a request history table to ensure that logical view of the additional backup service instance is the same as the logical view of the primary service instance.
-
-
47. A method of operating a service on a computer system to process requests from client processes, comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests from the client processes;
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
creating and operating an additional backup service instance, including replicating the logical view of the primary service instance to the additional backup service instance;
wherein;
the step of replicating the logical view of the primary service instance to the additional backup service instance includes implementing the additional backup service instance to have a physical behavior different from the physical behavior of the primary service instance. - View Dependent Claims (48, 49, 50)
-
-
51. A method of operating a service on a computer system to process requests from client processes, comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests from the client processes;
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
creating and operating an additional backup service instance, including replicating the logical view of the primary service instance to the additional backup service instance, wherein;
the step of replicating the logical view of the primary service instance to the additional backup service instance includes implementing the additional backup service instance to have a physical behavior the same as the physical behavior of the primary service instance. - View Dependent Claims (52, 53, 54)
-
-
55. A program storage device readable by a machine tangibly embodying a program of instructions executable by the machine to perform method steps for operating a computer system to process requests from client processes, said method steps comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests, from the client processes; and
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
wherein the step of operating each backup service instance includes communicating, to the primary service instance, acknowledgement of the determined logical requests. - View Dependent Claims (56, 57, 58, 59, 60, 61)
-
-
62. A program storage device readable by a machine tangibly embodying a program of instructions executable by the machine to perform method steps for operating a computer system to process requests from client processes, said method steps comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests, from the client processes; and
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance, receiving an indication of a path failure between the primary service instance and a client process that occurs after a result of processing a particular client request is committed and before the client process receives the result from the service, and receiving the particular client request again and providing the response to the client request without again processing the client request.
-
-
63. A program storage device readable by a machine tangibly embodying a program of instructions executable by the machine to perform method steps for operating a computer system to process requests from client processes, said method steps comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests, from the client processes; and
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance;
wherein the step of operating each backup service instance includes receiving an indication of a failure of the primary service instance, and notifying the client process that each backup service instance has become a new primary service instance; and
wherein the step of operating the new primary service instance includes receiving client requests and, for client requests that have already been committed by the new primary service instance when the new primary service instance was a backup service instance, providing the committed response to the client request without again processing the request. - View Dependent Claims (64, 65, 66)
-
-
67. A program storage device readable by a machine tangibly embodying a program of instructions executable by the machine to perform method steps for operating a computer system to process requests from client processes, said method steps comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests, from the client processes; and
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
creating an additional backup service instance, including replicating the logical view of the primary service instance to the additional backup service instance;
wherein the step of replicating the logical view of the primary service instance to the additional backup service instance includes providing the determined logical requests to the additional backup service instance, and processing a request history table to ensure that logical view of the additional backup service instance is the same as the logical view of the primary service instance. - View Dependent Claims (68, 69, 70, 71)
-
-
72. A program storage device readable by a machine tangibly embodying a program of instructions executable by the machine to perform method steps for operating a computer system to process requests from client processes, said method steps comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests from the client processes;
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
creating and operating an additional backup service instance, including replicating the logical view of the primary service instance to the additional backup service instance;
wherein;
the step of replicating the logical view of the primary service instance to the additional backup service instance includes providing the determined logical requests to the additional backup service instance. - View Dependent Claims (73)
-
-
74. A program storage device readable by a machine tangibly embodying a program of instructions executable by the machine to perform method steps for operating a computer system to process requests from client processes, said method steps comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests from the client processes;
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
creating and operating an additional backup service instance, including replicating the logical view of the primary service instance to the additional backup service instance;
wherein;
the step of replicating the logical view of the primary service instance to the additional backup service instance includes implementing the additional backup service instance to have a physical behavior different from the physical behavior of the primary service instance. - View Dependent Claims (75, 76, 77)
-
-
78. A program storage device readable by a machine tangibly embodying a program of instructions executable by the machine to perform method steps for operating a computer system to process requests from client processes, said method steps comprising:
-
operating a primary service instance on the computer system, including receiving and processing client requests from the client processes;
operating at least one backup service instance on the computer system, including receiving and processing the client requests, wherein each backup service instance is logically equivalent to the primary service instance;
wherein the step of operating the primary service instance includes determining which external requests, when processed by the secondary service instance, are logical requests such that processing of the determined logical requests cause the external view of each backup service instance to change, and communicating with each backup service instance to provide an indication of the determined logical requests to each backup service instance; and
creating and operating an additional backup service instance, including replicating the logical view of the primary service instance to the additional backup service instance;
wherein;
the step of replicating the logical view of the primary service instance to the additional backup service instance includes implementing the additional backup service instance to have a physical behavior the same as the physical behavior of the primary service instance. - View Dependent Claims (79, 80, 81)
-
Specification