Distributed computation recovery management system and method
First Claim
1. In a computer system having multiple application processes that interactively perform a distributed computation, the steps of the method comprising:
- modeling said multiple application processes as finite state machines by storing in a computer memory in said computer system model data corresponding to each application process, said model data identifying a set of states, identifying some states of said application process as final states from which the corresponding application process is allowed to terminate and identifying other states as intermediate states from which the corresponding application process must not be allowed to terminate;
said stored model data for each application process further including state transition data identifying state transitions between identified states of said each application process as being enabled by receiving a message from another application process, by unreliably sending a message to a destination external of said each application process, and by reliably sending a message to a destination external of said each application process;
said computer system modifying said model data by selecting, in accordance with a set of predefined state transition modification criteria, ones of said state transitions enabled by unreliably sending a message, and changing said state transition data to indicate that selected state transitions are enabled by reliably sending said message;
said computer system further modifying said model data by converting ones of said intermediate states into final states, said intermediate states converted into final states being selected in accordance with a predefined set of state modification criteria; and
said computer system when executing each application process, recording on stable storage information identifying reliably sent messages and information identifying state transitions by said each application process, said identifying information being recorded in accordance with which states are identified as being intermediate states in said modified model data.
3 Assignments
0 Petitions
Accused Products
Abstract
A protocol analysis system is provided with data specifying the defined states of processes participating in a distributed computation. State transitions between states are specified as being enabled by (A) receiving a message, (B) unreliably sending a message, or (C) performing an external action such as reliably sending a message. The specification data also identifies process states known to be final states, and all other states are initially denoted as intermediate states. The protocol analysis system determines if any intermediate states can be re-categorized as final states. Then it determines if any state transitions initially identified as unreliable send operations must be treated as derived external actions, and thus made reliable. Thirdly, for each derived external action, the states of the affected application process must be re-evaluated so as to determine if derived final states need to be converted into intermediate states. The resulting determinations as to which states are final states and which messages must be reliable sent are recorded and used to govern execution of the application process. When executing the application process, state transitions entering and leaving intermediate states are normally recorded on stable storage before the state transition is carried out and reliably sent messages are normally recorded on stable storage before being sent. A number of run-time journal optimization techniques reduce the number of state transitions and messages that need to be stored on stable storage.
-
Citations
5 Claims
-
1. In a computer system having multiple application processes that interactively perform a distributed computation, the steps of the method comprising:
-
modeling said multiple application processes as finite state machines by storing in a computer memory in said computer system model data corresponding to each application process, said model data identifying a set of states, identifying some states of said application process as final states from which the corresponding application process is allowed to terminate and identifying other states as intermediate states from which the corresponding application process must not be allowed to terminate; said stored model data for each application process further including state transition data identifying state transitions between identified states of said each application process as being enabled by receiving a message from another application process, by unreliably sending a message to a destination external of said each application process, and by reliably sending a message to a destination external of said each application process; said computer system modifying said model data by selecting, in accordance with a set of predefined state transition modification criteria, ones of said state transitions enabled by unreliably sending a message, and changing said state transition data to indicate that selected state transitions are enabled by reliably sending said message; said computer system further modifying said model data by converting ones of said intermediate states into final states, said intermediate states converted into final states being selected in accordance with a predefined set of state modification criteria; and said computer system when executing each application process, recording on stable storage information identifying reliably sent messages and information identifying state transitions by said each application process, said identifying information being recorded in accordance with which states are identified as being intermediate states in said modified model data. - View Dependent Claims (2, 3, 4, 5)
-
Specification