Orchestration of software applications upgrade using automatic hang detection
First Claim
Patent Images
1. A method comprising:
- monitoring execution of a first upgrade process on a first host machine, the first upgrade process upgrading a first software application on the first host machine;
accessing, by a computing system, a runtime execution time for the first upgrade process, the runtime execution time captured for the first upgrade process during the monitoring of the execution of the first upgrade process;
determining a first reference time defined for the first upgrade process for the first host machine, wherein the first reference time is different than a second reference time defined for the first upgrade process for a second host machine;
determining a latency tolerance time for the first upgrade process;
determining, based on the runtime execution time, that the first upgrade process continues execution after a total of the first reference time and the latency tolerance time has passed;
determining, by the computing system that the first upgrade process executing on the first host machine is to be indicated as being in a hang state; and
generating, by the computing system, an alert message indicating the first upgrade process executing on the first host machine is in the hang state.
1 Assignment
0 Petitions
Accused Products
Abstract
In an upgrade infrastructure performing an overall upgrade operation comprising multiple upgrade processes being executed, possibly concurrently, on multiple hosts for upgrading one or more software applications hosted by hosts, automated hang detection mechanisms are disclosed for quickly, efficiently, and automatically detecting when one or more of the upgrade process are in a hang state. Different hang detection techniques are described including a metadata-driven hang detection mechanism and a code-driven hang detection mechanism.
63 Citations
13 Claims
-
1. A method comprising:
-
monitoring execution of a first upgrade process on a first host machine, the first upgrade process upgrading a first software application on the first host machine; accessing, by a computing system, a runtime execution time for the first upgrade process, the runtime execution time captured for the first upgrade process during the monitoring of the execution of the first upgrade process; determining a first reference time defined for the first upgrade process for the first host machine, wherein the first reference time is different than a second reference time defined for the first upgrade process for a second host machine; determining a latency tolerance time for the first upgrade process; determining, based on the runtime execution time, that the first upgrade process continues execution after a total of the first reference time and the latency tolerance time has passed; determining, by the computing system that the first upgrade process executing on the first host machine is to be indicated as being in a hang state; and generating, by the computing system, an alert message indicating the first upgrade process executing on the first host machine is in the hang state. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A non-transitory computer-readable storage memory storing a plurality of instructions executable by one or more processors, the plurality of instructions comprising:
-
instructions that cause at least one processor from the one or more processors to monitor execution of a first upgrade process on a first host machine, the first upgrade process upgrading a first software application on the first host machine; instructions that cause at least one processor from the one or more processors to access runtime execution time for the first upgrade process, the runtime execution time captured for the first upgrade process during the monitoring of the execution of the first upgrade process; instructions that cause at least one processor from the one or more processors to determine a first reference time defined for the first upgrade process for the first host machine, wherein the first reference time is different than a second reference time defined for the first upgrade process for a second host machine; instructions that cause at least one processor from the one or more processors to determine a latency tolerance time for the first upgrade process; instructions that cause at least one processor from the one or more processors to determine, based on the runtime execution time, that the first upgrade process continues execution after a total of the first reference time and the latency tolerance time has passed; instructions that cause at least one processor from the one or more processors to determine that the first upgrade process executing on the first host machine is to be indicated as being in a hang state; and instructions that cause at least one processor from the one or more processors to generate an alert message indicating the first upgrade process executing on the first host machine is in the hang state. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A system comprising:
-
one or more processors; and a memory coupled with and readable by the one or more processors, the memory configured to store a set of instructions which, when executed by the one or more processors, causes at least one processor from the one or more processors to; monitor execution of a first upgrade process on a first host machine, the first upgrade process upgrading a first software application on the first host machine; access runtime execution time for the first upgrade process, the runtime execution time captured for the first upgrade process during the monitoring of the execution of the first upgrade process; determine a first reference time defined for the first upgrade process for the first host machine, wherein the first reference time is different than a second reference time defined for the first upgrade process for a second host machine; determine a latency tolerance time for the first upgrade process; determine, based on the runtime execution time, that the first upgrade process continues execution after a total of the first reference time and the latency tolerance time has passed; determine that the first upgrade process executing on the first host machine is to be indicated as being in a hang state; and generate an alert message indicating the first upgrade process executing on the first host machine is in the hang state. - View Dependent Claims (12, 13)
-
Specification