System and method for managing, scheduling, controlling and monitoring execution of jobs by a job scheduler utilizing a publish/subscription interface
First Claim
Patent Images
1. A method for job scheduling, comprising:
- defining a schedule that includes a plurality of commands that will execute a job comprising a first task, a second task, and a third task, wherein defining the schedule includes;
converting a job graph defined for the job into a series of subscriptions and publications that a publish and subscription system uses to route the plurality of commands that will execute the job; and
scheduling, by a workload manager, the first task for execution on a first computing device, the second task for execution on a second computing device, and the third task for execution on the first computing device;
subscribing a first agent executing on the first computing device to the first task and subscribing a second agent executing on the second computing device to the second task;
publishing a first command of the plurality of commands from the workload manager to the publish and subscription system in response to a scheduler receiving an activation trigger that initiates the execution of the job, wherein the publish and subscription system interfaces with the first agent executing on the first computing device and forwards the first command to the first agent in response to determining that the first agent has subscribed to the first task, and wherein the first agent executes the first task on the first computing device in response to receiving the first command from the publish and subscription system;
unsubscribing the first agent from the first task in response to the publish and subscription system receiving a first publication from the first agent indicating that the execution of the first task on the first computing device succeeded;
subscribing the first agent executing on the first computing device to the third task in response to unsubscribing the first agent from the first task;
publishing a second one of the plurality of commands from the workload manager to the publish and subscription system, which further interfaces with the second agent executing on the second computing device, wherein the publish and subscription system forwards the second command to the second agent in response to determining that the second agent has subscribed to the second task, and wherein the second agent executes the second task on the second computing device in response to receiving the second command from the publish and subscription system;
publishing a third one of the plurality of commands from the workload manager to the publish and subscription system that interfaces with the first agent executing on the first computing device, wherein the publish and subscription system forwards the third command to the first agent in response to determining that the first agent has subscribed to the third task, and wherein the first agent executes the third task on the first computing device in response to receiving the third command from the publish and subscription system;
receiving, at the publish and subscription system, a second publication from the second agent that indicates whether the execution of the second task succeeded or failed and a third publication from the first agent that indicates whether the execution of the third task succeeded or failed; and
generating a message indicating that the execution of the job succeeded in response to the second publication received from the second agent indicating that the execution of the second task on the second computing device succeeded and the third publication received from the first agent further indicating that the execution of the third task on the first computing device succeeded.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention relates to a system and a method for tracking and executing a job comprising a series of tasks. Each task may be executed on a separate computing device. The method comprises having a workload manager to identify an initial schedule of implementation for the job; having agents to selectively control execution of the tasks; and utilizing a publish/subscription interface between the workload manager and the agents to isolate the communications of the workload manager from the agents. The workload manager and the agents each subscribe and schedule execution of and reporting of the tasks through the publish/subscription interface.
217 Citations
24 Claims
-
1. A method for job scheduling, comprising:
-
defining a schedule that includes a plurality of commands that will execute a job comprising a first task, a second task, and a third task, wherein defining the schedule includes; converting a job graph defined for the job into a series of subscriptions and publications that a publish and subscription system uses to route the plurality of commands that will execute the job; and scheduling, by a workload manager, the first task for execution on a first computing device, the second task for execution on a second computing device, and the third task for execution on the first computing device; subscribing a first agent executing on the first computing device to the first task and subscribing a second agent executing on the second computing device to the second task; publishing a first command of the plurality of commands from the workload manager to the publish and subscription system in response to a scheduler receiving an activation trigger that initiates the execution of the job, wherein the publish and subscription system interfaces with the first agent executing on the first computing device and forwards the first command to the first agent in response to determining that the first agent has subscribed to the first task, and wherein the first agent executes the first task on the first computing device in response to receiving the first command from the publish and subscription system; unsubscribing the first agent from the first task in response to the publish and subscription system receiving a first publication from the first agent indicating that the execution of the first task on the first computing device succeeded; subscribing the first agent executing on the first computing device to the third task in response to unsubscribing the first agent from the first task; publishing a second one of the plurality of commands from the workload manager to the publish and subscription system, which further interfaces with the second agent executing on the second computing device, wherein the publish and subscription system forwards the second command to the second agent in response to determining that the second agent has subscribed to the second task, and wherein the second agent executes the second task on the second computing device in response to receiving the second command from the publish and subscription system; publishing a third one of the plurality of commands from the workload manager to the publish and subscription system that interfaces with the first agent executing on the first computing device, wherein the publish and subscription system forwards the third command to the first agent in response to determining that the first agent has subscribed to the third task, and wherein the first agent executes the third task on the first computing device in response to receiving the third command from the publish and subscription system; receiving, at the publish and subscription system, a second publication from the second agent that indicates whether the execution of the second task succeeded or failed and a third publication from the first agent that indicates whether the execution of the third task succeeded or failed; and generating a message indicating that the execution of the job succeeded in response to the second publication received from the second agent indicating that the execution of the second task on the second computing device succeeded and the third publication received from the first agent further indicating that the execution of the third task on the first computing device succeeded. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for job scheduling, comprising:
-
a plurality of computing devices that include a first computing device and a second computing device; a workload manager configured to; define a schedule that includes a plurality of commands to execute a job comprising a first task, a second task, and a third task; convert a job graph defined for the job into a series of subscriptions and publications to route the plurality of commands to execute the job; and schedule the first task for execution on the first computing device, the second task for execution on the second computing device, and the third task for execution on the first computing device; a scheduler configured to initiate execution of the job; and a publish and subscription system that interfaces with a first agent executing on the first computing device and a second agent executing on the second computing device, wherein the publish and subscription system is configured to use the series of subscriptions and publications converted from the job graph to; subscribe the first agent executing on the first computing device to the first task and the second agent executing on the second computing device to the second task; forward a first one of the plurality of commands to the first agent subscribed to the first task in response to the workload manager publishing the first command to the publish and subscription system, wherein the first agent is configured to execute the first task on the first computing device in response to receiving the first command from the publish and subscription system and the workload manager is further configured to publish the first command to the publish and subscription system in response to the scheduler receiving an activation trigger to initiate the execution of the job; unsubscribe the first agent from the first task in response to receiving a first publication from the first agent indicating that the execution of the first task on the first computing device succeeded; subscribe the first agent executing on the first computing device to the third task in response to unsubscribing the first agent from the first task; forward a second one of the plurality of commands to the second agent subscribed to the second task in response to the workload manager publishing the second command to the publish and subscription system, wherein the second agent is configured to execute the second task on the second computing device in response to receiving the second command from the publish and subscription system; forward a third one of the plurality of commands to the first agent subscribed to the third task in response to the workload manager publishing the third command to the publish and subscription system, wherein the first agent is further configured to execute the third task on the first computing device in response to receiving the third command from the publish and subscription system; receive a second publication from the second agent that indicates whether the execution of the second task succeeded or failed and a third publication from the first agent that indicates whether the execution of the third task succeeded or failed; generate a message indicating that the execution of the job succeeded in response to the second publication received from the second agent indicating that the execution of the second task on the second computing device succeeded and the third publication received from the first agent indicating that the execution of the third task on the first computing device succeeded. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A method for job scheduling, comprising:
-
scheduling, at a workload manager, a job that includes a first task to be executed on a first computing device, a second task to be executed on a second computing device, and a third task to be executed on the first computing device, wherein scheduling the job includes converting a job graph into a series of subscriptions and publications that a publish and subscription system will use to route commands to execute the scheduled job; subscribing a first agent executing on the first computing device to the first task and a second agent executing on the second computing device to the second task; publishing a first command associated with the scheduled job from the workload manager to the publish and subscription system in response to a scheduler receiving an activation trigger to initiate executing the scheduled job, wherein the publish and subscription system forwards the first command to the first agent subscribed to the first task to cause the first agent to execute the first task on the first computing device; unsubscribing the first agent from the first task in response to the publish and subscription system receiving a first publication from the first agent indicating that the first task was successfully executed on the first computing device; subscribing the first agent executing on the first computing device to the third task in response to unsubscribing the first agent from the first task; publishing a second command associated with the scheduled job from the workload manager to the publish and subscription system, wherein the publish and subscription system forwards the second command to the second agent subscribed to the second task to cause the second agent to execute the second task on the second computing device; publishing a third command associated with the scheduled job from the workload manager to the publish and subscription system, wherein the publish and subscription system forwards the third command to the first agent subscribed to the third task to cause the first agent to execute the third task on first second computing device; and generating a message indicating whether the job was successfully executed in response to receiving a second publication from the second agent that indicates whether the second task was successfully executed and a third publication from the first agent that indicates whether the third task was successfully executed, wherein the message indicates that the job was successfully executed if the second publication indicates that the second agent successfully executed the second task and the third publication indicates that the first agent successfully executed the third task. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A system for job scheduling, comprising:
-
a computer configured to run a workload manager to schedule a job using a series of subscriptions and publications converted from a job graph, wherein the scheduled job includes a first task to be executed on a first computing device, a second task to be executed on a second computing device, and a third task to be executed on the first computing device; and a publish and subscription server configured to use the series of subscriptions and publications converted from the job graph to route commands to execute the scheduled job, wherein to use the series of subscriptions and publications to execute the scheduled job, the publish and subscription server is further configured to; subscribe a first agent executing on the first computing device to the first task and a second agent executing on the second computing device to the second task; forward a first command associated with the scheduled job to the first agent subscribed to the first task to cause the first agent to execute the first task on the first computing device, wherein the computer is further configured to run the workload manager to publish the first command to the publish and subscription server in response to a scheduler receiving an activation trigger to initiate executing the scheduled job; unsubscribe the first agent from the first task in response to receiving a first publication from the first agent indicating that the first task was successfully executed on the first computing device; subscribe the first agent executing on the first computing device to the third task in response to unsubscribing the first agent from the first task; forward a second command associated with the scheduled job to the second agent subscribed to the second task to cause the second agent to execute the second task on the second computing device in response to the workload manager publishing the second command to the publish and subscription server; forward a third command associated with the scheduled job to the first agent subscribed to the third task to cause the first agent to execute the third task on the first computing device in response to the workload manager publishing the third command to the publish and subscription server; and generate a message indicating whether the job was successfully executed in response to receiving a second publication from the second agent that indicates whether the second task was successfully executed and a third publication from the first agent that indicates whether the third task was successfully executed, wherein the message indicates that the job was successfully executed if the second publication indicates that the second agent successfully executed the second task and the third publication indicates that the first agent successfully executed the third task. - View Dependent Claims (20, 21, 22, 23, 24)
-
Specification