Debugger launch and attach on compute clusters
First Claim
1. In a computing environment comprising a cluster computing system, a method of launching a debugging process, the method comprising:
- at a compute node on a cluster private network, receiving a debug job via a scheduler of a head node, the debug job originating from a client on a public network, wherein the head node is connected to both the cluster private network and the public network, and wherein the public network is external to the cluster private network;
beginning processing the debug job at the compute node, and as a result, initiating debugging by starting a debugger remote agent at the compute node, including the debugger remote agent opening a network port and listening on the network port for debugger connection requests originating from a debugger client at the client;
beginning processing a user job at the compute node in the presence of the started debugger remote agent at the compute node;
informing the client that the debugger remote agent is ready to debug the user job; and
as a result of informing the client, the debugger remote agent at the compute node receiving a debugger connection request at the network port, and connecting the debugger client at the client to the debugger remote agent.
2 Assignments
0 Petitions
Accused Products
Abstract
Launching a debugging process. A method includes at a compute node on a cluster private network, receiving a debug job via a scheduler of a head node from a client on a public network. The head node is connected to both the cluster private network and the public network. The public network is external to the cluster private network. The method further includes beginning processing the debug job, and as a result initiating debugging by starting one or more debugger remote agents at the compute node. The method further includes beginning processing a user job in the presence of the started debugger remote agents at the compute node. The client is informed that the one or more debugger remote agents are ready to debug the user job. A debugger client at the client is connected to the one or more debugger remote agents.
-
Citations
20 Claims
-
1. In a computing environment comprising a cluster computing system, a method of launching a debugging process, the method comprising:
-
at a compute node on a cluster private network, receiving a debug job via a scheduler of a head node, the debug job originating from a client on a public network, wherein the head node is connected to both the cluster private network and the public network, and wherein the public network is external to the cluster private network; beginning processing the debug job at the compute node, and as a result, initiating debugging by starting a debugger remote agent at the compute node, including the debugger remote agent opening a network port and listening on the network port for debugger connection requests originating from a debugger client at the client; beginning processing a user job at the compute node in the presence of the started debugger remote agent at the compute node; informing the client that the debugger remote agent is ready to debug the user job; and as a result of informing the client, the debugger remote agent at the compute node receiving a debugger connection request at the network port, and connecting the debugger client at the client to the debugger remote agent. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. In a computing environment comprising a cluster computing system, a method of attaching a debugging process, the method comprising:
-
at a compute node on a cluster private network that is executing a user job, receiving a message to start a debug job from a client on a public network, via a head node that is connected to both the cluster private network and the public network, and wherein the public network is external to the cluster private network beginning processing the debug job at the compute node, and as a result, initiating debugging of the user job, by starting a debugger remote agent at the compute node, including the debugger remote agent opening a network port and listening on the network port for debugger connection requests originating from a debugger client at the client; informing the client that the debugger remote agent is ready to debug the user job; and as a result of informing the client, the debugger remote agent at the compute node receiving a debugger connection request at the network port, and connecting the debugger client at the client to the debugger remote agent. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer system, comprising:
-
a client computer system that implements a developer application, a head node computer system that implements a scheduler and a directory service, and a cluster including a set of compute nodes, wherein the developer application at the client computer system, the scheduler and the directory service at the head node computer system, and the cluster are configured to implement a method of launching a debugging process using NAT forwarding, the method including the following; at the developer application, receiving user input requesting that a job be scheduled for execution on the cluster; at the developer application, sending the job to the scheduler; at the scheduler, queuing the job for execution on the cluster; at the scheduler, assigning the set of compute nodes to the job causing the job to start running on each compute node; wherein a first task in the job is a debug start task, executing the debug start task causing a remote agent process to be created at each compute node in the set of compute nodes; each remote agent opening a first port and listening on the first port for debugger connections from a debugger at the developer application; registering the first port on each compute agent with the directory service; at the developer application polling the directory service for all the ports registered for the job until the developer application receives one mapped port for each remote agent, wherein the directory service creates port mappings via NAT as needed to fulfill poll requests; the directory service periodically polling the scheduler to verify that the job has not terminated; for each registered port registered at the directory service, the developer application connecting to the remote agent on a corresponding compute node and creating and debugging a user process with messages continuing back and forth between the developer application and the remote agent until the debugging session is complete; and the directory service discovering that the debugging session is complete and deleting all forwarding ports. - View Dependent Claims (20)
-
Specification