Systems and methods for monitoring distributed applications using diagnostic information
First Claim
1. A method of monitoring a distributed computer system, said method comprising:
- defining a plurality of trigger events and for each one of said plurality of trigger events defining associated data to be collected;
monitoring a connection between a client and a first server;
while monitoring said connection, detecting at said client an occurrence of one of said plurality of trigger events;
as a result of detecting said occurrence, (i) collecting client data at said client and in accordance with the detected trigger event, the collected client data being the defined associated data for that detected trigger event;
(ii) notifying a controller of the occurrence of the detected trigger event and sending the client data to said controller;
(iii) notifying said first server of the occurrence of the detected trigger event;
(iv) gathering first server data by said first server; and
(v) sending the gathered first server data to said controller.
5 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for automated monitoring and management of distributed applications, client/server databases, networks and systems across heterogeneous environments. Distributed, automated intelligent monitoring agents use embedded sensing technology which is knowledgeable of application protocols, to monitor continuously the network environment in real time. To this end, the monitoring agent can be located on each client and server in the network. The monitoring agent can couple to the communications stack for monitoring the data that is being passed between the client and the network, of a server in the network. The data can be collected and employed for trouble shooting trend analysis, resource planning, security auditing, and accounting as well as other applications. Also included is a controller for remotely coordinating the data gathering process from the various clients and servers.
-
Citations
44 Claims
-
1. A method of monitoring a distributed computer system, said method comprising:
-
defining a plurality of trigger events and for each one of said plurality of trigger events defining associated data to be collected;
monitoring a connection between a client and a first server;
while monitoring said connection, detecting at said client an occurrence of one of said plurality of trigger events;
as a result of detecting said occurrence, (i) collecting client data at said client and in accordance with the detected trigger event, the collected client data being the defined associated data for that detected trigger event;
(ii) notifying a controller of the occurrence of the detected trigger event and sending the client data to said controller;
(iii) notifying said first server of the occurrence of the detected trigger event;
(iv) gathering first server data by said first server; and
(v) sending the gathered first server data to said controller. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
gathering a client portion of said associated data by said client that can be computed by said client.
-
-
5. The method of claim 4, wherein said client portion includes information relevant to said connection between said client and said first server.
-
6. The method of claim 5, wherein said client portion includes resource metrics.
-
7. The method of claim 6, wherein said resource metrics include processor and memory usage information.
-
8. The method of claim 5, wherein said client portion includes network latency information relevant to said connection.
-
9. The method of claim 3, further including:
-
sending to said controller client context information; and
determining, using said client context information, data to be obtained from said client.
-
-
10. The method of claim 9, further including:
said controller using data included in a data repository to determine what data is to be obtained from said client.
-
11. The method of claim 10, said client context information includes a server identifier uniquely identifying said first server.
-
12. The method of claim 3, wherein said collecting data on said client includes:
-
specifying a machine executable program to perform said collecting data on said client; and
executing said machine executable program to collect said data on said client.
-
-
13. The method of claim 3 further comprising:
-
notifying one or more other servers of said occurrence of said trigger event;
gathering other server data by said one or more other servers; and
sending said other server data to said controller.
-
-
14. The method of claim 3 further comprising:
sending said client data to said controller.
-
15. The method of claim 3, wherein said first server data includes information about current open transactions being processed by said first server.
-
16. The method of claim 3, wherein said first server data includes usage information about said first server.
-
17. The method of claim 3 further comprising:
gathering, by said controller, information about the trigger event that caused an exception.
-
18. The method of claim 3, wherein said client and said first server are each associated with an agent process that gathers data.
-
19. The method of claim 3, wherein said controller is a coordinator for gathering client and first server data.
-
20. The method of claim 1, wherein said distributed computer system includes one or more clients and one or more servers, each of said one or more clients and said one or more servers being associated with a different computer processor and being a process executing on said computer processor.
-
21. The method of claim 20, wherein said controller is a process executing on a dedicated computer processor.
-
22. The method of claim 1, wherein said distributed computer system includes one or more clients and one or more servers, and a computer processor is associated with at least two clients, each of said two clients being a process executing on said computer processor.
-
23. A system for monitoring a distributed computer system, the monitoring system comprising:
-
machine executable code for defining a plurality of trigger events and for each one of said plurality of trigger events defining associated data to be collected;
machine executable code for monitoring a connection between a client and a first server;
machine executable code for detecting at said client and while monitoring said connection an occurrence of one of said plurality of trigger events;
machine executable code for collecting, as a result of detecting said occurrence, client data at said client and in accordance with the detected trigger event, the collected client data being the defined associated data for that detected trigger event;
machine executable deescode for notifying, as a result of detecting said occurrence, a controller of the occurrence of the detected trigger event, and sending the client data to said controller;
machine executable code for notifying, as a result of detecting said occurrence, said first server of the occurrence of the detected trigger event;
machine executable code for gathering, as a result of detecting said occurrence, first server data by said first server; and
machine executable code for sending, as a result of detecting said occurrence, the gathered first server data to said controller. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
machine executable code for gathering a client portion of said associated data by said client that can be computed by said client.
-
-
25. The system of claim 24, wherein said client portion includes information relevant to said connection between said client and said first server.
-
26. The system of claim 25, wherein said client portion includes resource metrics.
-
27. The system of claim 26, wherein said resource metrics include processor and memory usage information.
-
28. The system of claim 25, wherein said client portion includes network latency information relevant to said connection.
-
29. The system of claim 23, further including:
-
machine executable code for sending to said controller client context information; and
machine executable code for determining, using said client context information, data to be obtained from said client.
-
-
30. The system of claim 29, further including:
said controller including machine executable code for accessing data included in a data repository to determine what data is to be obtained from said client.
-
31. The system of claim 30 wherein said client context information includes a server identifier uniquely identifying said first server.
-
32. The system of claim 23, wherein said distributed computer system includes one or more clients and one or more servers, each of said one or more clients and said one or more servers being associated with a different computer processor and being a process executing on said computer processor.
-
33. The system of claim 32, wherein said controller is a process executing on a dedicated computer processor.
-
34. The system of claim 23 further comprising:
machine executable code for sending said client data to said controller.
-
35. The system of claim 23, wherein said first server data includes information about current open transactions being processed by said first server.
-
36. The system of claim 23, wherein said first server data includes information about requests being serviced by said first server.
-
37. The system of claim 23, wherein said first server data includes usage information about said first server.
-
38. The system of claim 23 further comprising:
machine executable code for gathering, by said controller, information about the trigger event that caused the exception.
-
39. The system of claim 23, wherein said machine executable code for collecting client data performs remote data gathering by said controller.
-
40. The system of claim 23, wherein said machine executable code for collecting data on said client includes:
machine executable code for specifying a machine executable program to perform said collecting data on said client.
-
41. The system of claim 23 further comprising:
-
machine executable code for notifying one or more other servers of said occurrence of said trigger event;
machine executable code for gathering other server data by said one or more other servers; and
machine executable code for sending said other server data to said controller.
-
-
42. The system of claim 23, wherein said distributed computer system includes one or more clients and one or more servers, and a computer processor is associated with at least two clients, each of said two clients being a process executing on said computer processor.
-
43. The system of claim 23, wherein said client and said first server are each associated with an agent process that gathers data.
-
44. The system of claim 23, wherein said controller is a coordinator for gathering client and first server data.
Specification