System for monitoring and managing computer resources and applications across a distributed computing environment using an intelligent autonomous agent architecture

US 5,655,081 A
Filed: 03/08/1995
Issued: 08/05/1997
Est. Priority Date: 03/08/1995
Status: Expired due to Term

First Claim

Patent Images

1. A method for monitoring and managing computer system resources and applications in a computer network utilizing at least one console system and at least one agent system, said at least one console and at least one agent systems each comprising a random access memory and a non-volatile data storage device, the method comprising the steps of:

(a) storing, in the non-volatile data storage device, a plurality of data sets corresponding to information for monitoring and managing a plurality of resources and applications;

(b) transmitting a first request from the at least one console system to the at least one agent system, said first request specifying a first resource or application for the at least one agent system to monitor or manage;

(c) determining whether a first data set corresponding to information for monitoring or managing said first resource or application already exists in the random access memory of the at least one agent system;

(d) if the outcome of step (c) indicates that said first data set does not exist in the random access memory of the at least one agent system, loading said first data set from the non-volatile data storage device into the random access memory of the at least one agent system;

(e) gathering information about said first resource or application responsive to the information contained in said first data set;

(f) determining, responsive to a stored threshold and to information gathered in step (e), whether an event has occurred and, if so, what type of event;

(g) transmitting a plurality of messages, from the at least one agent system to the at least one console system, said plurality of messages containing information about said first resource or application;

(h) transmitting a second request from the at least one console system to the at least one agent system, said second request specifying that the at least one console system should not receive information about said first resource or application;

(i) determining whether other of the at least one console systems should receive information about said first resource or application;

(j) if the outcome of step (i) indicates that no other of the at least one console systems should receive information about said first resource or application, unloading said first data set from the random access memory of the at least one agent system.

View all claims

9 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus are disclosed for monitoring and managing the applications and resources on a distributed computer network. Preferably, at least one manager software system runs on at least one of the networked computer systems designated as a network management computer system or "console" system. An agent software system runs on each of the server computer systems in the network to be monitored. Each respective agent software system carries out tasks on the computer system in which it is installed such as discovering which resources and applications are present on the computer system, monitoring particular aspects of the resources and applications present on the computer system, and executing recovery actions automatically when such actions are warranted. The agents are capable of intelligent, autonomous operation. Knowledge modules are stored in a non-volatile storage device at the site of each agent software system and are loaded and unloaded into server memory dynamically as consoles register and de-register with the agents. Consoles may register to receive all information from the agents or only selected information. An event management procedure is disclosed for coordinating event management between the various consoles throughout the network.

1013 Citations

27 Claims

1. A method for monitoring and managing computer system resources and applications in a computer network utilizing at least one console system and at least one agent system, said at least one console and at least one agent systems each comprising a random access memory and a non-volatile data storage device, the method comprising the steps of:
- (a) storing, in the non-volatile data storage device, a plurality of data sets corresponding to information for monitoring and managing a plurality of resources and applications;
  
  (b) transmitting a first request from the at least one console system to the at least one agent system, said first request specifying a first resource or application for the at least one agent system to monitor or manage;
  
  (c) determining whether a first data set corresponding to information for monitoring or managing said first resource or application already exists in the random access memory of the at least one agent system;
  
  (d) if the outcome of step (c) indicates that said first data set does not exist in the random access memory of the at least one agent system, loading said first data set from the non-volatile data storage device into the random access memory of the at least one agent system;
  
  (e) gathering information about said first resource or application responsive to the information contained in said first data set;
  
  (f) determining, responsive to a stored threshold and to information gathered in step (e), whether an event has occurred and, if so, what type of event;
  
  (g) transmitting a plurality of messages, from the at least one agent system to the at least one console system, said plurality of messages containing information about said first resource or application;
  
  (h) transmitting a second request from the at least one console system to the at least one agent system, said second request specifying that the at least one console system should not receive information about said first resource or application;
  
  (i) determining whether other of the at least one console systems should receive information about said first resource or application;
  
  (j) if the outcome of step (i) indicates that no other of the at least one console systems should receive information about said first resource or application, unloading said first data set from the random access memory of the at least one agent system.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein said application portion of said first request includes information identifying a type of computer application.
  - 3. The method of claim 2 wherein said first request further includes information identifying an instance within the class of instances defined by said type of computer application.
  - 4. The method of claim 3 wherein said first request further includes information identifying a first event pertinent to said instance.
  - 5. The method of claim 4, further including the step of excluding, from said plurality of messages, any information about events that are detected in step (f) but that do not correspond to said first event.
  - 6. The method of claim 1, further including the step of storing, in the non-volatile data storage device of the at least one agent system, a first record of information gathered in step (e) and events detected in step (f).
  - 7. The method of claim 6 wherein, for cases in which said plurality of messages contains information indicating that said first event has occurred and in which the at least one console system wishes to communicate to other of the at least one console systems that the at least one console system has learned of the occurrence of said first event, the method further including the steps of:
    - transmitting a third request, from the at least one console system to the at least one agent system, said third request containing an acknowledgment that said first event has occurred;
      
      storing a second record of said acknowledgment that said first event has occurred in the non-volatile data storage device of the at least one agent system;
      
      determining whether other of the at least one console systems in the network should receive notification of said acknowledgment that said first event has occurred; and
      
      , if so,transmitting a fourth request to at least one other console system in the network, said fourth request containing information about said acknowledgment that said first event has occurred.
  - 8. The method of claim 1 wherein the at least one console system runs on a first computer system and the at least one agent system runs on a second computer system.
  - 9. The method of claim 1 wherein the at least one agent system executes recovery actions responsive to events detected in step (f), said recovery actions specified by information contained in said first data set.
  - 10. The method of claim 1 wherein said first data set contains information specifying computer script programs for discovering, monitoring or managing said first resource or application.

11. A method for monitoring and managing computer system resources and applications utilizing at least one agent system, said agent system comprising a random access memory and a non-volatile data storage device, the method comprising the steps of:
- a) storing, in the non-volatile data storage device, a plurality of data sets corresponding to information for monitoring and managing a plurality of computer resources and applications;
  
  b) storing, in the non-volatile data storage device, information indicating which computer resources or applications are to be monitored or managed by the at least one agent system;
  
  c) reading said information indicating which of said computer resources or applications are to be monitored or managed by the at least one agent system;
  
  d) responsive to information read in step (c), loading, into the random access memory, a first data set corresponding to a first computer resource or application to be monitored or managed by the at least one agent system;
  
  e) gathering information about said first computer resource or application responsive to information contained in said first data set;
  
  f) determining, responsive to a stored threshold and to information gathered in step (e), whether an event has occurred and, if so, what type of event; and
  
  g) storing, in the random access memory, information gathered in step (e) or information corresponding to events detected in step (f).
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The method of claim 11 further including the steps of:
    - (h) responsive to information read in step (c), determining whether a second data set corresponding to information for monitoring and managing a second computer resource or application already exists in the random access memory of the at least one agent system;
      
      (i) if the outcome of step (h) indicates that said second data set does not exist in the random access memory of the at least one agent system, loading said second data set from the non-volatile data storage device into the random access memory of the at least one agent system;
      
      (j) gathering information about said second computer resource or application responsive to information contained in said second data set;
      
      (k) determining, responsive to a stored threshold and to information gathered in step (j), whether an event has occurred, and if so, what type of event; and
      
      (l) storing, in the random access memory, information gathered in step (j) or information corresponding to events detected in step (k).
  - 13. The method of claim 11 wherein a plurality of data sets corresponding to a plurality of computer resources or applications to be monitored or managed by the at least one agent system are loaded into the random access memory of the at least one agent system.
  - 14. The method of claim 11 wherein said first data set contains information specifying computer script programs for discovering, monitoring or managing said first computer resource or application.
  - 15. The method of claim 11 further including the step of storing, in the non-volatile data storage device, information gathered in step (e) or information corresponding to events detected in step (f).
  - 16. The method of claim 11 further including the step of executing recovery actions responsive to events detected in step (f), said recovery actions specified by information contained in said first data set.
  - 17. The method of claim 11 wherein information about said first computer resource or application is gathered automatically according to a predetermined time schedule.
  - 18. The method of claim 11 further including at least one console system, said console system including a random access memory and a non-volatile data storage device, the method further comprising the steps of:
    - h) transmitting a first request from the at least one console system to the at least one agent system, said first request specifying a second computer resource or application for the at least one agent system to monitor or manage;
      
      i) determining whether a second data set corresponding to information for monitoring or managing said second computer resource or application already exists in the random access memory of the at least one agent system;
      
      j) if the outcome of step (i) indicates that said second data set does not exist in the random access memory of the at least one agent system, loading said second data set from the non-volatile data storage device of the at least one agent system into the random access memory of the at least one agent system;
      
      k) gathering information about said second computer resource or application responsive to information contained in said second data set;
      
      l) determining, responsive to a stored threshold and to information gathered in step (k), whether an event has occurred and, if so, what type of event;
      
      m) transmitting a plurality of messages, from the at least one agent system to the at least one console system, said plurality of messages containing information about said second computer resource or application;
      
      n) transmitting a second request from the at least one console system to the at least one agent system, said second request specifying that the at least one console system should not receive information about said second computer resource or application;
      
      o) determining whether other of the at least one console systems should receive information about said second computer resource or application; and
      
      p) if the outcome of step (o) indicates that no other of the at least one console systems should receive information about said second computer resource or application, unloading said second data set from the random access memory of the at least one agent system.
  - 19. The method of claim 18 further including the step of executing recovery actions responsive to events detected in step (1), said recovery actions specified by information contained in said data sets.
  - 20. The method of claim 11 further including at least one console system, said console system including a random access memory and a non-volatile data storage device, the method further including the step of transmitting information gathered about said first computer resource or application to the at least one console system.

21. A method for monitoring and managing computer system resources and applications utilizing at least one agent system, at least one intermediate agent system, and at least one console system, where the at least one agent system, intermediate agent system and console system each comprise a random access memory and a non-volatile data storage device, the method comprising the steps of:
- (a) registering the at least one intermediate agent system with the at least one agent system, said registration specifying resources and applications for the at least one agent system to monitor or manage;
  
  (b) registering the at least one console system with the at least one intermediate agent system, said registration specifying resources and applications for the at least one intermediate agent system to monitor or manage;
  
  (c) gathering, by the at least one agent system, information about said resources and applications monitored or managed by the at least one agent system;
  
  (d) determining by the at least one agent system, responsive to registration information received from the at least one intermediate agent system, whether the at least one intermediate agent system should receive information about said resources and applications monitored or managed by the at least one agent system;
  
  (e) responsive to the outcome of step (d), transmitting a plurality of messages from the at least one agent system to the at least one intermediate agent system, said plurality of messages containing information about said resources and applications monitored or managed by the at least one agent system;
  
  (f) determining by the at least one intermediate agent system, responsive to registration information received from the at least one console system, whether the at least one console system should receive information about said resources and applications monitored or managed by the at least one intermediate agent system;
  
  (g) responsive to the outcome of step (f), transmitting a plurality of messages from the at least one intermediate agent system to the at least one console system, said plurality of messages containing information about said resources and applications monitored or managed by the at least one intermediate agent system.
- View Dependent Claims (22, 23, 24, 25, 26, 27)
- - 22. The method of claim 21 further including a plurality of intermediate agent systems, the method further including the steps of:
    - (h) registering other of the plurality of intermediate agent systems with the at least one intermediate agent system, said registration specifying resources and applications for the at least one intermediate agent system to monitor or manage;
      
      (i) determining by the at least one intermediate agent system, whether other of the plurality of intermediate agent systems should receive information about resources and applications monitored or managed by the at least one intermediate agent system;
      
      (j) responsive to the outcome of step (i), transmitting a plurality of messages from the at least one intermediate agent system to other of the plurality of intermediate agent systems, said plurality of messages containing information about resources and applications monitored or managed by the at least one intermediate agent system.
  - 23. The method of claim 21 further including the steps of:
    - (h) registering the at least one console system with the at least one agent system, said registration specifying resources and applications for the at least one agent system to monitor or manage;
      
      (i) determining by the at least one agent system, whether the at least one console system should receive information about resources and applications monitored or managed by the at least one agent system;
      
      (j) responsive to the outcome of step (i), transmitting a plurality of messages from the at least one agent system to the at least one console system, said plurality of messages containing information about resources and applications monitored or managed by the at least one agent system.
  - 24. The method of claim 21 further including the steps of:
    - (h) de-registering the at least one console system from the at least one intermediate agent system;
      
      (i) de-registering the at least one intermediate agent system from the at least one agent system.
  - 25. The method of claim 21 wherein at least one intermediate agent system is used to interface between a plurality of tiers of a computer network including agent systems, intermediate agent systems or console systems.
  - 26. The method of claim 21 wherein the at least one console system runs on a first computer system, the at least one intermediate agent system runs on a second computer system, and the at least one agent system runs on a third computer system.
  - 27. The method of claim 21 wherein the registration information contains information specifying computer script programs for discovering, monitoring or managing said resources or applications.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
BMC Software Incorporated (KKR & Co., Inc.)
Original Assignee
BMC Software Incorporated (KKR & Co., Inc.)
Inventors
Tatarinov, Kirill L., Picard, Martin W., Bonnell, David N.
Primary Examiner(s)
Pan, Daniel H.

Application Number

US08/400,850
Time in Patent Office

881 Days
Field of Search

395/800, 395/187.01, 395/700, 395/650, 395/182.02, 395/115, 395/146, 395/166, 395/150, 395/200.01, 395/500, 395/733, 395/200.03, 395/292, 395/200.06 , 395/200.15, 395/200.16, 364/DIG. 1, 364/DIG. 2, 340/825.31, 370/95.1, 370/54
US Class Current

709/202
CPC Class Codes

G06F 11/0748   in a remote unit communicat...

G06F 11/0781   Error filtering or prioriti...

G06F 11/2294   by remote test

G06F 11/3006   where the computing system ...

G06F 11/3051   Monitoring arrangements for...

G06F 11/3093   Configuration details there...

G06F 11/327   Alarm or error message display

G06F 11/3409   for performance assessment

G06F 11/3495   for systems

G06F 2201/81   Threshold

G06F 2201/86   Event-based monitoring

G06F 2201/885   Monitoring specific for caches

H04L 43/00   Arrangements for monitoring...

H04L 43/16   Threshold monitoring

System for monitoring and managing computer resources and applications across a distributed computing environment using an intelligent autonomous agent architecture

First Claim

9 Assignments

0 Petitions

Accused Products

Abstract

1013 Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

System for monitoring and managing computer resources and applications across a distributed computing environment using an intelligent autonomous agent architecture

First Claim

9 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

1013 Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links