Problem determination method for local area network systems
First Claim
1. A limited multi-fault method of managing error events in a local area network (LAN) having one or more LAN servers, a plurality of LAN requesters, a LAN EXPERT server and a plurality of LAN EXPERT agents, the LAN EXPERT server being connected to said LAN and including an inference engine, a knowledge base containing relationships between possible causes and error messages, and a user interface for reporting problems and interacting with a user, the LAN EXPERT agents being installed on LAN servers and LAN requesters to monitor a status of the LAN servers and LAN requesters, said method comprising the steps of:
- receiving by the LAN EXPERT agents error messages issued by said LAN servers and LAN requesters on which they are installed and sending the error messages to the LAN EXPERT server;
receiving by the LAN EXPERT server error messages sent by all LAN EXPERT agents, a received error message being an event to be diagnosed by the inference engine of the LAN EXPERT server;
forming by the inference engine of the LAN EXPERT server an event cluster for a received error message, wherein a cluster is a data structure that holds partial diagnostic results containing correlated events and possible causes and wherein both events and causes have associated variables;
accessing by the inference engine the knowledge base of the LAN EXPERT server to retrieve all related causes for an event corresponding to a received error message as defined in the knowledge base, wherein variables in causes can be instantiated by the event;
comparing by the inference engine of the LAN EXPERT server subsequent error messages with each cluster to determine whether subsequent events should join a cluster or not;
joining by the inference engine of the LAN EXPERT server a subsequent event to a cluster if a mathematical intersection of causes of the subsequent event and causes of the cluster is not empty, otherwise, forming by the inference engine a new event cluster for the subsequent event, whereby as more and more events are joined in a cluster by the inference engine, a number of causes decreases and variables are instantiated so that when a cluster contains only one fully instantiated cause, a diagnostic conclusion is reached; and
reporting by the user interface of the LAN EXPERT server diagnostic information generated by the inference engine.
1 Assignment
0 Petitions
Accused Products
Abstract
A limited multi-fault system and method manages error recovery in a local area network system. The system includes a data structure which store related error events, diagnostic problems and causes. In addition, a method of managing error events in real time and identifying causes and recommending actions is provided. A knowledge base is used to determine the causes and recommended actions for the problem.
150 Citations
9 Claims
-
1. A limited multi-fault method of managing error events in a local area network (LAN) having one or more LAN servers, a plurality of LAN requesters, a LAN EXPERT server and a plurality of LAN EXPERT agents, the LAN EXPERT server being connected to said LAN and including an inference engine, a knowledge base containing relationships between possible causes and error messages, and a user interface for reporting problems and interacting with a user, the LAN EXPERT agents being installed on LAN servers and LAN requesters to monitor a status of the LAN servers and LAN requesters, said method comprising the steps of:
-
receiving by the LAN EXPERT agents error messages issued by said LAN servers and LAN requesters on which they are installed and sending the error messages to the LAN EXPERT server; receiving by the LAN EXPERT server error messages sent by all LAN EXPERT agents, a received error message being an event to be diagnosed by the inference engine of the LAN EXPERT server; forming by the inference engine of the LAN EXPERT server an event cluster for a received error message, wherein a cluster is a data structure that holds partial diagnostic results containing correlated events and possible causes and wherein both events and causes have associated variables; accessing by the inference engine the knowledge base of the LAN EXPERT server to retrieve all related causes for an event corresponding to a received error message as defined in the knowledge base, wherein variables in causes can be instantiated by the event; comparing by the inference engine of the LAN EXPERT server subsequent error messages with each cluster to determine whether subsequent events should join a cluster or not; joining by the inference engine of the LAN EXPERT server a subsequent event to a cluster if a mathematical intersection of causes of the subsequent event and causes of the cluster is not empty, otherwise, forming by the inference engine a new event cluster for the subsequent event, whereby as more and more events are joined in a cluster by the inference engine, a number of causes decreases and variables are instantiated so that when a cluster contains only one fully instantiated cause, a diagnostic conclusion is reached; and reporting by the user interface of the LAN EXPERT server diagnostic information generated by the inference engine. - View Dependent Claims (2)
-
-
3. A diagnostic system for limited multi-fault management of error events in a local area network (LAN) comprising:
-
a plurality of LAN requesters; one or more LAN servers, a LAN server providing service for LAN requesters; a LAN EXPERT server connected to said LAN and including an inference engine, a knowledge base containing relationships between possible causes and error messages, and a user interface for reporting problems and interacting with a user; and a plurality of LAN EXPERT agents installed on LAN servers and LAN requesters to monitor a status of the LAN servers and LAN requesters; said LAN EXPERT agents receiving error messages issued by said LAN servers and LAN requestors on which they are installed and sending the error messages to the LAN EXPERT server; said LAN EXPERT server receiving error messages sent by all LAN EXPERT agents, a received error message being an event to be diagnosed by the inference engine of the LAN EXPERT server; said inference engine of the LAN EXPERT server forming an event cluster for a received error message, wherein a cluster is a data structure that holds partial diagnostic results containing correlated events and possible causes and wherein both events and causes have associated variables; said inference engine accessing the knowledge base of the LAN EXPERT server to retrieve a cluster containing all related causes for an event corresponding to a received error message as defined in the knowledge base, wherein variables in causes may be instantiated by the event; said inference engine of the LAN EXPERT server comparing subsequent error messages with each cluster to determine whether subsequent events should join a cluster or not; said inference engine joining a subsequent event to a cluster if a mathematical intersection of causes of the subsequent event and causes of the cluster is not empty, otherwise, said inference engine forming a new event cluster for the subsequent event so that, as more and more events are joined in a cluster by the inference engine, the number of causes decreases and variables are instantiated and when a cluster contains only one fully instantiated cause, a diagnostic conclusion is reached; and said user interface of said LAN EXPERT server reporting diagnostic information generated by the inference engine.
-
-
4. A diagnostic system for managing error events in a local area network (LAN) comprising:
-
at least one LAN server connected in said local area network;
a plurality of LAN requesters connected in said local area network;a LAN EXPERT server, the LAN EXPERT server being connected in said local area network and including an inference engine, a knowledge base containing relationships between possible causes and error messages, and a user interface for reporting problems and interacting with a user; and a plurality of LAN EXPERT agents, the LAN EXPERT agents being installed on LAN servers and LAN requesters to monitor a status of the LAN servers and LAN requesters, said LAN EXPERT agents transmitting error messages to said LAN EXPERT server, the inference engine of said LAN EXPERT server forming event clusters for received error messages, accessing said knowledge base to retrieve all related causes for an event corresponding to a received error message and joining events to a cluster in a process wherein a number of causes of events in clusters are decreased to reach a diagnostic conclusion, the diagnostic conclusion being reported via said user interface of the LAN EXPERT server. - View Dependent Claims (5)
-
-
6. A diagnostic method determining a cause of an error in a local area network (LAN) comprising the steps of:
-
receiving by LAN EXPERT agents error messages issued by LAN servers and LAN requesters on which the LAN EXPERT agents are installed and sending the error messages to a LAN EXPERT server connected to the local area network; forming by an inference engine of the LAN EXPERT server an event cluster for a received error message as a data structure holding partial diagnostic results containing correlated events and possible causes; accessing by the inference engine a knowledge base of the LAN EXPERT server to retrieve all related causes for an event corresponding to a received error message as defined in the knowledge base; determining by the inference engine whether subsequent events should be joined in a cluster and joining by the inference engine those subsequent events determined that should be joined to clusters so that as more and more events are joined in a cluster by the inference engine, a number of causes decreases and a diagnostic conclusion is reached; and reporting by a user interface diagnostic information generated by the inference engine. - View Dependent Claims (7, 8, 9)
-
Specification