Systems and methods for fast detection and diagnosis of system outages
First Claim
Patent Images
1. An apparatus comprising:
- at least one processor; and
a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising;
computer readable program code configured to ascertain a system outage;
computer readable program code configured to categorize aberrant user activities, as possible contributors to the system outage, based on system impact in terms of at least one member taken from the group consisting of;
process forking and closure activities, file system activities, network activities, memory activities, user activity parameter values;
computer readable program code configured to learn user activities and system impact;
computer readable program code configured to compare user activities and system impact against predetermined rules;
computer readable program code configured to generate a system outage alert; and
computer readable program code configured to display a user activity responsible for the system outage; and
said computer readable program code being configured to;
measure system impact of user activities via employing data collector agents; and
aggregate data collected by the data collector agents based on process hierarchy.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and arrangements for detecting and diagnosing system outages. A system outage is ascertained and aberrant user activities are categorized, as possible contributors to the system outage, based on system impact. User activities and system impact are learned, and user activities and system impact are compared against predetermined rules. A system outage alert is generated, and a user activity responsible for the system outage is displayed.
-
Citations
10 Claims
-
1. An apparatus comprising:
-
at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising; computer readable program code configured to ascertain a system outage; computer readable program code configured to categorize aberrant user activities, as possible contributors to the system outage, based on system impact in terms of at least one member taken from the group consisting of;
process forking and closure activities, file system activities, network activities, memory activities, user activity parameter values;computer readable program code configured to learn user activities and system impact; computer readable program code configured to compare user activities and system impact against predetermined rules; computer readable program code configured to generate a system outage alert; and computer readable program code configured to display a user activity responsible for the system outage; and said computer readable program code being configured to; measure system impact of user activities via employing data collector agents; and aggregate data collected by the data collector agents based on process hierarchy.
-
-
2. A computer program product comprising:
-
a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising; computer readable program code configured to ascertain a system outage; computer readable program code configured to categorize aberrant user activities, as possible contributors to the system outage, based on system impact in terms of at least one member taken from the group consisting of;
process forking and closure activities, file system activities, network activities, memory activities, user activity parameter values;computer readable program code configured to learn user activities and system impact; computer readable program code configured to compare user activities and system impact against predetermined rules; computer readable program code configured to generate a system outage alert; and computer readable program code configured to display a user activity responsible for the system outage; and said computer readable program code being configured to; measure system impact of user activities via employing data collector agents; and aggregate data collected by the data collector agents based on process hierarchy. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10)
-
Specification