Systems and methods for prioritizing error notification
First Claim
1. A method for prioritizing repairs of a computing system of networked computers comprising:
- associating a base cost and a confidence with each one of at least one error type that can impact the performance of the computing system such that an actual cost for each one of the at least one error type can be produced with at least one computer using the associated base cost and the associated confidence, wherein the associated confidence is a value indicating the degree to which the base cost associated with a respective error type is accurate;
detecting an occurrence of any one of the at least one error type with at least one computer, creating a detected error associated with the error type that occurred, and using the detected error to update a detected error collection wherein the detected error collection comprises one or more detected errors;
using the actual cost of the error type associated with each detected error in the detected error collection to create one or more repair orders with at least one computer; and
detecting one or more repairs with at least one computer and updating the detected error collection accordingly.
3 Assignments
0 Petitions
Accused Products
Abstract
Errors occurring in computing clusters and other computing systems can impact system performance. Each error has an error type and each error type has a base cost estimating importance of correcting the error. Each error type also has a confidence indicating the level of agreement between those who fix the errors and those who assigned the base cost. An error type'"'"'s actual cost is produced using the base cost and confidence. An error cascade map contains estimates that one error will cause another. An error type that causes other error types has a cascade cost. Upon detecting an error type, a repair order can be generated, depending on the cost involved. Repairs are then performed. Feedback mechanisms and correlations can be used to update the confidences and the error cascade map.
48 Citations
22 Claims
-
1. A method for prioritizing repairs of a computing system of networked computers comprising:
-
associating a base cost and a confidence with each one of at least one error type that can impact the performance of the computing system such that an actual cost for each one of the at least one error type can be produced with at least one computer using the associated base cost and the associated confidence, wherein the associated confidence is a value indicating the degree to which the base cost associated with a respective error type is accurate; detecting an occurrence of any one of the at least one error type with at least one computer, creating a detected error associated with the error type that occurred, and using the detected error to update a detected error collection wherein the detected error collection comprises one or more detected errors; using the actual cost of the error type associated with each detected error in the detected error collection to create one or more repair orders with at least one computer; and detecting one or more repairs with at least one computer and updating the detected error collection accordingly. - View Dependent Claims (2, 3, 4, 5, 21)
-
-
6. A method for prioritizing repairs of a computing system of networked computers comprising:
-
associating a base cost and a confidence with each one of at least one error type that can impact the performance of the computing system such that an actual cost for each one of the at least one error type can be produced with at least one computer using the associated base cost and the confidence, wherein the associated confidence is a value indicating the degree to which the base cost associated with a respective error type is accurate; producing with at least one computer an error cascade map associating at least one causing error with at least one resulting error to produce at least one cascade association and associating a cascade probability with each one of the at least one cascade association wherein the causing error is one of the at least one error type, the resulting error is one of the at least one error type, and the cascade probability indicates the likelihood that the causing error will cause the resulting error to occur and wherein a cascade cost for each one of the at least one error type can be produced with at least one computer from the error cascade map and the actual cost of each of the at least one error type; detecting an occurrence of any one of the at least one error type with at least one computer, creating a detected error associated with the error type that occurred, and using the detected error to update a detected error collection wherein the detected error collection comprises one or more detected errors; using the cascade cost of the error type associated with each detected error in the detected error collection to create one or more repair orders with at least one computer; and detecting one or more repairs with at least one computer and updating the detected error collection accordingly. - View Dependent Claims (7, 8, 9, 10, 11, 12, 22)
-
-
13. A system for prioritizing repairs of a computing system comprising:
-
at least two computers in the computing system wherein each of the at least two computers can communicate with any of the at least two computers via a communications network; at least one error type wherein an error type identifies one of at least one error that impacts the performance of the computing system and an error price association wherein each of the at least one error type is associated with a base cost and a confidence such that an actual cost can be produced for each one of the at least one error type, wherein the associated confidence is a value indicating the degree to which the base cost associated with a respective error type is accurate; a detected error collection comprising one or more detected errors wherein each of the at least one detected errors has an error type; at least one error detection module that detects an occurrence of at least one error and updates the detected error collection based on the at least one error; a repair assignment module that examines the detected error collection and creates one or more repair orders that cause one or more repairs wherein each one of the one or more repairs is associated with one or more of the one or more repair orders; and a repair detection module that detects an occurrence of at least one repair and updates the detected error collection. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A system for prioritizing repairs of a computing system comprising:
-
a means of computing comprising at least two computers that can communicate over a communications network; a means of detecting at least one error in the means of computing and assigning a cost to each one of the at least one error based upon a base cost and a confidence associated with each at least one error, wherein the associated confidence is a value indicating the degree to which the base cost associated with a respective error is accurate; a means of tracking the at least one error; a means of prioritizing and assigning at least one repair wherein each one of the at least one repair removes at least one of the at least one error; and a means of detecting the completion of each one of the at least one repair. - View Dependent Claims (20)
-
Specification