×

Invariants-based learning method and system for failure diagnosis in large scale computing systems

  • US 8,185,781 B2
  • Filed: 04/05/2010
  • Issued: 05/22/2012
  • Est. Priority Date: 04/09/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for diagnosing a detected failure in a computer system, the method comprising:

  • comparing, in a computer process, a failure signature of the detected failure to an archived failure signature contained in a database to determine if the archived failure signature matches the failure signature of the detected failure;

    if the archived failure signature matches the failure signature of the detected failure, applying, in a computer process, an archived solution to the computer system that resolves the detected failure, the archived solution corresponding to a solution used to resolve a previously detected computer system failure corresponding to the archived failure signature in the database that matches the detected failure;

    wherein the archived failure signature is based on a set of broken computer system invariants, the set of broken computer system invariants corresponding to the previously detected computer system failure;

    further comprising constructing the database in a computer process prior to comparing the failure signature of the detected failure to the archived failure signature;

    wherein the constructing the database includes extracting invariants from the computer system;

    wherein the extracting the invariants includes;

    modeling invariants of the computer system;

    evaluating each of the invariants to determine whether it is broken;

    counting the broken invariants to determine whether the number of the broken invariants meets a predetermined threshold number;

    if the number of the broken invariants meets the predetermined threshold number deeming this result the previously detected computer system failure; and

    combining the broken invariants into the set of broken invariants forming the archived failure signature of the previously detected computer system failure.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×