Architecture for a self-healing computer system
First Claim
1. A self-healing system, the system comprising:
- a processor comprising a code block, a dynamic signature analysis circuit and an error mitigation system,the code block associated with the operation of a portion of digital logic and the dynamic signature analysis circuit, the processor coupled to execute the code block, the dynamic signature analysis circuit coupled to create a dynamic signature representing the operation of the portion of digital logic associated with the code block;
the error mitigation system coupled for receiving the dynamic signature from the dynamic signature analysis circuit, the error mitigation system having a static signature representing error-free execution of the code block, the error mitigation system comparing the dynamic signature to the static signature to detect an error in the digital logic based on whether the signatures match, the error mitigation system coupled to retry execution of the code block if the signatures do not match, the error mitigation system storing log information that includes (1) a description of the error, (2) the retrying execution of the code block, and (3) a result of the retrying execution, the log information further including one or more of a description of system temperature history that was recorded by the error mitigation system prior to the detection of the error in the digital logic, and a description about an amount of processor power used during retry attempts for the detected error.
3 Assignments
0 Petitions
Accused Products
Abstract
The self-healing system comprises a self-healing processor and an error mitigation system. The self-healing processor includes a code block associated with the operation of a portion of digital logic. The self-healing processor also includes a dynamic signature analysis circuit. The processor executes the code block. The dynamic signature analysis circuit creates a dynamic signature representing the operation of the portion of digital logic associated with the code block. The error mitigation system receives the dynamic signature from the dynamic signature analysis circuit. The error mitigation system compares the dynamic signature to a static signature to determine if the signatures match. If the signatures do not match, then the digital logic associated with the code block has an error. The error mitigation system retries execution of the code block. The error mitigation system stores log information describing the above events.
-
Citations
21 Claims
-
1. A self-healing system, the system comprising:
a processor comprising a code block, a dynamic signature analysis circuit and an error mitigation system, the code block associated with the operation of a portion of digital logic and the dynamic signature analysis circuit, the processor coupled to execute the code block, the dynamic signature analysis circuit coupled to create a dynamic signature representing the operation of the portion of digital logic associated with the code block; the error mitigation system coupled for receiving the dynamic signature from the dynamic signature analysis circuit, the error mitigation system having a static signature representing error-free execution of the code block, the error mitigation system comparing the dynamic signature to the static signature to detect an error in the digital logic based on whether the signatures match, the error mitigation system coupled to retry execution of the code block if the signatures do not match, the error mitigation system storing log information that includes (1) a description of the error, (2) the retrying execution of the code block, and (3) a result of the retrying execution, the log information further including one or more of a description of system temperature history that was recorded by the error mitigation system prior to the detection of the error in the digital logic, and a description about an amount of processor power used during retry attempts for the detected error. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
12. A method for detecting and mitigating an error in digital logic, the method comprising:
-
executing a code block, the code block associated with operation of a portion of the digital logic; creating a dynamic signature representing operation of the portion of digital logic; comparing the dynamic signature to a static signature to detect an error in the digital logic based on whether the signatures match, the static signature representing an error-free execution of the code block; retrying execution of the code block responsive to detecting an error; and storing log information that includes (1) a description of the error, (2) the retrying execution of the code block, and (3) a result of the retrying execution, the log information further including one or more of a description of system temperature history that was recorded by the error mitigation system prior to the detection of the error in the digital logic, and a description about an amount of processor power used during retry attempts for the detected error. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A method for detecting and mitigating an error in digital logic, the method comprising:
-
executing a code block, the code block associated with operation of a portion of the digital logic; creating a dynamic signature representing operation of the portion of digital logic; comparing the dynamic signature to a static signature to detect an error in the digital logic based on whether the signatures match, the static signature representing an error-free execution of the code block; retrying execution of the code block; receiving a dynamic signature for the retried execution of the code block; comparing the dynamic signature for the retried execution to the static signature to determine if the signatures match; continuing to retry execution of the code block until either the signatures match or a predetermined number of retries are executed; and storing log information that includes (1) a description of the error, (2) the retrying execution of the code block, and (3) a result of the retrying execution, the log information further including one or more of a description of system temperature history that was recorded by the error mitigation system prior to the detection of the error in the digital logic, and a description about an amount of processor power used during retry attempts for the detected error. - View Dependent Claims (18, 19, 20, 21)
-
Specification