Method and apparatus for component association inference, failure diagnosis and misconfiguration detection based on historical failure data
First Claim
1. A method for inferring component associations among a plurality of components in a distributed computing system, said method comprising the steps of:
- obtaining status information for each pertinent component of said plurality of components;
forming an N by D matrix, X, based on said status information, where;
N comprises a number of probe instances associated with a given time frame; and
D comprises a number of said plurality of components for which said associations are to be inferred; and
factorizing said matrix X to obtain a first matrix indicative of said component associations to be inferred and a second matrix indicative of failure explanations for corresponding ones of said probe instances, wherein one or more steps are performed by a hardware device.
7 Assignments
0 Petitions
Accused Products
Abstract
A method (which can be computer implemented) for inferring component associations among a plurality of components in a distributed computing system includes the steps of obtaining status information for each pertinent component of the plurality of components, forming an N by D matrix, X, based on the status information, and factorizing the matrix X to obtain a first matrix indicative of the component associations to be inferred and a second matrix indicative of failure explanations for corresponding ones of the probe instances. N is a number of probe instances associated with a given time frame. D is a number of the plurality of components for which the associations are to be inferred. Techniques are also presented for forming a database with the status information.
-
Citations
35 Claims
-
1. A method for inferring component associations among a plurality of components in a distributed computing system, said method comprising the steps of:
-
obtaining status information for each pertinent component of said plurality of components; forming an N by D matrix, X, based on said status information, where; N comprises a number of probe instances associated with a given time frame; and D comprises a number of said plurality of components for which said associations are to be inferred; and factorizing said matrix X to obtain a first matrix indicative of said component associations to be inferred and a second matrix indicative of failure explanations for corresponding ones of said probe instances, wherein one or more steps are performed by a hardware device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A method for forming a database useful in inferring component associations among a plurality of components in a distributed computing system, said method comprising the steps of:
-
obtaining a location of each of at least two of said plurality of components; determining a topology of said at least two of said plurality of components based on said location; monitoring status information for each pertinent component of said plurality of components, said monitoring comprising probing in a series of probe instances, each of said instances pertaining to a substantially contemporaneous time stamp; and recording said status information in said database when predetermined conditions are present, wherein said status information is based on said determined topology and wherein one or more steps are performed by a hardware device. - View Dependent Claims (23, 24, 25, 26, 27, 28)
-
-
29. A computer program product comprising a tangible computer useable readable storage medium including computer usable program code for inferring component associations among a plurality of components in a distributed computing system, said computer program product including:
-
computer usable program code for obtaining status information for each pertinent component of said plurality of components; computer usable program code for forming an N by D matrix, X, based on said status information, where; N comprises a number of probe instances associated with a given time frame; and D comprises a number of said plurality of components for which said associations are to be inferred; and computer usable program code for factorizing said matrix X to obtain a first matrix indicative of said component associations to be inferred and a second matrix indicative of failure explanations for corresponding ones of said probe instances. - View Dependent Claims (30, 31)
-
-
32. An apparatus for inferring component associations among a plurality of components in a distributed computing system, the apparatus comprising:
-
a memory; and at least one processor, coupled to the memory, operative to; obtain status information for each pertinent component of said plurality of components; form an N by D matrix, X, based on said status information, where; N comprises a number of probe instances associated with a given time frame; and D comprises a number of said plurality of components for which said associations are to be inferred; and factorize said matrix X to obtain a first matrix indicative of said component associations to be inferred and a second matrix indicative of failure explanations for corresponding ones of said probe instances. - View Dependent Claims (33, 34)
-
-
35. A computer program product comprising a tangible computer useable readable storage medium including computer usable program code for forming a database useful in inferring component associations among a plurality of components in a distributed computing system, said computer program product including:
-
computer usable program code for obtaining a location of each of at least two of said plurality of components; computer usable program code for determining a topology of said at least two of said plurality of components based on said location; computer usable program code for monitoring status information based on said determined topology for each pertinent component of said plurality of components, said monitoring comprising probing in a series of probe instances, each of said instances pertaining to a substantially contemporaneous time stamp; and computer usable program code for recording said status information in said database when predetermined conditions are present, wherein said status information is based on said determined topology.
-
Specification