SYSTEMS AND METHODS FOR PREDICTIVE RELIABILITY MINING

0Associated
Cases 
0Associated
Defendants 
0Accused
Products 
1Forward
Citation 
0
Petitions 
1
Assignment
First Claim
1. A computer implemented method for predictive reliability mining in a population of connected machines, the method comprising:
 identifying sets of discriminative Diagnostic Trouble Codes (DTCs) from DTCs generated preceding failure, the sets of discriminative DTCs corresponding to associated predefined parts of the connected machines;
generating a temporal conditional dependence model based on temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs; and
predicting future failures based on the generated temporal conditional dependence model and occurrence and nonoccurrence of DTCs.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for predictive reliability mining are provided that enable predicting of unexpected emerging failures in future without waiting for actual failures to start occurring in significant numbers. Sets of discriminative Diagnostic Trouble Codes (DTCs) from connected machines in a population are identified before failure of the associated parts. A temporal conditional dependence model based on the temporal dependence between the failure of the parts from past failure data and the identified sets of discriminative DTCs is generated. Future failures are predicted based on the generated temporal conditional dependence and root cause analysis of the predicted future failures is performed for predictive reliability mining. The probability of failure is computed based on both occurrence and nonoccurrence of DTCs. The root cause analysis enables identifying a subset of the population when an early warning is generated and also when an early warning is not generated.
1 Citation
Data driven converged infrastructure components evaluation  
Patent #
US 10,102,055 B1
Filed 03/22/2016

Current Assignee
Emc IP Holding Company LLC

Sponsoring Entity
Emc IP Holding Company LLC

No References
17 Claims
 1. A computer implemented method for predictive reliability mining in a population of connected machines, the method comprising:
identifying sets of discriminative Diagnostic Trouble Codes (DTCs) from DTCs generated preceding failure, the sets of discriminative DTCs corresponding to associated predefined parts of the connected machines; generating a temporal conditional dependence model based on temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs; and predicting future failures based on the generated temporal conditional dependence model and occurrence and nonoccurrence of DTCs.  View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
 15. A system for predictive reliability mining in a population of connected machines, the system comprising:
one or more processors; a communication interface device; one or more internal data storage devices operatively coupled to the one or more processors for storing; an input module configured to receive Diagnostic Trouble Codes (DTCs) from onboard diagnostic systems of predefined parts of the connected machines; a DTC pattern identifier configured to identify sets of discriminative DTCs from the DTCs generated preceding failure, the sets of discriminative DTCs corresponding to associated predefined parts of the connected machines; a Bayesian network generator configured to generate a temporal conditional dependence model based on temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs; and a failure predictor configured to predict future failures based on the generated temporal conditional dependence and occurrence and nonoccurrence of DTCs.  View Dependent Claims (16)
 17. A computer program product comprising a nontransitory computer readable medium having a computer readable program embodied therein, wherein the computer readable program, when executed on a computing device, causes the computing device to:
identify sets of discriminative Diagnostic Trouble Codes (DTCs) from DTCs generated preceding failure, the sets of discriminative DTCs corresponding to associated predefined parts of the connected machines; generate a temporal conditional dependence model based on temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs; and predict future failures based on the generated temporal conditional dependence model and occurrence and nonoccurrence of DTCs.
1 Specification
This U.S. patent application claims priority under 35 U.S.C. §119 to: India Application No. 3922/MUM/2015 filed on 15 Oct. 2015. The entire contents of the aforementioned application are incorporated herein by reference.
The embodiments herein generally relate to predictive reliability mining, and more particularly to systems and methods involving sensor augmented reliability models.
With industrial internet shaping the future, it is only natural to have connected machines forming part of every aspect of technology. Traditionally, predictive reliability mining has been based on historical data on part failures from warranty claims using distributions from exponential family such as the Weibull or lognormal distribution. When observed failures (in one or more parts) across a population of machines exceeds the number expected based on such a model, this may serve as an early warning of a potential systemic problem with the population. Such early warnings rely on some exceptionally high failures having actually occurred. Again, it has been seen that significant deviations from expected failure counts may often occur only in some unknown subset of the population, for instance, a particular batch, or machines manufactured in a particular year or at a particular plant site, and the like. Such deviations are insignificant across the full population and remain unidentified when traditional reliability mining techniques are employed. It is a challenge to not only detect potential problems earlier than possible using traditional reliability analysis but also to identify a subset of the population wherein an anomaly may have occurred that would statistically be otherwise hidden in the population.
Embodiments of the present disclosure present technological improvements as solutions to one or more of the abovementioned technical problems recognized by the inventors in conventional systems.
Systems and methods of the present disclosure enable predictive reliability mining in a population of connected machines. Due to ‘industrial internet’, most industrial equipment, install sensors that continuously monitor run time behavior of desired components in the field and also transmit predefined sensor information back to the manufacturer by various means including over wireless cellular or metropolitan WiFi networks. In particular, modern automobiles have onboard electronic control modules that generate alphanumeric Diagnostic Trouble Codes (DTCs) to indicate abnormal sensor levels in various situations, some of which are indicative of actual or potential part malfunction. Such DTCs typically triggered before actual part failure. Systems and methods of the present disclosure analyze such DTCs to correlate the DTCs with future failure times to serve as early warning indicators of possible future part failures.
In an aspect, there is provided a computer implemented method for predictive reliability mining in a population of connected machines, the method comprising identifying sets of discriminative Diagnostic Trouble Codes (DTCs) from DTCs generated preceding failure, the sets of discriminative DTCs corresponding to associated predefined parts of the connected machines; generating a temporal conditional dependence model based on temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs; and predicting future failures based on the generated temporal conditional dependence and occurrence and nonoccurrence of DTCs.
In an embodiment, the step of predicting future failures can be followed by performing root cause analysis of the predicted future failures for predictive reliability mining.
In an embodiment, the step of identifying sets of discriminative DTCs is based on association rule mining, wherein the association rule mining comprises use of Apriori technique.
In an embodiment, antecedents of rules identified by the association rule mining technique form the set of discriminative DTCs.
In an embodiment, the temporal conditional dependence model is a Bayesian network.
In an embodiment, the temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs follows a Weibull distribution.
In an embodiment, the step of generating a temporal conditional dependence model is followed by a step of segregating the population of connected machines into a first set comprising connected machines in which DTCs are not generated in a given time period and a second set comprising connected machines in which at least one DTC is generated in the given time period.
In an embodiment, the step of predicting future failures comprises the step of computing the probability of failure based on both occurrence and nonoccurrence of DTCs in the segregated population of connected machines and generating an early warning when predicted number of failures are greater than expected number of failures based on the past failure data by a predefined value. In an embodiment, the predefined value is based on the predicted number of failures and variance of a random variable representing number of failures of the predefined parts in a given time period.
In an embodiment, the step of performing root cause analysis comprises identifying a subset of the population when an early warning for at least one of the predefined parts is generated. In an embodiment, the step of performing root cause analysis comprises (i) calculating a first expected time of failure based on the past failure data; (ii) segregating the population of connected machines into a first set comprising connected machines in which DTCs are not generated in a given time period and a second set comprising connected machines in which at least one DTC is generated in the given time period; (iii) calculating a second expected time of failure based on expected time of failure of the second set that is further based on predefined delay parameters and occurrence time of the at least one DTC; (iv) defining an anomaly score for each of the connected machines based on the calculated first expected time of failure and the second expected time of failure; (v) iteratively performing steps (i) through (iv) for predefined features of the connected machines; and (vi) identifying the subset of the population having the anomaly score greater than a predefined threshold, the identified subset indicating possible reasons for the early warning for each of the predefined features.
In an embodiment, the step of performing root cause analysis comprises identifying a subset of the population with a possible anomaly when an early warning is not generated at the population level. In an embodiment the step of performing root cause analysis comprises (i) defining an anomaly score for each of the connected machines; (ii) associating each of the connected machines with a record comprising a set of predefined features and the defined anomaly score; (iii) discretizing the defined anomaly score into either a predefined high level or a normal level; (iv) performing association rule mining to identify association rules with the high level anomaly score; (v) clustering the identified association rules using a density based technique to form rule dusters; (vi) selecting one or more rules from each of the rule clusters that have high support and confidence; and (vii) identifying the subset of the population with the high level anomaly score based on antecedents of the selected one or more rules that are indicative of potential reasons for the high level anomaly score.
In another aspect, there is provided a system for predictive reliability mining in a population of connected machines, the system comprising: one or more processors; a communication interface device; one or more internal data storage devices operatively coupled to the one or more processors for storing: an input module configured to receive Diagnostic Trouble Codes (DTCs) from onboard diagnostic systems of predefined parts of the connected machines; a DTC pattern identifier configured to identify sets of discriminative DTCs from DTCs generated preceding failure, the sets of discriminative DTCs corresponding to associated predefined parts of the connected machines; a Bayesian network generator configured to generate a temporal conditional dependence model based on temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs; and a failure predictor configured to predict future failures based on the generated temporal conditional dependence and occurrence and nonoccurrence of DTCs.
In an embodiment, the system described herein above can further comprise an analyzer configured to perform root cause analysis of the predicted future failures for predictive reliability mining.
In yet another aspect, there is provided a computer program product for processing data, comprising a nontransitory computer readable medium having program instructions embodied therein for identifying sets of discriminative Diagnostic Trouble Codes (DTCs) from DTCs generated preceding failure, the sets of discriminative DTCs corresponding to associated predefined parts of the connected machines; generating a temporal conditional dependence model based on temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs; predicting future failures based on the generated temporal conditional dependence and the occurrence and nonoccurrence of DTCs; and performing root cause analysis of the predicted future failures for predictive reliability mining.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The embodiments herein will be better understood from the following detailed description with reference to the drawings, in which:
It should be appreciated by those skilled in the art that any block diagram herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computing device or processor, whether or not such computing device or processor is explicitly shown.
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the leftmost digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the spirit and scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope and spirit being indicated by the following claims.
Referring now to the drawings, and more particularly to
The I/O interface can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface can include one or more ports for connecting a number of devices to one another or to another server.
The memory may include any computerreadable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or nonvolatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, the various modules of system 100 can be stored in the memory.
Connected machines are provided with onboard diagnostic (OBD) systems 110. There is typically a many to many mapping between Diagnostic Trouble Codes (DTCs) and parts of the connected machines, i.e., one DTC may indicate a malfunction in more than one part and a malfunction in a single part may trigger more than one DTC. Therefore, it is important to identify discriminative DTCs that can be correlated to future failures of each part, even if they are not uniquely associated with failures of a single part. At step 202, DTC pattern identifier 114 can identify sets of discriminative DTCs wherein the sets of discriminative DTCs correspond to associated predefined parts of the connected machines. In an embodiment, the step of identifying sets of discriminative DTCs is based on association rule mining. In an embodiment, the association rule mining can include identifying rules for discriminative DTCs for each part by examining DTC occurrences in situations leading to failures and comparing these with situations where no failures are observed.
In order to use DTCs effectively as an early warning indicator of part failure, at step 204, a temporal conditional dependence model is generated by Bayesian Network generator 116. In an embodiment, the temporal conditional dependence model is based on the temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs. Systems and methods of the instant disclosure, thus use past failure data, the identified sets of discriminative DTCs and the temporal dependence therein to augment traditionally known reliability models. In an embodiment, the temporal dependence between failure of the predefined parts from past failure data and the identified sets of discriminative DTCs follows a Weibull distribution.
At step 206, future failures are predicted based on the generated temporal conditional dependence by failure predictor 118. In an embodiment, the step of predicting future failures comprises generating early warning when the predicted number of failures are greater than the expected number of failures based on the past failure data by a predefined value. In an embodiment, failure predictor 118 is configured to identify parts of the connected machines associated with the early warning and accordingly identify an associated subset of the population.
At step 208, root cause analysis of the predicted future failures for predictive reliability mining is performed by analyzer 120.
Steps 202 through 208 of
Let I={i_{1}, i_{2}, . . . i_{m}} be a set of m binary attributes called items. Let D={t_{1}, t_{2}, . . . t_{n}} be a set of transactions called the database. Each transaction in D has a unique id and contains a subset of items in I. Itemsets X, Y are called antecedent and consequent of a rule respectively. An association rule r is an implication expression of the form X>Y, where X, Y⊂I and X∩Y=. The support s(X) of an itemset X is defined as the fraction of transactions in the database D which contain the itemset X. The confidence of a rule conf(r) is s(X∪Y)/s(X). Further lift of a rule I(r) measures its interestingness and it is a ratio of its confidence and support of consequent. i.e. conf(r)/s(Y). The coverage of a rule r is the fraction of number of transaction in D containing X, and is given by s(X). Let R={r1, r2, . . . r_{I}} be a set of I rules then coverage of R represented by c_{R }in D is a fraction of transactions in D containing either of the antecedents of rules in set R.
Step 202 wherein sets of discriminative Diagnostic Trouble Codes (DTCs) are identified involves identifying DTCs which are discriminative for failure of say part Pi, i.e., DTCs which occur before failure of Pi but not before other part failures. In an embodiment, identifying discriminative DTCs is based on association rule mining. In an embodiment, the association rule mining uses Apriori technique to identify rules of DTCs which lead to failure of part Pi with high confidence and lift. Since a single part can fail due to different reasons in different vehicles, it is possible to obtain more than one high confidence rule for a single part. For part P_{i}, let I be the item set of all possible DTCs for the entire population of connected machines. At any given time t_{0}, let P_{i }represent a failed part as in n connected machines. For each of the n connected machines, starting from the failure date of part P_{i}, all DTCs in transaction f_{i }which get triggered in past d days are collected and item f representing failure is added to each transaction. Thus, for n connected machines, a set of n transactions D_{f}={f_{1}, f_{2}, . . . , f_{n}} called failure set is obtained, where each f_{i}⊂I. Similarly, another set D_{nf }called nonfailure set for the connected machines is obtained, where parts other than P_{i }have failed and an item n_{f }is added to every transaction of the set D_{nf}. For multiple occurrence of a single DTC in one transaction, the first occurrence of that DTC is considered. A set of rules R=D_{f}∪D_{nf }using Apriori technique are generated. Rules in the set R having confidence con f(r)>τ_{p}∀rεR, and c_{R }representing the coverage of set R in D_{f}>R_{c }are identified. Here, τ_{p }is a prior threshold for confidence and R_{c }is a prior threshold for coverage.
Set of antecedents of all the rules in set R, form the discriminative set of DTCs as disc_{i}={D_{1}, D_{2}, D_{3}, . . . D_{N}} for the part P_{i},
where each set D_{j }contains at least one DTC and D_{1}∩D_{2}∩ . . . D_{N}=.
Since DTCs occur before actual failure, time delay between DTC occurrence and part failure defines the dependency between them. At step 204, a Bayesian reliability model or a Bayesian network is generated based on the dependence between failure of part P_{i }and its set of discriminative DTCs disc_{i }as illustrated in
F(t)=1−e^{−(t/α)}^{β} (1)
In the context of the present disclosure, uniform distribution is assumed as a prerequisite for α_{i }and β_{i }with lower limit as 0 and upper limit as a>0 and b>0 respectively, i.e. P(α_{i})˜U(0, a) and P(β_{i})˜U(0, b).
Node τ_{i }in
P(τ_{ij}α_{Dj},β_{Dj})˜Weibull(α_{Dj},β_{Dj}) (2)
In the context of the present disclosure, uniform distribution is assumed as a prerequisite for delay parameters with lower limit as 0 and upper limit as α_{Dj}>0 and b_{Dj}>0 respectively, i.e. P(α_{Dj})˜U(0, α_{Dj}) and P(β_{Dj})˜U(0, b_{Dj}).
Since DTCs occur before part failure, in the context of the present disclosure, node t_{D }represents variable t_{Dj }that refers to time at which DTC D_{j }occurs in terms of variables t_{pi }and τ_{ij}. In an embodiment, time t_{Dj }follows normal distribution with mean as t_{pi}−τ_{ij }with standard deviation σ_{j }as represented in equation (3) herein below.
P(t_{Dj}τ_{ij},σ_{j})˜N(t_{pi}−τ_{ij},σ_{j}) (3)
wherein prerequisite for σ_{j }is uniform distribution i.e. σ_{j}˜U(0, s_{j}).
In accordance with the present disclosure, the Bayesian network approximates joint distribution with one node represented by t_{pi}, one node for every DTC D_{j }in discriminative set of P_{i }represented by τ_{ij}, one for every D_{j}εdisc_{i }represented by t_{Dj}, and rest of the nodes representing the parameters of these three random variables. All parameters of the model, namely the Weibull parameters for failures and DTCs, and mean and variance of the normally distributed delays, are learned using Markov Chain Monte Carlo (MCMC) sampling given past failure data and DTC data from a population of connected machines.
At step 206, future failures are predicted based on both past failure data and discriminative DTCs that have been modeled as the Bayesian network in step 204. Let N_{pi }represent number of failures of part P_{i }during a particular time frame in the future based on DTCs and M_{pi }represent expected number of failures in same time frame using the traditional reliability model. Statistically comparing N_{pi }and M_{pi}, an early warning is generated by system 100 when N_{p }is higher than M_{pi }by a predetermined value.
Given m connected machines and time t_{0}, suppose P_{i }has failed in n out m connected machines, let V be the set of remaining r (=m−n) connected machines, failure probability that the part P_{i }will fail in [t_{1}, t_{2}] (t_{2}>t_{1}>t_{0}) given that it has survived till t_{0 }is represented by equation (4) herein below.
where d_{k }is the time calculated till t_{k }from some initial time and F(t) is the probability that the part will fail before time t.
F(t), as known in the art, can be represented by equation (5) herein below.
F(t)=1−e^{−(t/α)}^{β} (5)
where α and β are scale and shape parameters of Weibull distribution and S(t) is called survival probability which can be represented by equation (6) herein below.
S(t)=1−F(t) (6)
Failure probability of part P_{i }for every vεV is calculated using traditional basic reliability model (BRM) as well as the sensoraugmented reliability model (SARM) that incorporates DTCs. In the conventional BRM, only past failure data of part P_{i }is utilized to calculate failure probability in r vehicles. Using ‘failure parameters’ as scale and shape parameters and selling time t_{s}^{v }of every vehicle vεV as initial time, i.e., d_{k}^{v}=t_{k}−t_{s}^{v }in equations 4 and 5. Then p_{i}^{v}, the failure probability for every vεV is calculated.
In SARM of the present disclosure, both past failure data and DTCs are utilized. Remaining r vehicles in which part P_{i }has not failed till to are divided into two sets 1) V_{1}: set of vehicles in which at least one of Djεdisc_{i }has occurred in [t_{0}−d, t_{0}], and 2) V_{0}: set of vehicles in which none of D_{j}εdisc_{i }has occurred in [t_{0}−d, t_{0}].
For vehicles in V_{1}, since at least one of the D_{j}εdisc_{i }has occurred in [t_{0}−d, t_{0}], for every vεV_{1}, occurrence time tap of DTC D_{j }is used as the initial time i.e. d_{k}=t_{k}−t_{Dj}. Further using ‘delay parameters’ as scale and shape parameters in equations 4 and 5, p_{i1}^{v}, the failure probability of part P_{i }for every vεV_{1 }is calculated given that the DTC D_{j }occurred at time t_{Dj}. In case more than one discriminative DTC has occurred in a vehicle, a DTC with highest confidence is used. Further, since the confidence conf(r) of rule r: D_{j}>P_{i }states that (1−conf(r))% times DTC D_{j }will lead to the failure other than P_{i}, P_{i1}^{v }is marginalized with equation 7 given herein below.
p′_{i1}^{v}=conf(r)p_{i1}^{v}+(1−conf(r))p_{i}^{v} (7)
wherein, p_{i}^{v }is the probability calculated for every vεV_{1 }as in the case of BRM.
For vehicles in V_{0}, the failure probability of part P_{i }for every vεV_{0 }using only past failure data is calculated, i.e. using ‘failure parameters’ as scale and shape parameters in equations 4 and 5, p_{0}^{v }for every vεV_{0 }is calculated. t_{s}^{v}, the selling time of a vehicle vεV_{0 }is used as initial time i.e. d_{k}^{v}=t_{k}−t_{s}^{v}. Since for every vεV_{0}, no DTC Djεdisc_{i }has occurred in [t_{0}−d, t_{0}], but as the definition of c_{R }states that c_{R}% of times, at least one of Djεdisc_{i }will occur in [t_{pi}−d, t_{pi}] before the failure of part P_{i}, probability p_{0}^{v }using c_{R }is marginalized with equation 8 given herein below.
p′_{i0}^{v}=p_{i0}^{v}(1−c_{R})(1−p_{ij})+p_{i0}^{v}c_{R}p_{ij} (8)
wherein p_{ij}, the probability that at least one of the DTC Djεdisc_{i }will occur in [t_{0}, t_{1}]. p_{ij }for the part P_{i }using the n failure of part P_{i }which occurred till to is then learned.
To determine whether the volume of failures as predicted by the BRM model and SARM model of the present disclosure differ significantly enough to declare an early warning, failure of part P_{i }in a vehicle is considered as a Bernoulli distributed random variable with parameter p (probability of the failure of P_{i}). Since the probability of the failure of P_{i }is different across vehicles, the failures of P_{i }in r vehicles forms the r independent and nonidentically Bernoulli distributed random variables. Sum of these r variables form another random variable X representing number of failures of P_{i }in [t_{1}, t_{2}] which follows PoissonBinomial distribution. The mean of X for both the cases is calculated as shown in equation 9 herein below.
Similarly, variance of X for both cases is given as
In accordance with the present disclosure, early warning for part P_{i }is reported if N_{Pi}−M_{Pi}>τ_{Pi}, where τ_{Pi }is decided based on N_{Pi }and Var_{NPi}.
Once an early warning for part Pi is detected, possible root causes are determined in step 208 by identifying and characterizing a subset of the connected machines by rules that point to possible causes for the anomaly. In some cases, a subset which significantly deviated from rest of the population of the connected machines in terms of failure rate, is small as compared to full population, which results into nonearly warning case when an analysis is performed at the full population level. System 100 of the present disclosure addresses two possible scenarios—1) Root cause analysis to find a subset of the connected machines, when an early warning for some part is identified at full population level and 2) Rule learning to find a subset of vehicles, when early warning is not visible on the full population level i.e. the difference between M_{Pi }and N_{Pi }is insignificant at the population level, but there is a small unknown subset of population which is significantly deviated from the rest of the population in terms of failure rate.
In accordance with the present disclosure, to find a subset of vehicles which could be possible reason of an already identified early warning, the probabilities computed in equations 7 and 8 herein above are used to determine vehiclelevel expected failure times. i.e., e_{i}^{v}, the expected time of failure of part P_{i }is computed given that it has survived till time t_{0 }for every vεV, which is given by equation 13 herein below.
wherein t_{v }and t′_{v }are time of vehicle till to starting from some initial time and S(t) is the survival probability.
Failure probability of part P_{i }for every vεV is analyzed using traditional basic reliability model (BRM) as well as the sensoraugmented reliability model (SARM) that incorporates DTCs. In BRM, failure parameters are used as scale and shape parameters in equation 13 to calculate e_{i}_{1}^{v }for every vεV. In an embodiment, selling time of vehicle v is used as initial time i.e. t′_{v}=t_{v}=t_{0}−t_{s}^{v}.
In SARM, expected time of failure e_{i}_{2}^{v }of part P_{i }is determined by dividing r vehicles into sets V_{0 }and V_{1}. For every vεV_{0}, expected time of failure is calculated. But for the vehicles in V_{1}, delay parameters are used in equation 13 to calculate expected time of failure of P_{1}. Also, occurrence time t_{Dj }of DTC D_{j }is used as initial time i.e. t_{v}=t_{0}−t_{D}_{j }and t′_{v}=t_{D}_{j}″−t_{s}^{v},
Thus for every vεV, there are two expected times of failure e_{i}_{1}^{v }and e_{i}_{2}^{v }of part P_{i}, calculated using BRM and SARM as described herein above. Further, e_{i}_{1}^{v }and e_{i}_{2}^{v }are used to define anomaly score of part P_{i }for vεV, which is given by the following equation.
α_{i}^{v}=e_{i}_{1}^{v}−e_{i}_{2}^{v} (14)
Vehicles from the population that have anomaly score greater than a predefined threshold are selected to find a subset of vehicles that could point to possible reasons for the early warning. Using featurebyfeature analysis, a collection of features, for instance, model, year of manufacture, plant of manufacture, geography, supplier etc. of the connected machines that differ statistically between the two sets, i.e., the entire population and the subset exhibiting high anomaly scores are determined.
It is also possible that only a small and unknown subset of vehicles deviates significantly from rest of the population in terms of failure rate and the count difference between N_{Pi }and M_{Pi }remains insignificant on the population level. To ensure that such situations are detected, a subgroup of the population exhibiting a high anomaly score is to be identified. In accordance with an embodiment, a subgroup discovery technique is utilized, wherein association rules are clustered to obtain a small set of rules that nevertheless cover a large fraction of the data. As described above, each connected machine or a part thereof can be ascribed an anomaly score, and in accordance with the present disclosure, subgroups of vehicles with high anomaly scores, which are characterized by rules based on machine/part attributes or features, such as model, year of manufacture, plant of manufacture, geography, supplier etc. are to be identified. Each connected machine can be viewed as a record with its features as fields and an anomaly score for part P_{i }that is discretized into two levels, either ‘high’ or ‘normal’, i.e., the anomaly field divides the set of vehicles V into two sets called E_{i }representing an early warning set and NE_{i }representing a non early warning set for part P_{i}, wherein
E_{i}={vεV:α_{i}^{v}>τ_{a}} (15)
and NE_{i}=E_{i}^{c }i.e. compliment set of E_{i}.
In accordance with an embodiment, association rules with ‘high’ anomaly score E_{i }are first mined as the consequent, e.g. X & Y . . . >E_{i}, which also satisfy a reasonable minimum confidence (e.g. conf(r)>0.75). These rules are then sorted in decreasing order of support, to choose a leading subset that covers a large enough fraction of the data (e.g. at least 50%). Next the rules are clustered using a densitybased technique such as DBS can and a distance measure that is inversely proportional to the degree of overlap between two rules, i.e., the number of records that satisfy both rules. As a result, each rule duster contains rules that strongly overlap with each other; conversely rules from different dusters have low mutual overlap. Finally, one or more rules are selected from each cluster that have high support and confidence, arriving at a small set of rules that each identify a subset of connected machines with predominantly high anomaly scores; the antecedents of such rules point to potential causes for the high anomaly scores observed. Further, each rule can be refined with its ‘exceptions’ by rerunning the above procedure on only the data covered by the antecedents of the rule, but this time using NE_{i }as the consequent. The rulelearning procedure, as described herein above, has to be executed regardless of whether an anomaly is detected at the level of the entire population or not, since it is designed especially for the situation where an anomaly is not visible at the population level.
Systems and methods of the present disclosure were validated by predicting future failures on reallife dataset of an automobile manufacturer and by comparing it against actual failures and expected failures calculated using BRM. Three scenarios encountered in realworld data were considered viz., 1) when actual failure volumes are significantly higher than expected, i.e., a case of early warning 2) when actual failure volumes are lower than expected 3) when actual and expected failures volumes match. The ‘expected’ failure volumes referred to herein are those predicted by a traditional basic reliability model (BRM) based on past failure data alone. In each of the above cases, it was seen that the augmented model (SARM) of the present disclosure predicts future volumes that are closer to actual numbers than the traditional model (BRM). The SARM model of the present disclosure was also validated for root cause analysis and rule learning on synthetic data.
RealLife Data:
Three datasets including sales data, DTC data and claims data were considered. Table I shows number of unique vehicles or vehicle identification numbers VINs and time period for which all three datasets are available.
Based on the availability of data, two parts P_{G }and P_{B }are chosen to validate methods and systems of the present disclosure and results were computed for the three scenarios mentioned herein above. Data profiling for these two parts is given in Table II.
In accordance with the methods of the present disclosure, firstly a discriminative set of DTCs for each part is identified. Table III shows the rules identified for parts P_{G }and P_{B}.
It shows that there are three rules with τp=0.7 and R_{c}=0.8 for both the parts. It also shows the support of each rule in whole database D and in a failure set D_{f}. So, the disci for the part P_{G }is {P2162, P07E7, P07E0} and for the part P_{B }is {B1304, B100D, B1D21}. For each part, the delay and failure parameters of every DTC in disc_{i }is learnt using a Bayesian graphical model as explained herein above with reference to step 204 of the method of the present disclosure. For the experiment, python library ‘pymc’ was used to estimate the parameters of Bayesian model via Markov Chain Monte Carlo (MCMC) sampling.
As described herein above in the Bayesian model, it is assumed that the delay between part failure and DTC occurrence follows Weibull distribution with delay parameters as scale and shape parameters of the distribution. It was empirically confirmed that the Weibull distribution is good fit for the variable τ_{ij}. The goodness of fit of Weibull distribution was compared with Gaussian distribution for the variable τ_{ij}. FIG. 4 illustrates QQ (Quantile) plots between the actual data of variable τ_{ij }and data sampled from Weibull and Gaussian distributions, with parameters estimated from Bayesian model, for all the six DTCs. It clearly shows that the Weibull distribution is better fit than a Gaussian distribution for the variable τ_{ij}. Apart from the QQ plots, KolmogorovSmimov(KS) test was used to check goodness of fit for the variable τ_{ij}. Table IV shows the pvalue of Weibull and Gaussian distribution for the variable τ_{ij}. It also shows the values of delay and failure parameters in terms of days calculated from selling time of vehicle, which was learned for parts and their discriminative DTCs using Bayesian model.
In accordance with the present disclosure, future number of failures of parts P_{G }and P_{B }are then predicted using delay and failure parameters shown in Table IV. For part P_{G}, claim data and DTC data till to are used to identify discriminative set of DTCs and for learning Bayesian model parameters. The number of failures for the month of M1 (>t_{0}) is predicted and compared against expected number of failures and actual failures. Similarly for part P_{B}, the number of failures for two months M2 and M3 are predicted.
Synthetic Data:
Three scenarios 1) NW: non early warning, 2) EW1: early warning1, and 3) EW2: early warning2 were simulated. For NW, 0.1M records were generated for vehicles with purchase date, miles driven, DTC occurrence dates, and part failure dates; assuming purchase dates to be uniformly distributed over a period of 5 years. The number of miles driven per day for each vehicle was drawn from a uniform distribution. Failure times were generated for 10 parts using Weibull distributions with different scale and shape parameters. DTC data with DTC codes were also generated, which become observable prior to the part failure. Delay between part failure and DTC occurrence was generated using Weibull distribution with different parameters for each DTCs. Attributes or features of the vehicles generated are shown in Table V.
In case of EW1, early warning in vehicles of model year 2012 was induced by preponing the failure times of part P_{i }in these vehicles. Similarly, in case of EW2, early warning in vehicles of model year ‘2012’, of geography ‘US’, and manufactured from plant 1 and 2 was induced by preponing the failure times of part P_{i }in these vehicles. Given the time t_{0}, failures of part P_{i }using early warning approach was predicted and compared against expected failures and actual failures.
Table VI contains the average anomaly scores for vehicles across the whole population for each of the above three cases.
It is seen that the anomaly score for EW1 is almost twice the score of the no warning case (NW), and is therefore visible at the full population level. However, the EW2 warning is not visible in the full population, with its anomaly score equivalent to that of the NW or no earlywarning case.
Root cause analysis was performed to reveal the potential cause for EW1. EW2 can be detected only by rule learning since it is invisible from average anomaly scores at a whole population level. Root cause analysis for the part P_{i }is demonstrated through a visual analytics workbench illustrated in
For rule learning of the part P_{i }in case of EW2, vehicles in which part Pi has not failed till t_{0 }are divided into two sets: ‘early warning’ and ‘nonearly warning’, using a threshold τ_{a }as the average of the anomaly score of all vehicles. Further, rules and exceptions of the attributes of vehicles are identified using techniques, as explained with reference to equations 13, 14 and 15.
Systems and methods of the present disclosure provide a complete framework to predict early warnings of unexpected failure volumes using sensorbased DTCs together with part failure data and combine early warnings of failure surges with root cause analysis as well as subgroup discovery. Systems and methods of the present disclosure exploit the availability of indicator sensors embedded in modern vehicles to signal such early warnings, by comparing traditional reliability predictions with those from a model augmented with sensor information as collected and transmitted over the ‘industrial internet’ of highly connected populations of machines. When the two models differ significantly it is indicative of a population level anomaly, and perhaps an indicator of the possible need for, say, a recall in the future. In addition to detecting anomalies in the above manner, systems and methods of the present disclosure enable drilling down to discover potential root causes of the anomaly, so that it can be addressed in advance. Finally, systems and methods of the present disclosure identify anomalies in small subgroups that are not statistically visible in the overall population. The efficacy of the systems and methods of the present disclosure are apparent from the experiments performed using reallife data wherein actual early warnings have been detected and experiments using synthetically generated data wherein root cause analysis and rule discovery was accurately established.
Although systems and methods of the present disclosure have been illustrated with reference to failure prediction in the automobile industry via experimental data herein above, it may be understood by persons skilled in the art that failure prediction plays important role in many domains and systems and methods of the present disclosure may be applied to software, health and insurance domain, and the like.
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments of the invention. The scope of the subject matter embodiments defined here may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language.
It is, however to be understood that the scope of the protection is extended to such a program and in addition to a computerreadable means having a message therein; such computerreadable storage means contain programcode means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an applicationspecific integrated circuit (ASIC), a fieldprogrammable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software modules located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the invention may be implemented on different hardware devices, e.g. using a plurality of CPUs.
The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various modules comprising the system of the present disclosure and described herein may be implemented in other modules or combinations of other modules. For the purposes of this description, a computerusable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The various modules described herein may be implemented as either software and/or hardware modules and may be stored in any type of nontransitory computer readable medium or other storage device. Some nonlimiting examples of nontransitory computerreadable media include CDs, DVDs, BLURAY, flash memory, and hard disk drives.
A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
Further, although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context dearly dictates otherwise.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope and spirit of disclosed embodiments being indicated by the following claims.