Detection of abnormal behavior using probabilistic distribution estimation
First Claim
1. An abnormal behavior detection apparatus for detecting abnormalities in data output from a system, comprising:
- a probabilistic distribution estimation apparatus for responding to, as input data, a string of vector data representing behavior to estimate, using a stochastic model, a probabilistic distribution occurred in each data by successively reading said string of vector data, said probabilistic distribution estimation apparatus comprising a parameter storage unit for storing all of parameters for the stochastic model having hidden variables, certainty calculation means for calculating, in response to said input data, a certainty occurred in said input data and that represents a level of assuredness that said input data is correct and does not contain errors, as the probabilistic distribution, using said stochastic model by reading the parameters of said stochastic model from said parameter storage unit, parameter renewal means for renewing the parameters in said parameter storage unit in accordance with new read data and with past data previously read using the certainty calculated by said certainty calculation means and using each parameter of said stochastic model read from said parameter storage unit, and parameter output means for outputting the parameters for the stochastic model stored in said parameter storage unit, wherein the parameters in said parameter storage unit are renewed based on a parameter renewal rule that unitizes both the certainty and the parameters of said stochastic model, said parameter renewal rule being an oblivious type algorithm for calculating a conditional expected value of a statistical amount weighted with an oblivion coefficient indicative of an accuracy of the past data as time progresses, the oblivious type algorithm be used to treat current data and the past data with appropriate weights in data analysis; and
abnormality detection means for receiving the parameters for the stochastic model output from said parameter output means and for calculating a score indicative of an abnormal behavior degree with respect to the new read data by using the parameters of the probabilistic distribution estimated by said probabilistic distribution estimation apparatus to produce the abnormal behavior degree of said new read data, the abnormal behavior degree being a degree where the new read data is out of whole patterns,the abnormal behavior degree being an indicator of a level of abnormalities that exist in the data output from the system.
1 Assignment
0 Petitions
Accused Products
Abstract
Supplied with a string of vector data as input data, a probabilistic distribution estimation apparatus estimates, by using a stochastic model having hidden variables, a probabilistic distribution in which each data occurs by successively reading the train of vector data. Specifically, the probabilistic distribution estimation apparatus reads values of parameters of the stochastic model having the hidden variables for a value of the input data, calculates, by using the stochastic model, a certainty in which the input data occurs, renews the parameters in response to new read data with past data forgotten, and produce several parameter'"'"'s values. By using the parameter'"'"'s values received from the probabilistic distribution estimation apparatus, an abnormality detection unit calculates an information amount of data as an abnormal behavior degree to produce the abnormal behavior degree.
-
Citations
33 Claims
-
1. An abnormal behavior detection apparatus for detecting abnormalities in data output from a system, comprising:
-
a probabilistic distribution estimation apparatus for responding to, as input data, a string of vector data representing behavior to estimate, using a stochastic model, a probabilistic distribution occurred in each data by successively reading said string of vector data, said probabilistic distribution estimation apparatus comprising a parameter storage unit for storing all of parameters for the stochastic model having hidden variables, certainty calculation means for calculating, in response to said input data, a certainty occurred in said input data and that represents a level of assuredness that said input data is correct and does not contain errors, as the probabilistic distribution, using said stochastic model by reading the parameters of said stochastic model from said parameter storage unit, parameter renewal means for renewing the parameters in said parameter storage unit in accordance with new read data and with past data previously read using the certainty calculated by said certainty calculation means and using each parameter of said stochastic model read from said parameter storage unit, and parameter output means for outputting the parameters for the stochastic model stored in said parameter storage unit, wherein the parameters in said parameter storage unit are renewed based on a parameter renewal rule that unitizes both the certainty and the parameters of said stochastic model, said parameter renewal rule being an oblivious type algorithm for calculating a conditional expected value of a statistical amount weighted with an oblivion coefficient indicative of an accuracy of the past data as time progresses, the oblivious type algorithm be used to treat current data and the past data with appropriate weights in data analysis; and abnormality detection means for receiving the parameters for the stochastic model output from said parameter output means and for calculating a score indicative of an abnormal behavior degree with respect to the new read data by using the parameters of the probabilistic distribution estimated by said probabilistic distribution estimation apparatus to produce the abnormal behavior degree of said new read data, the abnormal behavior degree being a degree where the new read data is out of whole patterns, the abnormal behavior degree being an indicator of a level of abnormalities that exist in the data output from the system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 25, 26, 27)
-
-
11. A method of detecting abnormal behavior of data output from a system, comprising the steps of:
-
inputting a string of vector data representing behavior as input data; calculating, using a model in which each data occurs is input thereto by successively reading the string of vector data, a certainty occurred in the input data and that represents a level of assuredness that said input data is correct and does not contain errors, as a probabilistic distribution, on the basis of parameters of said model; renewing, by using the certainty obtained from the calculating step and the parameters of said stochastic model, the parameters in response to new read data and with past data previously read, the renewing being made based on a parameter renewal rule that utilizes both the certainty and the parameters of said model, said parameter renewal rule being an oblivious type algorithm for calculating a conditional excepted value of a statistical amount weighted with an oblivion coefficient indicative of an accuracy of the past data as time progresses, the oblivious type algorithm be used to treat current data and the past data with appropriate weights in data analysis; and calculating, by using parameters of an estimated probabilistic distribution, an abnormal behavior degree of new read data using a score indicative of the abnormal behavior degree with respect to the new read data to produce the abnormal behavior degree of said new read data, the abnormal behavior degree being a degree where the new read data is out of whole patterns, the abnormal behavior degree being an indicator of a level of abnormalities that exist in the data output from the system. - View Dependent Claims (12, 13, 14, 15, 28, 29, 30)
-
-
16. A method of detecting abnormal behavior of data output from a system, comprising the steps of:
-
inputting a string of vector data representing behavior as input data; calculating, using a time series model having a continuous time distribution and hidden variables as a probabilistic distribution in which each data occurs by successively reading the string of vector data, a certainty occurred in said input data and that represents a level of assuredness that said input data is correct and does not contain errors, as the probabilistic distribution, on the basis of parameters of said time series model; renewing, by using said certainty obtained from the calculating step and the parameters of said time series model, the parameters in response to new read data and with past data previously read, the renewing being made based on a parameter renewal rule that utilizes both the certainty and the parameters of said time series model, said parameter renewal rule being an oblivious type algorithm for calculating a conditional expected value of a statistical amount weighted with an oblivion coefficient indicative of accuracy of the past data as time progresses, the oblivious type algorithm be used to treat current data and the past data with appropriate weights in data analysis; and calculating, by using parameters of an estimated probabilistic distribution, an abnormal behavior degree of the new read data using a score indicative of the abnormal behavior degree with respect to the new read data to produce the abnormal behavior degree of said new read data, the abnormal behavior degree being a degree where the new read data is out of whole patterns, the abnormal behavior degree being an indicator of a level of abnormalities that exist in the data output from the system. - View Dependent Claims (17, 18, 19, 20)
-
-
21. A computer readable medium storing an abnormal behavior detection program for execution by a computer for determining an abnormality behavior of data output from a system, wherein, when executing the behavior detection program, the computer performs the steps of:
-
performing a probabilistic distribution estimation for responding to, as input data, a string of vector data representing behavior to estimate, using a stochastic model, a probabilistic distribution occurred in each data by successively reading said string of vector data, said probabilistic distribution estimation step comprising a parameter storage step for storing all of parameters for the stochastic model having hidden variables, a certainty calculation step for calculating, in response to said input data, a certainty occurred in said input data and that represents a level of assuredness that said input data is correct and does not contain errors, as the probabilistic distribution, using said stochastic model by reading the parameters of said stochastic model from said parameter storage step, and a parameter renewal step for renewing the parameters in said parameter storage step in accordance with new read data and with past data previously read using the certainty calculated by said certainty calculation step and using each parameter of said stochastic model read from said parameter storage step, the renewing being made based on a parameter renewal rule that utilizes both the certainty and the parameters of said stochastic model, said parameter renewal rule being an oblivious type algorithm for calculating a conditional expected value of a statistical amount weighted with an oblivion coefficient indicative of accuracy of the past data as time progresses, the oblivious type algorithm be used to treat current data and the past data with appropriate weights in data analysis; and performing an abnormality detection for calculating a score indicative of an abnormal behavior degree with respect to the new read data by using the parameters of the probabilistic distribution estimated by said probabilistic distribution estimation step to produce the abnormal behavior degree of said new read data, the abnormal behavior degree being a degree where the new read data is out of whole patters, the abnormal behavior degree being an indicator of a level of abnormalities that exist in the data output from the system. - View Dependent Claims (22, 23, 24, 31, 32, 33)
-
Specification