METHOD AND SYSTEM FOR TRAINING A BIG DATA MACHINE TO DEFEND
First Claim
1. A method for training a big data machine to defend an enterprise system comprising:
- retrieving log lines belonging to one or more log line parameters from one or more enterprise system data sources and from incoming data traffic to the enterprise system;
computing one or more features from the log lines;
wherein computing one or more features includes one or more statistical processes;
applying the one or more features to an adaptive rules model;
wherein the adaptive rules model comprises one or more identified threat labels;
further wherein applying the one or more features to an adaptive rules model comprises;
blocking one or more features that has one or more identified threat labels;
generating a features matrix from said applying the one or more features to an adaptive rule module;
executing at least one detection method from a first group of statistical outlier detection methods and at least one detection method from a second group of statistical outlier detection methods on one or more features matrix, to identify statistical outliers;
wherein the first group of statistical outlier detection methods includes a matrix decomposition-based outlier process, a replicator neural networks process and a joint probability process andthe second group of statistical outlier detection methods includes a matrix decomposition-based outlier process, a replicator neural networks process and a joint probability process;
wherein the at least one detection method from a first group of statistical outlier detection methods and the at least one detection method from a second group of statistical outlier detection methods are different;
generating an outlier scores matrix from each detection method of said first and second group of statistical outlier detection methods;
converting each outlier scores matrix to a top scores model;
combining each top scores model using a probability model to create a single top scores vector;
generating a GUI output of at least one of;
an output of the single top scores vector and the adaptive rules model;
labeling the said output to create one or more labeled features matrix;
creating a supervised learning module with the one or more labeled features matrix to update the one or more identified threat labels for performing at least one of;
further refining adaptive rules model for identification of statistical outliers;
andpreventing access by categorized threats by detecting new threats in real time and reducing the time elapsed between threat detection of the enterprise system.
4 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are a method and system for training a big data machine to defend, retrieve log lines belonging to log line parameters of a system'"'"'s data source and from incoming data traffic, compute features from the log lines, apply an adaptive rules model with identified threat labels produce a features matrix, identify statistical outliers from execution of statistical outlier detection methods, and may generate an outlier scores matrix. Embodiments may combine a top scores model and a probability model to create a single top scores vector. The single top scores vector and the adaptive rules model may be displayed on a GUI for labeling of malicious or non-malicious scores. Labeled output may be transformed into a labeled features matrix to create a supervised learning module for detecting new threats in real time and reducing the time elapsed between threat detection of the enterprise or e-commerce system.
-
Citations
19 Claims
-
1. A method for training a big data machine to defend an enterprise system comprising:
-
retrieving log lines belonging to one or more log line parameters from one or more enterprise system data sources and from incoming data traffic to the enterprise system; computing one or more features from the log lines; wherein computing one or more features includes one or more statistical processes; applying the one or more features to an adaptive rules model; wherein the adaptive rules model comprises one or more identified threat labels; further wherein applying the one or more features to an adaptive rules model comprises;
blocking one or more features that has one or more identified threat labels;generating a features matrix from said applying the one or more features to an adaptive rule module; executing at least one detection method from a first group of statistical outlier detection methods and at least one detection method from a second group of statistical outlier detection methods on one or more features matrix, to identify statistical outliers; wherein the first group of statistical outlier detection methods includes a matrix decomposition-based outlier process, a replicator neural networks process and a joint probability process and the second group of statistical outlier detection methods includes a matrix decomposition-based outlier process, a replicator neural networks process and a joint probability process; wherein the at least one detection method from a first group of statistical outlier detection methods and the at least one detection method from a second group of statistical outlier detection methods are different; generating an outlier scores matrix from each detection method of said first and second group of statistical outlier detection methods; converting each outlier scores matrix to a top scores model; combining each top scores model using a probability model to create a single top scores vector; generating a GUI output of at least one of;
an output of the single top scores vector and the adaptive rules model;labeling the said output to create one or more labeled features matrix; creating a supervised learning module with the one or more labeled features matrix to update the one or more identified threat labels for performing at least one of; further refining adaptive rules model for identification of statistical outliers; and preventing access by categorized threats by detecting new threats in real time and reducing the time elapsed between threat detection of the enterprise system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus for training a big data machine to defend an enterprise system, the apparatus comprising:
-
one or more processors; system memory coupled to the one or more processors; one or more non-transitory memory units coupled to the one or more processors; and threat identification and detection code stored on the one or more non-transitory memory units that when executed by the one or more processors are configured to perform a method, comprising; retrieving log lines belonging to one or more log line parameters from one or more enterprise system data sources and from incoming data traffic to the enterprise system; computing one or more features from the log lines; wherein computing one or more features includes one or more statistical processes; applying the one or more features to an adaptive rules model; wherein the adaptive rules model comprises one or more identified threat labels; further wherein the applying the one or more features to an adaptive rules model comprises;
blocking one or more features that has one or more identified threat labels, investigating one or more features, or a combination thereof;generating a features matrix from said applying the one or more features to an adaptive rule module; executing at least one detection method from a first group of statistical outlier detection methods and at least one detection method from a second group of statistical outlier detection methods on one or more features matrix, to identify statistical outliers; wherein the first group of statistical outlier detection methods includes a matrix decomposition-based outlier process, a replicator neural networks process and a joint probability density process and the second group of statistical outlier detection methods includes a matrix decomposition-based outlier process, a replicator neural networks process and a density-based process; wherein the at least one detection method from a first group of statistical outlier detection methods and the at least one detection method from a second group of statistical outlier detection methods are different; generating an outlier scores matrix from each detection method of said first and second group of statistical outlier detection methods; converting each outlier scores matrix to a top scores model; combining each top scores model using a probability model to create a single top scores vector; generating a GUI output of at least one of;
an output of the single top scores vector and the adaptive rules model;labeling the said output to create one or more labeled features matrix; creating a supervised learning model with the one or more labeled features matrix to update the one or more identified threat labels for performing at least one of; further refining adaptive rules model for identification of statistical outliers; and preventing access by categorized threats by detecting new threats in real time and reducing the time elapsed between threat detection of the enterprise system. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
Specification