METHOD AND SYSTEM FOR ADAPTIVELY REMOVING OUTLIERS FROM DATA USED IN TRAINING OF PREDICTIVE MODELS
First Claim
Patent Images
1. A method for generating training data for a machine learning system, comprising:
- collecting data from a monitored system;
identifying one or more filter conditions, where the one or more filter conditions correspond to identified events that occur within the monitored system;
filtering the data pursuant to the one or more filter conditions, wherein the data is filtered by scanning candidate datasets to identify a datapoint temporally associated with the identified events, and the datapoint is not placed within a filtered dataset; and
performing model training with the filtered data.
5 Assignments
0 Petitions
Accused Products
Abstract
Described is an improved approach to remove data outliers by filtering out data correlated to detrimental events within a system. One or more detrimental even conditions are defined to identify and handle abnormal transient states from collected data for a monitored system.
24 Citations
21 Claims
-
1. A method for generating training data for a machine learning system, comprising:
-
collecting data from a monitored system; identifying one or more filter conditions, where the one or more filter conditions correspond to identified events that occur within the monitored system; filtering the data pursuant to the one or more filter conditions, wherein the data is filtered by scanning candidate datasets to identify a datapoint temporally associated with the identified events, and the datapoint is not placed within a filtered dataset; and performing model training with the filtered data. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for generating training data for a machine learning system, comprising:
-
a processor; a memory for holding programmable code; and wherein the programmable code includes instructions for collecting data from a monitored system;
identifying one or more filter conditions, where the one or more filter conditions correspond to identified events that occur within the monitored system;
filtering the data pursuant to the one or more filter conditions, wherein the data is filtered by scanning candidate datasets to identify a datapoint that is a manifestation of the identified events any of the given detrimental events, and the datapoint is not placed within a filtered dataset; and
performing model training with the filtered data. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product embodied on a computer readable medium, the computer readable medium having stored thereon a sequence of instructions which, when executed by a processor, executes a method comprising:
-
collecting data from a monitored system; identifying one or more filter conditions, where the one or more filter conditions correspond to identified events that occur within the monitored system; filtering the data pursuant to the one or more filter conditions, wherein the data is filtered by scanning candidate datasets to identify a datapoint that is a manifestation of the identified events any of the given detrimental events, and the datapoint is not placed within a filtered dataset; and performing model training with the filtered data. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification