System and method for learning models from scarce and skewed training data
First Claim
Patent Images
1. A method for learning models from scarce and/or skewed training data, comprising:
- building a classifier based on accumulated training data;
estimating a most likely current class distribution using historical training data; and
selecting historical classifiers based on the most likely class distribution to form a set of classifiers used to classify streaming data with evolving concepts.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for learning models from scarce and/or skewed training data includes partitioning a data stream into a sequence of time windows. A most likely current class distribution to classify portions of the data stream is determined based on observing training data in a current time window and based on concept drift probability patterns using historical information.
-
Citations
20 Claims
-
1. A method for learning models from scarce and/or skewed training data, comprising:
-
building a classifier based on accumulated training data; estimating a most likely current class distribution using historical training data; and selecting historical classifiers based on the most likely class distribution to form a set of classifiers used to classify streaming data with evolving concepts. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer program product for learning models from scarce and/or skewed training data comprising a computer useable medium including a computer readable program, wherein the computer readable program when executed on a computer causes the computer to perform steps of:
-
building a classifier based on accumulated training data; estimating a most likely current class distribution using historical training data; and selecting historical classifiers based on the most likely class distribution to form a set of classifiers used to classify streaming data with evolving concepts. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 19, 20)
-
-
18. A system for learning models and classifying evolving data, comprising:
-
a partition module configured to receive a data stream and partition the data stream into a sequence of time windows, each time window including a feature space partitioned into regions; and at least one classifier having a weight based on a number of classes in each region, the at least one classifier being configured to determine a most likely current class distribution for each window by employing observations of training data in the data stream and employing historical patterns using a concept drift probability model to classify portions of the data stream.
-
Specification