Automatic mapping from data to preprocessing algorithms
First Claim
1. A method to identify a preprocessing algorithm for raw data, the method comprising:
- providing an algorithm knowledge database including preprocessing algorithm data and feature set data associated with the preprocessing algorithm data;
analyzing raw data to produce analyzed data;
extracting from the analyzed data features that characterize the data;
selecting a preprocessing algorithm using the algorithm knowledge database and features extracted from the analyzed data.
1 Assignment
0 Petitions
Accused Products
Abstract
One embodiment is a method to identify a preprocessing algorithm for raw data. The method may includes the steps of providing an algorithm knowledge database including preprocessing algorithm data and feature set data associated with the preprocessing algorithm data, analyzing raw data to produce analyzed data, extracting from the analyzed data features that characterize the data, and selecting a preprocessing algorithm using the algorithm knowledge database and features extracted from the analyzed data. Another embodiment is a data mining system for identifying a preprocessing algorithm for raw data using this method. Still another embodiment is a data mining application with improved preprocessing algorithm selection, including (a) an algorithm knowledge database containing preprocessing algorithm data and feature set data associated with the preprocessing algorithm data; (b) a data analysis module adapted to receive control of the data mining application when the data mining application begins; (c) a feature extraction module adapted to receive control of the data mining application from the data analysis module and available to identify a set of features; and (d) an algorithm selection module available to receive control from the feature extraction module and available to identify a preprocessing algorithm based upon the set of features identified by the feature extraction module using the algorithm knowledge database.
163 Citations
37 Claims
-
1. A method to identify a preprocessing algorithm for raw data, the method comprising:
-
providing an algorithm knowledge database including preprocessing algorithm data and feature set data associated with the preprocessing algorithm data;
analyzing raw data to produce analyzed data;
extracting from the analyzed data features that characterize the data;
selecting a preprocessing algorithm using the algorithm knowledge database and features extracted from the analyzed data. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A data mining system for identifying a preprocessing algorithm for raw data comprising:
-
at least one memory containing an algorithm knowledge database and raw data for processing;
random access memory having stored therein a computer program and which is coupled to the at least one memory such that the random access memory is adapted to receive;
at least one data analysis program to analyze raw data, a feature extraction program to extract features from raw data, and an algorithm selection program to identify a preprocessing algorithm. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 17, 18, 19, 20, 21)
-
-
16. A data mining system for identify a preprocessing algorithm for raw data, the data mining system comprising
a means for storing an algorithm knowledge database, a means for storing raw data; -
a means for data analysis on the raw data to produce analyzed data;
a means for feature extraction from the analyzed data to produce a feature set;
a means for algorithm selection using the feature set and the algorithm knowledge database.
-
-
22. A data mining application comprising:
-
a) an algorithm knowledge database including preprocessing algorithm data and feature set data associated with the preprocessing algorithm data;
b) a data analysis module that is adapted to receive control of the data mining application when the data mining application begins;
c) a feature extraction module that is adapted to receive control of the data mining application from the data analysis module and that is available to identify a set of features; and
d) an algorithm selection module that is adapted to receive control from the feature extraction module and that is adapted to identify a preprocessing algorithm based upon the set of features identified by the feature extraction module using the algorithm knowledge database. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29)
-
-
30. A data mining product embedded in a computer readable medium, comprising:
at least one computer readable medium having an algorithm knowledge database embedded therein and having a computer readable program code embedded therein to identify a preprocessing algorithm for raw data, the computer readable program code in the computer program product comprising;
computer readable program code for data analysis to produce analyzed data from the raw data;
computer readable program code for feature extraction to identify a feature set from the analyzed data; and
computer readable program code for algorithm selection to identify a preprocessing algorithm using the analyzed data and the algorithm knowledge database. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37)
Specification