METHOD FOR VISUALIZING FEATURE RANKING OF A SUBSET OF FEATURES FOR CLASSIFYING DATA USING A LEARNING MACHINE
First Claim
1. A method for enhancing knowledge obtained from a dataset by visualizing subsets of features selected from a plurality of features that describe the dataset, the method comprising:
- downloading the dataset into a processor programmed for executing one or more learning machine classifiers;
training the one or more classifiers with each subset of features;
calculating a success rate of the one or more classifiers trained on each subset of features;
assigning a rank to each subset of features according to the success rate of the trained classifier in accurately classifying the dataset;
assigning a visually distinguishable characteristic to each rank; and
displaying a graph at a user interface display, the graph comprising a plurality of representations of subsets of features, wherein each representation of the subset of features comprises the visually distinguishable characteristic corresponding to the rank of the subset of features.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for enhancing knowledge discovery from a dataset uses visualization of a subset features within a dataset that provide the best separation of the dataset into classes. One or more classifiers are trained using each subset of features and the success rate of the classifiers in accurately classifying the dataset is calculated. The success rate is converted into a ranking that is represented as a visually distinguishable characteristic. One or more tree structures may be displayed with a node representing each feature, and the visually distinguishable characteristic is used to indicate the scores for each feature subset. Connectors between the nodes may be used to indicate unconstrained and constrained feature sets. Nodes within a constrained path may be substituted for a feature within the preferred, unconstrained path if that feature is impractical to measure.
-
Citations
25 Claims
-
1. A method for enhancing knowledge obtained from a dataset by visualizing subsets of features selected from a plurality of features that describe the dataset, the method comprising:
-
downloading the dataset into a processor programmed for executing one or more learning machine classifiers; training the one or more classifiers with each subset of features; calculating a success rate of the one or more classifiers trained on each subset of features; assigning a rank to each subset of features according to the success rate of the trained classifier in accurately classifying the dataset; assigning a visually distinguishable characteristic to each rank; and displaying a graph at a user interface display, the graph comprising a plurality of representations of subsets of features, wherein each representation of the subset of features comprises the visually distinguishable characteristic corresponding to the rank of the subset of features. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer program product embodied on a computer readable medium for enhancing knowledge discovered from a dataset by visualizing nested subsets of features selected from a plurality of features that describe a dataset, the computer program product comprising instructions for executing learning machine classifiers and further for causing a computer processor to:
-
receive the dataset; train the one or more classifiers with each subset of features; calculate a success rate of the one or more classifiers trained on each subset of features; assign a rank to each subset of features according to the success rate of the trained classifier in accurately classifying the dataset; assign a visually distinguishable characteristic to each rank; and display one or more trees at a user interface, each tree comprising a plurality of nodes, each node representing a feature, wherein a representation of the subset of features comprises the visually distinguishable characteristic corresponding to the rank of the subset of features. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
-
Specification