Data classification methods and apparatus for use with data fusion
First Claim
Patent Images
1. A method, comprising:
- generating a classification tree having a plurality of terminal nodes, each of the terminal nodes having a frequency distribution related to a plurality of classes at the respective terminal node;
at each terminal node, combining the frequency distribution of the terminal node with an overall population frequency distribution to generate an index value for each of the classes at the terminal node such that each of the index values represents a difference between a corresponding portion of the frequency distribution and a corresponding portion of the overall population frequency distribution;
modifying the classification tree based on the index values; and
associating each data record of a first dataset with one of the terminal nodes based on the modified classification tree.
12 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for classifying data for use in data fusion processes are disclosed. An example method of classifying data selectively groups nodes of a classification tree so that each node is assigned to only one of a plurality of groups and so that at least one of the groups includes at least two of the nodes. Data is classified based on the classification tree and the selective grouping of the nodes, and the results displayed.
-
Citations
24 Claims
-
1. A method, comprising:
-
generating a classification tree having a plurality of terminal nodes, each of the terminal nodes having a frequency distribution related to a plurality of classes at the respective terminal node; at each terminal node, combining the frequency distribution of the terminal node with an overall population frequency distribution to generate an index value for each of the classes at the terminal node such that each of the index values represents a difference between a corresponding portion of the frequency distribution and a corresponding portion of the overall population frequency distribution; modifying the classification tree based on the index values; and associating each data record of a first dataset with one of the terminal nodes based on the modified classification tree. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A tangible machine readable storage medium comprising instructions that, when executed, cause a machine to at least:
-
generate a classification tree having a plurality of terminal nodes, the terminal nodes having respective frequency distributions related to a plurality of classes at the respective terminal node; for each of the terminal nodes, combine the frequency distribution of the respective terminal node with an overall population frequency distribution to generate an index value for respective ones of the classes at the terminal node such that the index values respectively represent a difference between a corresponding portion of the frequency distribution and a corresponding portion of the overall population frequency distribution; modify the classification tree based on the index values; and associate data records of a first dataset with respective ones of the terminal nodes based on the modified classification tree. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. An apparatus, comprising:
-
a tree generator to generate a classification tree having a plurality of terminal nodes, the terminal nodes having respective frequency distributions related to a plurality of classes at the respective terminal node; a node analyzer to combine the frequency distribution related to the classes at a first one of the terminal nodes with an overall population frequency distribution to generate an index value for each of the classes at the first terminal node such that each of the index values represents a difference between a corresponding portion of the frequency distribution and a corresponding portion of the overall population frequency distribution; a node grouper to modify the classification tree based on the index values; and an assignor to associate data records of a first dataset with respective ones of the terminal nodes. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification