Method for selecting node variables in a binary decision tree structure
First Claim
Patent Images
1. A method of selecting node variables for use in building a binary decision tree, comprising the steps of:
- (a) providing an input data set including a plurality of input variables and an associated decision state;
(b) calculating a statistical measure of the significance of each of the input variables to the associated decision state;
(c) averaging the statistical measures for each of the input variables and to form an averaged statistical measure for each input variable;
(d) selecting the input variable with the largest average statistical measure; and
(e) using the selected input variable as a node variable for splitting the input data set into two subsets that are used in building the binary decision tree.
0 Assignments
0 Petitions
Accused Products
Abstract
A method for selecting node variables in a binary decision tree structure is provided. The binary decision tree is formed by mapping node variables to known outcome variables. The method calculates a statistical measure of the significance of each input variable in an input data set and then selects an appropriate node variable on which to base the structure of the binary decision tree using an averaged statistical measure of the input variable and any co-linear input variables of the data set.
60 Citations
15 Claims
-
1. A method of selecting node variables for use in building a binary decision tree, comprising the steps of:
-
(a) providing an input data set including a plurality of input variables and an associated decision state;
(b) calculating a statistical measure of the significance of each of the input variables to the associated decision state;
(c) averaging the statistical measures for each of the input variables and to form an averaged statistical measure for each input variable;
(d) selecting the input variable with the largest average statistical measure; and
(e) using the selected input variable as a node variable for splitting the input data set into two subsets that are used in building the binary decision tree. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 15)
-
-
10. A method for mapping genomic markers to a phenotypical trait, comprising the steps of:
-
(a) receiving a structured data set having a plurality of genomic markers;
(b) determining a first correlating statistic for each genomic marker where the magnitude of the correlating statistic is proportional to the capability of the genomic marker to map the phenotype;
(c) calculating a second correlating statistic for each genomic marker using values of the genomic marker and adjacent genomic markers; and
(d) selecting the largest second correlating statistic from the genomic markers;
the genomic marker having the largest second correlating statistic being used as a decision node of a binary decision tree thereby splitting the data set into two sub sets. - View Dependent Claims (11, 12, 13, 14)
-
Specification