Method for selecting node variables in a binary decision tree structure
First Claim
Patent Images
1. A computer-implemented method for mapping one or more genomic markers to a phenotypical trait, the method comprising:
- receiving, using one or more processors, a structured data set having a plurality of genomic markers;
determining, using one or more processors, a first correlating statistic for each genomic marker where the magnitude of the correlating statistic is proportional to the capability of the genomic marker to map a phenotype; and
calculating, using one or more processors, a second correlating statistic for each genomic marker from a smoothing mathematical function of the determined first correlating statistic of the genomic marker and the first correlating statistic of adjacent genomic markers, wherein calculating includes;
providing a neighbor parameter indicating how many adjacent genomic markers to use in calculating the second correlating statistic, andproviding a weight parameter indicating a weight to apply to each of the adjacent genomic markers used in calculating the second correlating statistic, and calculating the second correlating statistic for each genomic marker according to the following equation;
0 Assignments
0 Petitions
Accused Products
Abstract
A method for selecting node variables in a binary decision tree structure is provided. The binary decision tree is formed by mapping node variables to known outcome variables. The method calculates a statistical measure of the significance of each input variable in an input data set and then selects an appropriate node variable on which to base the structure of the binary decision tree using an averaged statistical measure of the input variable and any co-linear input variables of the data set.
59 Citations
20 Claims
-
1. A computer-implemented method for mapping one or more genomic markers to a phenotypical trait, the method comprising:
-
receiving, using one or more processors, a structured data set having a plurality of genomic markers; determining, using one or more processors, a first correlating statistic for each genomic marker where the magnitude of the correlating statistic is proportional to the capability of the genomic marker to map a phenotype; and calculating, using one or more processors, a second correlating statistic for each genomic marker from a smoothing mathematical function of the determined first correlating statistic of the genomic marker and the first correlating statistic of adjacent genomic markers, wherein calculating includes; providing a neighbor parameter indicating how many adjacent genomic markers to use in calculating the second correlating statistic, and providing a weight parameter indicating a weight to apply to each of the adjacent genomic markers used in calculating the second correlating statistic, and calculating the second correlating statistic for each genomic marker according to the following equation; - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 17, 18)
-
-
13. A computer-implemented system for mapping genomic markers to a phenotypical trait, comprising:
-
one or more processors; one or more computer-readable storage mediums containing software instructions executable on the one or more processors to cause the one or more processors to perform operations including; receiving a structured data set having a plurality of genomic markers; determining a first correlating statistic for each genomic marker where the magnitude of the correlating statistic is proportional to the capability of the genomic marker to map a phenotype; and calculating a second correlating statistic for each genomic marker from a smoothing mathematical function of the determined first correlating statistic of the genomic marker and the first correlating statistic of adjacent genomic markers, wherein calculating includes; providing a neighbor parameter indicating how many adjacent genomic markers to use in calculating the second correlating statistic, and providing a weight parameter indicating a weight to apply to each of the adjacent genomic markers used in calculating the second correlating statistic, and calculating the second correlating statistic for each genomic marker according to the following equation; - View Dependent Claims (19, 20)
-
-
14. A computer-readable storage medium encoded with instructions that when executed on one or more processors within a computer system, perform a method for mapping one or more genomic markers to a phenotypical trait, the method comprising:
-
receiving a structured data set having a plurality of genomic markers; determining a first correlating statistic for each genomic marker where the magnitude of the correlating statistic is proportional to the capability of the genomic marker to map a phenotype; calculating a second correlating statistic for each genomic marker from a smoothing mathematical function of the determined first correlating statistic of the genomic marker and the first correlating statistic of adjacent genomic markers;
wherein calculating includes;providing a neighbor parameter indicating how many adjacent genomic markers to use in calculating the second correlating statistic, providing a weight parameter indicating a weight to apply to each of the adjacent genomic markers used in calculating the second correlating statistic, and calculating the second correlating statistic for each genomic marker according to the following equation; - View Dependent Claims (15, 16)
-
Specification