Quantifying gene relatedness via nonlinear prediction of gene
First Claim
1. A computer-implemented method for quantifying relative gene relatedness for a plurality of candidate genes for which a plurality of gene expression level observations have been collected, the method comprising:
- for a predicted candidate gene selected from the plurality of candidate genes, selecting a plurality of subset gene combinations of the plurality of candidate genes, and performing (a)–
(c) for each subset gene combination to generate a plurality of quantifications of relative relatedness for the predicted candidate gene and the subset combinations;
(a) based on data comprising the plurality of gene expression level observations for the plurality of candidate genes, constructing a multivariate nonlinear model predicting gene expression for the predicted candidate gene, wherein the multivariate nonlinear model accepts gene expression levels for the subset gene combination of the plurality of candidate genes as inputs and produces a gene expression level for the predicted candidate gene as an output;
(b) predicting gene expression of the predicted candidate gene with the multivariate nonlinear model; and
(c) measuring effectiveness of the multivariate nonlinear model in accurately predicting the gene expression level of the predicted candidate gene as compared to the gene expression level observations, the effectiveness being a quantification of relative gene relatedness of the predicted gene and the subset gene combination of the plurality of candidate genes with respect to other subset gene combinations of the plurality of candidate genes as determined by comparing effectiveness of the multivariate nonlinear model with effectiveness of other multivariate nonlinear models constructed for other subset gene combinations, the other multivariate nonlinear models being of a same type as the multivariate nonlinear model; and
presenting a ranked plurality of the plurality of quantifications of relative gene relatedness for a plurality of the subset gene combinations of the plurality of candidate genes.
3 Assignments
0 Petitions
Accused Products
Abstract
Relatedness between genes is quantified by constructing nonlinear models predicting gene expression. Effectiveness of the model is evaluated to provide a measurement of the relatedness of genes associated with the model. Various types of models, including full-logic or neural networks can be constructed. A graphical user interface presents results of the analysis to allow evaluation by a user. Each gene'"'"'s contribution to the measurement of relatedness can be shown on a graph, and graphical representations of models used to predict gene expression can be displayed.
34 Citations
81 Claims
-
1. A computer-implemented method for quantifying relative gene relatedness for a plurality of candidate genes for which a plurality of gene expression level observations have been collected, the method comprising:
-
for a predicted candidate gene selected from the plurality of candidate genes, selecting a plurality of subset gene combinations of the plurality of candidate genes, and performing (a)–
(c) for each subset gene combination to generate a plurality of quantifications of relative relatedness for the predicted candidate gene and the subset combinations;(a) based on data comprising the plurality of gene expression level observations for the plurality of candidate genes, constructing a multivariate nonlinear model predicting gene expression for the predicted candidate gene, wherein the multivariate nonlinear model accepts gene expression levels for the subset gene combination of the plurality of candidate genes as inputs and produces a gene expression level for the predicted candidate gene as an output; (b) predicting gene expression of the predicted candidate gene with the multivariate nonlinear model; and (c) measuring effectiveness of the multivariate nonlinear model in accurately predicting the gene expression level of the predicted candidate gene as compared to the gene expression level observations, the effectiveness being a quantification of relative gene relatedness of the predicted gene and the subset gene combination of the plurality of candidate genes with respect to other subset gene combinations of the plurality of candidate genes as determined by comparing effectiveness of the multivariate nonlinear model with effectiveness of other multivariate nonlinear models constructed for other subset gene combinations, the other multivariate nonlinear models being of a same type as the multivariate nonlinear model; and presenting a ranked plurality of the plurality of quantifications of relative gene relatedness for a plurality of the subset gene combinations of the plurality of candidate genes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A computer-implemented method for identifying genes related to a target gene by analyzing gene expression level observations for the genes, the method comprising:
-
based on the gene expression level observations, constructing multivariate nonlinear predictors that predict an expression level for the target gene, wherein the predictors accept gene expression levels for other genes as predictive elements; estimating a coefficient of determination for sets of predictive elements and the target gene by comparing results of the multivariate nonlinear predictors with gene expression level observations for the target gene, wherein the predictive elements comprise expression level observations for groups of genes other than the target gene; and ranking the groups of genes other than the target gene by coefficient of determination to present the groups of genes other than the target gene in order of likelihood of relatedness to the target gene. - View Dependent Claims (31)
-
-
32. A computer-implemented method for analyzing gene expression level observations for a set of genes comprising a target gene, the method comprising:
estimating a coefficient of determination for an optimal multivariate nonlinear model predicting gene expression of the target gene by constructing a multivariate nonlinear model from the gene expression level observations of gene expression for the target gene, wherein the optimal multivariate nonlinear model and the constructed multivariate nonlinear model predict gene expression of the target gene based on variables representing gene expression levels of genes other than the target gene. - View Dependent Claims (33)
-
34. A method for identifying related genes out of a set of genes for which gene expression level observations have been collected, the method comprising:
-
for at least one predicted gene out of the set of genes, training an artificial intelligence function to predict gene expression for the predicted gene, wherein the artificial intelligence function takes one or more predictive elements as inputs and produces a gene expression level for the predicted gene as an output, wherein at least one of the predictive elements is a gene expression level for a gene other than the predicted gene; testing effectiveness of the artificial intelligence function in predicting expression of the predicted gene to rate relative relatedness of the predicted gene and at least one gene associated with the predictive elements; and presenting the relative relatedness in a computer user interface showing a ranking of relative relatedness for a plurality of gene groups. - View Dependent Claims (35, 36)
-
-
37. For a plurality of observed genes for which expression levels have been observed, a method of presenting an analysis of the expression levels to assist in identifying related genes, the method comprising:
-
denoting a particular observed gene as a predicted gene; for the predicted gene, constructing a plurality of nonlinear multivariate models predicting expression of the observed gene, wherein the nonlinear multivariate models comprise a variety of predictive elements chosen from permutations of expression levels of observed genes other than the predicted gene; measuring effectiveness of the nonlinear multivariate models in predicting expression of the predicted gene to quantify relatedness between the predicted gene and the set of genes associated with the predictive elements of the models; and presenting a quantification of relatedness between the predicted gene and a set of genes associated with the predictive elements of at least one of the models. - View Dependent Claims (38, 39, 40, 41, 42, 43, 44, 45, 46, 47)
-
-
48. For a plurality of observed genes for which expression levels have been observed, a method of performing an analysis of the expression levels to assist in identifying related genes, the method comprising:
-
(a) for a plurality of the observed genes, denoting a particular observed gene as a predicted gene and performing at least (b) and (c); (b) for the predicted gene, constructing a plurality of nonlinear multivariate models predicting expression of the predicted gene, wherein the nonlinear multivariate models have a variety of predictive elements chosen from permutations of expression levels of observed genes other than the predicted gene; (c) measuring effectiveness of the nonlinear multivariate models in predicting expression of the predicted gene to provide a quantification of relative relatedness between the predicted gene and genes associated with the predictive elements of the models. - View Dependent Claims (49, 50)
-
-
51. A system for quantifying gene relatedness for a plurality of candidate genes for which a plurality of gene expression level observations have been collected, the system comprising:
-
means for constructing a nonbinary, nonlinear model predicting gene expression based on data comprising the plurality of gene expression level observations for the plurality of candidate genes; means for predicting gene expression with the nonbinary, nonlinear model; and means for measuring effectiveness of the nonbinary, nonlinear model in predicting gene expression, the effectiveness indicating gene relatedness for the plurality of candidate genes. - View Dependent Claims (52)
-
-
53. A computer-implemented method of ranking the relatedness of a plurality of genes based on gene expression level observations associated with the plurality of genes, the method comprising:
-
based on the gene expression level observations, constructing a plurality of multivariate nonlinear predictors to predict the expression of a plurality of target genes out of the genes, wherein the multivariate nonlinear predictors comprise predictive elements comprising an observed gene, thereby associating the multivariate nonlinear predictor with the target gene and at least one observed gene; testing effectiveness of the plurality of multivariate nonlinear predictors in predicting gene expression to quantify relative gene relatedness between the genes associated with the predictors by estimating a coefficient of determination; and displaying a ranked list of relative gene relatedness among the genes as determined by testing the plurality of multivariate nonlinear predictors.
-
-
54. A computer-implemented method for analyzing a plurality of candidate genes for which a plurality of gene expression level observations have been collected to determine which out of the genes are more related, the method comprising:
-
for a plurality of selected permutations of the plurality of candidate genes, performing (a)–
(c) for each permutation;(a) based on data comprising the plurality of gene expression level observations for the plurality of candidate genes, constructing a nonlinear model predicting gene expression for the permutation of the plurality of candidate genes; (b) predicting gene expression with the nonlinear model; and (c) measuring effectiveness of the nonlinear model in predicting gene expression, the effectiveness being a quantification indicating relative gene relatedness for the plurality of candidate genes of the permutation; and presenting at least one of the permutations of genes as related and an indication of the quantification indicating relative gene relatedness for the at least one of the permutations. - View Dependent Claims (55)
-
-
56. A computer-implemented method comprising:
-
constructing a first multivariate nonlinear model for predicting a gene expression level for a candidate gene selected out of an observed set of genes, wherein the gene expression level for the candidate gene is predicted by the first multivariate nonlinear model via observed gene expression levels for a first permutation subset of genes selected out of the observed set of genes, the first nonlinear model taking observed gene expression levels for the first permutation subset of genes as inputs; measuring effectiveness of the first multivariate nonlinear model in predicting the observed gene expression level of the candidate gene; constructing a second multivariate nonlinear model for predicting the gene expression level for the candidate gene selected out of the observed set of genes, wherein the gene expression level for the candidate gene is predicted by the second multivariate nonlinear model via observed gene expression levels for a second permutation subset of genes selected out of the observed set of genes, the second nonlinear model taking observed gene expression levels for the second permutation subset of genes as inputs, wherein the first multivariate nonlinear model and the second multivariate nonlinear model are of a same model type, wherein the first multivariate nonlinear model and the second multivariate nonlinear model predict the gene expression level for the candidate gene, and wherein the first multivariate nonlinear model and the second multivariate nonlinear model are constructed via gene expression data from a same set of experiments; measuring effectiveness of the second multivariate nonlinear model in predicting the observed gene expression level of the candidate gene; ordering the first and second multivariate nonlinear models by ranking the effectiveness of the first multivariate nonlinear model with respect to the effectiveness of the second multivariate nonlinear model; and presenting results of the ordering, wherein the results indicate that observed gene expression levels for a permutation subset of genes associated with a higher-ranking multivariate nonlinear model have a higher effectiveness in predicting the observed gene expression level of the candidate gene than those of observed gene expression levels for a permutation subset of genes associated with a lower-ranking multivariate nonlinear model. - View Dependent Claims (57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81)
-
Specification