Methods for analyzing high dimensional data for classifying, diagnosing, prognosticating, and/or predicting diseases and other biological states
First Claim
1. A method of diagnosing, predicting, or prognosticating about a disease comprising the steps of:
- (a) obtaining experimental data, wherein said experimental data comprises high dimensional data;
(b) filtering said data;
(c) reducing the dimensionality of said data through use of one or more methods;
(d) training a supervised pattern recognition method;
(e) ranking individual data points from said high dimensional data, wherein said ranking is dependent on an outcome of said supervised pattern recognition method;
(f) choosing multiple data points from said high dimensional data, wherein said choice is based on said relative ranking of said individual data points; and
(g) using said multiple data points to determine if an unknown set of experimental data indicates a diseased condition, a predilection for a diseased condition, or a prognosis about a diseased condition.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of diagnosing, predicting, or prognosticating about a disease that includes obtaining experimental data, wherein the experimental data is high dimensional data, filtering the data, reducing the dimensionality of the data through use of one or more methods, training a supervised pattern recognition method, ranking individual data points from the data, wherein the ranking is dependent on the outcome of the supervised pattern recognition method, choosing multiple data points from the data, wherein the choice is based on the relative ranking of the individual data points, and using the multiple data points to determine if an unknown set of experimental data indicates a diseased condition, a predilection for a diseased condition, or a prognosis about a diseased condition.
46 Citations
31 Claims
-
1. A method of diagnosing, predicting, or prognosticating about a disease comprising the steps of:
-
(a) obtaining experimental data, wherein said experimental data comprises high dimensional data;
(b) filtering said data;
(c) reducing the dimensionality of said data through use of one or more methods;
(d) training a supervised pattern recognition method;
(e) ranking individual data points from said high dimensional data, wherein said ranking is dependent on an outcome of said supervised pattern recognition method;
(f) choosing multiple data points from said high dimensional data, wherein said choice is based on said relative ranking of said individual data points; and
(g) using said multiple data points to determine if an unknown set of experimental data indicates a diseased condition, a predilection for a diseased condition, or a prognosis about a diseased condition. - View Dependent Claims (2, 3, 4, 6, 7, 8, 9, 10, 11, 12)
-
-
5. The method of 4, wherein said step of filtering said gene expression data is based on the intensity of the spots on said microarray.
-
13. A method of diagnosing, predicting, or prognosticating about a disease comprising the steps of:
-
(a) obtaining experimental data, wherein said experimental data comprises gene expression data (b) filtering said gene expression data;
(c) reducing the dimensionality of said data through use of one or more methods;
(d) training an artificial neural network;
(e) ranking individual genes from said gene expression data, wherein said ranking is dependent on the outcome of said artificial neural network;
(f) choosing multiple genes from said gene expression data, wherein said choice is based on the relative ranking of said individual genes; and
(g) using said multiple genes to determine if an unknown set of experimental gene expression data indicates a diseased condition, a predilection for a diseased condition, or a prognosis about a diseased condition. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer-based method of determining a set of multiple data points for use in diagnosing, predicting, or prognosticating about a disease, the computer-based method comprising:
-
(a) receiving experimental data representing high dimensional data, wherein said experimental data comprises gene expression data;
(b) filtering the experimental data;
(c) reducing the dimensionality of the experimental data using one or more methods;
(d) dividing the experimental data into a training data set and a validation data set;
(e) training an artificial neural network using the training data set to generate a trained artificial neural network;
(f) validating the performance of the trained neural network using the validation data set;
(g) generating a ranking data value corresponding to a relative ranking for individual data points using the data from the trained artificial neural network; and
(h) choosing the set of multiple data points from the data using the ranking data values. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28)
-
-
29. A computer data product readable by a computing system and encoding instructions for implementing a computer method for determining a set of multiple genes for use in diagnosing a disease, the computer-based method comprising:
-
receiving data representing experimental gene expression data;
filtering the gene expression data;
reducing the dimensionality of the gene expression data using principle component analysis;
dividing the gene expression data into a training data set and a validation data set;
training an artificial neural network using the training data set to generate a trained artificial neural network;
validating the performance of the trained neural network using the validation data set;
generating a ranking data value corresponding to a relative ranking for individual genes using the gene expression data in the trained artificial neural network; and
choosing the set of multiple genes from the gene expression data using the ranking data values. - View Dependent Claims (30, 31)
-
Specification