Analysis of transcriptomic data using similarity based modeling
First Claim
1. A method for discovery of biomarkers indicative of a medical condition in a species of organism, from a candidate set of molecular constituents extracted from such organisms, comprising the steps of:
- determining a set of molecular constituents to test as biomarkers;
obtaining first measurements of the quantity of each of the molecular constituents present in each of a first plurality of samples extracted from such organisms free of the condition;
training and storing in a computer memory, a kernel-based estimation model using said first measurements as multivariate training observations, each multivariate observation comprising the measurements from a single condition-free sample;
obtaining second measurements of the quantity of each of the molecular constituents present in each of a second plurality of samples extracted from such organisms free of the condition;
obtaining third measurements of the quantity of each of the molecular constituents present in each of a third plurality of samples extracted from such organisms having the medical condition;
estimating with said model, using a microprocessor, an estimated quantity for each of at least some of the molecular constituents in a sample, responsive to inputting to said model a multivariate input observation comprising the measurements from that sample, for all of said second plurality of samples and all of said third plurality of samples;
computing residuals corresponding to at least one molecular constituent, each residual being the difference between an estimated quantity of that molecular constituent for a sample and the measured quantity of that molecular constituent for that sample; and
comparing, for at least one molecular constituent, residuals from said second plurality of samples with residuals from said third plurality of samples, to identify whether that molecular constituent is a biomarker of said medical condition.
7 Assignments
0 Petitions
Accused Products
Abstract
An analytic apparatus and method is provided for diagnosis, prognosis and biomarker discovery using transcriptome data such as mRNA expression levels from microarrays, proteomic data, and metabolomic data. The invention provides for model-based analysis, especially using kernel-based models, and more particularly similarity-based models. Model-derived residuals advantageously provide a unique new tool for insights into disease mechanisms. Localization of models provides for improved model efficacy. The invention is capable of extracting useful information heretofore unavailable by other methods, relating to dynamics in cellular gene regulation, regulatory networks, biological pathways and metabolism.
-
Citations
21 Claims
-
1. A method for discovery of biomarkers indicative of a medical condition in a species of organism, from a candidate set of molecular constituents extracted from such organisms, comprising the steps of:
-
determining a set of molecular constituents to test as biomarkers; obtaining first measurements of the quantity of each of the molecular constituents present in each of a first plurality of samples extracted from such organisms free of the condition; training and storing in a computer memory, a kernel-based estimation model using said first measurements as multivariate training observations, each multivariate observation comprising the measurements from a single condition-free sample; obtaining second measurements of the quantity of each of the molecular constituents present in each of a second plurality of samples extracted from such organisms free of the condition; obtaining third measurements of the quantity of each of the molecular constituents present in each of a third plurality of samples extracted from such organisms having the medical condition; estimating with said model, using a microprocessor, an estimated quantity for each of at least some of the molecular constituents in a sample, responsive to inputting to said model a multivariate input observation comprising the measurements from that sample, for all of said second plurality of samples and all of said third plurality of samples; computing residuals corresponding to at least one molecular constituent, each residual being the difference between an estimated quantity of that molecular constituent for a sample and the measured quantity of that molecular constituent for that sample; and comparing, for at least one molecular constituent, residuals from said second plurality of samples with residuals from said third plurality of samples, to identify whether that molecular constituent is a biomarker of said medical condition. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for diagnosing a medical condition in a suspect biological specimen, comprising the steps of:
-
storing in a computer memory a kernel-based estimation model comprising multivariate observations of first measurements of the quantity of each of a set of molecular constituents present in each of a first plurality of samples extracted from multiple biological specimens free of the condition, each multivariate observation comprising the measurements from a single condition-free sample; obtaining second measurements of the quantity of each of the molecular constituents present in said suspect biological specimen; estimating with said model, using a microprocessor, an estimated quantity expected for each of at least some of the molecular constituents in said suspect biological specimen, responsive to inputting to said model a multivariate input observation comprising said second measurements from said suspect biological specimen; computing a residual corresponding to each of at least one molecular constituent, each residual being the difference between the estimated quantity of that molecular constituent and the measured quantity of that molecular constituent; and testing said residuals to determine if said suspect biological specimen has said medical condition. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
Specification