Ratio-based decisions and the quantitative analysis of cDNA micro-array images
First Claim
1. A method for analyzing gene expression in a cDNA micro-array image, the method comprising:
- identifying target sites in the cDNA micro-array image, wherein the target sites are associated with a set of genes;
computing a maximum-likelihood estimator for a coefficient of variation of expression level ratio samples, where the expression level ratio samples are taken from a collection of expression values for each gene in the set of genes associated with the target sites identified in the micro-array image, the expression level ratio samples indicate a ratio of an expression level for a first cell type to an expression level for a second cell type for a corresponding gene, and the expression levels for the corresponding gene are taken from a target site associated with the corresponding gene in the cDNA micro-array image;
computing a confidence interval for the expression level ratio samples based on the maximum likelihood estimator for a coefficient of variation of expression level ratio samples; and
identifying genes corresponding to expression level ratio samples outside the confidence interval.
0 Assignments
0 Petitions
Accused Products
Abstract
Gene expression can be quantitatively analyzed by hybridizing fluor-tagged mRNA to targets on a cDNA micro-array. Comparison of gene expression levels arising from co-hybridized samples is achieved by taking ratios of average expression levels for individual genes. In an image-processing phase, a method of image segmentation identifies cDNA target sites in a cDNA micro-array image. The resulting cDNA target sites are analyzed based on a hypothesis test and confidence interval to quantify the significance of observed differences in expression ratios. In particular, the probability density of the ratio and the maximum-likelihood estimator for the distribution are derived, and an iterative procedure for signal calibration is developed.
-
Citations
20 Claims
-
1. A method for analyzing gene expression in a cDNA micro-array image, the method comprising:
-
identifying target sites in the cDNA micro-array image, wherein the target sites are associated with a set of genes;
computing a maximum-likelihood estimator for a coefficient of variation of expression level ratio samples, where the expression level ratio samples are taken from a collection of expression values for each gene in the set of genes associated with the target sites identified in the micro-array image, the expression level ratio samples indicate a ratio of an expression level for a first cell type to an expression level for a second cell type for a corresponding gene, and the expression levels for the corresponding gene are taken from a target site associated with the corresponding gene in the cDNA micro-array image;
computing a confidence interval for the expression level ratio samples based on the maximum likelihood estimator for a coefficient of variation of expression level ratio samples; and
identifying genes corresponding to expression level ratio samples outside the confidence interval. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
initializing an estimated mean for the expression level ratio samples;
(a) calibrating expression level ratio samples by computing a scaling factor between the expression levels of the first and second cell types and adjusting the expression level ratio samples with the scaling factor to generate adjusted expression level ratio samples;
(b) using the maximum likelihood estimator to compute a coefficient of variation of the adjusted expression level ratio samples; and
(c) using the coefficient of variation to compute an estimated mean value of the adjusted expression level ratio samples;
repeating the above computations at least once; and
after determining the mean and coefficient of variation of the adjusted expression level ratio samples, using the mean and coefficient of variation to compute the confidence interval.
-
-
3. The method of claim 2 further including:
-
capturing the micro-array image for an array of cDNAs hybridized with a first set of labeled mRNAs extracted from the first cell type and a second set of labeled mRNAs extracted from the second cell type, where intensity values of the first set of labeled mRNAs in the micro-array image represent an expression level of the first cell type, and intensity values of the second set of labeled mRNAs represent an expression level of the second cell type; and
computing an expression level ratio for each gene in the set as a ratio of an average of the first intensity values to an average of the second intensity values for pixel locations in the identified target site associated with the gene.
-
-
4. The method of claim 1 wherein each gene in the set has two or more expression ratios, and each expression ratio represents a ratio of an expression level for a different cell type to an expression level for a reference cell type.
-
5. The method of claim 1 wherein the expression levels for each gene are determined from intensity values in a portion of the cDNA micro-array image representing a tagged mRNA hybridized to the target site for the gene in a cDNA micro-array.
-
6. The method of claim 1 wherein mRNAs of the first and second cell types are labeled using a radioactive labeling technique and the expression levels of the first and second cell types for each gene are measured based on presence of radioactive label.
-
7. The method of claim 1 wherein mRNAs of the first and second cell types are labeled using physical labels and the expression levels of the first and second cell types for each gene are measured based on the physical labels.
-
8. The method of claim 1 wherein mRNAs of the first and second cell types are labeled using flourescent or chemiflourescent labels and the expression levels of the first and second cell types for each gene are measured based on the flourescent or chemiflourescent labels.
-
9. A computer readable medium having software for performing the method of claim 1.
-
10. A programmatic method for analyzing gene expression of two or more cell types in an array of target gene sites, the method comprising:
-
programmatically extracting expression levels of at least first and second cell types at the target gene sites from an image of the array;
programmatically determining expression level ratios of the first and second cell types at the target gene sites;
programmatically determining a coefficient of variation from the expression level ratios; and
based on the coefficient of variation, determining a confidence interval for identifying target sites with expression level ratios that fall outside the confidence interval. - View Dependent Claims (11, 12)
calibrating the expression levels for first and second cell types at each target site by modifying a measure of the expression level for each cell type by a constant factor for all target sites.
-
-
12. The method of claim 10 further including:
iteratively calibrating the expression levels by estimating a constant gain factor between expression levels of the first and second cell types for each gene and then using the estimated constant gain factor to determine a maximum likelihood estimator of the coefficient of variation.
-
13. A programmatic method for analyzing gene expression of two or more cell types in an array of target gene sites of housekeeping genes, the method comprising:
-
programmatically extracting expression levels of at least first and second cell types at the target gene sites from an image of the array;
programmatically determining expression levels of the first and second cell types at the target gene sites;
programmatically determining a distribution of the expression levels and a limit in the distribution that is used to identify target sites with an abnormal expression level; and
based on the distribution of the expression levels, identifying target sites with expression levels that fall outside the limit. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A computer-implemented method for analyzing data comprising a plurality of gene expression level ratio samples collected by analyzing a cDNA micro-array image indicating gene expression of a plurality of genes, wherein the ratio samples are associated with the plurality of genes, the method comprising:
-
(a) choosing an initial estimate of a mean of a density of the gene expression level ratio samples;
(b) calibrating the gene expression level ratio samples by applying a gain factor based on the estimate of the mean of the density of the gene expression level ratio samples;
(c) estimating a coefficient of variation for the gene expression level ratio samples via a maximum-likelihood estimator and the calibrated gene expression level ratio samples;
(d) re-estimating the mean via the coefficient of variation for the gene expression level ratio samples;
(e) repeating (b)-(d) at least once; and
(f) determining a confidence interval via the coefficient of variation estimated by repeating (b)-(d) at least once to identify outlier genes in the data. - View Dependent Claims (20)
-
Specification