Methods for classifying samples and ascertaining previously unknown classes
First Claim
Patent Images
1. A method of identifying a set of informative genes whose expression correlates with a class distinction between samples, comprising the steps of:
- a. sorting genes using a neighborhood analysis, wherein the genes are sorted by degree to which their expression in said samples correlate with a class distinction; and
b. determining whether said correlation is stronger than expected by chance;
wherein genes whose expression correlate with a class distinction more strongly than expected by chance are informative genes, thereby identifying a set of informative genes.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for classifying or predicting the classes for samples based on gene expression are described. Also described are methods and apparatus for ascertaining or discovering new, previously unknown classes based on gene expression.
-
Citations
12 Claims
-
1. A method of identifying a set of informative genes whose expression correlates with a class distinction between samples, comprising the steps of:
-
a. sorting genes using a neighborhood analysis, wherein the genes are sorted by degree to which their expression in said samples correlate with a class distinction; and
b. determining whether said correlation is stronger than expected by chance;
wherein genes whose expression correlate with a class distinction more strongly than expected by chance are informative genes, thereby identifying a set of informative genes. - View Dependent Claims (2, 3, 4, 5, 6, 7)
a. defining an idealized expression pattern corresponding to a gene, wherein said idealized expression pattern is expression of said gene that is uniformly high in a first class and uniformly low in a second class; and
b. determining whether there is a high density of genes having an expression pattern similar to said idealized expression pattern, as compared to an equivalent random expression pattern, wherein the high density of genes are genes having a high statistical significance in a permutation test.
-
-
7. The method of claim 6, further including a signal to noise routine performed according to:
-
8. A method of identifying a set of informative genes whose expression correlates with a class distinction between samples, comprising the steps of:
-
a. sorting genes using a neighborhood analysis, wherein the genes are sorted by degree to which their expression in said samples correlate with a disease class distinction; and
b. determining whether said correlation is stronger than expected by chance;
wherein genes whose expression correlate with a disease class distinction more strongly than expected by chance are informative genes, thereby identifying a set of informative genes.
-
-
9. A method of identifying a set of informative genes whose expression correlates with a class distinction between samples, comprising the steps of:
-
a. sorting genes using a neighborhood analysis, wherein the genes are sorted by degree to which their expression in said samples correlate with a cancer disease class distinction; and
b. determining whether said correlation is stronger than expected by chance;
wherein genes whose expression correlate with a cancer disease class distinction more strongly than expected by chance are informative genes, thereby identifying a set of informative genes.
-
-
10. A method of identifying a set of informative genes whose expression correlates with a class distinction between samples, comprising the steps of:
-
a. defining an idealized expression pattern corresponding to a gene, wherein said idealized expression pattern is expression of said gene that is uniformly high in a first class and uniformly low in a second class; and
b. determining whether there is a high density of genes having an expression pattern similar to said idealized expression pattern, as compared to an equivalent random expression pattern, wherein the high density of genes are genes having a high statistical significance in a permutation test;
wherein genes whose expression correlate with a disease class distinction more strongly than expected by chance are informative genes, thereby identifying a set of informative genes. - View Dependent Claims (11)
-
-
12. A method of identifyng a set of informative genes whose expression correlates with a class distinction between samples, comprising the steps of:
a. sorting genes using a neighborhood analysis, wherein the genes are sorted by degree to which their expression in said samples correlate with a cancer disease class distinction, said neighborhood analysis includes performing a signal to noise routine according to;
Specification