Methods and systems for differential clustering
First Claim
1. A high throughput method of identifying genes that are potentially functionally variant, said method comprising the steps of:
- providing gene expression values for a plurality of genes for each of a number of tissues wherein the expression values are given for the same genes in each of the number of tissues;
dividing the number of tissues into at least first and second groups of tissues;
generating a gene expression response profile for each gene representative of all gene expression values for that gene across all tissue samples in the first group;
clustering the gene expression response profiles generated with respect to the first group;
generating a gene expression response profile for each gene representative of all gene expression values for that gene across all tissue samples in the second group;
clustering the gene expression response profiles generated with respect to the second group;
comparing gene expression response profile members in clusters generated with respect to the first group with gene expression response profile members in clusters generated with respect to the second group and identifying those members that change cluster membership in the second group relative to the first group;
statistically calculating whether the move of a member from membership in a first cluster to membership in a second cluster is significant relative to the variance within the first and second clusters and the variance between the first and second clusters; and
if the move is calculated to be significant, identifying the gene represented by the member as a potential functionally variant gene.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems and computer readable media for differential clustering of gene expression response profiles for identification of potential functionally variant genes in a high throughput manner. Gene expression data is provided for a number of samples. Gene expression response profiles are generated for various sets of the samples and then differentially clustered across such sets to observe genes whose expression response profiles change cluster membership going from one set to another. Statistical analysis is performed with regard to the change from one cluster membership to another to determine whether the change from one cluster membership to another is statistically significant. If the change is determined to be statistically significant, the gene represented by the gene expression response profiles having been analyzed is identified as being a potential functionally variant gene. The nature of the function change may also be identified by the present systems, methods and computer readable media.
-
Citations
24 Claims
-
1. A high throughput method of identifying genes that are potentially functionally variant, said method comprising the steps of:
-
providing gene expression values for a plurality of genes for each of a number of tissues wherein the expression values are given for the same genes in each of the number of tissues;
dividing the number of tissues into at least first and second groups of tissues;
generating a gene expression response profile for each gene representative of all gene expression values for that gene across all tissue samples in the first group;
clustering the gene expression response profiles generated with respect to the first group;
generating a gene expression response profile for each gene representative of all gene expression values for that gene across all tissue samples in the second group;
clustering the gene expression response profiles generated with respect to the second group;
comparing gene expression response profile members in clusters generated with respect to the first group with gene expression response profile members in clusters generated with respect to the second group and identifying those members that change cluster membership in the second group relative to the first group;
statistically calculating whether the move of a member from membership in a first cluster to membership in a second cluster is significant relative to the variance within the first and second clusters and the variance between the first and second clusters; and
if the move is calculated to be significant, identifying the gene represented by the member as a potential functionally variant gene. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A high throughput method of identifying genes that are potentially functionally variant, said method comprising the steps of:
- providing gene expression values for a plurality of genes for each of a number of tissues wherein the expression values are given for the same genes in each of the number of tissues;
differentially clustering gene expression response profiles generated based upon at least a first set of tissues taken from the number of tissues and then from at least a second set of tissues taken from the number of tissues;
comparing gene expression response profile members in clusters generated with respect to one of said sets with gene expression response profile members in clusters generated with respect to another of said sets and identifying those members that change cluster membership in said another of said sets relative to said one of said sets;
statistically calculating whether the move of a member from membership in a first cluster to membership in a second cluster is significant relative to the variance within the first and second clusters and the variance between the first and second clusters; and
if the move is calculated to be significant, identifying the gene represented by the member as a potential functionally variant gene. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
- providing gene expression values for a plurality of genes for each of a number of tissues wherein the expression values are given for the same genes in each of the number of tissues;
-
22. A system for high throughput identification of genes that are potentially functionally variant, said system comprising:
-
means for differentially clustering gene expression response profiles generated from expression values taken from at least a first set of tissues taken from a dataset providing gene expression values for a number of tissues and then clustering gene expression response profiles generated from expression values taken from at least a second set of tissues taken from the number of tissues;
means for comparing gene expression response profile members in clusters generated with respect to one of said sets with gene expression response profile members in clusters generated with respect to another of said sets and identifying those members that change cluster membership in said another of said sets relative to said one of said sets;
means for statistically calculating whether the move of a member from membership in a first cluster to membership in a second cluster is significant relative to the variance within the first and second clusters and the variance between the first and second clusters; and
means for identifying a gene as a potential functionally variant gene if the move is calculated to be significant.
-
-
23. A computer readable medium carrying one or more sequences of instructions for high throughput identification of genes that are potentially functionally variant, wherein execution of one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
dividing gene expression values, provided for a plurality of genes for each of a number of tissues wherein the expression values are given for the same genes in each of the number of tissues, into at least first and second groups of tissues;
generating a gene expression response profile for each gene representative of all gene expression values for that gene across all tissue samples in the first group;
clustering the gene expression response profiles generated with respect to the first group;
generating a gene expression response profile for each gene representative of all gene expression values for that gene across all tissue samples in the second group;
clustering the gene expression response profiles generated with respect to the second group;
comparing gene expression response profile members in clusters generated with respect to the first group with gene expression response profile members in clusters generated with respect to the second group and identifying those members that change cluster membership in the second group relative to the first group;
statistically calculating whether the move of a member from membership in a first cluster to membership in a second cluster is significant relative to the variance within the first and second clusters and the variance between the first and second clusters; and
if the move is calculated to be significant, identifying the gene represented by the member as a potential functionally variant gene.
-
-
24. A computer readable medium carrying one or more sequences of instructions for high throughput identification of genes that are potentially functionally variant, wherein execution of one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
differentially clustering gene expression response profiles, provided for a plurality of genes for each of a number of tissues wherein the expression values are given for the same genes in each of the number of tissues, generated based upon at least a first set of tissues taken from the number of tissues and then from at least a second set of tissues taken from the number of tissues;
comparing gene expression response profile members in clusters generated with respect to one of said sets with gene expression response profile members in clusters generated with respect to another of said sets and identifying those members that change cluster membership in said another of said sets relative to said one of said sets;
statistically calculating whether the move of a member from membership in a first cluster to membership in a second cluster is significant relative to the variance within the first and second clusters and the variance between the first and second clusters; and
if the move is calculated to be significant, identifying the gene represented by the member as a potential functionally variant gene.
-
Specification