Methods and systems for determining the biological function of cell constituents using response profiles
First Claim
1. A method for determining a biological function of a cellular constituent that is perturbed in one or more landmark response profiles that are most similar to a first response profile, comprising:
- (a) receiving a first response profile comprising measured amounts of a plurality of different cellular constituents in a first cell of a cell type or type of organism, said first cell having a first perturbation to said first cellular constituent;
(b) comparing said first response profile to a plurality of landmark response profiles stored in a database to determine a measure of similarity between said first response profile and each said landmark response profile in said plurality of landmark response profiles, each said landmark response profile comprising measured amounts of said plurality of different cellular constituents in a second cell of said cell type or type of organism, said second cell having a second perturbation to a second cellular constituent associated with a known biological function, wherein said plurality of landmark response profiles comprises landmark response profiles corresponding to respective perturbations to at least 118 different genes of a cell of said cell type or type of organism, and wherein said measured amounts in said first response profile and in said plurality of landmark response profiles are all measured amounts of transcripts or are all measured amounts of proteins;
(c) determining one or more landmark response profiles most similar to said first response profile based on the measures of similarity determined in step(b); and
(d) identifying the known biological function associated with the one or more second cellular constituents that are perturbed in the one or more second cells corresponding to the respective one or more landmark response profiles determined to be most similar to said first response profile in step (c);
wherein steps (a), (b), (c), and (d) are implemented on a suitably programmed computer.
3 Assignments
0 Petitions
Accused Products
Abstract
The invention relates to methods and systems (e.g., computer systems and computer program products) for determining the biological function of uncharacterized cellular constituents, particularly genes and gene products, by using “response profiles,” i.e., measurements of pluralities of cellular constituents in cells having a modified gene or gene product, as phenotypic markers for the gene or gene product. Methods are provided for clustering such response profiles so that similar or correlated response profiles are organized into the same cluster. The invention also provides databases or “compendiums” of response profiles to which the response profile of an uncharacterized gene or gene product can be compared. In one embodiment, steps of the methods comprise comparing the measured response profiles to response profiles stored in the databases or compendiums, and determining the biological function of the response profiles in the databases that are most similar to the measured response profiles.
-
Citations
48 Claims
-
1. A method for determining a biological function of a cellular constituent that is perturbed in one or more landmark response profiles that are most similar to a first response profile, comprising:
-
(a) receiving a first response profile comprising measured amounts of a plurality of different cellular constituents in a first cell of a cell type or type of organism, said first cell having a first perturbation to said first cellular constituent; (b) comparing said first response profile to a plurality of landmark response profiles stored in a database to determine a measure of similarity between said first response profile and each said landmark response profile in said plurality of landmark response profiles, each said landmark response profile comprising measured amounts of said plurality of different cellular constituents in a second cell of said cell type or type of organism, said second cell having a second perturbation to a second cellular constituent associated with a known biological function, wherein said plurality of landmark response profiles comprises landmark response profiles corresponding to respective perturbations to at least 118 different genes of a cell of said cell type or type of organism, and wherein said measured amounts in said first response profile and in said plurality of landmark response profiles are all measured amounts of transcripts or are all measured amounts of proteins; (c) determining one or more landmark response profiles most similar to said first response profile based on the measures of similarity determined in step (b); and (d) identifying the known biological function associated with the one or more second cellular constituents that are perturbed in the one or more second cells corresponding to the respective one or more landmark response profiles determined to be most similar to said first response profile in step (c); wherein steps (a), (b), (c), and (d) are implemented on a suitably programmed computer. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
2. A method for determining a biological function of a cellular constituent that is perturbed in one or more landmark response profiles that are most similar to a first response profile, comprising:
-
(a) comparing a first response profile to a plurality of landmark response profiles stored in a database to determine a measure of similarity between said first response profile and each said landmark response profile in said plurality of landmark response profiles;
wherein said first response profile comprises measured amounts of a plurality of different cellular constituents in a first cell of a cell type or type of organism, said first cell having a first perturbation to said first cellular constituent;
wherein each landmark response profile comprises measured amounts of said plurality of different cellular constituents in a second cell of said cell type or type of organism, said second cell having a second perturbation to a second cellular constituent associated with a known biological function, wherein said plurality of landmark response profiles comprises landmark response profiles corresponding to respective perturbations to at least 118 different genes of a cell of said cell type or type of organism, and wherein said measured amounts in said first response profile and in said plurality of landmark response profiles are all measured amounts of transcripts or are all measured amounts of proteins;(b) determining one or more landmark response profiles most similar to said first response profile based on the measures of similarity determined in step (a); and (c) identifying the known biological function associated with the one or more second cellular constituents that are perturbed in the one or more second cells corresponding to the respective one or more landmark response profiles determined to be most similar to said first response profile in step (b);
wherein steps (a), (b), and (c) are implemented on a suitably programmed computer.
-
-
19. A method for characterizing a first cellular constituent as being associated with a particular biological function, comprising:
-
(a) measuring a first response profile comprising measured amounts of a plurality of different cellular constituents in a first cell of a cell type or type of organism, said first cell having a first perturbation to said first cellular constituent; (b) clustering a plurality of response profiles, which plurality comprises said first response profile and a plurality of landmark response profiles, each landmark response profile comprising measured amounts of said plurality of different cellular constituents in a second cell of said cell type or type of organism, said second cell having a second perturbation to a second cellular constituent associated with a known biological function, wherein said plurality of landmark response profiles comprises landmark response profiles corresponding to respective perturbations to at least 118 different genes of a cell of said cell type or type of organism, and wherein said measured amounts in said first response profile and in said plurality of landmark response profiles are all measured amounts of transcripts or are all measured amounts of proteins; (c) identifying one or more landmark response profiles in said plurality of landmark response profiles that cluster with the first response profile; and (d) characterizing the first cellular constituent as being associated with said known biological function associated with the one or more second cellular constituents that are perturbed in the one or more second cells corresponding to said respective one or more landmark response profiles identified as clustered with said first response profile in step (c). - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
20. A method for characterizing a first cellular constituent as being associated with a particular biological function, comprising:
-
(a) clustering a plurality of response profiles, which plurality comprises; (i) a first response profile comprising measured amounts of a plurality of different cellular constituents in a first cell of a cell type or type of organism, said first cell having a first perturbation to said first cellular constituent; and (ii) a plurality of landmark response profiles, each landmark response profile comprising measured amounts of said plurality of different cellular constituents in a second cell of said cell type or type of organism, said second cell having a second perturbation to a second cellular constituent associated with a known biological function, wherein said plurality of landmark response profiles comprises landmark response profiles corresponding to respective perturbations to at least 118 different genes of a cell of said cell type or type of organism, and wherein said measured amounts in said first response profile and in said plurality of landmark response profiles are all measured amounts of transcripts or are all measured amounts of proteins; (b) identifying one or more landmark response profiles in said plurality of landmark response profiles that cluster with the first response profile; (c) characterizing the first cellular constituent as being associated with said known biological function associated with the one or more second cellular constituents that are perturbed in the one or more second cells corresponding to said respective one or more landmark response profiles identified as clustered with said first response profile in step (b); and (d) identifying the biological function with which said first cellular constituent is associated, as characterized in step (c); wherein steps (a), (b), (c), and (d) are implemented on a suitably programmed computer.
-
-
33. A computer system for identifying a biological function of a cellular constituent that is perturbed in one or more landmark response profiles that are most similar to a first response profile, said computer system comprising:
-
one or more processor units; and one or more memory units connected to said one or more processor units, said one or more memory units containing one or more programs which cause said one or more processor units to execute steps comprising; (a) receiving a data structure for a first response profile comprising measured amounts of a plurality of different cellular constituents in a first cell of a cell type or type of organism, said first cell having a first perturbation to said first cellular constituent; (b) comparing said first response profile to a plurality of landmark response profiles stored in a database to determine a measure of similarity between said first response profile and each said landmark response profile in said plurality of landmark response profiles, each said landmark response profile comprising measured amounts of said plurality of different cellular constituents in a second cell of said cell type or type of organism, said second cell having a second perturbation to a second cellular constituent associated with a known biological function, wherein said plurality of landmark response profiles comprises landmark response profiles corresponding to respective perturbations to at least 118 different genes of a cell of said cell type or type of organism, and wherein said measured amounts in said first response profile and in said plurality of landmark response profiles are all measured amounts of transcripts or are all measured amounts of proteins; (c) determining one or more landmark response profiles most similar to said first response profile based on the measures of similarity determined in step (b); (d) determining the known biological function associated with the one or more second cellular constituents that are perturbed in the one or more second cells corresponding to the respective one or more landmark response profiles determined to be most similar to said first response profile in step (c); and (e) outputting or displaying the known biological function determined in step (d). - View Dependent Claims (34, 36, 37, 38, 39)
-
-
35. A computer system for identifying a biological function of a cellular constituent that is perturbed in one or more landmark response profiles that are most similar to a first response profile, said computer system comprising:
-
one or more processor units; and one or more memory units connected to said one or more processor units, said one or more memory units containing one or more programs which cause said one or more processor units to execute steps comprising; (a) comparing a first response profile to a plurality of landmark response profiles stored in a database to determine a measure of similarity between said first response profile and each said landmark response profile in said plurality of landmark response profiles;
wherein said first response profile comprises measured amounts of a plurality of different cellular constituents in a first cell of a cell type or type of organism, said first cell having a first perturbation to said first cellular constituent;
wherein each said landmark response profile comprises measured amounts of said plurality of different cellular constituents in a second cell of said cell type or type of organism, said second cell having a second perturbation to a second cellular constituent associated with a known biological function, wherein said plurality of landmark response profiles comprises landmark response profiles corresponding to respective perturbations to at least 118 different genes of a cell of said cell type or type of organism, and wherein said measured amounts in said first response profile and in said plurality of landmark response profiles are all measured amounts of transcripts or are all measured amounts of proteins;(b) determining one or more landmark response profiles most similar to said first response profile based on the measures of similarity determined in step (a); (c) determining the known biological function associated with the one or more second cellular constituents that are perturbed in the one or more second cells corresponding to the respective one or more landmark response profiles determined to be most similar to said first response profile in step (b) and (d) outputting or displaying the known biological function determined in step (c). - View Dependent Claims (40)
-
-
41. A computer program product for use in conjunction with a computer having one or more memory units and one or more processor units, the computer program product comprising a computer readable storage medium having a computer program mechanism encoded thereon, wherein said computer program mechanism can be loaded into the one or more memory units of said computer and cause the one or more processor units of the computer to execute steps comprising:
-
(a) receiving a data structure for a first response profile comprising measured amounts of a plurality of different cellular constituents in a first cell of a cell type or type of organism, said first cell having a first perturbation to a first cellular constituent; and (b) comparing said first response profile to a plurality of landmark response profiles stored in a database to determine a measure of similarity between said first response profile and each said landmark response profile in said plurality of landmark response profiles, each landmark response profile comprising measured amounts of said plurality of different cellular constituents in a second cell of said cell type or type of organism, said second cell having a second perturbation in a second cellular constituent associated with a known biological function, wherein said plurality of landmark response profiles comprises landmark response profiles corresponding to respective perturbations to at least 118 different genes of a cell of said cell type or type of organism, and wherein said measured amounts in said first response profile and in said plurality of landmark response profiles are all measured amounts of transcripts or are all measured amounts of proteins; (c) determining one or more landmark response profiles most similar to said first response profile based on the measures of similarity determined in step (b); (d) determining the known biological function associated with the one or more second cellular constituents that are perturbed in the one or more second cells corresponding to the respective one or more landmark response profiles determined to be most similar to said first response profile in step (c); and (e) outputting or displaying the known biological function determined in step (d). - View Dependent Claims (42, 44, 45, 46, 47)
-
-
43. A computer program product for use in conjunction with a computer having one or more memory units and one or more processor units, the computer program product comprising a computer readable storage medium having a computer program mechanism encoded thereon, wherein said computer program mechanism can be loaded into the one or more memory units of said computer and cause the one or more processor units of the computer to execute steps comprising:
-
(a) comparing a first response profile to a plurality of landmark profiles stored in a database to determine a measure of similarity between said first response profile and each said landmark response profile in said plurality of landmark response profiles; wherein said first response profile comprises measured amounts of a plurality of different cellular constituents in a first cell of a cell type or type of organism, said first cell having a first perturbation to a first cellular constituent;
wherein each landmark response profile comprises measured amounts of said plurality of different cellular constituents in a second cell of said cell type or type of organism, said second cell having a second perturbation to a second cellular constituent associated with a known biological function, wherein said plurality of landmark response profiles comprises landmark response profiles corresponding to respective perturbations to at least 118 different genes of a cell of said cell type or type of organism, and wherein said measured amounts in said first response profile and in said plurality of landmark response profiles are all measured amounts of transcripts or are all measured amounts of proteins;(b) determining one or more landmark response profiles most similar to said first response profile based on the measures of similarity determined in step (a); (c) determining the known biological function associated with the one or more second cellular constituents that are perturbed in the one or more second cells corresponding to the respective one or more landmark response profiles determined to be most similar to said first response profile in step (b); and (d) outputting or displaying the known biological function determined in step (c). - View Dependent Claims (48)
-
Specification