Identification and comparison of protein—protein interactions that occur in populations and identification of inhibitors of these interactors
First Claim
1. A method of detecting one or more protein-protein interactions comprising(a) recombinantly expressing within a population of host cells (i) a first population of first fusion proteins, each said first fusion protein comprising a first protein sequence and a DNA binding domain in which the DNA binding domain is the same in each said first fusion protein, and in which said first population of first fusion proteins has a complexity of at least 100;
- and (ii) a second population of second fusion proteins, each said second fusion protein comprising a second protein sequence and a transcriptional regulatory domain of a transcriptional regulator, in which the transcriptional regulatory domain is the same in each said second fusion protein, such that a first fusion protein is co-expressed with a second fusion protein in host cells, and wherein said host cells contain at least one nucleotide sequence operably linked to a promoter driven by one or more DNA binding sites recognized by said DNA binding domain such that interaction of a first fusion protein with a second fusion protein results in regulation of transcription of said at least one nucleotide sequence by said regulatory domain, and in which said second population of second fusion proteins has a complexity of at least 100;
wherein a host cell comprises a first fusion protein and a second fusion protein, and the number of host cells in the population is sufficiently large to provide confidence at the level of 50% or greater that every pair wise combination of a first fusion protein and a second fusion protein is represented in the population; and
(b) detecting said regulation of transcription of said at least one nucleotide sequence, thereby detecting cells in which an interaction between a first fusion protein and a second fusion protein has occurred.
0 Assignments
0 Petitions
Accused Products
Abstract
Methods are described for detecting protein-protein interactions, among two populations of proteins, each having a complexity of at least 100. Encoded proteins are fused either to the DNA-binding domain of a transcriptional activator or to the activation domain of a transcriptional activator. Two yeast strains, of the opposite mating type and carrying one type each of the fusion proteins are mated together. Productive interactions between the two halves due to protein-protein interactions lead to the reconstitution of the transcriptional activator, which in turn leads to the activation of a reporter gene containing a binding site for the DNA-binding domain. This analysis can be carried out for two or more populations of proteins. The differences in the genes encoding the proteins involved in the protein-protein interactions are characterized, thus leading to the identification of specific protein-protein interactions, and the genes encoding the interacting proteins, relevant to a particular tissue, stage or disease. Furthermore, inhibitors that interfere with these protein-protein interactions are identified by their ability to inactivate a reporter gene. The screening for such inhibitors can be in a multiplexed format where a set of inhibitors will be screened against a library of interactors. Further, information-processing methods and systems are described. These methods and systems pro vide for identification of the genes coding for detected interacting proteins, for assembling a unified database of protein-protein interaction data, and for processing this unified database to obtain protein interaction domain and protein pathway information.
33 Citations
17 Claims
-
1. A method of detecting one or more protein-protein interactions comprising
(a) recombinantly expressing within a population of host cells (i) a first population of first fusion proteins, each said first fusion protein comprising a first protein sequence and a DNA binding domain in which the DNA binding domain is the same in each said first fusion protein, and in which said first population of first fusion proteins has a complexity of at least 100; - and
(ii) a second population of second fusion proteins, each said second fusion protein comprising a second protein sequence and a transcriptional regulatory domain of a transcriptional regulator, in which the transcriptional regulatory domain is the same in each said second fusion protein, such that a first fusion protein is co-expressed with a second fusion protein in host cells, and wherein said host cells contain at least one nucleotide sequence operably linked to a promoter driven by one or more DNA binding sites recognized by said DNA binding domain such that interaction of a first fusion protein with a second fusion protein results in regulation of transcription of said at least one nucleotide sequence by said regulatory domain, and in which said second population of second fusion proteins has a complexity of at least 100;
wherein a host cell comprises a first fusion protein and a second fusion protein, and the number of host cells in the population is sufficiently large to provide confidence at the level of 50% or greater that every pair wise combination of a first fusion protein and a second fusion protein is represented in the population; and
(b) detecting said regulation of transcription of said at least one nucleotide sequence, thereby detecting cells in which an interaction between a first fusion protein and a second fusion protein has occurred. - View Dependent Claims (17)
- and
-
2. A method of detecting one or more protein-protein interactions comprising
(a) recombinantly expressing in a first population of yeast cells of a first mating type, a first population of first fusion proteins, each first fusion protein comprising a first protein sequence and a DNA binding domain, in which the DNA binding domain is the same in-each said first fusion protein; - wherein said first population of yeast cells contains a first nucleotide sequence operably linked to a promoter driven by one or more DNA binding sites recognized by said DNA binding domain such that an interaction of a first fusion protein with a second fusion protein, said second fusion protein comprising a transcriptional activation domain, results in increased transcription of said first nucleotide sequence, and in which said first population of first fusion proteins has a complexity of at least 100;
(b) negatively selecting to reduce the number of those yeast cells expressing said first population of first fusion proteins in which said increased transcription of said first nucleotide sequence occurs in the absence of said second fusion protein;
(c) recombinantly expressing in a second population of yeast cells of a second mating type different from said first mating type, a second population of said second fusion proteins, each second fusion protein comprising a second protein sequence and an activation domain of a transcriptional activator, in which the activation domain is the same in each said second fusion protein, and in which said second population of second fusion proteins has a complexity of at least 100;
(d) mating said first population of yeast cells with said second population of yeast cells to form a third population of diploid yeast cells, wherein a diploid cell comprises a first fusion protein and a second fusion protein, wherein said population of diploid yeast cells contains a second nucleotide sequence operably linked to a promoter driven by a DNA binding site recognized by said DNA binding domain such that an interaction of a first fusion protein with a second fusion protein results in increased transcription of said second nucleotide sequence, in which the first and second nucleotide sequences can be the same or different, and wherein the number of diploid cells in the third population is sufficiently large to provide confidence at the level of 50% or greater that every pair wise combination of a first fusion protein and a second fusion protein is represented in the population; and
(e) detecting said increased transcription of said first and/or second nucleotide sequence, thereby detecting cells in which an interaction between a first fusion protein and a second fusion protein has occurred. - View Dependent Claims (3, 4, 5, 6, 8, 10)
(f) designating each colony in which an interaction between a first fusion protein and a second fusion protein is detected as one point of a multidimensional array in which the intersection of axes in each dimension uniquely identifies a single said colony;
(g) pooling all colonies along a simple axis to form a plurality of pooled colonies;
(h) amplifying from a first aliquot of each pooled colony a plurality of first nucleic acids, each first nucleic acid comprising a sequence encoding said first fusion protein or a portion thereof comprising said first protein sequence;
(i) amplifying from a second aliquot of each pooled colony a plurality of second nucleic acids, each second nucleic acid comprising a sequence encoding said second fusion protein or a portion thereof comprising said second protein sequence;
(j) subjecting said first nucleic acids from each pooled colony to size separation;
(k) subjecting said second nucleic acids from each pooled colony to size separation;
(l) identifying which at least one of said first nucleic acids are present in samples of first nucleic acids from a pooled colony from axes in each dimension, thereby indicating that said at least one first nucleic acid is present in said array in the colony designated at the intersection of said axes in each dimension;
(m) identifying which at least one of said second nucleic acids are present in samples of a second nucleic acid from a pooled colony from axes in each dimension, thereby indicating that the said at least one second nucleic acid is present in said array in the colony designated at the intersection of said axes in each dimension;
in which the first and second nucleic acids that are indicated to be present in said array in a colony designated at the same intersection are indicated to encode interacting protein sequences.
- wherein said first population of yeast cells contains a first nucleotide sequence operably linked to a promoter driven by one or more DNA binding sites recognized by said DNA binding domain such that an interaction of a first fusion protein with a second fusion protein, said second fusion protein comprising a transcriptional activation domain, results in increased transcription of said first nucleotide sequence, and in which said first population of first fusion proteins has a complexity of at least 100;
-
10. The method according to claim 8 which further comprises subjecting said pooled colonies of first nucleic acids to a method for identifying, classifying, or quantifying one or more nucleic acids in a sample, said method comprising:
-
(a) probing said sample with one or more recognition means, each recognition means causing recognition of a target nucleotide subsequence or a set of target nucleotide subsequences;
(b) generating one or more signals from said sample probed by said recognition means, each generated signal arising from a nucleic acid in said sample and comprising a representation of (i) the identities of effective subsequences, each said effective subsequence being a subsequence comprising a target subsequence, or the identities of sets of effective subsequences, each said set having member effective subsequences each of which comprises a different target subsequence from one of said sets of target sequences, and (ii) the length between occurrences of effective subsequences in said nucleic acid or between one occurrence of one effective subsequence and the end of said nucleic acid; and
(c) searching a nucleotide sequence database to determine sequences that match or the absence of any sequences that match said one or more generated signals, said database comprising a plurality of known nucleotide sequences of nucleic acids that may be present in the sample, a sequence from said database matching a generated signal when the sequence from said database has both (i) the same length between occurrences of effective subsequences or the same length between one occurrence of one effective target subsequence and the end of the sequence as is represented by the generated signal, and (ii) the same effective subsequences as are represented by the generated signal, or effective subsequences that are members of the same sets of effective subsequences as are represented by the generated signal, whereby said one or more nucleic acids in said sample are identified, classified, or quantified.
-
-
7. A method of detecting one or more protein-protein interactions comprising
(a) introducing into a first population of cells of Saccharomyces cerevisiae a first population of first plasmids, each said first plasmid encoding and capable of expressing in the first population of cells (i) TRP1, and (ii) a first population of first fusion proteins, each said first fusion protein comprising a GAL4 DNA binding domain and a first protein sequence, in which said first population of first fusion proteins has a complexity of at least 100, and in which said first population of cells (i) is of a first mating type selected from the group consisting of a and α - , (ii) is mutant in endogenous URA3 and IHS3, (iii) contains functional URA3 coding sequences under the control of a promoter containing GAL4 binding sites, and (iv) contains functional lacZ coding sequences under the control of a promoter containing GAL4 binding sites;
(b) introducing into a second population of cells of Saccharomyces cerevisiae a second population of second plasmids, each said second plasmids encoding and capable of expressing in the second population of cells (i) LEU2, and (ii) a second population of second fusion proteins, each said second fusion protein comprising a GAL4 transcriptional activation domain and a second protein sequence, in which said second population of second fusion proteins has a complexity of at least 100, and in which said second population of cells (i) is of a second mating type different from said first mating type and selected from the group consisting of a and α
, (ii) is mutant in endogenous URA3 and HIS3, (iii) contains functional HIS3 coding sequences under the control of a promoter containing GAL4 binding sites, and (iv) contains functional lacZ coding sequences under the control of a promoter containing GAL4 binding sites;
(c) after step (a), incubating said first population of cells in an environment lacking tryptophan and containing 5-fluoroorotic acid;
(d) pooling surviving cells from said first population after step (c);
(e) after step (b), incubating said second population of cells in an environment lacking leucine;
(f) pooling surviving cells from said second population after step (e);
(g) mating the pooled cells from said first population and the pooled cells from said second population by mixing the cells together, applying the cells to a solid medium and incubating the cells, to form diploid cells, wherein a diploid cell comprises a first fusion protein and a second fusion protein and wherein the number of diploid cells is sufficiently large to provide confidence at the level of 50% or greater that every pair wise combination of a first fusion protein and a second fusion protein is represented in the population; and
(h) incubating the diploid cells in an environment lacking uracil, histidine, tryptophan and leucine, to select diploid cells containing a said first plasmid and a said second plasmid and in which transcription of the URA3 and HIS3 coding sequences has been activated, thereby indicating that a first fusion protein has interacted with a second fusion protein within the diploid cell, thereby detecting one or more protein-protein interactions. - View Dependent Claims (9, 11)
(f) designating each colony in which an interaction between a first fusion protein and a second fusion protein is detected as one point of a multidimensional array in which the intersection of axes in each dimension uniquely identifies a single said colony;
(g) pooling all colonies along a simple axis to form a plurality of pooled colonies;
(h) amplifying from a first aliquot of each pooled colony a plurality of first DNA molecules, each first DNA molecule comprising a sequence encoding said first fusion protein or a portion thereof comprising said first protein sequence;
(i) amplifying from a second aliquot of each pooled colony a plurality of second DNA molecules, each second DNA molecule comprising a sequence encoding said second fusion protein or a portion thereof comprising said second protein sequence;
(j) subjecting said first DNA molecules from each pooled colony to size separation;
(k) subjecting said second DNA molecules from each pooled colony to size separation;
(l) identifying which at least one of said first DNA molecules are present in samples of first DNA molecules from a pooled colony from axes in each dimension, thereby indicating that said at least one first DNA molecule is present in said array in the colony designated at the intersection of said axes in each dimension;
(m) identifying which at least one of said second DNA molecules are present in samples of a second DNA molecule from a pooled colony from axes in each dimension, thereby indicating that the said at least one second DNA molecule is present in said array in the colony designated at the intersection of said axes in each dimension;
in which the first and second DNA molecules that are indicated to be present in said array in a colony designated at the same intersection are indicated to encode interacting protein sequences.
- , (ii) is mutant in endogenous URA3 and IHS3, (iii) contains functional URA3 coding sequences under the control of a promoter containing GAL4 binding sites, and (iv) contains functional lacZ coding sequences under the control of a promoter containing GAL4 binding sites;
-
11. The method according to claim 9 which further comprises subjecting said pooled colonies of first DNA molecules to a method for identifying, classifying, or quantifying one or more DNA molecules in a sample, said method comprising:
-
(a) probing said sample with one or more recognition means, each recognition means causing recognition of a target nucleotide subsequence or a set of target nucleotide subsequences;
(b) generating one or more signals from said sample probed by said recognition means, each generated signal arising from a nucleic acid in said sample and comprising a representation of (i) the identities of effective subsequences, each said effective subsequence being a subsequence comprising a target subsequence, or the identities of sets of effective subsequences, each said set having member effective subsequences each of which comprises a different target subsequence from one of said sets of target sequences, and (ii) the length between occurrences of effective subsequences in said nucleic acid or between one occurrence of one effective subsequence and the end of said nucleic acid; and
(c) searching a nucleotide sequence database to determine sequences that match or the absence of any sequences that match said one or more generated signals, said database comprising a plurality of known nucleotide sequences of nucleic acids that may be present in the sample, a sequence from said database matching a generated signal when the sequence from said database has both (i) the same length between occurrences of effective subsequences or the same length between one occurrence of one effective target subsequence and the end of the sequence as is represented by the generated signal, and (ii) the same effective subsequences as are represented by the generated signal, or effective subsequences that are members of the same sets of effective subsequences as are represented by the generated signal, whereby said one or more nucleic acids in said sample are identified, classified, or quantified.
-
-
12. A method of detecting one or more protein-protein interactions comprising
(a) recombinantly expressing within a population of host cells (i) a first population of first fusion proteins, each said first fusion protein comprising a first protein sequence and a DNA binding domain in which the DNA binding domain is the same in each said first fusion protein, and in which said first population of first fusion proteins has a complexity of at least 500; - and
(ii) a second population of second fusion proteins, each said second fusion protein comprising a second protein sequence and a transcriptional regulatory domain of a transcriptional regulator, in which the transcriptional regulatory domain is the same in each said second fusion protein, such that a first fusion protein is co-expressed with a second fusion protein in host cells, and wherein said host cells contain at least one nucleotide sequence operably linked to a promoter driven by one or more DNA binding sites recognized by said DNA binding domain such that interaction of a first fusion protein with a second fusion protein results in regulation of transcription of said at least one nucleotide sequence by said regulatory domain, and in which said second population of second fusion proteins has a complexity of at least 500;
wherein a host cell comprises a first fusion protein and a second fusion protein, and the number of host cells in the population is sufficiently large to provide confidence at the level of 50% or greater that every pair wise combination of a first fusion protein and a second fusion protein is represented in the population; and
(b) detecting said regulation of transcription of said at least one nucleotide sequence, thereby detecting cells in which an interaction between a first fusion protein and a second fusion protein has occurred.
- and
-
13. A method of detecting one or more protein-protein interactions comprising
(a) recombinantly expressing in a first population of yeast cells of a first mating type, a first population of first fusion proteins, each first fusion protein comprising a first protein sequence and a DNA binding domain, in which the DNA binding domain is the same in each said first fusion protein; - wherein said first population of yeast cells contains a first nucleotide sequence operably linked to a promoter driven by one or more DNA binding sites recognized by said DNA binding domain such that an interaction of a first fusion protein with a second fusion protein, said second fusion protein comprising a transcriptional activation domain, results in increased transcription of said first nucleotide sequence, and in which said first population of first fusion proteins has a complexity of at least 500;
(b) negatively selecting to reduce the number of those yeast cells expressing said first population of first fusion proteins in which said increased transcription of said first nucleotide sequence occurs in the absence of said second fusion protein;
(c) recombinantly expressing in a second population of yeast cells of a second mating type different from said first mating type, a second population of said second fusion proteins, each second fusion protein comprising a second protein sequence and an activation domain of a transcriptional activator, in which the activation domain is the same in each said second fusion protein, and in which said second population of second fusion proteins has a complexity of at least 500;
(d) mating said first population of yeast cells with said second population of yeast cells to form a third population of diploid yeast cells wherein a diploid cell comprises a first fusion protein and a second fusion protein, wherein said population of diploid yeast cells contains a second nucleotide sequence operably linked to a promoter driven by a DNA binding site recognized by said DNA binding domain such that an interaction of a first fusion protein with a second fusion protein results in increased transcription of said second nucleotide sequence, in which the first and second nucleotide sequences can be the same or different, and wherein the number of diploid cells in the third population is sufficiently large to provide confidence at the level of 50% or greater than every pair wise combination of a first fusion protein and a second fusion protein is represented in the population; and
(e) detecting said increased transcription of said first and/or second nucleotide sequence, thereby detecting cells in which an interaction between a first fusion protein and a second fusion protein has occurred. - View Dependent Claims (15)
(f) designating each colony in which an interaction between a first fusion protein and a second fusion protein is detected as one point of a multidimensional array in which the intersection of axes in each dimension uniquely identifies a single said colony;
(g) pooling all colonies along a simple axis to form a plurality of pooled colonies;
(h) amplifying from a first aliquot of each pooled colony a plurality of first nucleic acids, each first nucleic acid comprising a sequence encoding said first fusion protein or a portion thereof comprising said first protein sequence;
(i) amplifying from a second aliquot of each pooled colony a plurality of second nucleic acids, each second nucleic acid comprising a sequence encoding said second fusion protein or a portion thereof comprising said second protein sequence;
(j) subjecting said first nucleic acids from each pooled colony to size separation;
(k) subjecting said second nucleic acids from each pooled colony to size separation;
(l) identifying which at least one of said first nucleic acids are present in samples of first nucleic acids from a pooled colony from axes in each dimension, thereby indicating that said at least one first nucleic acid is present in said array in the colony designated at the intersection of said axes in each dimension;
(m) identifying which at least one of said second nucleic acids are present in samples of a second nucleic acid from a pooled colony from axes in each dimension, thereby indicating that the said at least one second nucleic acid is present in said array in the colony designated at the intersection of said axes in each dimension;
in which the first and second nucleic acids that are indicated to be present in said array in a colony designated at the same intersection are indicated to encode interacting protein sequences.
- wherein said first population of yeast cells contains a first nucleotide sequence operably linked to a promoter driven by one or more DNA binding sites recognized by said DNA binding domain such that an interaction of a first fusion protein with a second fusion protein, said second fusion protein comprising a transcriptional activation domain, results in increased transcription of said first nucleotide sequence, and in which said first population of first fusion proteins has a complexity of at least 500;
-
14. A method of detecting one or more protein-protein interactions comprising
(a) introducing into a first population of cells of Saccharomyces cerevisiae a first population of first plasmids, each said first plasmid encoding and capable of expressing in the first population of cells (i) TRP1, and (ii) a first population of first fusion proteins, each said first fusion protein comprising a GAL4 DNA binding domain and a first protein sequence, in which said first population of first fusion proteins has a complexity of at least 500, and in which said first population of cells (i) is of a first mating type selected from the group consisting of a and α - , (ii) is mutant in endogenous URA3 and HIS3, (iii) contains functional URA3 coding sequences under the control of a promoter containing GAL4 binding sites, and (iv) contains functional lacZ coding sequences under the control of a promoter containing GAL4 binding sites;
(b) introducing into a second population of cells of Saccharomyces cerevisiae a second population of second plasmids, each said second plasmids encoding and capable of expressing in the second population of cells (i) LEU2, and (ii) a second population of second fusion proteins, each said second fusion protein comprising a GAL4 transcriptional activation domain and a second protein sequence, in which said second population of second fusion proteins has a complexity of at least 500, and in which said second population of cells (i) is of a second mating type different from said first mating type and selected from the group consisting of a and , (ii) is mutant in endogenous URA3 and HIS3, (iii) contains functional HIS3 coding sequences under the control of a promoter containing GAL4 binding sites, and (iv) contains functional lacZ coding sequences under the control of a promoter containing GALA binding sites;
(c) after step (a), incubating said first population of cells in an environment lacking tryptophan and containing 5-fluoroorotic acid;
(d) pooling surviving cells from said first population after step (c);
(e) after step (b), incubating said second population of cells in an environment lacking leucine;
(f) pooling surviving cells from said second population after step (e);
(g) mating the pooled cells from said first population and the pooled cells from said second population by mixing the cells together, applying the cells to a solid medium and incubating the cells, to form diploid cells, wherein a diploid cell comprises a first fusion protein and a second fusion protein and wherein the number of diploid cells is sufficiently large to provide confidence at the level of 50% or greater that every pair wise combination of a first fusion protein and second fusion protein is represented in the population; and
(h) incubating the diploid cells in an environment lacking uracil, histidine, tryptophan and leucine, to select diploid cells containing a said first plasmid and a said second plasmid and in which transcription of the URA3 and HIS3 coding sequences has been activated, thereby indicating that a first fusion protein has interacted with a second fusion protein within the diploid cell, thereby detecting one or more protein-protein interactions. - View Dependent Claims (16)
(f) designating each colony in which an interaction between a first fusion protein and a second fusion protein is detected as one point of a multidimensional array in which the intersection of axes in each dimension uniquely identifies a single said colony;
(g) pooling all colonies along a simple axis to form a plurality of pooled colonies;
(h) amplifying from a first aliquot of each pooled colony a plurality of first DNA molecules, each first DNA molecule comprising a sequence encoding said first fusion protein or a portion thereof comprising said first protein sequence;
(i) amplifying from a second aliquot of each pooled colony a plurality of second DNA molecules, each second DNA molecule comprising a sequence encoding said second fusion protein or a portion thereof comprising said second protein sequence;
(j) subjecting said first DNA molecules from each pooled colony to size separation;
(k) subjecting said second DNA molecules from each pooled colony to size separation;
(l) identifying which at least one of said first DNA molecules are present in samples of first DNA molecules from a pooled colony from axes in each dimension, thereby indicating that said at least one first DNA molecule is present in said array in the colony designated at the intersection of said axes in each dimension;
(m) identifying which at least one of said second DNA molecules are present in samples of a second DNA molecule from a pooled colony from axes in each dimension, thereby indicating that the said at least one second DNA molecule is present in said array in the colony designated at the intersection of said axes in each dimension;
in which the first and second DNA molecules that are indicated to be present in said array in a colony designated at the same intersection are indicated to encode interacting protein sequences.
- , (ii) is mutant in endogenous URA3 and HIS3, (iii) contains functional URA3 coding sequences under the control of a promoter containing GAL4 binding sites, and (iv) contains functional lacZ coding sequences under the control of a promoter containing GAL4 binding sites;
Specification