Gene discovery using microarrays
First Claim
Patent Images
1. A method of identifying the location of exons within the genome of a species of organism comprising:
- (a) contacting a sample comprising RNAs or nucleic acids derived therefrom from one or more cells of said species of organism with an array, said array comprising a positionally-addressable ordered array of polynucleotide probes bound to a solid support, said polynucleotide probes comprising a first plurality of at least 100 polynucleotide probes of different, predetermined nucleotide sequences, each said different nucleotide sequence comprising a sequence complementary and hybridizable to a different genomic sequence of the same species of organism, said respective genomic sequences for the probes being found at sequential predetermined sites in said genome of said species of organism, said contacting being under conditions conducive to hybridization between said RNAs or nucleic acids derived therefrom and said probes;
(b) identifying the one or more probes to which hybridization of one or more of said RNAs or nucleic acids derived therefrom occurs; and
(c) identifying said genomic sequences for each said identified probe as the location of an exon within the genome of said species of organism.
3 Assignments
0 Petitions
Accused Products
Abstract
The invention relates to methods and systems (e.g., computer systems and computer program products) for identifying and characterizing genes using microarrays. In particular, the invention provides for improved, robust methods for detecting genes through the use of microarrays to analyze the expression state of the genome. Genes which are expressed can be mapped to their respective positions in the genome, and the structure of such genes can be determined.
96 Citations
41 Claims
-
1. A method of identifying the location of exons within the genome of a species of organism comprising:
-
(a) contacting a sample comprising RNAs or nucleic acids derived therefrom from one or more cells of said species of organism with an array, said array comprising a positionally-addressable ordered array of polynucleotide probes bound to a solid support, said polynucleotide probes comprising a first plurality of at least 100 polynucleotide probes of different, predetermined nucleotide sequences, each said different nucleotide sequence comprising a sequence complementary and hybridizable to a different genomic sequence of the same species of organism, said respective genomic sequences for the probes being found at sequential predetermined sites in said genome of said species of organism, said contacting being under conditions conducive to hybridization between said RNAs or nucleic acids derived therefrom and said probes;
(b) identifying the one or more probes to which hybridization of one or more of said RNAs or nucleic acids derived therefrom occurs; and
(c) identifying said genomic sequences for each said identified probe as the location of an exon within the genome of said species of organism. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
(a) said polynucleotide probes further comprise a second plurality of polynucleotide probes comprising a sequence complementary and hybridizable to said first plurality; and
(b) said identifying step comprises using a hybridization signal generated in said contacting step from said second plurality to filter a hybridization signal generated in said contacting step from said first plurality.
-
-
32. The method of claim 1, wherein:
-
(a) said sample comprises RNAs or nucleic acids derived therefrom from (i) a first cell or cells of a first tissue type or subject to a first condition, and (ii) a second cell or cells of a second tissue type different from said first tissue type or subject to a second condition different from said first condition; and
(b) said identifying step comprises comparing a hybridization signal generated in said contacting step from said first cell or cells to a hybridization signal generated in said contacting step from said second cell or cells.
-
-
33. The method of claim 1, wherein said plurality of probes is tiled across a sequence predicted to contain, or known to contain, exons.
-
34. The method of claim 1, wherein said plurality of probes includes known expressed sequence tags (ESTs) or predicted exons.
-
35. The method of claim 1, wherein each of said plurality of probes is complementary and hybridizable to a predicted or known exon.
-
36. The method of claim 1, further comprising a sample comprising a population of cellular RNA or nucleic acid derived therefrom on the surface of said solid support such that said sample is in contact with said polynucleotide probes, under conditions conducive to hybridization between said population and said polynucleotide probes.
-
37. The method of claim 36 wherein said population is labeled.
-
38. The method of claim 36 wherein said population comprises total cellular mRNA or nucleic acid derived therefrom.
-
39. The method of claim 36 wherein said population comprises nucleic acids of at least 10,000 different sequences.
-
40. A computer system for identifying the location of exons within the genome of a species of organism, said computer system comprising:
-
one or more processor units; and
one or more memory units connected to said one or more processor units, said one or more memory units containing one or more programs which cause said one or more processor units to execute steps of;
(a) receiving a first data structure comprising a first plurality of measured hybridization signals from an array comprising a positionally-addressable ordered array of polynucleotide probes bound to a solid support, said polynucleotide probes comprising a second plurality of at least 100 polynucleotide probes of different, predetermined nucleotide sequences, each said different nucleotide sequence comprising a sequence complementary and hybridizable to a different genomic sequence of the same species of organism, said respective genomic sequences for the probes being found at sequential predetermined sites in said genome of said species of organism, said array contacting a sample comprising RNAs or nucleic acids derived therefrom from one or more cells of said species of organism with said array, said contacting being under conditions conducive to hybridization between said RNAs or nucleic acids derived therefrom and said probes;
(b) receiving a second data structure comprising the nucleotide sequence of said genome of said organism;
(c) receiving a third data structure comprising the nucleotide sequence of said second plurality of polynucleotide probes, said third data structure identifying the positional location of each said probe on said array;
(d) identifying the one or more probes to which hybridization of one or more of said RNAs or nucleic acids derived therefrom occurs;
(e) identifying said genomic sequences for each said identified probe as the location of an exon within the genome of said species of organism; and
(f) outputting the locations of said exons with respect to the nucleotide sequence of said genome of said organism.
-
-
41. A computer program product for identifying the location of exons within the genome of a species of organism, the computer program product for use in conjunction with a computer having a memory and a processor, the computer program product comprising a computer readable storage medium having a computer program mechanism encoded thereon, wherein said computer program mechanism can be loaded into the one or more memory units of a computer and cause the one or more processor units of the computer to execute the steps of:
-
(a) receiving a first data structure comprising a first plurality of measured hybridization signals from an array comprising a positionally-addressable ordered array of polynucleotide probes bound to a solid support, said polynucleotide probes comprising a second plurality of at least 100 polynucleotide probes of different, predetermined nucleotide sequences, each said different nucleotide sequence comprising a sequence complementary and hybridizable to a different genomic sequence of the same species of organism, said respective genomic sequences for the probes being found at sequential predetermined sites in said genome of said species of organism, said array contacting a sample comprising RNAs or nucleic acids derived therefrom from one or more cells of said species of organism with said array, said contacting being under conditions conducive to hybridization between said RNAs or nucleic acids derived therefrom and said probes;
(b) receiving a second data structure comprising the nucleotide sequence of said genome of said organism;
(c) receiving a third data structure comprising the nucleotide sequence of said second plurality of polynucleotide probes, said third data structure identifying the positional location of each said probe on said array;
(d) identifying the one or more probes to which hybridization of one or more of said RNAs or nucleic acids derived therefrom occurs;
(e) identifying said genomic sequences for each said identified probe as the location of an exon within the genome of said species of organism; and
(f) outputting the locations of exon boundaries with respect to the nucleotide sequence of said genome of said organism.
-
Specification