Method of sequencing genomes by hybridization of oligonucleotide probes
First Claim
1. A plurality of oligonucleotide probes each having a predetermined sequence and the same predetermined length n, each probe having a contiguous overlapping common sequence of length n-1 with at least one other probe of the plurality, wherein, when the probes are hybridized to a nucleic acid, the probe sequences or a subset of the probe sequences determine a sequence of the nucleic acid that is longer than n.
1 Assignment
0 Petitions
Accused Products
Abstract
The conditions under which oligonucleotides hybridize only with entirely homologous sequences are recognized. The sequence of a given DNA fragment is read by the hybridization and assembly of positively hybridizing probes through overlapping portions. By simultaneous hybridization of DNA molecules applied as dots and bound onto a filter, representing single-stranded phage vector with the cloned insert, with about 50,000 to 100,000 groups of probes, the main type of which is (A,T,C,G)(A,T,C,G)N8(A,T,C,G), information for computer determination of a sequence of DNA having the complexity of a mammalian genome are obtained in one step. To obtain a maximally completed sequence, three libraries are cloned into the phage vector, M13, bacteriophage are used: with the 0.5 kb and 7 kbp insert consisting of two sequences, with the average distance in genomic DNA of 100 kbp. For a million bp of genomic DNA, 25,000 subclones of the 0.5 kbp are required as well as 700 subclones 7 kb long and 170 jumping subclones. Subclones of 0.5 kb are applied on a filter in groups of 20 each, so that the total number of samples is 2,120 per million bp. The process can be easily and entirely robotized for factory reading of complex genomic fragments or DNA molecules.
217 Citations
18 Claims
- 1. A plurality of oligonucleotide probes each having a predetermined sequence and the same predetermined length n, each probe having a contiguous overlapping common sequence of length n-1 with at least one other probe of the plurality, wherein, when the probes are hybridized to a nucleic acid, the probe sequences or a subset of the probe sequences determine a sequence of the nucleic acid that is longer than n.
-
9. A plurality of oligonucleotide probes, each having a predetermined sequence and a predetermined length, each probe having a contiguous overlapping common sequence with at least one other probe in the plurality of oligonucleotide probes, wherein at least a subset of said plurality of probes hybridizes to a completely complementary region of a target nucleic acid, and when the contiguous overlapping common sequences of said oligonucleotide probes that hybridize to said target nucleic acid are overlapped, said subset of oligonucleotide probes defines a contiguous stretch of the sequence of the target nucleic acid.
-
11. A plurality of oligonucleotide probes, each having a predetermined sequence and the same predetermined length n, each probe having a contiguous overlapping common sequence with at least one other probe of the plurality of oligonucleotide probes, wherein, when the probes are hybridized to a target nucleic acid that includes an insert of a subclone, the probe sequences or a subset of the probe sequences determine a terminal sequence of said insert.
-
16. A set of oligonucleotide probes comprising 65,356 subsets of probes, wherein each subset is composed of the complete set of 64 probes having the formula NN(N8)N, wherein each N is independently selected from the group consisting of A, T, C and G and N8 is the same octanucleotide, wherein the octanucleotide of each subset is unique, and wherein, when the probes are hybridized to a nucleic acid, the probe sequences or a subset of the probe sequences determine a sequence of the nucleic acid.
-
17. A set of oligonucleotide probes comprising three groups of subsets of probes, wherein, when the probes are hybridized to a nucleic acid the probe sequences or a subset of the probe sequences determine a sequence of the nucleic acid and wherein:
-
(i) the first group comprises 1024 first subsets, wherein each first subset is composed of the complete set of 4 probes having the formula N(N10), wherein N is selected from the group consisting of A, T, C and G and N10 is the same decanucleotide having a G+C content of zero, and wherein the decanucleotide of each first subset is unique; (ii) the second group comprises 23044 second subsets, wherein each second subset is composed of the complete set of 16 probes having the formula N(N9)N, wherein each N is independently selected from the group consisting of A, T, C and G and N9 is the same nonanucleotide having a G+C content of 1 or 2, and wherein the nonanucleotide of each second subset is unique; and (iii) the third group comprises 56064 third subsets, wherein each third subset is composed of the complete set of 64 probes having the formula NN(N8)N, wherein each N is independently selected from the group consisting of A, T, C and G and N8 is the same octanucleotide having a G+C content of at least 3, and wherein the octanucleotide of each third subset is unique.
-
-
18. A plurality of oligonucleotide probes having the same predetermined length n, wherein:
-
(i) n is an integer from 11 to 20; (ii) the probes are divided into a plurality of groups, each group defined by a different predetermined subsequence of fixed length m, wherein m is an integer from 5 to n, the subsequence starting at a predetermined position within the oligonucleotide probes; (iii) each group including all possible combinations of oligonucleotide probes of length n having the same subsequence of length m beginning at the predetermined position within the oligonucleotide probes; and (iv) each subsequence including a contiguous overlapping common sequence of length m-1 with a subsequence of another oligonucleotide probe in the plurality of to oligonucleotide probes; wherein, when the oligonucleotide probes are hybridized to a nucleic acid, the subsequences of the oligonucleotide probes or a subset of the probe sequences that hybridize to the nucleic acid determine a contiguous sequence, longer than m, of the nucleic acid.
-
Specification