Efficient shotgun sequencing methods
First Claim
1. A method of determining sequence of a target nucleic acid comprising:
- (a) sequencing the target nucleic acid to produce primary sequence information for the target nucleic acid;
(b) identifying missing sequences and/or low confidence sequences in the target nucleic acid from the primary sequence information determined in step (a);
(c) synthesizing a plurality of target-specific oligonucleotides, wherein each of said plurality of oligonucleotides corresponds to at least one of the sequences identified in step (b);
(d) selecting, from a library of fragments of the target nucleic acid, fragments that hybridize with the target-specific oligonucleotides synthesized in step (c);
(e) sequencing fragments selected in step (d) to produce sequence information for the selected fragments; and
(f) assembling sequence information for the selected fragments determined in step (e) with the primary sequence information determined in step (a) to produce an assembled sequence, thereby determining sequence of the target nucleic acid.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods are provided for efficient shotgun sequencing to allow efficient selection and sequencing of nucleic acids of interest contained in a library. The nucleic acids of interest can be defined any time before or after preparation of the library. One example of nucleic acids of interest is missing or low confidence genome sequences resulting from an initial sequencing procedure. Other nucleic acids of interest include subsets of genomic DNA, RNA or cDNAs (exons, genes, gene sets, transciptomes). By designing an efficient (simple to implement, speedy, high specificity, low cost) selection procedure, a more complete sequence is achieved with less effort than by using highly redundant shotgun sequencing in an initial sequencing procedure.
-
Citations
34 Claims
-
1. A method of determining sequence of a target nucleic acid comprising:
-
(a) sequencing the target nucleic acid to produce primary sequence information for the target nucleic acid; (b) identifying missing sequences and/or low confidence sequences in the target nucleic acid from the primary sequence information determined in step (a); (c) synthesizing a plurality of target-specific oligonucleotides, wherein each of said plurality of oligonucleotides corresponds to at least one of the sequences identified in step (b); (d) selecting, from a library of fragments of the target nucleic acid, fragments that hybridize with the target-specific oligonucleotides synthesized in step (c); (e) sequencing fragments selected in step (d) to produce sequence information for the selected fragments; and (f) assembling sequence information for the selected fragments determined in step (e) with the primary sequence information determined in step (a) to produce an assembled sequence, thereby determining sequence of the target nucleic acid. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A method for sequencing a target nucleic acid, comprising:
-
(a) obtaining nucleotide sequence information for at least a portion of the target nucleic acid; (b) identifying missing sequences and/or low confidence sequences in the target nucleic acid from the nucleotide sequence information obtained in step (a); (c) enriching fragments of the target nucleic acid from a fragment library according to whether they correspond to a sequence of interest identified in step (b); (d) obtaining nucleotide sequence information for the fragments enriched in step (c); and (e) assembling nucleotide sequence information determined in step (e) and step (a). - View Dependent Claims (27, 28, 29, 30, 31)
-
-
32. An improved method for sequencing a human genome, wherein the method comprises preparing overlapping fragments of the genome, obtaining multiple sequence reads for said overlapping fragments of the genome;
- and assembling the reads into assembled sequence information, the improvement comprising;
(a) assembling sequence reads from fragments of the genome to obtain a primary assembly; (b) identifying missing sequences, low confidence sequences, and/or sequences that differ between the primary assembly and a reference sequence in said human genome from the primary assembly; (c) synthesizing a plurality of target-specific oligonucleotides, each of which corresponds to a sequence identified in step (b); (d) selecting fragments from a library of fragments of the target nucleic acid that hybridize with the oligonucleotides synthesized in step (c); (e) obtaining sequence reads for the fragments selected in step (d); and (f) assembling sequence reads obtained in step (e) with the primary assembly, thereby obtaining more complete sequence information.
- and assembling the reads into assembled sequence information, the improvement comprising;
-
33. A computer controlled apparatus configured and programmed for sequencing a genome of a human organism according to a method that comprises the following steps:
-
(a) assembling sequence reads from fragments of the genome to obtain a primary assembly; (b) identifying missing sequences and/or low confidence sequences in said human genome from the primary assembly; (c) synthesizing a plurality of target-specific oligonucleotides, each of which corresponds to a sequence identified in step (b); (d) selecting fragments from a library of fragments of the target nucleic acid that hybridize with the oligonucleotides synthesized in step (c); (e) obtaining sequence reads for the fragments selected in step (d); and (f) assembling sequence reads obtained in step (e) with the primary assembly, thereby obtaining more complete sequence information. - View Dependent Claims (34)
-
Specification