SEQUENCE TAG DIRECTED SUBASSEMBLY OF SHORT SEQUENCING READS INTO LONG SEQUENCING READS
First Claim
Patent Images
1. A method for preparing a DNA sequencing library, comprising:
- (a) circularizing a target fragment library with a plurality of adaptor molecules to produce a population of circularized double-stranded DNA molecules, wherein the plurality of adaptor molecules comprises a first defined sequence P1, a degenerate sequence tag, and a second defined sequence P2, such that at least one circularized double-stranded DNA molecule comprises a non-degenerate sequence tag and a member of the target fragment library;
(b) amplifying the population of circularized double-stranded DNA molecules to produce a plurality of copies of each circularized double-stranded DNA molecule, wherein the copies of each circularized double-stranded DNA molecule comprise the same non-degenerate sequence tag;
(c) fragmenting the plurality of copies of each circularized double-stranded DNA molecule to produce a plurality of linear double-stranded DNA molecules, wherein the plurality of linear double-stranded DNA molecules may be the same or different, and at least one of the plurality of linear double-stranded DNA molecules contains the non-degenerate sequence tag present in the plurality of copies of each circularized double-stranded DNA molecule;
(d) adding a third defined sequence P3 to at least one of a first end and a second end of at least one of the plurality of linear double-stranded DNA molecules from step (c); and
(e) amplifying a region of at least one of the plurality of linear double-stranded DNA molecules to produce a plurality of amplicons, wherein at least one amplicon comprises the non-degenerate sequence tag and sequence complementary to a portion of a single member of the target fragment library.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention provides compositions and methods for preparing DNA sequencing libraries. In particular, the method relates to preparing DNA sequencing libraries from kilobase scale nucleic acids. The invention also provides methods for assembling short read sequencing data into longer contiguous sequences. The method is useful for various applications in genomics, including genome assembly, full length cDNA sequencing, metagenomics, and the analysis of repetitive sequences of assembled genomes.
354 Citations
28 Claims
-
1. A method for preparing a DNA sequencing library, comprising:
-
(a) circularizing a target fragment library with a plurality of adaptor molecules to produce a population of circularized double-stranded DNA molecules, wherein the plurality of adaptor molecules comprises a first defined sequence P1, a degenerate sequence tag, and a second defined sequence P2, such that at least one circularized double-stranded DNA molecule comprises a non-degenerate sequence tag and a member of the target fragment library; (b) amplifying the population of circularized double-stranded DNA molecules to produce a plurality of copies of each circularized double-stranded DNA molecule, wherein the copies of each circularized double-stranded DNA molecule comprise the same non-degenerate sequence tag; (c) fragmenting the plurality of copies of each circularized double-stranded DNA molecule to produce a plurality of linear double-stranded DNA molecules, wherein the plurality of linear double-stranded DNA molecules may be the same or different, and at least one of the plurality of linear double-stranded DNA molecules contains the non-degenerate sequence tag present in the plurality of copies of each circularized double-stranded DNA molecule; (d) adding a third defined sequence P3 to at least one of a first end and a second end of at least one of the plurality of linear double-stranded DNA molecules from step (c); and (e) amplifying a region of at least one of the plurality of linear double-stranded DNA molecules to produce a plurality of amplicons, wherein at least one amplicon comprises the non-degenerate sequence tag and sequence complementary to a portion of a single member of the target fragment library. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for preparing a DNA sequencing library, comprising:
-
(a) circularizing a target fragment library with a plurality of adaptor molecules to produce a population of first circularized double-stranded DNA molecules, wherein the plurality of adaptor molecules comprises a first defined sequence P1 comprising a first restriction enzyme recognition site R1, a degenerate sequence tag, and a second defined sequence P2 comprising a second restriction enzyme recognition site R2, such that at least one of the first circularized double-stranded DNA molecule comprises a non-degenerate sequence tag and a member of the target fragment library; (b) amplifying the population of first circularized double-stranded DNA molecules to produce a plurality of copies of each first circularized double-stranded DNA molecule, wherein the copies of each first circularized double-stranded DNA molecule comprise the same non-degenerate sequence tag; (c) fragmenting the plurality of copies of each first circularized double-stranded DNA molecule to produce a plurality of first linear double-stranded DNA molecules, wherein the plurality of first linear double-stranded DNA molecules may be the same or different, and at least one of the plurality of first linear double-stranded DNA molecules contains the non-degenerate sequence tag present in the plurality of copies of each first circularized double-stranded DNA molecule; (d) adding a third defined sequence P3 to at least one of a first end and a second end of at least one of the plurality of first linear double-stranded DNA molecules from step (c); (e) digesting at least one of the first linear double-stranded DNA molecules from step (d) with restriction enzyme R1, thereby producing an R1 digested double-stranded DNA molecule; (f) circularizing the R1 digested double-stranded DNA molecule with a first bridging oligonucleotide B1 to generate a second circularized double-stranded DNA molecule; (g) amplifying the second circularized double-stranded DNA molecule of step (f) to produce a plurality of copies of the second circularized double-stranded DNA molecule; (h) fragmenting the plurality of copies of the second circularized double-stranded DNA molecule to produce a plurality of second linear double-stranded DNA molecules, wherein at least one of the plurality of second linear double-stranded DNA molecules contains the non-degenerate sequence tag present in the plurality of copies of the second circularized double-stranded DNA molecule; (i) adding a fourth defined sequence P4 to at least one of a first end and a second end of at least one of the plurality of second linear double-stranded DNA molecules; and (j) amplifying a region of at least one of the plurality of second linear double-stranded DNA molecules to produce a plurality of amplicons, wherein each amplicon comprises the non-degenerate sequence tag and sequence complementary to a portion of a single member of the target fragment library. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A method for preparing a DNA sequencing library, comprising:
-
(a) providing a population of circular double-stranded DNA molecules;
wherein each circular double-stranded DNA molecule comprises a vector sequence and a sequence of interest, the sequence of interest having a first end joined to a first end of the vector sequence, an internal portion, and a second end joined to a second end of the vector sequence;(b) fragmenting a portion of the population of circular double-stranded DNA molecules to produce a plurality of linear double-stranded DNA molecules; (c) adding a common adaptor sequence to at least one end of at least one of the plurality of linear double-stranded DNA molecules; and (d) amplifying a region of at least one of the plurality of linear double-stranded DNA molecules to produce a plurality of amplicons, wherein at least one amplicon comprises sequence complementary to the sequence of interest. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A method for preparing a DNA sequencing library, comprising:
-
(a) incorporating at least one first nucleic acid adaptor molecule into at least one member of a target library comprising a plurality of nucleic acid molecules, wherein at least a portion of the first adaptor molecule comprises a first defined sequence; (b) amplifying the plurality of nucleic acid molecules to produce an input library comprising a first plurality of amplified DNA molecules, wherein the amplified molecules comprise sequence identical to or complementary to at least a portion of the first adaptor molecule and sequence identical to or complementary to at least a portion of at least one member of the target library; (c) fragmenting the input library to produce a plurality of linear DNA fragments having a first end and a second end; (d) attaching at least one second nucleic acid adaptor molecule to one or both ends of at least one of the plurality of linear DNA fragments, wherein at least a portion of the second adaptor molecule comprises a second defined sequence; (e) amplifying the plurality of linear DNA fragments to produce a sequencing library comprising a second plurality of amplified DNA molecules, wherein at least one of the plurality of amplified DNA molecules comprises sequence identical to or complementary to at least a portion of the first adaptor molecule, sequence identical to or complementary to at least a portion of the second adaptor molecule, and sequence identical to or complementary to at least a portion of a member of the target library.
-
-
26. A kit for preparing a DNA sequencing library, comprising a mixture of double-stranded, partially degenerate adaptor molecules, wherein each adaptor molecule comprises a first defined sequence P1, a sequence tag that is fully or partially degenerate within the mixture of adaptor molecules, and a second defined sequence P2, wherein the degenerate sequence tag comprises from 5 to 50 randomly selected nucleotides.
-
27. A kit for preparing a DNA sequencing library, comprising a cloning vector comprising restriction enzyme recognition sites oriented such that digestion using the cognate restriction enzymes results in digestion of an insert sequence cloned into the cloning vector, thereby producing an end portion of the insert sequence that remains attached to the vector sequence after digestion.
-
28. A kit for preparing a DNA sequencing library, comprising at least one of a plurality of first nucleic acid adaptor molecules, and at least one of a plurality of second nucleic acid adaptor molecules.
Specification