Compositions and methods for identification of a duplicate sequencing read
First Claim
Patent Images
1. A kit comprising:
- in a suitable container;
a plurality of synthetic forward adaptors, wherein each forward adaptor is at least partially double-stranded and comprises;
(i) an indexing primer binding site;
(ii) an indexing site;
(iii) an identifier site consisting of between 3 and 8 nucleotides, wherein each identifier site has a sequence that differs from the every other identifier site sequence in the kit at at least three nucleotide positions; and
(iv) a target sequencing primer binding site comprising SEQ ID NO;
3;
wherein each indexing site is unique amongst a subset of the adaptors and is an index for multiple polynucleotides, thus allowing for samples to be pooled together for a multiplexed sequencing run; and
wherein the sequence of the identifier site is variable in sequence content in the plurality of adaptors and is used to identify duplicate sequence reads in the multiplexed sequencing run.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides methods, compositions and kits for detecting duplicate sequencing reads. In some embodiments, the duplicate sequencing reads are removed.
-
Citations
14 Claims
-
1. A kit comprising:
-
in a suitable container; a plurality of synthetic forward adaptors, wherein each forward adaptor is at least partially double-stranded and comprises; (i) an indexing primer binding site; (ii) an indexing site; (iii) an identifier site consisting of between 3 and 8 nucleotides, wherein each identifier site has a sequence that differs from the every other identifier site sequence in the kit at at least three nucleotide positions; and (iv) a target sequencing primer binding site comprising SEQ ID NO;
3;wherein each indexing site is unique amongst a subset of the adaptors and is an index for multiple polynucleotides, thus allowing for samples to be pooled together for a multiplexed sequencing run; and
wherein the sequence of the identifier site is variable in sequence content in the plurality of adaptors and is used to identify duplicate sequence reads in the multiplexed sequencing run. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
Specification