SAFE SEQUENCING SYSTEM

US 20140227705A1
Filed: 04/12/2012
Published: 08/14/2014
Est. Priority Date: 04/15/2011
Status: Active Grant

First Claim

Patent Images

1. A method to analyze nucleic acid sequences, comprising:

attaching a unique identifier nucleic acid sequence (UID) to a first end of each of a plurality of analyte nucleic acid fragments to form uniquely identified analyte nucleic acid fragments;

redundantly determining nucleotide sequence of a uniquely identified analyte nucleic acid fragment, wherein determined nucleotide sequences which share a UID form a family of members;

identifying a nucleotide sequence as accurately representing an analyte nucleic acid fragment when at least 1% of members of the family contain the sequence.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The identification of mutations that are present in a small fraction of DNA templates is essential for progress in several areas of biomedical research. Though massively parallel sequencing instruments are in principle well-suited to this task, the error rates in such instruments are generally too high to allow confident identification of rare variants. We here describe an approach that can substantially increase the sensitivity of massively parallel sequencing instruments for this purpose. One example of this approach, called “Safe-SeqS” for (Safe-Sequencing System) includes (i) assignment of a unique identifier (UID) to each template molecule; (ii) amplification of each uniquely tagged template molecule to create UID-families; and (iii) redundant sequencing of the amplification products. PCR fragments with the same UID are truly mutant (“super-mutants”) if ≧95% of them contain the identical mutation. We illustrate the utility of this approach for determining the fidelity of a polymerase, the accuracy of oligonucleotides synthesized in vitro, and the prevalence of mutations in the nuclear and mitochondrial genomes of normal cells.

233 Citations

50 Claims

1. A method to analyze nucleic acid sequences, comprising:
- attaching a unique identifier nucleic acid sequence (UID) to a first end of each of a plurality of analyte nucleic acid fragments to form uniquely identified analyte nucleic acid fragments;
  
  redundantly determining nucleotide sequence of a uniquely identified analyte nucleic acid fragment, wherein determined nucleotide sequences which share a UID form a family of members;
  
  identifying a nucleotide sequence as accurately representing an analyte nucleic acid fragment when at least 1% of members of the family contain the sequence.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 49, 50)
- - 2. The method of claim 1 wherein prior to the step of redundantly determining, the uniquely identified analyte nucleic acid fragments are amplified.
  - 3. The method of claim 1 wherein the nucleotide sequence is identified when at least 5% of members of the family contain the sequence.
  - 4. The method of claim 1 wherein the nucleotide sequence is identified when at least 25% of members of the family contain the sequence.
  - 5. The method of claim 1 wherein the nucleotide sequence is identified when at 50% of members of the family contain the sequence.
  - 6. The method of claim 1 wherein the nucleotide sequence is identified when at least 70% of members of the family contain the sequence.
  - 7. The method of claim 1 wherein the nucleotide sequence is identified when at least 90% of members of the family contain the sequence.
  - 8. The method of claim 1 wherein the nucleotide sequence is identified when 100% of members of the family contain the sequence.
  - 9. The method of claim 1 wherein the step of attaching is performed by polymerase chain reaction.
  - 10. The method of claim 1 wherein a first universal priming site is attached to a second end of each of a plurality of analyte nucleic acid fragments.
  - 11. The method of claim 9 wherein at least two cycles of polymerase chain reaction are performed such that a family is formed of uniquely identified analyte nucleic acid fragments that have a UID on the first end and a first universal priming site on a second end.
  - 12. The method of claim 1 wherein the UID is covalently linked to a second universal priming site.
  - 13. The method of claim 10 wherein the UID is covalently linked to a second universal priming site.
  - 14. The method of claim 13 wherein prior to the step of redundantly determining, the uniquely identified analyte nucleic acid fragments are amplified using a pair of primers which are complementary to the first and the second universal priming sites, respectively.
  - 15. The method of claim 12 wherein the UID is attached to the 5′
    - end of an analyte nucleic acid fragment and the second universal priming site is 5′
      
      to the UID.
  - 16. The method of claim 12 wherein the UID is attached to the 3′
    - end of an analyte nucleic acid fragment and the second universal priming site is 3′
      
      to the UID.
  - 17. The method of claim 1 wherein the analyte nucleic acid fragments are formed by applying a shear force to analyte nucleic acid.
  - 18. The method of claim 9 wherein prior to the step of redundantly determining, the uniquely identified analyte nucleic acid fragments are subjected to amplification, and wherein prior to said amplification, a single strand-specific exonuclease is used to digest excess primers used to attach the UID the analyte nucleic acid fragments.
  - 19. The method of claim 18 wherein prior to the step of redundantly determining, the uniquely identified analyte nucleic acid fragments are subject to amplification, and wherein prior to said amplification, the single strand-specific exonuclease is inactivated, inhibited, or removed.
  - 20. The method of claim 19 wherein the single strand-specific exonuclease is inactivated by heat treatment.
  - 21. The method of claim 18 wherein primers used in said amplification comprise one or more chemical modifications rendering them resistant to exonucleases.
  - 22. The method of claim 18 wherein primers used in said amplification comprise one or more phosphorothioate linkages.
  - 49. The method of claim 2, 23, or 36 wherein prior to the amplification, the analyte DNA is treated with bisulfite to convert unmethylated cytosine bases to uracil.
  - 50. The method of claim 1, 23, or 36 further comprising the step of comparing number of families representing a first analyte DNA fragment to number of families representing a second analyte DNA fragment to determine a relative concentration of a first analyte DNA fragment to a second analyte DNA fragment in the plurality of analyte DNA fragments.

23. A method to analyze nucleic acid sequences, comprising:
- attaching a unique identifier sequence (UID) to a first end of each of a plurality of analyte DNA fragments using at least two cycles of amplification with first and second primers to form uniquely identified analyte DNA fragments, wherein the UIDs are in excess of the analyte DNA fragments during amplification, wherein the first primers comprise;
  
  a first segment complementary to a desired amplicon;
  
  a second segment containing the UID;
  
  a third segment containing a universal priming site for subsequent amplification;
  
  and wherein the second primers comprise a universal priming site for subsequent amplification;
  
  wherein each cycle of amplification attaches one universal priming site to a strand;
  
  amplifying the uniquely identified analyte DNA fragments to form a family of uniquely identified analyte DNA fragments from each uniquely identified analyte DNA fragment; and
  
  determining nucleotide sequences of a plurality of members of the family.
- View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
- - 24. The method of claim 23 wherein the second primers each comprise a UID.
  - 25. The method of claim 23 further comprising the steps of:
    - comparing sequences of a family of uniquely identified analyte DNA fragments; and
      
      identifying a nucleotide sequence as accurately representing an analyte DNA fragment when at least 1% of members of the family contain the sequence.
  - 26. The method of claim 25 wherein the nucleotide sequence is identified when at least 5% of members of the family contain the sequence.
  - 27. The method of claim 25 wherein the nucleotide sequence is identified when at least 25% of members of the family contain the sequence.
  - 28. The method of claim 25 wherein the nucleotide sequence is identified when at least 50% of members of the family contain the sequence.
  - 29. The method of claim 25 wherein the nucleotide sequence is identified when at least 70% of members of the family contain the sequence.
  - 30. The method of claim 25 wherein the nucleotide sequence is identified when at least 90% of members of the family contain the sequence.
  - 31. The method of claim 23 wherein the UIDs are from 2 to 4000 bases inclusive.
  - 32. The method of claim 23 wherein prior to the step of amplifying the uniquely identified analyte DNA fragments, a single strand-specific exonuclease is used to digest excess primers used to attach the UID the analyte DNA fragments.
  - 33. The method of claim 32 wherein prior to the step of amplifying the single strand-specific exonuclease is inactivated, inhibited, or removed.
  - 34. The method of claim 33 wherein the single strand-specific exonuclease is inactivated by heat treatment.
  - 35. The method of claim 32 wherein primers used in the step of amplifying comprise one or more phosphorothioate linkages.

36. A method to analyze DNA using endogenous unique identifier sequences (UIDs), comprising:
- attaching adapter oligonucleotides to ends of fragments of analyte DNA of between 30 to 2000 bases, inclusive, to form adapted fragments, wherein each end of a fragment before said attaching is an endogenous UID for the fragment;
  
  amplifying the adapted fragments using primers complementary to the adapter oligonucleotides to form families of adapted fragments;
  
  determining nucleotide sequence of a plurality of members of a family;
  
  comparing nucleotide sequences of the plurality of members of the family; and
  
  identifying a nucleotide sequence as accurately representing an analyte DNA fragment when at least 1% of members of the family contain the sequence.
- View Dependent Claims (37, 38, 39, 40, 41)
- - 37. The method of claim 36 further comprising:
    - enriching for fragments representing one or more selected genes by means of capturing a subset of the fragments using capture oligonucleotides complementary to selected genes in the analyte DNA.
  - 38. The method of claim 36 further comprising:
    - enriching for fragments representing one or more selected genes by means of amplifying fragments complementary to selected genes.
  - 39. The method of claim 37 or 38 wherein the step of attaching is prior to the step of enriching.
  - 40. The method of claim 36 wherein the fragments are formed by shearing.
  - 41. The method of claim 36 wherein a nucleotide sequence is identified as accurately representing an analyte DNA fragment when at least 5% of members of the family contain the sequence.

42. A population of primer pairs, wherein each pair comprises a first and second primer for amplifying and identifying a gene or gene portion, wherein:
- the first primer comprises a first portion of 10-100 nucleotides complementary to the gene or gene portion and a second portion of 10 to 100 nucleotides comprising a site for hybridization to a third primer;
  
  the second primer comprises a first portion of 10-100 nucleotides complementary to the gene or gene portion and a second portion of 10 to 100 nucleotides comprising a site for hybridization to a fourth primer, wherein interposed between the first portion and the second portion of the second primer is a third portion consisting of 2 to 4000 nucleotides forming a unique identifier (UID);
  
  wherein the unique identifiers in the population have at least 4 different sequences, wherein the first and second primers are complementary to opposite strands of the gene or gene portion.
- View Dependent Claims (43, 44, 45, 46, 47, 48)
- - 43. The method of claim 42 wherein the first primer further comprises a unique identifier (UID).
  - 44. The population of claim 42 wherein the unique identifiers in the population have at least at least 16, at least 64, at least 256, at least 1,024, at least 4,096, at least 16,384, at least 65,536, at least 262,144, at least 1,048,576, at least 4,194,304, at least 16,777,216, or at least 67,108,864 different sequences.
  - 45. A kit comprising the population of primers of claim 42 and the third and fourth primers complementary to the second portions of each of the first and second primers.
  - 46. The population of claim 42 wherein the UID comprises randomly selected sequences.
  - 47. The population of claim 42 wherein the UID comprises pre-defined nucleotide sequences.
  - 48. The population of claim 42 wherein the UID comprises both randomly selected sequences and pre-defined nucleotides.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Johns Hopkins University
Original Assignee
Johns Hopkins University
Inventors
Vogelstein, Bert, Kinzler, Kenneth W., Papadopoulos, Nickolas, Kinde, Isaac

Granted Patent

US 9,476,095 B2
Time in Patent Office

Days
Field of Search
US Class Current

435/6.12
CPC Class Codes

C12Q 1/6806   Preparing nucleic acids for...

C12Q 1/6869   Methods for sequencing

C12Q 1/6874   involving nucleic acid arra...

C12Q 1/6876   Nucleic acid products used ...

C12Q 2521/501   Ligase

C12Q 2525/155   incorporating/generating a ...

C12Q 2525/179   incorporating arbitrary or ...

C12Q 2525/191   incorporating an adaptor

C12Q 2535/122   Massive parallel sequencing

C12Q 2563/179   the label being a nucleic acid

C12Q 2565/514   characterised by the use of...

C12Q 2600/158   Expression markers

SAFE SEQUENCING SYSTEM

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

233 Citations

50 Claims

Specification

Solutions

Use Cases

Quick Links

SAFE SEQUENCING SYSTEM

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

233 Citations

50 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links