High throughput genome sequencing on DNA arrays

US 20090005252A1
Filed: 10/31/2007
Published: 01/01/2009
Est. Priority Date: 02/24/2006
Status: Active Grant

First Claim

Patent Images

1-42. -42. (canceled)

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention is directed to methods and compositions for acquiring nucleotide sequence information of target sequences using adaptors interspersed in target polynucleotides. The sequence information can be new, e.g. sequencing unknown nucleic acids, re-sequencing, or genotyping. The invention preferably includes methods for inserting a plurality of adaptors at spaced locations within a target polynucleotide or a fragment of a polynucleotide. Such adaptors may serve as platforms for interrogating adjacent sequences using various sequencing chemistries, such as those that identify nucleotides by primer extension, probe ligation, and the like. Encompassed in the invention are methods and compositions for the insertion of known adaptor sequences into target sequences, such that there is an interruption of contiguous target sequence with the adaptors. By sequencing both “upstream” and “downstream” of the adaptors, identification of entire target sequences may be accomplished.

Citations

80 Claims

1-42. -42. (canceled)

43. A method of inserting multiple adaptors in a target nucleic acid comprising:
- a) ligating a first adaptor to one terminus of said target nucleic acid to form a first linear construct, wherein said adaptor comprises a recognition site for a first restriction enzyme;
  
  b) circularizing first linear construct to create a first circular polynucleotide;
  
  c) cleaving said first circular polynucleotide with said first restriction enzyme to form a second linear construct, wherein said first restriction enzyme is able to bind to said recognition site within said first adaptor;
  
  d) ligating a second adaptor to said second linear construct to form a third linear construct, wherein said second adaptor comprises a recognition site for a second restriction enzyme;
  
  e) circularizing said third linear construct to create a second circular polynucleotide.
- View Dependent Claims (44, 45, 46, 47)
- - 44. The method of claim 43, wherein steps (c) through (e) are optionally repeated with additional adaptors, each comprising a recognition site for a restriction enzyme.
  - 45. The method of claim 44, wherein said recognition sites for a restriction enzyme are unique for each adaptor.
  - 46. The method of claim 43, wherein said second circular polynucleotide is used to form a concatemer through rolling circle replication.
  - 47. The method of claim 46, wherein said concatemer comprises multiple copies of a monomer, wherein said monomer comprises said target nucleic acid, said first adaptor and said second adaptor.

48. A method of inserting multiple adaptors in a target nucleic acid comprising:
- a) ligating a first adaptor to one terminus of said target nucleic acid to form a first linear construct, wherein said adaptor comprises;
  
  i. a recognition site for a first restriction enzyme, andii. a first secondary structure sequence;
  
  b) circularizing first linear construct to create a first circular polynucleotide;
  
  c) cleaving said first circular polynucleotide with said first restriction enzyme to form a second linear construct, wherein said first restriction enzyme is able to bind to said recognition site within said first adaptor;
  
  d) ligating a second adaptor to said second linear construct to form a third linear construct, wherein said second adaptor comprises;
  
  iii. a recognition site for a second restriction enzyme, andiv. a second secondary structure sequence;
  
  e) circularizing said third linear construct to create a second circular polynucleotide.
- View Dependent Claims (49, 50, 51, 52)
- - 49. The method of claim 48, wherein steps (c) through (e) are optionally repeated with additional adaptors, each comprising a recognition site for a restriction enzyme and a secondary structure sequence.
  - 50. The method of claim 49, wherein the recognition site for a restriction enzyme is unique for each adaptor.
  - 51. The method of claim 48, wherein each adaptor comprises the same secondary structure sequence.
  - 52. The method of claim 48, wherein the adaptors comprise secondary structure sequences resulting in interaction between different adaptors within a concatemer.

53. A method of identifying a nucleotide at a detection position of a target nucleic acid, wherein said target nucleic acid comprises a plurality of detection positions, said method comprising:
- a) providing a plurality of concatemers, wherein each concatemer comprises a plurality of monomers and each monomer comprises;
  
  i) a first target domain of said target nucleic acid comprising a first set of target detection positions;
  
  ii) a first adaptor comprising a Type IIs endonuclease restriction site;
  
  iii) a second target domain of said target nucleic acid comprising a second set of target detection positions; and
  
  iv) a second interspersed adaptor comprising a Type IIs endonuclease restriction site;
  
  b) identifying said nucleotide.
- View Dependent Claims (54)
- - 54. The method of claim 53, wherein said concatemers are immobilized on a surface.

55. A method of identifying a nucleotide sequence of a target nucleic acid, said method comprising:
- a) providing a plurality of immobilized concatemers, wherein each concatemer comprises;
  
  i) multiple copies of a fragment of said target nucleic acid,ii) a plurality of interspersed adaptors at predetermined sites, wherein each interspersed adaptor comprises a secondary structure sequence;
  
  b) identifying a sequence of at least a portion of each fragment adjacent to at least one interspersed adaptor in at least one concatemer;
  
  thereby identifying a nucleotide sequence of the target nucleic acid.
- View Dependent Claims (56, 57, 58, 59, 60)
- - 56. The method of claim 55, wherein each interspersed adaptor comprises a palindrome
  - 57. The method of claim 55, wherein the secondary structure sequence in each adaptor results in an interaction between different adaptors within a concatemer.
  - 58. The method of claim 55, different concatemers comprise different fragments of said target nucleic acid.
  - 59. The method of claim 55, wherein said fragments represent substantially all of said target nucleic acid.
  - 60. The method of claim 55, further comprising a step of reconstructing said nucleotide sequence of said target nucleic acid from identities of sequences of said fragments of said plurality of concatemers.

61. A method for identifying a nucleotide sequence of a target nucleic acid, said method comprising:
- a) providing a target nucleic acid comprising a plurality of interspersed adaptors, wherein each interspersed adaptor has at least one boundary with said target nucleic acid; and
  
  b) identifying at least one nucleotide adjacent to at least one boundary of at least two interspersed adaptors,thereby identifying a nucleotide sequence of said target nucleic acid.
- View Dependent Claims (62, 63, 64)
- - 62. The method of claim 61, wherein said identifying step comprises:
    - a) contacting said concatemers with a set of sequencing probes, wherein each sequencing probe comprises;
      
      i) a first domain complementary to one of said adaptors;
      
      ii) a unique nucleotide at a first interrogation position; and
      
      iii) a label;
      
      wherein said contacting occurs under conditions such that if said unique nucleotide is complementary to said first nucleotide, a sequencing probe hybridizes to said concatemer; and
      
      b) identifying said first nucleotide by identifying at least a portion of a sequence of said hybridized sequencing probe.
  - 63. The method of claim 62, wherein the first domain complementary to one of said adaptors is adjacent or close to the 3′
    - end of the target nucleic acid.
  - 64. The method of claim 62, wherein the first domain complementary to one of said adaptors is adjacent or close to the 5′
    - end of the target nucleic acid.

65. A method of identifying a nucleotide sequence of a target nucleic acid, said method comprising:
- a) providing a plurality of amplicons, wherein;
  
  i) each amplicon comprises multiple copies of a fragment of the target nucleic acid,ii) each amplicon comprises a plurality of interspersed adaptors at predetermined sites within the fragment, each adaptor comprising at least one anchor probe hybridization site, andiii) said plurality of amplicons comprise fragments that substantially cover the target nucleic acid;
  
  b) providing a random array of said amplicons fixed to a surface at a density such that at least a majority of said amplicons are optically resolvable;
  
  c) hybridizing one or more anchor probes to said random array;
  
  d) hybridizing one or more sequencing probes to said random array to form perfectly matched duplexes between said one or more sequencing probes and fragments of target nucleic acid;
  
  e) ligating the anchor probes to the sequencing probes; and
  
  f) identifying at least one nucleotide adjacent to at least one interspersed adaptor; and
  
  g) repeating steps (c) and (f) until a nucleotide sequence of said target nucleic acid is identified.
- View Dependent Claims (66, 67, 68, 69)
- - 66. The method of claim 65, wherein the anchor probe hybridization sites on said adaptors are adjacent or close to the 3′
    - end of the fragment of the target nucleic acid.
  - 67. The method of claim 65, wherein the anchor probe hybridization sites on said adaptors are adjacent or close to the 5′
    - end of the fragment of the target nucleic acid.
  - 68. The method of claim 65, wherein each adaptor comprises two anchor probe hybridization sites, one adjacent or close to the 5′
    - end of the fragment of the target nucleic acid and one adjacent or close to the 5′
      
      end of the fragment of the target nucleic acid.
  - 69. The method of claim 65, further comprising reconstructing said nucleotide sequence of said target nucleic acid from identities of nucleotide determined in step f).

70. A method of identifying a nucleotide sequence of a target nucleic acid, said method comprising:
- a) providing a plurality of amplicons, wherein;
  
  i) each amplicon comprises multiple copies of a fragment of the target nucleic acid,ii) each amplicon comprises a plurality of interspersed adaptors at predetermined sites within the fragment, each adaptor comprising at least one anchor probe hybridization site, andiii) said plurality of amplicons comprise fragments that substantially cover the target nucleic acid;
  
  b) providing a random array of said amplicons fixed to a surface at a density such that at least a majority of said amplicons are optically resolvable;
  
  c) hybridizing one or more anchor probes to said random array to form perfectly matched duplexes between said one or more anchor probes and anchor probe hybridization sites on said interspersed adaptors,d) identifying at least one nucleotide adjacent to at least one interspersed adaptor by extending said one or more anchor probes in a sequence specific reaction; and
  
  e) repeating steps (c) and (d) until a nucleotide sequence of said target nucleic acid is identified.

71. A method of identifying a nucleotide sequence of a target nucleic acid, said method comprising:
- a) providing a random array of concatemers fixed to a planar surface, wherein;
  
  i) said surface has an array of optically resolvable discrete spaced apart regions,ii) each discrete spaced apart region has an area of less than 1 μ
  
  m²,iii) substantially all of said discrete spaced apart regions have at most one of said concatemers attached,iv) each of said concatemers comprises multiple copies of a fragment of said target nucleic acid,v) each concatemer comprises a plurality of interspersed adaptors at predetermined sites within each fragment, andvi) said fragments of said concatemers substantially cover said target nucleic acid;
  
  b) hybridizing one or more probes from a first set of probes to the random array under conditions that permit the formation of perfectly matched duplexes between the one or more probes and complementary sequences on the concatemers;
  
  c) hybridizing one or more probes from a second set of probes to the random array under conditions that permit the formation of perfectly matched duplexes between the one or more probes and complementary sequences on the concatemers;
  
  d) ligating probes from said first and second sets which are hybridized to a concatemer at contiguous sites;
  
  e) identifying sequences of said ligated probes;
  
  thereby identifying a nucleotide sequence of said target nucleic acid.
- View Dependent Claims (72)
- - 72. The method of claim 71, wherein said steps (b) through (e) are repeated a number of times.

73. A method for identifying multiple nucleotides in a target nucleic acid, comprising the steps of:
- (a) providing a target nucleic acid comprising two or more adapters interspersed within said target nucleic acid, wherein each of said adaptors comprises at least one anchor site;
  
  (b) hybridizing anchor probes and sequencing probes to multiple anchor sites within said adaptors,(c) ligating anchor probes and sequencing probes that are hybridized to adjacent sites within the target nucleic acid to create anchor-probe ligations;
  
  (d) removing any probes that are not anchor-probe ligations;
  
  (e) identifying the probe-anchor ligation at each anchor site independently within the target nucleic acid;
  
  wherein each anchor-probe ligation is used to identity one or more nucleotides at a defined distance from each adaptor in the target nucleic acid.
- View Dependent Claims (74, 75, 76)
- - 74. The method of claim 73, wherein the method further comprises:
    - (f) removing the anchor-probe ligations from the target nucleic acid; and
      
      (g) repeating steps (b)-(e) with a sequencing probe that will read a nucleotide different from that read in the previous steps.
  - 75. The method of claim 73, wherein each adaptor within the target nucleic acid comprises an anchor site at each end of the adaptor.
  - 76. The method of claim 73, wherein overlapped and/or shifted pairs of sequencing probes and anchors may be used to read each base two or more times.

77. A method for identifying multiple nucleotides in target nucleic acid in a single ligation reaction, the method comprising:
- a) providing a target nucleic acid comprising a plurality of interspersed adaptors;
  
  b) hybridizing two or more labeled anchor probes to the target nucleic acid, wherein the different anchor probes, can be distinguished based on differential properties of the probes;
  
  c) ligating the labeled probes to identify a nucleotide position relative to each anchord) identifying the labeled probes hybridized to the adaptors;
  
  e) removing one or more probes based on the differential properties of the probes;
  
  f) identifying the labeled probes remaining after removal of the probes having differential properties;
  
  g) identifying additional nucleotide positions in the target targets by comparison of the probes identified in steps (d) and (f), thereby providing identification of two or more nucleotides in a single ligation reaction.
- View Dependent Claims (78, 79)
- - 78. The method of claim 77, wherein the differential properties of the probes are based on competitive hybridization of the probes.
  - 79. The method of claim 77, wherein the differential properties of the probes are differences in melting temperature of the probes upon hybridization with the adaptor sequences.

80. A method for identifying multiple nucleotides in a target nucleic acid, comprising the steps of:
- (a) providing a target nucleic acid comprising two or more adapters interspersed within said target nucleic acid, wherein each of said adaptors comprises at least one anchor site;
  
  (b) hybridizing two-part anchor probes to the multiple anchor sites within said adaptors, wherein the first part of the anchor probe is substantially complementary to the adaptor, and wherein the second part of the anchor probes comprises one or more degenerate nucleotides at the ligation end of the probe; and
  
  (c) hybridizing sequencing probes to the target nucleic acid;
  
  (d) ligating anchor probes and sequencing probes that are hybridized to adjacent sites within the target nucleic acid to create anchor-probe ligations;
  
  (e) removing any hybridized probes that are not anchor-probe ligations;
  
  (f) identifying the probe-anchor ligation at each anchor site independently within the target nucleic acid;
  
  wherein each anchor-probe ligation is used to identity one or more nucleotides at a defined distance from each adaptor in the target nucleic acid.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Complete Genomics Incorporated (BGI Genomics Co., Ltd.)
Original Assignee
Complete Genomics Incorporated (BGI Genomics Co., Ltd.)
Inventors
Drmanac, Radoje T., Callow, Matthew, Drmanac, Snezana

Granted Patent

US 8,722,326 B2
Time in Patent Office

Days
Field of Search
US Class Current

506/3
CPC Class Codes

C12Q 1/6837   using probe arrays or probe...

C12Q 1/6874   involving nucleic acid arra...

C12Q 2521/313   Type II endonucleases, i.e....

C12Q 2525/151   repeat or repeated sequence...

C12Q 2531/125   Rolling circle

High throughput genome sequencing on DNA arrays

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

80 Claims

Specification

Solutions

Use Cases

Quick Links

High throughput genome sequencing on DNA arrays

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

80 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links