Methods for genome assembly, haplotype phasing, and target independent nucleic acid detection
First Claim
1. A method of assaying for nucleic acid species diversity in a heterogeneous sample comprising at least two species, comprisinga) obtaining a stabilized nucleic acid sample comprising a diverse plurality of nucleic acids from at least two species stabilized such that, for at least a first member of the plurality, a first nucleic acid segment and a second nucleic acid segment are held together independent of their common phosphodiester backbone, wherein said phosphodiester backbone is cleaved between said first nucleic acid segment and said second nucleic acid segment, and for at least a second member of the plurality, a third nucleic acid segment and a fourth nucleic acid segment are held together independent of their common phosphodiester backbone, wherein said phosphodiester backbone is cleaved between said third nucleic acid segment and said fourth nucleic acid segment;
- b) tagging said first nucleic acid segment with a first tag and said second nucleic acid segment with a second tag, such that said first nucleic acid segment and said second nucleic acid segment are identifiable as arising from a common nucleic acid of the diverse plurality of nucleic acids, and tagging said third nucleic acid segment with a third tag and said fourth nucleic acid segment with a fourth tag, such that said third nucleic acid segment and said fourth nucleic acid segment are identifiable as arising from a common nucleic acid of the diverse plurality of nucleic acids;
c) sequencing at least an identifiable portion of said first nucleic acid segment and said first tag, of said second nucleic acid segment and said second tag, of said third nucleic acid segment and said third tag, and of said fourth nucleic acid segment and said fourth tag;
d) constructing at least a first sequence scaffold comprising said first nucleic acid segment and said second nucleic acid segment and a second sequence scaffold comprising said third nucleic acid segment and said fourth nucleic acid segment;
such that a plurality of segments of said diverse plurality of nucleic acids are assigned to at least one of the first or second sequence scaffold; and
e) counting a plurality of sequence scaffolds constructed,wherein nucleic acid segments tagged such that they are identifiable as arising from a common nucleic acid of the diverse plurality of nucleic acids are assigned to a common scaffold; and
wherein the number of scaffolds generated indicates the species diversity in the heterogeneous sample.
2 Assignments
0 Petitions
Accused Products
Abstract
The disclosure provides methods to assemble genomes of eukaryotic or prokaryotic organisms. The disclosure provides methods for haplotype phasing and meta-genomics assemblies. The disclosure provides a streamlined method for accomplishing these tasks, such that intermediates need not be labeled by an affinity label to facilitate binding to a solid surface. The disclosure also provides methods and compositions for the de novo generation of scaffold information, linkage information, and genome information for unknown organisms in heterogeneous metagenomic samples or samples obtained from multiple individuals. Practice of the methods can allow de novo sequencing of entire genomes of uncultured or unidentified organisms in heterogeneous samples, or the determination of linkage information for nucleic acid molecules in samples comprising nucleic acids obtained from multiple individuals.
114 Citations
23 Claims
-
1. A method of assaying for nucleic acid species diversity in a heterogeneous sample comprising at least two species, comprising
a) obtaining a stabilized nucleic acid sample comprising a diverse plurality of nucleic acids from at least two species stabilized such that, for at least a first member of the plurality, a first nucleic acid segment and a second nucleic acid segment are held together independent of their common phosphodiester backbone, wherein said phosphodiester backbone is cleaved between said first nucleic acid segment and said second nucleic acid segment, and for at least a second member of the plurality, a third nucleic acid segment and a fourth nucleic acid segment are held together independent of their common phosphodiester backbone, wherein said phosphodiester backbone is cleaved between said third nucleic acid segment and said fourth nucleic acid segment; -
b) tagging said first nucleic acid segment with a first tag and said second nucleic acid segment with a second tag, such that said first nucleic acid segment and said second nucleic acid segment are identifiable as arising from a common nucleic acid of the diverse plurality of nucleic acids, and tagging said third nucleic acid segment with a third tag and said fourth nucleic acid segment with a fourth tag, such that said third nucleic acid segment and said fourth nucleic acid segment are identifiable as arising from a common nucleic acid of the diverse plurality of nucleic acids; c) sequencing at least an identifiable portion of said first nucleic acid segment and said first tag, of said second nucleic acid segment and said second tag, of said third nucleic acid segment and said third tag, and of said fourth nucleic acid segment and said fourth tag; d) constructing at least a first sequence scaffold comprising said first nucleic acid segment and said second nucleic acid segment and a second sequence scaffold comprising said third nucleic acid segment and said fourth nucleic acid segment;
such that a plurality of segments of said diverse plurality of nucleic acids are assigned to at least one of the first or second sequence scaffold; ande) counting a plurality of sequence scaffolds constructed, wherein nucleic acid segments tagged such that they are identifiable as arising from a common nucleic acid of the diverse plurality of nucleic acids are assigned to a common scaffold; and wherein the number of scaffolds generated indicates the species diversity in the heterogeneous sample. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
Specification