Compositions and methods for identifying nucleic acid molecules
First Claim
Patent Images
1. A method for sequencing at least a portion of a population of sample nucleic acid molecules, wherein the method comprises:
- forming a reaction mixture comprising the population of sample nucleic acid molecules and a set of Molecular Index Tags (MITs), wherein the MITs are nucleic acid molecules, wherein the number of different MITs in the set of MITs is between 10 and 1,000, and wherein a ratio of the total number of sample nucleic acid molecules in the population of sample nucleic acid molecules to the number of different MITs in the set of MITs is at least 1,000;
1;
attaching at least one MIT from the set of MITs to a sample nucleic acid molecule or segment thereof for at least 50% of the sample nucleic acid molecules to form a population of tagged nucleic acid molecules, wherein the at least one MIT is located 5′ and
/or 3′
to the sample nucleic acid molecule or segment thereof on each tagged nucleic acid molecule and wherein the population of tagged nucleic acid molecules comprises at least one copy of each MIT of the set of MITs;
amplifying the population of tagged nucleic acid molecules to create a library of tagged nucleic acid molecules;
and determining the sequences of the attached MITs and at least a portion of the sample nucleic acid molecule or segment thereof of the tagged nucleic acid molecules in the library of tagged nucleic acid molecules.
4 Assignments
0 Petitions
Accused Products
Abstract
The present disclosure provides methods and compositions for sequencing nucleic acid molecules and identifying individual sample nucleic acid molecules using Molecular Index Tags (MITs). Furthermore, reaction mixtures, kits, and adapter libraries are provided.
-
Citations
19 Claims
-
1. A method for sequencing at least a portion of a population of sample nucleic acid molecules, wherein the method comprises:
-
forming a reaction mixture comprising the population of sample nucleic acid molecules and a set of Molecular Index Tags (MITs), wherein the MITs are nucleic acid molecules, wherein the number of different MITs in the set of MITs is between 10 and 1,000, and wherein a ratio of the total number of sample nucleic acid molecules in the population of sample nucleic acid molecules to the number of different MITs in the set of MITs is at least 1,000;
1;attaching at least one MIT from the set of MITs to a sample nucleic acid molecule or segment thereof for at least 50% of the sample nucleic acid molecules to form a population of tagged nucleic acid molecules, wherein the at least one MIT is located 5′ and
/or 3′
to the sample nucleic acid molecule or segment thereof on each tagged nucleic acid molecule and wherein the population of tagged nucleic acid molecules comprises at least one copy of each MIT of the set of MITs;amplifying the population of tagged nucleic acid molecules to create a library of tagged nucleic acid molecules; and determining the sequences of the attached MITs and at least a portion of the sample nucleic acid molecule or segment thereof of the tagged nucleic acid molecules in the library of tagged nucleic acid molecules. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method for identifying amplification errors from sample preparation for high-throughput sequencing or identifying base-calling errors in a high-throughput sequencing reaction of a population of tagged nucleic acid molecules derived from a sample, wherein the method comprises:
-
forming a reaction mixture comprising the population of sample nucleic acid molecules and a set of Molecular Index Tags (MITs), wherein the MITs are double-stranded nucleic acid molecules, wherein the number of different MITs in the set of MITs is between 10 and 1,000, and wherein a ratio of the total number of sample nucleic acid molecules in the population of sample nucleic acid molecules to the diversity of MITs in the set of MITs is greater than 1,000;
1;attaching at least one MIT from the set of MITs to a sample nucleic acid molecule or segment thereof for a plurality** of sample nucleic acid molecules to form a population of tagged nucleic acid molecules wherein the at least one MIT is located 5′ and
/or 3′
to the sample nucleic acid molecule or segment thereof on each tagged nucleic acid molecule and wherein the population of tagged nucleic acid molecules comprises at least one copy of each MIT in the set of MITs;amplifying the population of tagged nucleic acid molecules to create a library of tagged nucleic acid molecules; determining, using high-throughput sequencing, the sequences of the attached MITs and at least a portion of the sample nucleic acid molecule or segment thereof of the tagged nucleic acid molecules in the library of tagged nucleic acid molecules, wherein the sequence of the at least one MIT on each tagged nucleic acid molecule identifies the individual sample nucleic acid molecule that gave rise the tagged nucleic acid molecule; and identifying tagged nucleic acid molecules having amplification errors or base-calling errors by identifying tagged nucleic acid molecules in which the sample nucleic acid molecule or segment thereof has a nucleotide sequence that is found in less than 25% of tagged nucleic acid molecules derived from the same initial sample nucleic acid molecule. - View Dependent Claims (16, 17, 18, 19)
-
Specification