INCREASING CONFIDENCE OF ALLELE CALLS WITH MOLECULAR COUNTING
First Claim
1. A method of estimating the number of starting polynucleotide molecules sequenced from multiple samples, the method comprising:
- attaching an adapter to starting polynucleotide molecules in multiple different samples, wherein the adapter for each sample comprises;
a unique MID specific for the sample; and
a degenerate base region (DBR) comprising at least one nucleotide base selected from;
R, Y, S, W, K, M, B, D, H, V, N, and modified versions thereof;
pooling the multiple different adapter-attached samples to generate a pooled sample;
amplifying the adapter-attached polynucleotides in the pooled sample;
sequencing a plurality of the amplified adapter-attached polynucleotides, wherein the sequence of the MID, the DBR and at least a portion of the polynucleotide is obtained for each of the plurality of adapter-attached polynucleotides; and
determining the number of distinct DBR sequences present in the plurality of sequenced adapter-attached polynucleotides from each sample to determine the number of starting polynucleotides from each sample that were sequenced in the sequencing step.
4 Assignments
0 Petitions
Accused Products
Abstract
Aspects of the present invention include methods and compositions for determining the number of individual polynucleotide molecules originating from the same genomic region of the same original sample that have been sequenced in a particular sequence analysis configuration or process. In these aspects of the invention, a degenerate base region (DBR) is attached to the starting polynucleotide molecules that are subsequently sequenced (e.g., after certain process steps are performed, e.g., amplification and/or enrichment). The number of different DBR sequences present in a sequencing run can be used to determine/estimate the number of different starting polynucleotides that have been sequenced. DBRs can be used to enhance numerous different nucleic acid sequence analysis applications, including allowing higher confidence allele call determinations in genotyping applications.
-
Citations
24 Claims
-
1. A method of estimating the number of starting polynucleotide molecules sequenced from multiple samples, the method comprising:
-
attaching an adapter to starting polynucleotide molecules in multiple different samples, wherein the adapter for each sample comprises; a unique MID specific for the sample; and a degenerate base region (DBR) comprising at least one nucleotide base selected from;
R, Y, S, W, K, M, B, D, H, V, N, and modified versions thereof;pooling the multiple different adapter-attached samples to generate a pooled sample; amplifying the adapter-attached polynucleotides in the pooled sample; sequencing a plurality of the amplified adapter-attached polynucleotides, wherein the sequence of the MID, the DBR and at least a portion of the polynucleotide is obtained for each of the plurality of adapter-attached polynucleotides; and determining the number of distinct DBR sequences present in the plurality of sequenced adapter-attached polynucleotides from each sample to determine the number of starting polynucleotides from each sample that were sequenced in the sequencing step. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification