Increased confidence of allele calls with molecular counting
First Claim
1. A method for assessing the presence of a genetic variation in a tagged viral sample, comprising:
- (a) amplifying a population of target DNA molecules from the tagged viral sample, thereby producing a population of amplified target DNA molecules, wherein at least some of the target DNA molecules are tagged with different degenerate base region (DBR) sequences, wherein said DBR sequences comprises at least one nucleotide base selected from;
R, V, S, W, K, M, B, D, H, V, N and modified versions thereof and wherein each of the amplified target DNA molecules comprises an associated DBR sequence of said DBR sequences;
(b) sequencing at least some of the amplified target DNA molecules of step (a), thereby producing a plurality of sequence reads, wherein the sequencing step provides, for each of the amplified target DNA molecules that are sequenced;
(i) the nucleotide sequence of a target DNA molecule in the amplified target DNA molecules and (ii) the nucleotide sequence of an associated DBR sequence of said DBR sequence;
(c) examining the sequence reads of step (b), thereby identifying a potential genetic variation; and
(d) assessing, using a computer, the presence of the genetic variation in said tagged viral sample, based on;
(i) a determination of the number of said different DBR sequences that are associated with said genetic variation; and
(ii) a determination of the number of said sequence reads that comprise each of the different DBR sequences that are associated with said genetic variation.
3 Assignments
0 Petitions
Accused Products
Abstract
Aspects of the present invention include methods and compositions for determining the number of individual polynucleotide molecules originating from the same genomic region of the same original sample that have been sequenced in a particular sequence analysis configuration or process. In these aspects of the invention, a degenerate base region (DBR) is attached to the starting polynucleotide molecules that are subsequently sequenced (e.g., after certain process steps are performed, e.g., amplification and/or enrichment). The number of different DBR sequences present in a sequencing run can be used to determine/estimate the number of different starting polynucleotides that have been sequenced. DBRs can be used to enhance numerous different nucleic acid sequence analysis applications, including allowing higher confidence allele call determinations in genotyping applications.
160 Citations
21 Claims
-
1. A method for assessing the presence of a genetic variation in a tagged viral sample, comprising:
-
(a) amplifying a population of target DNA molecules from the tagged viral sample, thereby producing a population of amplified target DNA molecules, wherein at least some of the target DNA molecules are tagged with different degenerate base region (DBR) sequences, wherein said DBR sequences comprises at least one nucleotide base selected from;
R, V, S, W, K, M, B, D, H, V, N and modified versions thereof and wherein each of the amplified target DNA molecules comprises an associated DBR sequence of said DBR sequences;(b) sequencing at least some of the amplified target DNA molecules of step (a), thereby producing a plurality of sequence reads, wherein the sequencing step provides, for each of the amplified target DNA molecules that are sequenced;
(i) the nucleotide sequence of a target DNA molecule in the amplified target DNA molecules and (ii) the nucleotide sequence of an associated DBR sequence of said DBR sequence;(c) examining the sequence reads of step (b), thereby identifying a potential genetic variation; and (d) assessing, using a computer, the presence of the genetic variation in said tagged viral sample, based on; (i) a determination of the number of said different DBR sequences that are associated with said genetic variation; and (ii) a determination of the number of said sequence reads that comprise each of the different DBR sequences that are associated with said genetic variation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 21)
-
-
17. A method for assessing the presence of a genetic variation in a tagged microbial sample, comprising:
-
(a) amplifying a population of target DNA molecules from the tagged microbial sample, thereby producing a population of amplified target DNA molecules, wherein at least some of the target DNA molecules are tagged with different degenerate base region (DBR) sequences, wherein said DBR sequences comprises at least one nucleotide base selected from;
R, V, S, W, K, M, B, I), H, V, N and modified versions thereof and wherein each of the amplified target DNA molecules comprises an associated DBR sequence of said DBR sequences;(b) sequencing at least some of the amplified target DNA molecules of step (a), thereby producing a plurality of sequence reads, wherein the sequencing step provides, for each of the amplified target DNA molecules that are sequenced;
(i) the nucleotide sequence of a target DNA molecule in the amplified target DNA molecules and (ii) the nucleotide sequence of an associated DBR sequence of said DBR sequences;(c) examining the sequence reads of step (b), thereby identifying a potential genetic variation; and (d) assessing, using a computer, the presence of the genetic variation in said tagged microbial sample, based on; (i) a determination of the number of said different DBR sequences that are associated with said genetic variation; and (ii) a determination of the number of said sequence reads that comprise each of the different DBR sequences that are associated with said genetic variation. - View Dependent Claims (18, 19, 20)
-
Specification