Increasing Confidence of Allele Calls with Molecular Counting
3 Assignments
0 Petitions
Accused Products
Abstract
Aspects of the present invention include methods and compositions for determining the number of individual polynucleotide molecules originating from the same genomic region of the same original sample that have been sequenced in a particular sequence analysis configuration or process. In these aspects of the invention, a degenerate base region (DBR) is attached to the starting polynucleotide molecules that are subsequently sequenced (e.g., after certain process steps are performed, e.g., amplification and/or enrichment). The number of different DBR sequences present in a sequencing run can be used to determine/estimate the number of different starting polynucleotides that have been sequenced. DBRs can be used to enhance numerous different nucleic acid sequence analysis applications, including allowing higher confidence allele call determinations in genotyping applications.
-
Citations
43 Claims
-
1-24. -24. (canceled)
-
25. A method for determining the presence of an allele, comprising:
-
a) amplifying a population of initial target DNA molecules from a tagged genomic sample to produce a population of amplified target DNA molecules, wherein the initial target DNA molecules comprise a polymorphic target region and wherein each of the initial target DNA molecules that comprises said polymorphic target region is tagged with a different degenerate base region (DBR) sequence, wherein said DBR sequence comprises at least one nucleotide base selected from;
R, Y, S, W, K, M, B, D, H, V, N and modified versions thereof;b) sequencing a plurality of the amplified target DNA molecules, thereby producing a plurality of sequence reads, wherein the sequencing step provides, for each of the amplified target DNA molecules that are sequenced;
(i) the nucleotide sequence of at least a portion of the polymorphic target region and (ii) a DBR sequence;c) determining, for one allele of the polymorphic target region, the number of different DBR sequences that are associated with said allele; d) determining, for said allele of the polymorphic target region, the number of sequence reads that comprise each of the different DBR sequences; e) calculating the likelihood that said allele is present in said tagged genomic sample using the number of different sequences counted in step c) and the number of sequence reads counted in d); and f) making an allele call based on the likelihood calculated in step e), wherein a higher likelihood increases the confidence of said allele call. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
Specification