METHODS AND SYSTEMS FOR DETECTING GENETIC VARIANTS
First Claim
1. A method for estimating a total number of double-stranded deoxyribonucleic acid (DNA) molecules in a sample, comprising:
- (a) determining a quantitative measure of individual DNA molecules for which both strands are detected;
(b) determining a quantitative measure of individual DNA molecules for which only one strand is detected;
(c) using said quantitative measures determined in (a) and (b) to estimate a total number of double-stranded DNA molecules in the sample, wherein said total number comprises individual DNA molecules for which neither DNA strand is detected.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein in are methods and systems for determining genetic variants (e.g., copy number variation) in a polynucleotide sample. A method for determining copy number variations includes tagging double-stranded polynucleotides with duplex tags, sequencing polynucleotides from the sample and estimating total number of polynucleotides mapping to selected genetic loci. The estimate of total number of polynucleotides can involve estimating the number of double-stranded polynucleotides in the original sample for which no sequence reads are generated. This number can be generated using the number of polynucleotides for which reads for both complementary strands are detected and reads for which only one of the two complementary strands is detected.
122 Citations
30 Claims
-
1. A method for estimating a total number of double-stranded deoxyribonucleic acid (DNA) molecules in a sample, comprising:
-
(a) determining a quantitative measure of individual DNA molecules for which both strands are detected; (b) determining a quantitative measure of individual DNA molecules for which only one strand is detected; (c) using said quantitative measures determined in (a) and (b) to estimate a total number of double-stranded DNA molecules in the sample, wherein said total number comprises individual DNA molecules for which neither DNA strand is detected. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method, comprising:
-
(a) providing a sample comprising a set of double-stranded polynucleotide molecules, each double-stranded polynucleotide molecule including first and second complementary strands; (b) tagging said double-stranded polynucleotide molecules with a set of duplex tags, wherein each duplex tag differently tags said first and second complementary strands of a double-stranded polynucleotide molecule in said set; (c) sequencing at least some of said tagged strands to produce a set of sequence reads; (d) reducing and/or tracking redundancy in said set of sequence reads; (e) sorting sequence reads into paired reads and unpaired reads, wherein (i) each paired read corresponds to sequence reads generated from a first tagged strand and a second differently tagged complementary strand derived from a double-stranded polynucleotide molecule in said set, and (ii) each unpaired read represents a first tagged strand having no second differently tag complementary strand derived from a double-stranded polynucleotide molecule represented among said sequence reads in said set of sequence reads; (f) determining quantitative measures of at least two of (i) said paired reads, (ii) said unpaired reads that map to each of one or more genetic loci, (iii) read depth of said paired reads and (iv) read depth of said unpaired reads; and (g) estimating with a programmed computer processor a quantitative measure of total double-stranded polynucleotide molecules in said set that map to each of said one or more genetic loci based on said quantitative measures of said at least two of (i) said paired reads, (ii) said unpaired reads mapping to each locus, (iii) said read depth of said paired reads and (iv) said read depth of said unpaired reads. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for detecting copy number variation in deoxyribonucleic acid (DNA) molecules in a biological sample of a subject, comprising:
-
(a) attaching adapters to ends of fragments generated from said DNA molecules in said biological sample of said subject, wherein said adapters tag a 5′
end of a strand of an individual fragment among said fragments with a first tag and a 3′
end of a complementary strand of said individual fragment with a second tag, thereby providing tagged fragment molecules;(b) sequencing at least a portion of each of said tagged fragment molecules to provide a plurality of sequencing reads; (c) mapping said plurality of sequencing reads to a first genetic locus and at least one second genetic locus in a reference genome, wherein said first tag and said second tag are indicative of which strand of said tagged fragment molecules each of said plurality of sequencing reads is derived; (d) using a programmed computer to determine a first total number of tagged fragment molecules for said first genetic locus and a second total number of tagged fragment molecules for said at least one second genetic locus, wherein each of said first total number and second total number is based on (i) a number of tagged fragments for which sequencing reads from both strands of said tagged fragment molecules are detected and (ii) a number of tagged fragment for which sequencing reads from only one strand of said tagged fragment molecules are detected, and wherein said first total number and said second total number comprises tagged fragment molecules for which neither strand was sequenced; and (e) comparing said first total number of tagged fragment molecules to said second total number of tagged fragment molecules to determine copy number variation in said DNA molecules. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification