Systems and methods to detect rare mutations and copy number variation
DCFirst Claim
1. A method for detecting a presence or absence of one or more somatic genetic variants in cell-free deoxyribonucleic acid (cfDNA) molecules from a bodily fluid sample of a subject, comprising:
- (a) non-uniquely tagging a plurality of cfDNA molecules from a population of cfDNA molecules obtained from the bodily fluid sample with molecular barcodes from a set of molecular barcodes to produce non-uniquely tagged parent polynucleotides,wherein the non-uniquely tagging comprises ligating molecular barcodes from the set of molecular barcodes to both ends of a cfDNA molecule from the plurality of cfDNA molecules using more than a 10×
molar excess of molecular barcodes relative to the population of cfDNA molecules,wherein the cfDNA molecules that map to a mappable base position of a reference sequence are tagged with a number of diffrent molecular barcodes ranging from at least 2 and fewer than a number of cfDNA molecules that map to the mappable base position, andwherein at least 20% of the cfDNA molecules from the population of cfDNA molecules are attached to molecular barcodes;
(b) amplifying a plurality of the non-uniquely tagged parent polynucleotides to produce progeny polynucleotides with associated molecular barcodes;
(c) sequencing a plurality of the progeny polynucleotides to produce sequencing reads of the progeny polynucleotides with associated molecular barcodes;
(d) mapping a plurality of the sequencing reads to the reference sequence to generate mapped sequencing reads;
(e) grouping a plurality of the mapped sequencing reads into a plurality of families based on sequence information from the molecular barcodes and at least (1) a start base position of a given mapped sequencing read from among the mapped sequencing reads at which the given mapped sequencing read is determined to start mapping to the reference sequence and/or (2) a stop base position of the given mapped sequencing read at which the given mapped sequencing read is determined to stop mapping to the reference sequence; and
(f) detecting, from among the mapped sequencing reads in a plurality of the families, the presence or absence of the one or more somatic genetic variants.
1 Assignment
Litigations
0 Petitions
Accused Products
Abstract
The present disclosure provides a system and method for the detection of rare mutations and copy number variations in cell free polynucleotides. Generally, the systems and methods comprise sample preparation, or the extraction and isolation of cell free polynucleotide sequences from a bodily fluid; subsequent sequencing of cell free polynucleotides by techniques known in the art; and application of bioinformatics tools to detect rare mutations and copy number variations as compared to a reference. The systems and methods also may contain a database or collection of different rare mutations or copy number variation profiles of different diseases, to be used as additional references in aiding detection of rare mutations, copy number variation profiling or general genetic profiling of a disease.
326 Citations
30 Claims
-
1. A method for detecting a presence or absence of one or more somatic genetic variants in cell-free deoxyribonucleic acid (cfDNA) molecules from a bodily fluid sample of a subject, comprising:
-
(a) non-uniquely tagging a plurality of cfDNA molecules from a population of cfDNA molecules obtained from the bodily fluid sample with molecular barcodes from a set of molecular barcodes to produce non-uniquely tagged parent polynucleotides, wherein the non-uniquely tagging comprises ligating molecular barcodes from the set of molecular barcodes to both ends of a cfDNA molecule from the plurality of cfDNA molecules using more than a 10×
molar excess of molecular barcodes relative to the population of cfDNA molecules,wherein the cfDNA molecules that map to a mappable base position of a reference sequence are tagged with a number of diffrent molecular barcodes ranging from at least 2 and fewer than a number of cfDNA molecules that map to the mappable base position, and wherein at least 20% of the cfDNA molecules from the population of cfDNA molecules are attached to molecular barcodes; (b) amplifying a plurality of the non-uniquely tagged parent polynucleotides to produce progeny polynucleotides with associated molecular barcodes; (c) sequencing a plurality of the progeny polynucleotides to produce sequencing reads of the progeny polynucleotides with associated molecular barcodes; (d) mapping a plurality of the sequencing reads to the reference sequence to generate mapped sequencing reads; (e) grouping a plurality of the mapped sequencing reads into a plurality of families based on sequence information from the molecular barcodes and at least (1) a start base position of a given mapped sequencing read from among the mapped sequencing reads at which the given mapped sequencing read is determined to start mapping to the reference sequence and/or (2) a stop base position of the given mapped sequencing read at which the given mapped sequencing read is determined to stop mapping to the reference sequence; and (f) detecting, from among the mapped sequencing reads in a plurality of the families, the presence or absence of the one or more somatic genetic variants. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method for quantifying single nucleotide variant tumor markers in cell-free deoxyribonucleic acid (cfDNA) molecules from a bodily fluid sample of a subject, comprising:
-
(a) non-uniquely tagging a plurality of cfDNA molecules from a population of cfDNA molecules obtained from the bodily fluid sample with molecular barcodes from a set of molecular barcodes to produce non-uniquely tagged parent polynucleotides, wherein the non-uniquely tagging comprises ligating molecular barcodes from the set of molecular barcodes to both ends of a cfDNA molecule from the plurality of cfDNA molecules using more than a 10×
molar excess of molecular barcodes relative to the population of cfDNA molecules,wherein the cfDNA molecules that map to a mappable base position of a reference sequence are tagged with a number of diffrent molecular barcodes ranging from at least 2 and fewer than a number of cfDNA molecules that map to the mappable base position, and wherein at least 20% of the cfDNA molecules from the population of cfDNA molecules are attached to molecular barcodes; (b) amplifying a plurality of the non-uniquely tagged parent polynucleotides to produce progeny polynucleotides with associated molecular barcodes; (c) sequencing at least a subset of the progeny polynucleotides to produce sequencing reads of the progeny polynucleotides with associated molecular barcodes; (d) mapping a plurality of the sequencing reads to the reference sequence to generate mapped sequencing reads; (e) grouping a plurality of the mapped sequencing reads into a plurality of families based on sequence information from the molecular barcodes and at least (1) a start base position of a given mapped sequencing read from among the mapped sequencing reads at which the given mapped sequencing read is determined to start mapping to the reference sequence and/or (2) a stop base position of the given mapped sequencing read at which the given mapped sequencing read is determined to stop mapping to the reference sequence; (f) generating a consensus sequence for each family from among the plurality of families; (g) identifying a subset of the plurality of consensus sequences comprising a single nucleotide variant as compared to the reference sequence; and (h) calculating a number of the subset of the plurality of consensus sequences comprising the single nucleotide variant, thereby quantifying single nucleotide variant tumor markers in the cfDNA molecules from the bodily fluid sample of the subject. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification