Systems and methods to detect rare mutations and copy number variation
First Claim
1. A system comprising:
- a nucleic acid sequencer configured to sequence a nucleic acid library comprising non-uniquely tagged parent polynucleotides and non-uniquely tagged progeny polynucleotides derived therefrom, to generate sequencing reads corresponding to the non-uniquely tagged parent polynucleotides, wherein the non-uniquely tagged parent polynucleotides comprise nucleic acid molecules ligated to non-unique molecular barcodes, wherein the non-unique molecular barcodes have from 2 to 1000 different sequences and lengths of at least 5 nucleotides, wherein the sequencing reads comprise (1) nucleic acid sequences of the nucleic acid molecules, and (2) nucleic acid sequences of the non-unique molecular barcodes attached to the nucleic acid molecules;
a communication interface that receives, over a communication network, the sequencing reads generated by the nucleic acid sequencer; and
a computer in communication with the communication interface, wherein the computer comprises one or more computer processors and a computer-readable medium comprising machine-executable code that, upon execution by the one or more computer processors, implements a method comprising;
(i) receiving, over the communication network, the sequencing reads generated by the nucleic acid sequencer;
(ii) mapping the sequencing reads to one or more reference sequences from a human genome;
(iii) grouping the sequencing reads into a plurality of families, wherein a family of the plurality of families comprises sequencing reads comprising identical nucleic acid sequences of the non-unique molecular barcodes and having the same start or stop positions, wherein the family of the plurality of families comprises sequencing reads corresponding to a same original cell-free nucleic acid molecule, wherein the family of the plurality of families comprises sequencing reads of non-uniquely tagged progeny polynucleotides amplified from a unique polynucleotide among the non-uniquely tagged parent polynucleotides; and
(iv) generating a base call for at least the family of the plurality of families at a genetic locus of a plurality of genetic loci in the one or more reference sequences.
1 Assignment
0 Petitions
Accused Products
Abstract
The present disclosure provides a system and method for the detection of rare mutations and copy number variations in cell free polynucleotides. Generally, the systems and methods comprise sample preparation, or the extraction and isolation of cell free polynucleotide sequences from a bodily fluid; subsequent sequencing of cell free polynucleotides by techniques known in the art; and application of bioinformatics tools to detect rare mutations and copy number variations as compared to a reference. The systems and methods also may contain a database or collection of different rare mutations or copy number variation profiles of different diseases, to be used as additional references in aiding detection of rare mutations, copy number variation profiling or general genetic profiling of a disease.
303 Citations
28 Claims
-
1. A system comprising:
-
a nucleic acid sequencer configured to sequence a nucleic acid library comprising non-uniquely tagged parent polynucleotides and non-uniquely tagged progeny polynucleotides derived therefrom, to generate sequencing reads corresponding to the non-uniquely tagged parent polynucleotides, wherein the non-uniquely tagged parent polynucleotides comprise nucleic acid molecules ligated to non-unique molecular barcodes, wherein the non-unique molecular barcodes have from 2 to 1000 different sequences and lengths of at least 5 nucleotides, wherein the sequencing reads comprise (1) nucleic acid sequences of the nucleic acid molecules, and (2) nucleic acid sequences of the non-unique molecular barcodes attached to the nucleic acid molecules; a communication interface that receives, over a communication network, the sequencing reads generated by the nucleic acid sequencer; and a computer in communication with the communication interface, wherein the computer comprises one or more computer processors and a computer-readable medium comprising machine-executable code that, upon execution by the one or more computer processors, implements a method comprising; (i) receiving, over the communication network, the sequencing reads generated by the nucleic acid sequencer; (ii) mapping the sequencing reads to one or more reference sequences from a human genome; (iii) grouping the sequencing reads into a plurality of families, wherein a family of the plurality of families comprises sequencing reads comprising identical nucleic acid sequences of the non-unique molecular barcodes and having the same start or stop positions, wherein the family of the plurality of families comprises sequencing reads corresponding to a same original cell-free nucleic acid molecule, wherein the family of the plurality of families comprises sequencing reads of non-uniquely tagged progeny polynucleotides amplified from a unique polynucleotide among the non-uniquely tagged parent polynucleotides; and (iv) generating a base call for at least the family of the plurality of families at a genetic locus of a plurality of genetic loci in the one or more reference sequences. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
Specification