Detecting genetic copy number variation
First Claim
1. A method of detecting copy number variation (CNV) in the DNA of a plurality of patients, the method comprising:
- receiving a plurality of samples, each sample containing DNA from a single patient;
from each sample, generating a plurality of fragments of DNA;
barcoding each of the fragments with an identifier that uniquely identifies the respective patient from whom the DNA was received;
pooling the plurality of samples into a DNA library;
subjecting the DNA library to one or more stages of filtering to increase the relative concentration of fragments within a plurality of selected regions of interest;
producing sequencing data for the plurality of patients by sequencing the filtered DNA library;
demultiplexing the sequencing data;
for each patient, generating coverage data by identifying, for each of the regions of interest, coverage of each region of interest in the sequencing data;
generating normalized coverage data from the coverage data;
generating reference coverage, common to all samples, for each region of interest, the generation of the reference coverage being based upon the normalized coverage data;
automatically detecting CNV for at least one subsequence of at least one of the regions of interest of at least one of the patients based upon comparing the reference coverage to the normalized coverage data; and
providing output that identifies the patient, the subsequence, and the CNV.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the invention include systems, apparatus, and methods for detecting copy number variation (CNV) in the genomes of one or more patients. Samples of DNA may be taken from several patients, and then sections of the patients'"'"' DNA may be sequenced, e.g., through a process that may include, for each patient, one or more of: purifying, concentrating, fragmenting, labeling, filtering, and amplifying that patient'"'"'s DNA. Fragments from several patients may be pooled, and the fragments in the pool may be sequenced.
The sequencing data is then subjected to analysis, which includes several normalization steps. The normalized data are then examined to identify CNV, which is reported.
-
Citations
12 Claims
-
1. A method of detecting copy number variation (CNV) in the DNA of a plurality of patients, the method comprising:
-
receiving a plurality of samples, each sample containing DNA from a single patient; from each sample, generating a plurality of fragments of DNA; barcoding each of the fragments with an identifier that uniquely identifies the respective patient from whom the DNA was received; pooling the plurality of samples into a DNA library; subjecting the DNA library to one or more stages of filtering to increase the relative concentration of fragments within a plurality of selected regions of interest; producing sequencing data for the plurality of patients by sequencing the filtered DNA library; demultiplexing the sequencing data; for each patient, generating coverage data by identifying, for each of the regions of interest, coverage of each region of interest in the sequencing data; generating normalized coverage data from the coverage data; generating reference coverage, common to all samples, for each region of interest, the generation of the reference coverage being based upon the normalized coverage data; automatically detecting CNV for at least one subsequence of at least one of the regions of interest of at least one of the patients based upon comparing the reference coverage to the normalized coverage data; and providing output that identifies the patient, the subsequence, and the CNV. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of detecting copy number variation (CNV) in one or more regions of respective X chromosomes of a plurality of patients, the method comprising:
-
receiving a plurality of samples, each sample containing DNA from a single patient; from each sample, generating a plurality of fragments of DNA; barcoding each of the fragments with an identifier that uniquely identifies the respective patient from whom the DNA was received; pooling the plurality of samples into a DNA library; subjecting the DNA library to one or more stages of filtering to increase the relative concentration of fragments within a plurality of selected regions of interest, at least one of the selected regions of interest being a region known to exist within the X chromosome; producing sequencing data for the plurality of patients by sequencing the filtered DNA library; demultiplexing the sequencing data; for each patient, generating coverage data by identifying, for each of the regions of interest, coverage of each region of interest in the sequencing data; generating normalized coverage data from the coverage data; generating reference coverage, common to all samples, for each region of interest, the generation of the reference coverage being based upon the normalized coverage data; automatically detecting CNV for at least one subsequence of at least one of the regions of interest of at least one of the patients based upon comparing the reference coverage to the normalized coverage data, at least one of the at least one subsequences being within the region known to exist within the X chromosome; and providing output that identifies the patient, at least one of the subsequences within the region of interest known to exist within the X chromosome, and the CNV of the at least one of the subsequences within the region of interest.
-
Specification