×

Detecting fetal sub-chromosomal aneuploidies

  • US 10,318,704 B2
  • Filed: 05/29/2015
  • Issued: 06/11/2019
  • Est. Priority Date: 05/30/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method, implemented at a computer system that includes one or more processors and system memory, for evaluation of copy number of a sequence of interest in a test sample comprising nucleic acids, the method comprising:

  • (a) receiving, by the computer system, sequence reads obtained by sequencing DNA in the test sample;

    (b) aligning, by the computer system, the sequence reads of the test sample to a reference genome comprising the sequence of interest, thereby providing test sequence tags, wherein the reference genome is divided into a plurality of bins, wherein the sequence of interest is in a sub-chromosomal genomic region in which a copy number variation is associated with a genetic syndrome;

    (c) determining, by the computer system, coverages of the test sequence tags for the bins in the reference genome including the sequence of interest;

    (d) adjusting, by the computer system, the coverages of the test sequence tags for the bins in the reference genome by employing expected coverages for the bins obtained from a subset of a training set of unaffected training samples sequenced and aligned in substantially the same manner as the test sample, wherein the expected coverages for the bins in the reference genome were obtained by;

    (i) selecting a plurality of bins outside the sequence of interest, wherein each selected bin has a correlation in coverage meeting a first criterion with a bin in the sequence of interest, and wherein the first criterion excludes one or more bins outside the sequence of interest from being selected,(ii) selecting training samples from the training set to form the subset of the training set, wherein the selected training samples have correlations meeting a second criterion with each other in their coverages in the plurality of bins outside the sequence of interest, and wherein the second criterion excludes one or more training samples from being selected, and(iii) obtaining the expected coverages for the bins in the reference genome based on the subset of the training set'"'"'s coverages in the bins in the reference genome; and

    (e) making, by the computer system, a call of the copy number variation of the sequence of interest in the test sample based on the adjusted coverages from (d).

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×