Determination of copy number variations using binomial probability calculations

US 8,712,697 B2
Filed: 09/06/2012
Issued: 04/29/2014
Est. Priority Date: 09/07/2011
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented process for determining copy number variation (CNV) of a one or more genomic regions in a single source in a mixed sample, wherein at least one processor coupled to a memory executes a software component that performs the process comprising:

accessing by the software component a first data set comprising frequency data based on identification of distinguishing regions of two or more informative loci from a first source in the single source in the mixed sample;

accessing by the software component a second data set comprising frequency data based on identification of distinguishing regions of two or more informative loci from a second source in the single source in the mixed sample;

calculating by the software component an estimated source contribution of cell free nucleic acids based on a binomial distribution of counts of the distinguishing regions from first and second data sets;

accessing by the software component a third data set comprising frequency data for one or more genomic regions from the single source in the mixed sample; and

calculating by the software component a presence or absence of a CNV for the one or more genomic regions by comparison of the frequency data from the single source to the estimated contribution of cell free nucleic acids in the mixed sample.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

This invention relates to a binomial calculation of copy number of data obtained from a mixed sample having a first source and a second source.

202 Citations

7 Claims

1. A computer-implemented process for determining copy number variation (CNV) of a one or more genomic regions in a single source in a mixed sample, wherein at least one processor coupled to a memory executes a software component that performs the process comprising:
- accessing by the software component a first data set comprising frequency data based on identification of distinguishing regions of two or more informative loci from a first source in the single source in the mixed sample;
  
  accessing by the software component a second data set comprising frequency data based on identification of distinguishing regions of two or more informative loci from a second source in the single source in the mixed sample;
  
  calculating by the software component an estimated source contribution of cell free nucleic acids based on a binomial distribution of counts of the distinguishing regions from first and second data sets;
  
  accessing by the software component a third data set comprising frequency data for one or more genomic regions from the single source in the mixed sample; and
  
  calculating by the software component a presence or absence of a CNV for the one or more genomic regions by comparison of the frequency data from the single source to the estimated contribution of cell free nucleic acids in the mixed sample.
- View Dependent Claims (2)
- - 2. The process of claim 1, wherein the CNV is calculated based on empirical frequency data for the one or more genomic regions from the single source in the mixed sample.

3. A computer-implemented process for determining copy number variation (CNV) of one or more genomic regions a single source in a mixed sample, wherein at least one processor coupled to a memory executes a software component that performs the process, the process comprising:
- accessing by the software component a first data set comprising frequency data for one or more informative loci from a maternal source in the mixed sample;
  
  accessing by the software component a second data set comprising frequency data for one or more informative loci from a fetal source in the mixed sample;
  
  calculating by the software component an estimated fetal source contribution of cell free nucleic based on a binomial distribution of the counts of distinguishing regions from first and second data sets; and
  
  accessing by the software component a third data set comprising frequency data for one or more genomic regions from the single source in the mixed sample; and
  
  calculating by the software component the presence or absence of a CNV for the one or more genomic regions in the fetus by comparison of the frequency of the genomic regions from the single source to the estimated fetal source contribution of cell free nucleic acids in the mixed sample.

4. An executable software product stored on a non-transitory computer-readable medium containing program instructions, which when executed by a computer directs performance of steps for estimating copy number variation (CNV) of one or more genomic regions in a mixed sample, the steps comprising:
- accessing by the software component a first data set comprising frequency data based on identification of distinguishing regions from copies of one or more informative loci from a first source;
  
  accessing by the software component a second data set comprising frequency data based on identification of distinguishing regions from copies of one or more informative loci from a second source;
  
  calculating by the software component an estimated source contribution of cell free nucleic acids based on a binomial distribution of the first and second data sets; and
  
  calculating by the software component the CNV for the one or more genomic regions by comparison of the at least one of the first source and the second source counts for the genomic region to the estimated source contribution of cell free nucleic acids from the at least one of the first source and the second source.
- View Dependent Claims (5)
- - 5. The process of claim 4, wherein the CNV is calculated based on empirical frequency data for the one or more genomic regions from the single source in the mixed sample.

6. A system, comprising:
- a memory;
  
  a processor coupled to the memory; and
  
  a software component executed by the processor that is configured to;
  
  access a first data set comprising frequency data based on identification of distinguishing regions from copies of one or more informative loci from a first source in a mixed sample;
  
  access a second data set comprising frequency data based on identification of distinguishing regions from copies of one or more informative loci from a second source in the mixed sample;
  
  calculate an estimated contribution of cell free nucleic acids from at least one of the first source and the second source based on a binomial distribution of counts of the distinguishing regions from the first and second data sets; and
  
  calculate a copy number variation for one or more genomic regions of a single source in the mixed sample by comparison of the frequency data for the one or more genomic regions to the estimated source contribution of cell free nucleic acids in the mixed sample.

7. A computer software product including a non-transitory computer-readable storage medium having fixed therein a sequence of instructions which when executed by a computer directs performance of steps of:
- creating a first data set representing a quantity of informative loci from a first source in a mixed sample;
  
  creating a second data set representing a quantity of informative loci from a second source in a the mixed sample;
  
  calculating an estimated source contribution of cell free nucleic acids from the first source and the second source in the mixed sample based on a binomial distribution of the quantities of informative loci from the first and second data sets; and
  
  calculating a presence or absence of a copy number variation for a genomic region by comparison of the quantity of one or more informative loci from the first and second data sets to the estimated source contribution of cell free nucleic acids in the mixed sample.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Roche Molecular Systems (Roche Holding AG)
Original Assignee
Ariosa Diagnostics Incorporated (Roche Holding AG)
Inventors
Struble, Craig, Stuelpnagel, John
Primary Examiner(s)
SIMS, JASON M

Application Number

US13/605,505
Publication Number

US 20130060483A1
Time in Patent Office

600 Days
Field of Search

None
US Class Current

702/19
CPC Class Codes

G16B 20/00   ICT specially adapted for f...

G16B 20/10   Ploidy or copy number detec...

G16B 20/20   Allele or variant detection...

G16B 20/40   Population genetics; Linkag...

G16B 30/00   ICT specially adapted for s...

G16B 5/20   Probabilistic models

Determination of copy number variations using binomial probability calculations

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

202 Citations

7 Claims

Specification

Use Cases

Quick Links

Others

Determination of copy number variations using binomial probability calculations

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

202 Citations

7 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others