×

Methods for identifying DNA copy number changes

  • US 7,822,555 B2
  • Filed: 12/05/2005
  • Issued: 10/26/2010
  • Est. Priority Date: 11/11/2002
  • Status: Active Grant
First Claim
Patent Images

1. A method of estimating in a sample the copy number of a plurality of genomic regions in a genome, wherein each genomic region contains at least one single nucleotide polymorphisms (SNP) from a plurality of SNPs, wherein each SNP in the plurality has an A and a B allele in a population, said method comprising:

  • (a) genotyping the sample using a high density genotyping array comprising a plurality of perfect match and mismatch-probes for the A allele of each SNP in the plurality of SNPs (PMA and MMA) and a plurality of perfect match and mismatch probes for the B allele (PMB and MMB) to obtain a raw intensity measurement for each PMA, MMA, PMB and MMB probe for each SNP in the plurality of SNPs, wherein said and to obtain a genotyping call for each SNP in the plurality of SNPs;

    (b) transforming each raw intensity measurement to its natural log to obtain a transformed intensity value for each probe;

    (c) normalizing the transformed intensity values using the MMB transformed intensity values for all SNPs from the plurality of SNPs that are called BB in the sample to obtain normalized PMA intensities;

    (d) normalizing each PMB transformed intensity values using the MMA transformed intensities for all SNPs from the plurality that are called AA in the sample to obtain normalized PMB intensities;

    (e) using a plurality of reference samples, identify a set of PMA probes and a set of PMB probes for each SNP in the plurality of SNPs that show linear correlation between copy number and intensity;

    (f) calculating for each SNP in the plurality of SNPs an average of the PMA probes in the set of PMA probes and an average of the PMB probes in the set of PMB probes to obtain a PMA average intensity and a PMB average intensity for each SNP in the plurality of SNPs;

    (g) performing linear regression against a model equation derived from a plurality of reference samples to obtain an estimated A allele copy number and an estimated B allele copy number for each SNP in the plurality of SNPs;

    (h) adding the estimated A allele copy number to the estimated B allele copy number to obtain an estimated total copy number of the genomic region of each SNP in the plurality of SNPs, thereby calculating an estimated total copy number for each of a plurality of genomic regions in a genome; and

    (i) applying regression tree analysis to the estimated total copy numbers obtained in (h) to partition the genome into genomic regions having the same estimated total copy number, wherein steps (b)-(i) are performed by a computer and wherein the computer outputs the estimated total copy number of a plurality of genomic regions in a computer readable format.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×