×

GENOMIC CLASSIFICATION OF COLORECTAL CANCER BASED ON PATTERNS OF GENE COPY NUMBER ALTERATIONS

  • US 20100145894A1
  • Filed: 10/28/2009
  • Published: 06/10/2010
  • Est. Priority Date: 10/31/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method for obtaining a database of colorectal cancer genomic subgroups, the method comprising the steps of:

  • (a) obtaining a plurality of m samples comprising at least one CRC cell, wherein the samples comprise cell lines or tumors;

    (b) acquiring a data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (a);

    (c) identifying in the data set samples contaminated by normal cells and eliminating the contaminated samples from the data set, wherein the identifying and eliminating comprises;

    (1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data;

    (2) assigning a probability score for normal cell contamination to each sample as determined by the machine learning algorithm;

    (3) eliminating data from the data set for each sample scoring 50% or greater probability of containing normal cells;

    (d) estimating a number of subgroups, r, in the data set by applying an unsupervised clustering algorithm using Pearson linear dissimilarity algorithm to the data set;

    (e) assigning each sample in the data set to at least one cluster using a modified genomic Non-negative Matrix Factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises;

    (1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using formula (11);

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×