METHODS FOR ASSEMBLING PANELS OF CANCER CELL LINES FOR USE IN TESTING THE EFFICACY OF ONE OR MORE PHARMACEUTICAL COMPOSITIONS

US 20100144554A1
Filed: 10/28/2009
Published: 06/10/2010
Est. Priority Date: 10/31/2008
Status: Active Grant

First Claim

Patent Images

1. An algorithm for use in clustering tumors and cell lines to define genomic subgroups, the method comprising the steps of:

(a) obtaining a plurality of m samples comprising at least one tumor or cancer cell line;

(b) acquiring a data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (a);

(c) identifying in the data set, copy number alteration information obtained from samples contaminated by normal cells and eliminating the contaminated samples from the data set, wherein the identifying and eliminating comprises;

(1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data;

(2) assigning a probability score for normal cell contamination to each sample as determined by the machine learning algorithm;

(3) eliminating data from the data set for each sample scoring 50% or greater probability of containing normal cells;

(d) estimating a number of subgroups, r, in the data set by applying an unsupervised clustering algorithm using Pearson linear dissimilarity algorithm to the data set;

(e) assigning each sample in the data set to at least one cluster using a modified genomic non-negative matrix factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises;

(1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using the formula (1);

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention relates to algorithms for use in defining genomic subgroups of tumors and cancer cell lines. The present invention also relates to methods for assembling panels of tumors and cancer cell lines according to genomic subgroups for use in testing the efficacy of one or more pharmaceutical compounds in the treatment of subjects suffering from at least one cancer.

29 Citations

View as Search Results

8 Claims

1. An algorithm for use in clustering tumors and cell lines to define genomic subgroups, the method comprising the steps of:
- (a) obtaining a plurality of m samples comprising at least one tumor or cancer cell line;
  
  (b) acquiring a data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (a);
  
  (c) identifying in the data set, copy number alteration information obtained from samples contaminated by normal cells and eliminating the contaminated samples from the data set, wherein the identifying and eliminating comprises;
  
  (1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data;
  
  (2) assigning a probability score for normal cell contamination to each sample as determined by the machine learning algorithm;
  
  (3) eliminating data from the data set for each sample scoring 50% or greater probability of containing normal cells;
  
  (d) estimating a number of subgroups, r, in the data set by applying an unsupervised clustering algorithm using Pearson linear dissimilarity algorithm to the data set;
  
  (e) assigning each sample in the data set to at least one cluster using a modified genomic non-negative matrix factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises;
  
  (1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using the formula (1);
- View Dependent Claims (2, 3, 4, 5)
- - 2. The algorithm of claim 1, wherein the unsupervised clustering algorithm is a hierarchical clustering.
  - 3. The algorithm of claim 1, wherein Cophenetic correlation is used to provide a final number of clusters from the data set.
  - 4. The algorithm of claim 1, wherein Bayesian Information Criterion is used to provide a final number of clusters from the data set.
  - 5. The algorithm of claim 1, wherein Cophenetic correlation and Bayesian Information Criterion are used to provide a final number of clusters from the data set.

6. A method for assembling panels of tumor and cancer cell lines according to genomic subgroups, the method comprising the steps of:
- (a) obtaining a plurality of m samples comprising at least one tumor or cancer cell line;
  
  (b) acquiring a data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (a);
  
  (c) identifying in the data set, copy number alteration information obtained from samples contaminated by normal cells and eliminating the contaminated samples from the data set, wherein the identifying and eliminating comprises;
  
  (1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data;
  
  (2) assigning a probability score for normal cell contamination to each sample, as determined by the machine learning algorithm;
  
  (3) eliminating data from the data set for each sample scoring 50% or greater probability of containing normal cells;
  
  (d) estimating a number of subgroups, r, in the data set by applying unsupervised clustering using Pearson linear dissimilarity algorithm to the data set;
  
  (e) assigning each sample in the data set to at least one cluster using a modified genomic non-negative matrix factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises;
  
  (1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using the formula (1);
- View Dependent Claims (7, 8)
- - 7. The method of claim 6, wherein the cancer is selected from the group consisting of:
    - small cell lung carcinoma, non-small cell lung carcinoma, colorectal cancer, and melanoma.
  - 8. The method of claim 6, wherein the copy number alteration is a gain or loss or copy number.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Abbvie Incorporated
Original Assignee
Abbott Laboratories Incorporated
Inventors
Zhang, Ke, Lu, Xin, Semizarov, Dimitri, Lesniewski, Rick R.

Granted Patent

US 9,002,653 B2
Time in Patent Office

Days
Field of Search
US Class Current

506/24
CPC Class Codes

G06F 17/10   Complex mathematical operat...

G06F 17/11   for solving equations , e.g...

G06F 17/15   Correlation function comput...

G06F 17/16   Matrix or vector computatio...

G06F 17/17   Function evaluation by appr...

G16B 20/00   ICT specially adapted for f...

G16B 20/10   Ploidy or copy number detec...

G16B 20/20   Allele or variant detection...

G16B 40/00   ICT specially adapted for b...

G16B 40/30   Unsupervised data analysis

G16B 5/00   ICT specially adapted for m...

G16B 5/20   Probabilistic models

G16H 50/50   for simulation or modelling...

G16H 50/70   for mining of medical data,...

METHODS FOR ASSEMBLING PANELS OF CANCER CELL LINES FOR USE IN TESTING THE EFFICACY OF ONE OR MORE PHARMACEUTICAL COMPOSITIONS

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

29 Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

METHODS FOR ASSEMBLING PANELS OF CANCER CELL LINES FOR USE IN TESTING THE EFFICACY OF ONE OR MORE PHARMACEUTICAL COMPOSITIONS

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links