GENOMIC CLASSIFICATION OF MALIGNANT MELANOMA BASED ON PATTERNS OF GENE COPY NUMBER ALTERATIONS
First Claim
Patent Images
1. A method for obtaining a database of malignant melanoma genomic subgroups, the method comprising the steps of:
- (a) obtaining a plurality of m samples comprising at least one MM cell;
(b) acquiring a data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (a);
(c) identifying in the data set samples contaminated by normal cells and eliminating the contaminated samples from the data set, wherein the identifying and eliminating comprises;
(1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data;
(2) assigning a probability score for normal cell contamination to each sample as determined by the machine learning algorithm;
(3) eliminating data from the data set for each sample scoring 50% or greater probability of containing normal cells;
(d) estimating a number of subgroups, r, in the data set by applying an unsupervised clustering algorithm using Pearson linear dissimilarity algorithm to the data set;
(e) assigning each sample in the data set to at least one cluster using a modified genomic Non-negative Matrix Factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises;
(1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using the formula;
2 Assignments
0 Petitions
Accused Products
Abstract
The invention is directed to methods and kits that allow for classification of malignant melanoma cells according to genomic profiles, and methods of diagnosing, predicting clinical outcomes, and stratifying patient populations for clinical testing and treatment using the same.
-
Citations
23 Claims
-
1. A method for obtaining a database of malignant melanoma genomic subgroups, the method comprising the steps of:
-
(a) obtaining a plurality of m samples comprising at least one MM cell; (b) acquiring a data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (a); (c) identifying in the data set samples contaminated by normal cells and eliminating the contaminated samples from the data set, wherein the identifying and eliminating comprises; (1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data; (2) assigning a probability score for normal cell contamination to each sample as determined by the machine learning algorithm; (3) eliminating data from the data set for each sample scoring 50% or greater probability of containing normal cells; (d) estimating a number of subgroups, r, in the data set by applying an unsupervised clustering algorithm using Pearson linear dissimilarity algorithm to the data set; (e) assigning each sample in the data set to at least one cluster using a modified genomic Non-negative Matrix Factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises; (1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using the formula; - View Dependent Claims (3, 4, 5, 6, 7, 8)
-
-
2. A method of classifying a MM tumor or cell line, comprising:
-
(a) providing a database, developed through a method comprising; (i) obtaining a plurality of m samples comprising at least one MM tumor or cell line; (ii) acquiring a first data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (i); (iii) identifying in the first data set samples contaminated by normal cells and eliminating the contaminated samples from the first data set, wherein the identifying and eliminating comprises; (1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data; (2) assigning a probability score for normal cell contamination to each sample as determined by the machine learning algorithm; (3) eliminating data from the first data set for each sample scoring 50% or greater probability of containing normal cells; (iv) estimating a number of subgroups, r, in the data set by applying an unsupervised clustering algorithm using Pearson linear dissimilarity algorithm to the data set; (v) assigning each sample in the data set to at least one cluster using a modified genomic Non-negative Matrix Factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises; (1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using the formula;
-
-
9. A method of classifying a therapeutic intervention for arresting or killing malignant melanoma (MM) cells, comprising:
-
(a) from a panel of MM cells classified according to genomic subgroups, selecting at least one MM cell line from each subgroup, wherein the panel is assembled from a method comprising; (i) obtaining a plurality of m samples comprising MM cells; (ii) acquiring a first data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (i); (iii) identifying in the first data set samples contaminated by normal cells and eliminating the contaminated samples from the first data set, wherein the identifying and eliminating comprises; (1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data; (2) assigning a probability score for normal cell contamination to each sample as determined by the machine learning algorithm; (3) eliminating data from the first data set for each sample scoring 50% or greater probability of containing normal cells; (iv) estimating a number of subgroups, r, in the data set by applying an unsupervised clustering algorithm using Pearson linear dissimilarity algorithm to the data set; (v) assigning each sample in the data set to at least one cluster using a modified genomic Non-negative Matrix Factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises; (1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using the formula; - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method of assembling a probe panel for classifying a MM cell from a sample, comprising:
-
(a) assembling a database, comprising; (i) obtaining a plurality of m samples comprising at least one MM cell; (ii) acquiring a first data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (i); (iii) identifying in the first data set samples contaminated by normal cells and eliminating the contaminated samples from the first data set, wherein the identifying and eliminating comprises; (1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data; (2) assigning a probability score for normal cell contamination to each sample as determined by the machine learning algorithm; (3) eliminating data from the first data set for each sample scoring 50% or greater probability of containing normal cells; (iv) estimating a number of subgroups, r, in the data set by applying an unsupervised clustering algorithm using Pearson linear dissimilarity algorithm to the data set; (v) assigning each sample in the data set to at least one cluster using a modified genomic Non-negative Matrix Factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises; (1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using the formula; - View Dependent Claims (21, 22)
-
-
23. A kit for classifying a MM tumor sample or a cell line, comprising:
-
(a) instructions to assemble a database, comprising instructions for; (i) obtaining a plurality of m samples comprising at least one MM cell; (ii) acquiring a first data set comprising copy number alteration information from at least one locus from each chromosome from each sample obtained in step (i); (iii) identifying in the first data set samples contaminated by normal cells and eliminating the contaminated samples from the first data set, wherein the identifying and eliminating comprises; (1) applying a machine learning algorithm tuned to parameters that represent the differences between tumor and normal samples to the data; (2) assigning a probability score for normal cell contamination to each sample as determined by the machine learning algorithm; (3) eliminating data from the first data set for each sample scoring 50% or greater probability of containing normal cells; (iv) estimating a number of subgroups, r, in the data set by applying an unsupervised clustering algorithm using Pearson linear dissimilarity algorithm to the data set; (v) assigning each sample in the data set to at least one cluster using a modified genomic Non-negative Matrix Factorization (gNMF) algorithm, wherein the modified gNMF algorithm comprises; (1) calculating divergence of the algorithm after every 100 steps of multiplicative updating using the formula;
-
Specification