Method and system for determining cancer status
First Claim
Patent Images
1. A computing platform for utilizing CpG cancer methylation data for generation of a cancer CpG methylation profile database, comprising:
- (a) a first computing device comprising a processor, a memory module, an operating system, and a computer program including instructions executable by the processor to create a data acquisition application for generating CpG methylation data from a set of biological samples, the data acquisition application comprising;
(1) a sequencing module operating a sequencing device to perform CpG methylation by hybridizing at least one probe sequence selected from SEQ ID NOs;
1-1775 and 1830-2321 to an extracted genomic DNA treated with a deaminating agent, wherein the extracted genomic DNA is obtained from a set of biological samples, wherein the set comprises a first cancerous biological sample, a second cancerous biological sample, a third cancerous biological sample, a first normal biological sample, a second normal biological sample, and a third normal biological sample;
wherein the first, second, and third cancerous biological samples are different; and
wherein the first, second, and third normal biological samples are different; and
(2) a data receiving module receiving;
(i) a first pair of CpG methylation datasets generated from the first cancerous biological sample and the first normal biological sample, wherein CpG methylation data generated from the first cancerous biological sample form a first dataset within the first pair of datasets, CpG methylation data generated from the first normal biological sample form a second dataset within the first pair of datasets, and the first cancerous biological sample and the first normal biological sample are from the same biological sample source;
(ii) a second pair of CpG methylation datasets generated from the second normal biological sample and the third normal biological sample, wherein CpG methylation data generated from the second normal biological sample form a third dataset within the second pair of datasets, CpG methylation data generated from the third normal biological sample form a fourth dataset within the second pair of datasets, and the first, second, and third normal biological samples are different; and
(iii) a third pair of CpG methylation datasets generated from the second cancerous biological sample and the third cancerous biological sample, wherein CpG methylation data generated from the second cancerous biological sample form a fifth dataset within the third pair of datasets, CpG methylation data generated from the third cancerous biological sample form a sixth dataset within the third pair of datasets, and the first, second, and third cancerous biological samples are different; and
(b) a second computing device comprising a processor, a memory module, an operating system, and a computer program including instructions executable by the processor to create a data analysis application for generating a cancer CpG methylation profile database, the data analysis application comprising a data analysis module to;
(1) generate a pair-wise methylation difference dataset from the first, second, and third pair of datasets; and
(2) analyze the pair-wise methylation difference dataset with a control dataset by a machine learning method to generate the cancer CpG methylation profile database, wherein(i) the machine learning method comprises;
identifying a plurality of markers and a plurality of weights based on a top score, and classifying the samples based on the plurality of markers and the plurality of weights; and
(ii) the cancer CpG methylation profile database comprises a set of CpG methylation profiles and each CpG methylation profile represents a cancer type.
4 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are methods, systems, platforms, non-transitory computer-readable medium, services, and kits for determining a cancer type in an individual. Also described herein include methods, systems, platforms, non-transitory computer-readable medium, and compositions for generating a CpG methylation profile database.
53 Citations
17 Claims
-
1. A computing platform for utilizing CpG cancer methylation data for generation of a cancer CpG methylation profile database, comprising:
-
(a) a first computing device comprising a processor, a memory module, an operating system, and a computer program including instructions executable by the processor to create a data acquisition application for generating CpG methylation data from a set of biological samples, the data acquisition application comprising; (1) a sequencing module operating a sequencing device to perform CpG methylation by hybridizing at least one probe sequence selected from SEQ ID NOs;
1-1775 and 1830-2321 to an extracted genomic DNA treated with a deaminating agent, wherein the extracted genomic DNA is obtained from a set of biological samples, wherein the set comprises a first cancerous biological sample, a second cancerous biological sample, a third cancerous biological sample, a first normal biological sample, a second normal biological sample, and a third normal biological sample;
wherein the first, second, and third cancerous biological samples are different; and
wherein the first, second, and third normal biological samples are different; and(2) a data receiving module receiving; (i) a first pair of CpG methylation datasets generated from the first cancerous biological sample and the first normal biological sample, wherein CpG methylation data generated from the first cancerous biological sample form a first dataset within the first pair of datasets, CpG methylation data generated from the first normal biological sample form a second dataset within the first pair of datasets, and the first cancerous biological sample and the first normal biological sample are from the same biological sample source; (ii) a second pair of CpG methylation datasets generated from the second normal biological sample and the third normal biological sample, wherein CpG methylation data generated from the second normal biological sample form a third dataset within the second pair of datasets, CpG methylation data generated from the third normal biological sample form a fourth dataset within the second pair of datasets, and the first, second, and third normal biological samples are different; and (iii) a third pair of CpG methylation datasets generated from the second cancerous biological sample and the third cancerous biological sample, wherein CpG methylation data generated from the second cancerous biological sample form a fifth dataset within the third pair of datasets, CpG methylation data generated from the third cancerous biological sample form a sixth dataset within the third pair of datasets, and the first, second, and third cancerous biological samples are different; and (b) a second computing device comprising a processor, a memory module, an operating system, and a computer program including instructions executable by the processor to create a data analysis application for generating a cancer CpG methylation profile database, the data analysis application comprising a data analysis module to; (1) generate a pair-wise methylation difference dataset from the first, second, and third pair of datasets; and (2) analyze the pair-wise methylation difference dataset with a control dataset by a machine learning method to generate the cancer CpG methylation profile database, wherein (i) the machine learning method comprises;
identifying a plurality of markers and a plurality of weights based on a top score, and classifying the samples based on the plurality of markers and the plurality of weights; and(ii) the cancer CpG methylation profile database comprises a set of CpG methylation profiles and each CpG methylation profile represents a cancer type. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method for generating a cancer CpG methylation profile database, comprising:
-
a) hybridizing at least one probe sequence selected from SEQ ID NOs;
1-1775 and 1830-2321 to an extracted genomic DNA treated with a deaminating agent to generate CpG methylation data, wherein the extracted genomic DNA is obtained from a set of biological samples, wherein the set comprises a first cancerous biological sample, a second cancerous biological sample, a third cancerous biological sample, a first normal biological sample, a second normal biological sample, and a third normal biological sample;
wherein the first, second, and third cancerous biological samples are different; and
wherein the first, second, and third normal biological samples are different;b) obtaining a first pair of CpG methylation datasets, with a first processor, generated from the first cancerous biological sample and the first normal biological sample, wherein CpG methylation data generated from the first cancerous biological sample form a first dataset within the first pair of datasets, CpG methylation data generated from the first normal biological sample form a second dataset within the first pair of datasets, and the first cancerous biological sample and the first normal biological sample are from the same biological sample source; c) obtaining a second pair of CpG methylation datasets, with the first computing device, generated from the second normal biological sample and the third normal biological sample, wherein CpG methylation data generated from the second normal biological sample form a third dataset within the second pair of datasets, CpG methylation data generated from the third normal biological sample form a fourth dataset within the second pair of datasets, and the first, second, and third normal biological samples are different; d) obtaining a third pair of CpG methylation datasets, with the first computing device, generated from the second cancerous biological sample and the third cancerous biological sample, wherein CpG methylation data generated from the second cancerous biological sample form a fifth dataset within the third pair of datasets, CpG methylation data generated from the third cancerous biological sample form a sixth dataset within the third pair of datasets, and the first, second, and third cancerous biological samples are different; e) generating a pair-wise methylation difference dataset, with a second processor, from the first, second, and third pair of datasets; and f) analyzing the pair-wise methylation difference dataset with a control dataset by a machine learning method to generate the cancer CpG methylation profile database, wherein (1) the machine learning method comprises;
identifying a plurality of markers and a plurality of weights based on a top score, and classifying the samples based on the plurality of markers and the plurality of weights; and(2) the cancer CpG methylation profile database comprises a set of CpG methylation profiles and each CpG methylation profile represents a cancer type. - View Dependent Claims (13, 14, 15, 16, 17)
-
Specification