×

Systems and methods for generating biomarker signatures with integrated dual ensemble and generalized simulated annealing techniques

  • US 10,373,708 B2
  • Filed: 06/21/2013
  • Issued: 08/06/2019
  • Est. Priority Date: 06/21/2012
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method of classifying a data set into two or more classes executed by a processor, comprising:

  • (a) receiving a training data set associated with the data set and having a set of known labels, wherein the data set comprises gene set data, and each gene set data corresponds to one of a plurality of biological state classes, and wherein the labels identify the biological state classes of the gene set data;

    (b) generating a first classifier for the training data set by applying a first machine learning technique to the training data set, wherein the first machine learning technique identifies a first set of classification methods, wherein each classification method votes on the training data set;

    (c) classifying elements in the training data set according to the first classifier to obtain a first set of predicted labels for the training data set;

    (d) computing a first objective value from the first set of predicted labels and the set of known labels;

    (e) for each of a plurality of iterations, performing the following steps (i)-(v);

    (i) generating a second classifier for the training data set by applying a second machine learning technique to the training data set, wherein the second machine learning technique identifies a second set of classification methods that is different from the first set of classification methods by at least one classification method, wherein each classification method votes on the training data set;

    ii) classifying the elements in the training data set according to the second classifier to obtain a second set of predicted labels for the training data set;

    (iii) computing a second objective value from the second set of predicted labels and the set of known labels;

    (iv) comparing the first objective value to the second objective value to determine whether the second classifier outperforms the first classifier; and

    (v) replacing the first set of predicted labels with the second set of predicted labels and replacing the first objective value with the second objective value when the second classifier outperforms the first classifier, and return to step (i); and

    (f) when a desired number of iterations has been reached, outputting the first set of predicted labels.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×