Feature selection method using support vector machine classifier

  • US 7,542,959 B2
  • Filed: 08/21/2007
  • Issued: 06/02/2009
  • Est. Priority Date: 05/01/1998
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for predicting patterns in biological data, wherein the data comprises a large set of features that describe the data and a sample set from which the biological data is obtained is much smaller than the large set of features, the method comprising:

  • identifying a determinative subset of features that are most correlated to the patterns comprising;

    (a) inputting the data into a computer processor programmed for executing support vector machine classifiers;

    (b) training a support vector machine classifier with a training data set comprising at least a portion of the sample set and having known outcomes with respect to the patterns, wherein the classifier comprises weights having weight values that correspond to the features in the data set and removal of a subset of features affects the weight values;

    (c) ranking the features according to their corresponding weight values;

    (d) removing one or more features corresponding to the smallest weight values;

    (e) training a new classifier with the remaining features;

    (f) repeating steps (c) through (e) for a plurality of iterations until a final subset having a pre-determined number of features remains; and

    generating at a printer or display device a report comprising a listing of the features in the final subset, wherein the final subset comprises the determinative subset of features for determining biological characteristics of the sample set.

View all claims

    Thank you for your feedback