×

Selection of features predictive of biological conditions using protein mass spectrographic data

  • US 7,676,442 B2
  • Filed: 10/30/2007
  • Issued: 03/09/2010
  • Est. Priority Date: 05/01/1998
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for identification distinguishing between different biological conditions using protein expression data contained in a plurality of mass spectra generated from mass spectrographic measurement of a plurality of samples from subjects having the different biological conditions, the method comprising:

  • downloading the plurality of mass spectra into a computer system comprising a processor and a storage device, wherein the processor is programmed to perform the steps of;

    aligning the plurality of spectra, comprising;

    selecting a first spectrum of the plurality of spectra as a baseline example;

    sliding each spectral peak of a second spectrum of the plurality of spectra one at a time along a plurality of peaks within the baseline example;

    constructing a similarity measure for comparing pairs of spectra, wherein the similarity measure includes a scoring function for obtaining a similarity score between each spectral peak of the second spectrum and the peaks within the baseline example, the similarity score being examined according to the relationship S(xi

    x0)=∥

    xi,

    x0

    22, where xi and x0 are feature vectors corresponding to peaks of an ith spectrum and the baseline spectrum, respectively;

    offsetting the second spectrum relative to the baseline example according to the similarity score achieved for the second spectrum;

    repeating the step of aligning the spectra for at least one additional spectrum to create a set of aligned spectra;

    applying a feature selection algorithm to the set of aligned spectra to select a subset of spectral peaks that discriminate between the different biological conditions, wherein the feature selection algorithm is selected from SVM-recursive feature elimination and l0-norm minimization; and

    training at least one support vector machine to discriminate between the plurality of different sample classes using the selected subset of spectral peaks, wherein the at least one support vector machine comprises a kernel;

    processing the plurality of spectra using the at least one support vector machine;

    generating a listing for display on a graphical display of at least one predictive feature within the plurality of spectra for distinguishing between the different biological conditions.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×