×

Effective multi-class support vector machine classification

  • US 7,386,527 B2
  • Filed: 04/10/2003
  • Issued: 06/10/2008
  • Est. Priority Date: 12/06/2002
  • Status: Active Grant
First Claim
Patent Images

1. In a computer-based system, a method of training a multi-category classifier using a binary SVM algorithm, said method comprising:

  • storing a plurality of user-defined categories in a memory of a computer;

    analyzing a plurality of training examples for each category so as to identify one or more features associated with each category;

    calculating at least one feature vector for each of said examples;

    transforming each of said at least one feature vectors using a first mathematical function so as to provide desired information about each of said training examples; and

    building a SVM classifier for each one of said plurality of categories, wherein said process of building a SVM classifier comprises;

    assigning each of said examples in a first category to a first class and all other examples belonging to other categories to a second class, wherein if any one of said examples belongs to both said first category and another category, such examples are assigned to the first class only;

    optimizing at least one tunable parameter of a SVM classifier for said first categories, wherein said SVM classifier is trained using said first and second classes after the at least one tunable parameter has been optimized; and

    optimizing a second mathematical function that converts the output of the binary SVM classifier into a probability of category membership;

    calculating a solution for the SVM classifier for the first category using predetermined initial value(s) for said at least one tunable parameter; and

    testing said solution for said first category to determine if the solution is characterized by either over-generalization or over-memorization;

    wherein the SVM classifier is used on real world data, the probability of category membership of the real world data being output to at least one of a user, another system, and another process;

    wherein the test to determine whether said SVM classifier solution for said first category is characterized by either over-generalization or over-memorization is based on a difference between a harmonic mean of first and second estimated probabilities on the one hand, and an arithmetic mean of said first and second estimated probabilities on the other hand;

    wherein the first estimated probability is indicative of class membership and the second estimated probability is indicative of non-class membership for training examples.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×