×

Text classification by weighted proximal support vector machine based on positive and negative sample sizes and weights

  • US 7,707,129 B2
  • Filed: 03/20/2006
  • Issued: 04/27/2010
  • Est. Priority Date: 03/20/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A system for training a text classifier, the system comprising:

  • a memory storing computer-executable instructions that implement;

    a text data preprocessor that preprocesses raw training text to produce an input matrix, the raw training text including documents and indications of whether each document is a positive or a negative training example of a classification; and

    a module for solving a weighted proximal support vector machine equation comprising;

    a weighting module that generates a weighted matrix by re-weighting the input matrix based on how many training examples are positive and how many training examples are negative wherein the weighting is based on satisfying the following equation;


    N+δ

    +2=N

    δ



    2 where N+ and N

    denote numbers of positive and negative training examples and δ

    + and δ



    denote weights of the positive and negative training examples; and

    a model-vector generator that iteratively calculates a model vector based on the weighted matrix using a proximal support vector machine model; and

    a processor for executing the computer-executable instructions stored in the memory.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×