Large scale semi-supervised linear support vector machines
First Claim
Patent Images
1. A computerized method for semi-supervised learning for web page classification, comprising:
- receiving a set of web pages as training elements;
labeling some of the elements of the set of training elements that are determined to fall within a classification group, the set of training elements thereby having labeled elements and unlabeled elements;
using selected labeled elements and unlabeled elements as examples in a semi-supervised support vector machine implemented using a mean field annealing method, constructing a continuous loss function from a non-continuous loss function, train a linear classifier;
receiving unclassified web pages; and
classifying the unclassified web pages using the trained linear classifier.
9 Assignments
0 Petitions
Accused Products
Abstract
A computerized system and method for large scale semi-supervised learning is provided. The training set comprises a mix of labeled and unlabeled examples. Linear classifiers based on support vector machine principles are built using these examples. One embodiment uses a fast design of a linear transductive support vector machine using multiple switching. In another embodiment, mean field annealing is used to form a very effective semi-supervised support vector machine. For both these embodiments the finite Newton method is used as the base method for achieving fast training.
62 Citations
20 Claims
-
1. A computerized method for semi-supervised learning for web page classification, comprising:
-
receiving a set of web pages as training elements; labeling some of the elements of the set of training elements that are determined to fall within a classification group, the set of training elements thereby having labeled elements and unlabeled elements; using selected labeled elements and unlabeled elements as examples in a semi-supervised support vector machine implemented using a mean field annealing method, constructing a continuous loss function from a non-continuous loss function, train a linear classifier; receiving unclassified web pages; and classifying the unclassified web pages using the trained linear classifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system for semi-supervised learning for web page classification, comprising:
-
an input device operative to receive a set of web pages as training elements; a processor operative to label some of the elements of the set of training elements that are determined to fall within a classification group, the set of training elements thereby having labeled elements and unlabeled elements; the processor further operative to use selected labeled elements and unlabeled elements as examples in a semi-supervised support vector machine implemented using a mean field annealing method, constructing a continuous loss function from a non-continuous loss function, to train a linear classifier; the input device further operative to receive unclassified web pages; and the processor further for operative to classify the received unclassified web pages using the trained linear classifier.
-
-
20. A computer program product stored on a computer-readable medium having instructions for performing a semi-supervised learning method for web page classification, the method comprising:
-
receiving a set of web pages as training elements; labeling some of the elements of the set of training elements that are determined to fall within a classification group, the set of training elements thereby having labeled elements and unlabeled elements; using selected labeled elements and unlabeled elements as examples in a semi-supervised support vector machine implemented using a mean field annealing method, constructing a continuous loss function from a non-continuous loss function, to train a linear classifier; receiving unclassified web pages; and classifying the unclassified web pages using the trained linear classifier.
-
Specification