Spread Kernel Support Vector Machine
First Claim
1. A method for training a support vector machine based on a set of training data at one of a plurality of processing nodes, comprising the steps of:
- a) selecting a local working set of training data based on local data;
b) transmitting selected data related to said local working set;
c) receiving an identification of a global working set of training data;
d) optimizing said global working set of training data;
e) updating a portion of gradients of said global working set of training data; and
f) repeating said steps a) through e) until a convergence condition is met.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a parallel support vector machine technique for solving problems with a large set of training data where the kernel computation, as well as the kernel cache and the training data, are spread over a number of distributed machines or processors. A plurality of processing nodes are used to train a support vector machine based on a set of training data. Each of the processing nodes selects a local working set of training data based on data local to the processing node, for example a local subset of gradients. Each node transmits selected data related to the working set (e.g., gradients having a maximum value) and receives an identification of a global working set of training data. The processing node optimizes the global working set of training data and updates a portion of the gradients of the global working set of training data. The updating of a portion of the gradients may include generating a portion of a kernel matrix. These steps are repeated until a convergence condition is met. Each of the local processing nodes may store all, or only a portion of, the training data. While the steps of optimizing the global working set of training data, and updating a portion of the gradients of the global working set, are performed in each of the local processing nodes, the function of generating a global working set of training data is performed in a centralized fashion based on the selected data (e.g., gradients of the local working set) received from the individual processing nodes.
-
Citations
26 Claims
-
1. A method for training a support vector machine based on a set of training data at one of a plurality of processing nodes, comprising the steps of:
-
a) selecting a local working set of training data based on local data;
b) transmitting selected data related to said local working set;
c) receiving an identification of a global working set of training data;
d) optimizing said global working set of training data;
e) updating a portion of gradients of said global working set of training data; and
f) repeating said steps a) through e) until a convergence condition is met. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for training a support vector machine based on a set of training data using a plurality of processing nodes comprising the steps of:
-
a) selecting, at each of said plurality of processing nodes, a local working set of training data based on local data;
b) generating a global working set of training data using selected data related to said local working sets;
c) optimizing, at each of said plurality of processing nodes, said global working set of training data;
d) updating, at each of said plurality of processing nodes, a portion of gradients of said global working set of training data; and
e) repeating steps a) through d) until a convergence condition is met. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. Apparatus for training a support vector machine based on a set of training data at one of a plurality of processing nodes, comprising:
-
a) means for selecting a local working set of training data based on local data;
b) means for transmitting selected data related to said local working set;
c) means for receiving an identification of a global working set of training data;
d) means for optimizing said global working set of training data;
e) means for updating a portion of gradients of said global working set of training data; and
f) means for repeating said steps a) through e) until a convergence condition is met. - View Dependent Claims (21, 22, 23, 24, 25, 26)
-
Specification