Spread Kernel Support Vector Machine

US 20070094170A1
Filed: 02/20/2006
Published: 04/26/2007
Est. Priority Date: 09/28/2005
Status: Active Grant

First Claim

Patent Images

1. A method for training a support vector machine based on a set of training data at one of a plurality of processing nodes, comprising the steps of:

a) selecting a local working set of training data based on local data;

b) transmitting selected data related to said local working set;

c) receiving an identification of a global working set of training data;

d) optimizing said global working set of training data;

e) updating a portion of gradients of said global working set of training data; and

f) repeating said steps a) through e) until a convergence condition is met.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed is a parallel support vector machine technique for solving problems with a large set of training data where the kernel computation, as well as the kernel cache and the training data, are spread over a number of distributed machines or processors. A plurality of processing nodes are used to train a support vector machine based on a set of training data. Each of the processing nodes selects a local working set of training data based on data local to the processing node, for example a local subset of gradients. Each node transmits selected data related to the working set (e.g., gradients having a maximum value) and receives an identification of a global working set of training data. The processing node optimizes the global working set of training data and updates a portion of the gradients of the global working set of training data. The updating of a portion of the gradients may include generating a portion of a kernel matrix. These steps are repeated until a convergence condition is met. Each of the local processing nodes may store all, or only a portion of, the training data. While the steps of optimizing the global working set of training data, and updating a portion of the gradients of the global working set, are performed in each of the local processing nodes, the function of generating a global working set of training data is performed in a centralized fashion based on the selected data (e.g., gradients of the local working set) received from the individual processing nodes.

Citations

26 Claims

1. A method for training a support vector machine based on a set of training data at one of a plurality of processing nodes, comprising the steps of:
- a) selecting a local working set of training data based on local data;
  
  b) transmitting selected data related to said local working set;
  
  c) receiving an identification of a global working set of training data;
  
  d) optimizing said global working set of training data;
  
  e) updating a portion of gradients of said global working set of training data; and
  
  f) repeating said steps a) through e) until a convergence condition is met.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1 wherein said local data comprises a local subset of gradients.
  - 3. The method of claim 1 wherein said local data comprises a maximum value of a function of a local subset of gradients and kernel values.
  - 4. The method of claim 1 wherein said selected data comprises gradients of said local working set.
  - 5. The method of claim 4 wherein said gradients of said local working set are gradients having a maximum value.
  - 6. The method of claim 1 wherein said one processing node stores the entire set of training data.
  - 7. The method of claim 1 wherein said one processing node stores only a portion of the set of training data.
  - 8. The method of claim 7 wherein said step of receiving further comprises:
    - receiving at least a portion of the global working set of training data.
  - 9. The method of claim 1 wherein said step of updating a portion of the gradients comprises:
    - generating a portion of a kernel matrix.

10. A method for training a support vector machine based on a set of training data using a plurality of processing nodes comprising the steps of:
- a) selecting, at each of said plurality of processing nodes, a local working set of training data based on local data;
  
  b) generating a global working set of training data using selected data related to said local working sets;
  
  c) optimizing, at each of said plurality of processing nodes, said global working set of training data;
  
  d) updating, at each of said plurality of processing nodes, a portion of gradients of said global working set of training data; and
  
  e) repeating steps a) through d) until a convergence condition is met.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 11. The method of claim 10 wherein said local data comprises a local subset of gradients.
  - 12. The method of claim 10 wherein said local data comprises a maximum value of a function of a local subset of gradients and kernel values.
  - 13. The method of claim 10 wherein said selected data comprises gradients of said local working set.
  - 14. The method of claim 13 wherein said gradients of said local working set are gradients having a maximum value.
  - 15. The method of claim 14 further comprising the steps of:
    - determining said gradients having a maximum value using a tree structure of network nodes; and
      
      transmitting said gradients having a maximum value using hierarchal broadcast.
  - 16. The method of claim 10 wherein each of said plurality of processing nodes stores the entire set of training data.
  - 17. The method of claim 10 wherein each of said plurality of processing nodes each store only a portion of the set of training data.
  - 18. The method of claim 17 further comprising:
    - receiving, at each of said plurality of processing nodes, at least a portion of the global working set of training data.
  - 19. The method of claim 10 wherein said step of updating comprises:
    - generating, at each of said plurality of processing nodes, a portion of a kernel matrix.

20. Apparatus for training a support vector machine based on a set of training data at one of a plurality of processing nodes, comprising:
- a) means for selecting a local working set of training data based on local data;
  
  b) means for transmitting selected data related to said local working set;
  
  c) means for receiving an identification of a global working set of training data;
  
  d) means for optimizing said global working set of training data;
  
  e) means for updating a portion of gradients of said global working set of training data; and
  
  f) means for repeating said steps a) through e) until a convergence condition is met.
- View Dependent Claims (21, 22, 23, 24, 25, 26)
- - 21. The apparatus of claim 20 wherein said local data comprises a local subset of gradients.
  - 22. The apparatus of claim 20 wherein said local data comprises a maximum value of a function of a local subset of gradients and kernel values.
  - 23. The apparatus of claim 20 wherein said selected data comprises gradients of said local working set.
  - 24. The apparatus of claim 23 wherein said gradients of said local working set are gradients having a maximum value.
  - 25. The apparatus of claim 20 wherein said means for receiving further comprises:
    - means for receiving at least a portion of the global working set of training data.
  - 26. The apparatus of claim 20 wherein said means for updating a portion of the gradients comprises:
    - means for generating a portion of a kernel matrix.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NEC Corporation
Original Assignee
NEC Laboratories America Inc (NEC Corporation)
Inventors
Cosatto, Eric, Vapnik, Vladimir, Graf, Hans, Durdanovic, Igor

Granted Patent

US 7,406,450 B2
Time in Patent Office

Days
Field of Search
US Class Current

706/15
CPC Class Codes

G06F 18/2411   based on the proximity to a...

G06N 20/00   Machine learning

G06N 20/10   using kernel methods, e.g. ...

Spread Kernel Support Vector Machine

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

Spread Kernel Support Vector Machine

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links