Multi-class classification learning on several processors

US 7,983,999 B1
Filed: 05/14/2009
Issued: 07/19/2011
Est. Priority Date: 12/30/2005
Status: Expired due to Fees

First Claim

Patent Images

1. An interactive voice response system comprising:

a first computing unit configured to;

receive a training data set;

sort classes of the training data set by a frequency distribution to yield sorted classes; and

distribute the sorted classes as a plurality of groups across a plurality of processors using a round robin partition, wherein each group includes classes different from classes in each other group, and each group is distributed to a different processor of the plurality of processors, wherein each of the processors is located within a different computing unit, and each processor is configured to process the distributed group of sorted classes to produce learning data and distribute the learning data to each of the other processors;

a second computing unit configured to merge results of the processing into a model; and

a third computing unit configured to receive the model and apply the model.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The time taken to learn a model from training examples is often unacceptable. For instance, training language understanding models with Adaboost or SVMs can take weeks or longer based on numerous training examples. Parallelization through the use of multiple processors may improve learning speed. The disclosure describes effective systems for distributed multiclass classification learning on several processors. These systems are applicable to multiclass models where the training process may be split into training of independent binary classifiers.

17 Citations

15 Claims

1. An interactive voice response system comprising:
- a first computing unit configured to;
  
  receive a training data set;
  
  sort classes of the training data set by a frequency distribution to yield sorted classes; and
  
  distribute the sorted classes as a plurality of groups across a plurality of processors using a round robin partition, wherein each group includes classes different from classes in each other group, and each group is distributed to a different processor of the plurality of processors, wherein each of the processors is located within a different computing unit, and each processor is configured to process the distributed group of sorted classes to produce learning data and distribute the learning data to each of the other processors;
  
  a second computing unit configured to merge results of the processing into a model; and
  
  a third computing unit configured to receive the model and apply the model.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The system of claim 1, wherein the first computing unit is further configured to determine if at least two training data in the training data set are identical to yield identical data and, if so, to merge the identical data.
  - 3. The system of claim 2, wherein the first computing unit is further configured to define an order relationship for all data of the identical data, sort all of the data by the defined ordered relationship, merge consecutive data that are equivalent to yield merged consecutive data and re-weight the merged consecutive data.
  - 4. The system of claim 3, wherein the first computing unit is further configured to sort all of the data using a quick sort sorting routine.
  - 5. The system of claim 1, wherein the first computing unit is further configured to transpose the training data set.
  - 6. The system of claim 1, wherein the first computing unit is further configured to store in cache memory previously processed classes of the training set data for each of the plurality of processors.
  - 7. The system of claim 1, wherein the frequency distribution of sorted classes within the groups is similar.
  - 8. The system of claim 1, wherein the second computing unit and the third computing unit are the same computing unit.
  - 9. The system of claim 1, wherein the first computing unit and the third computing unit are the same computing unit.

10. An interactive voice response system comprising:
- a first computing unit configured to;
  
  receive a training data set;
  
  split the training data set along examples to yield a split training data set along examples;
  
  split the split training data set along classes to yield a split training data set along classes;
  
  separate the split training data set along classes as a training set into subsets of equal size; and
  
  distribute the subsets in across a plurality of processors, such that one subset is distributed to one processor to yield a distributed subset, wherein each of the plurality of processors is located within a different computing unit, and each processor is configured to determine all classifiers of a distributed subset;
  
  a second computing unit configured to merge results of the determining into a model and output the model to cache operatively connected to the second computing unit; and
  
  a third computing unit configured to receive the model from the cache and apply the model.
- View Dependent Claims (11, 12, 13, 14, 15)
- - 11. The system of claim 10, the first computing unit further configured to determine if at least two training data in the training data set are identical to yield identical data and, if so, merge the identical data.
  - 12. The system of claim 10, which the first computing unit is further configured to transpose the training data set.
  - 13. The system of claim 10, wherein the first computing unit is further configured to store in cache memory previously processed classes of the training data set for each of the plurality of processors.
  - 14. The system of claim 10, wherein the second computing unit and the third computing unit are the same computing unit.
  - 15. The system of claim 10, wherein the first computing unit and the third computing unit are the same computing unit.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property II LP (AT&T, Inc.)
Inventors
Haffner, Patrick
Primary Examiner(s)
HOLMES, MICHAEL B

Application Number

US12/465,827
Time in Patent Office

796 Days
Field of Search

706/12
US Class Current

706/12
CPC Class Codes

G06N 20/00   Machine learning

G06N 20/10   using kernel methods, e.g. ...

G10L 15/063   Training

G10L 15/34   Adaptation of a single reco...

Multi-class classification learning on several processors

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

17 Citations

15 Claims

Specification

Use Cases

Quick Links

Others

Multi-class classification learning on several processors

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

15 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others