Speech models generated using competitive training, asymmetric training, and data boosting

US 7,693,713 B2
Filed: 06/17/2005
Issued: 04/06/2010
Est. Priority Date: 06/17/2005
Status: Active Grant

First Claim

Patent Images

1. A computer implemented method of training a speech model that detects differences between different classes of speech signals, using a computer with a processor, comprising:

dividing, with the processor, the speech model into a plurality of sub-model groups based on at least one predetermined criterion, a first of the sub-model groups detecting a difference between a corresponding first class of speech signal and a corresponding second class of speech signal, a second of the sub-model groups detecting a difference between a corresponding third class of speech signals and a corresponding forth class of speech signal wherein the first and second classes of speech signal are closer to one another than the third and forth classes of speech signal;

performing, with the processor, different training on each of the plurality of sub-model groups to increase performance of each of the plurality of sub-model groups in detecting differences specific to the corresponding classes of speech signal to obtain a plurality of modified sub-models; and

combining, with the processor, the plurality of modified sub-models to obtain a modified model.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech models are trained using one or more of three different training systems. They include competitive training which reduces a distance between a recognized result and a true result, data boosting which divides and weights training data, and asymmetric training which trains different model components differently.

16 Citations

View as Search Results

10 Claims

1. A computer implemented method of training a speech model that detects differences between different classes of speech signals, using a computer with a processor, comprising:
- dividing, with the processor, the speech model into a plurality of sub-model groups based on at least one predetermined criterion, a first of the sub-model groups detecting a difference between a corresponding first class of speech signal and a corresponding second class of speech signal, a second of the sub-model groups detecting a difference between a corresponding third class of speech signals and a corresponding forth class of speech signal wherein the first and second classes of speech signal are closer to one another than the third and forth classes of speech signal;
  
  performing, with the processor, different training on each of the plurality of sub-model groups to increase performance of each of the plurality of sub-model groups in detecting differences specific to the corresponding classes of speech signal to obtain a plurality of modified sub-models; and
  
  combining, with the processor, the plurality of modified sub-models to obtain a modified model.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein performing different training comprises:
    - performing group-specific training on each of the plurality of sub-model groups.
  - 3. The method of claim 1 wherein performing different training comprises:
    - performing a common training method on the plurality of sub-model groups, using different training settings for each of the plurality of sub-model groups.
  - 4. The method of claim 1 wherein performing different training comprises:
    - performing different training methods on each of the plurality of sub-model groups.
  - 5. The method of claim 1 wherein performing different training comprises:
    - performing competitive training, to reduce a distance between a true model result and an actual model result, on at least one of the sub-model groups.
  - 6. The method of claim 1 wherein performing different training comprises:
    - performing competitive function training on at least one of the sub-model groups to train a competitive function used by the speech model.
  - 7. The method of claim 1 wherein performing different training comprises:
    - performing data boost training on at least one of the sub-model groups by dividing training data into data groups and weighting each data group, each training data group including corresponding pairs of training data in the first and second classes of speech signal or the third and fourth classes of speech signal, respectively, and wherein the training data groups are weighted based on which classes of speech signal the corresponding pairs of training data are in, and training at least one sub-model group using the weighted training data.
  - 8. The method of claim 1 wherein dividing the speech model comprises:
    - dividing the speech model into groups of individual model components so the modified sub-models comprise sets of modified individual model components, and wherein combining comprises forming a superset of the sets of modified individual model components.
  - 9. The method of claim 1 and further comprising:
    - performing common model training at least either on the speech model before dividing the speech model into the plurality of sub-model groups, or on the modified model after combining the modified sub-models.
  - 10. The method of claim 1 wherein performing different training comprises:
    - dividing training data into first, second, third and fourth sets representing the first, second, third and fourth classes of speech signals, respectively; and
      
      training the first sub-model group using the first and second sets of training data and training the second sub-model group using the third and fourth sets of training data by optimizing an objective function for each sub-model group, without optimizing an objective function for all sub-model groups combined.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Wu, Jian, He, Xiaodong
Primary Examiner(s)
Sked; Matthew J

Application Number

US11/156,106
Publication Number

US 20060287856A1
Time in Patent Office

1,754 Days
Field of Search

None
US Class Current

704/243
CPC Class Codes

G10L 15/063 Training

Speech models generated using competitive training, asymmetric training, and data boosting

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

16 Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Speech models generated using competitive training, asymmetric training, and data boosting

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

16 Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links