Recognition unit model training based on competing word and word string models

US 5,579,436 A
Filed: 03/15/1993
Issued: 11/26/1996
Est. Priority Date: 03/02/1992
Status: Expired due to Term

First Claim

Patent Images

1. A method of making a speech recognizer recognition unit model database based on one or more known speech signals and a set of current recognizer recognition unit models, the method comprising the steps of:

receiving a known speech signal;

generating a first recognizer scoring signal based on the known speech signal and a current recognition unit model for that signal;

generating one or more other recognizer scoring signals, each such scoring signal based on the known speech signal and another current recognition unit model;

generating a misrecognition signal based on the first and other recognizer scoring signals;

based on a value of a predetermined loss function when applied to the misrecognition signal and the known speech signal, modifying one or more of the current recognition unit models to decrease the likelihood of misrecognizing an unknown speech signal; and

storing one or more modified recognition unit models in memory.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system pattern-based speech recognition, e.g., a hidden Markov model (HMM) based speech recognizer using Viterbi scoring. The principle of minimum recognition error rate is applied by the present invention using discriminative training. Various issues related to the special structure of HMMs are presented. Parameter update expressions for HMMs are provided.

Citations

23 Claims

1. A method of making a speech recognizer recognition unit model database based on one or more known speech signals and a set of current recognizer recognition unit models, the method comprising the steps of:
- receiving a known speech signal;
  
  generating a first recognizer scoring signal based on the known speech signal and a current recognition unit model for that signal;
  
  generating one or more other recognizer scoring signals, each such scoring signal based on the known speech signal and another current recognition unit model;
  
  generating a misrecognition signal based on the first and other recognizer scoring signals;
  
  based on a value of a predetermined loss function when applied to the misrecognition signal and the known speech signal, modifying one or more of the current recognition unit models to decrease the likelihood of misrecognizing an unknown speech signal; and
  
  storing one or more modified recognition unit models in memory.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1 wherein the step of generating a misrecognition signal comprises the step of forming a difference between:
    - a. the first recognizer scoring signal; and
      
      b. an average of the one or more other recognizer scoring signals.
  - 3. The method of claim 1 wherein the first recognizer scoring signal reflects how well the known speech signal matches the current recognition unit models for that signal.
  - 4. The method of claim 1 wherein the one or more other scoring signals reflect how well the known speech signal matches one or more other current recognition unit models.
  - 5. The method of claim 1 wherein the step of modifying one or more of the current speech recognition unit models comprises the steps of:
    - a. determining a gradient of a function relatingi. recognizer scoring of known speech based on a current recognition unit model for that speech toii. recognizer scoring of known speech based on one or more other current recognition unit models; and
      
      b. adjusting one or more parameters of the current speech recognition unit models based on the gradient.
  - 6. The method of claim 5 wherein the step of adjusting one or more parameters is further based on a matrix of current recognition unit model parameters.
  - 7. The method of claim 6 wherein the matrix of current recognition unit model parameters comprises variances of the models.
  - 8. The method of claim 5 wherein the step of adjusting one or more parameters comprises the step of adjusting transformations of recognition unit model parameters to adhere to recognition unit model constraints.
  - 9. The method of claim 1 wherein the set of current recognition unit models comprises one or more hidden Markov models.
  - 10. The method of claim 1 wherein the set of current recognition unit models comprises one or more templates.
  - 11. The method of claim 1 wherein the current recognition unit models comprise the output of a recognition unit model trainer.
  - 12. The method of claim 1 wherein the current recognition unit models comprise a modified set of recognition unit models.
  - 13. The method of claim 1 wherein the step of modifying current recognition unit models comprises the step of modifying recognition unit models a plurality of times prior to storing modified recognition unit models in memory, each of the plurality of modifications based on a distinct known speech signal.
  - 14. The method of claim 1 further comprising the steps of:
    - recognizing an unknown speech signal based on current recognition unit models;
      
      providing the recognized speech signal to be received as a known speech signal.

15. A speech recognizer trainer for providing a speech recognizer database based on one or more known speech signals and a set of current recognition unit models, the trainer comprising:
- means for generating a first recognizer scoring signal based on the known speech signal and a current recognition unit model for that signal;
  
  means, coupled to the means for generating a first recognizer scoring signal, for generating one or more other recognizer scoring signals, each such scoring signals based on the known speech signal and another current recognition unit model;
  
  means, coupled to the means for generating a first and other recognizer scoring signals, for generating a misrecognition signal based on the first and other recognizer scoring signals;
  
  means, coupled to the means for generating a misrecognition signal, for modifying one or more of the recognition unit models, based on a value of a predetermined loss function when applied to the misrecognition signal and the known speech signal, to decrease the likelihood of misrecognizing an unknown speech signal; and
  
  means, coupled to the means for modifying, for storing one or more modified recognition unit models.
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
- - 16. The trainer of claim 15 wherein the means for generating a misrecognition signal comprises means for forming a difference between:
    - a. the first recognizer scoring signal; and
      
      b. an average of the one or more other recognizer scoring signals.
  - 17. The trainer of claim 15 wherein the means for modifying one or more of the current speech recognition unit model comprises:
    - a. means for determining a gradient of a function relatingi. recognizer scoring of known speech based on a current recognition unit model for that speech toii. recognizer scoring of known speech based on one or more other current recognition unit models; and
      
      b. means for adjusting one or more parameters of the current speech recognition unit models based on the gradient.
  - 18. The trainer of claim 15 wherein the set of current recognition unit models comprises one or more hidden Markov models.
  - 19. The trainer of claim 15 wherein the set of current recognition unit models comprises one or more templates.
  - 20. The trainer of claim 15 wherein the current recognition unit models comprise the output of a recognition unit model trainer.
  - 21. The trainer of claim 15 wherein the current recognition unit models comprise a modified set of recognition unit models.
  - 22. The trainer of claim 15 further comprising:
    - means for recognizing an unknown speech signal based on current recognition unit models;
      
      means for providing the recognized speech signal to be received as a known speech signal.

23. A speech recognition system comprisinga. a feature extractor for receiving an unknown speech signal and identifying features characterizing the signal;
- b. a first memory means for storing current recognition unit models;
  
  c. a second memory means for storing known speech training samples;
  
  d. a scoring comparator, coupled to the feature extractor and the first memory means, for comparing a plurality of current recognition unit models with one or more features of the unknown speech signal to determine a comparison score for each such model;
  
  e. a score processor, coupled to the scoring comparator, for selecting the highest comparison score and recognizing speech based on the highest score; and
  
  f. a trainer, coupled to the first and second memory means, the trainer comprising;
  
  i. means for generating a first recognizer scoring signal based on a known speech signal and a current recognition unit model for that signal;
  
  ii. means, coupled to the means for generating a first recognizer scoring signal, for generating one or more other recognizer scoring signals, each such scoring signal based on the known speech signal and another current recognition unit model;
  
  iii. means, coupled to the means for generating a first and other recognizer scoring signals, for generating a misrecognition signal based on the first and other recognizer scoring signals;
  
  iv. means, coupled to the means for generating a misrecognition signal, for modifying one or more of the current recognition unit models, based on a value of a predetermined loss function when applied to the misrecognition signal and the known speech signal, to decrease the likelihood of misrecognizing an unknown speech signal; and
  
  v. means, coupled to the means for modifying, for storing one or more modified recognition unit models in the first memory means.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Chou, Wu, Juang, Biing-Hwang
Primary Examiner(s)
Knepper, David D.

Application Number

US08/030,895
Time in Patent Office

1,352 Days
Field of Search

395/2, 395/2.4, 395/2.49-2.53, 395/2.6, 395/2.64, 395/2.65, 395/2.54, 381/41-43
US Class Current

704/244
CPC Class Codes

G10L 15/063 Training

G10L 15/144 Training of HMMs

Recognition unit model training based on competing word and word string models

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Recognition unit model training based on competing word and word string models

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links