Speech recognition system employing discriminatively trained models

US 6,260,013 B1
Filed: 03/14/1997
Issued: 07/10/2001
Est. Priority Date: 03/14/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method for a speech recognition system with word models having descriptive parameters and associated continuous probability density functions (PDFs) to dynamically adjust the word model descriptive parameters, the method comprising:

a. converting an input utterance into a sequence of representative vectors;

b. comparing the sequence of representative vectors with a plurality of word model state sequences and using the continuous PDFs to score each word model state sequence for a likelihood that such state sequence represents the sequence of representative vectors;

c. selecting the word model state sequence having the best score as a recognition result for output to a user;

d. automatically performing a discriminative adjustment to the descriptive parameters of the best scoring word model state sequence and the descriptive parameters of at least one inferior scoring word model state sequence; and

e. if the user corrects the recognition result by selecting a different word sequence, i. automatically performing an adjustment to the descriptive parameters modified in step (d) that substantially undoes the discriminative adjustment performed in step (d), and ii. automatically performing a discriminative adjustment to the descriptive parameters of the word model state sequences for the words in the user corrected word sequence and the descriptive parameters of at least one other word model state sequence.

View all claims

10 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition system has vocabulary word models having for each word model state both a discrete probability distribution function and a continuous probability distribution function. Word models are initially aligned with an input utterance using the discrete probability distribution functions, and an initial matching performed. From well scoring word models, a ranked scoring of those models is generated using the respective continuous probability distribution functions. After each utterance, preselected continuous probability distribution function parameters are discriminatively adjusted to increase the difference in scoring between the best scoring and the next ranking models.

In the event a user subsequently corrects a prior recognition event by selecting a different word model from that generated by the recognition system, a re-adjustment of the continuous probability distribution function parameters is performed by adjusting the current state of the parameters opposite to the adjustment performed with the original recognition event, and adjusting the current parameters to that which would have been performed if the user correction associated word had been the best scoring model.

265 Citations

9 Claims

1. A method for a speech recognition system with word models having descriptive parameters and associated continuous probability density functions (PDFs) to dynamically adjust the word model descriptive parameters, the method comprising:
- a. converting an input utterance into a sequence of representative vectors;
  
  b. comparing the sequence of representative vectors with a plurality of word model state sequences and using the continuous PDFs to score each word model state sequence for a likelihood that such state sequence represents the sequence of representative vectors;
  
  c. selecting the word model state sequence having the best score as a recognition result for output to a user;
  
  d. automatically performing a discriminative adjustment to the descriptive parameters of the best scoring word model state sequence and the descriptive parameters of at least one inferior scoring word model state sequence; and
  
  e. if the user corrects the recognition result by selecting a different word sequence, i. automatically performing an adjustment to the descriptive parameters modified in step (d) that substantially undoes the discriminative adjustment performed in step (d), and ii. automatically performing a discriminative adjustment to the descriptive parameters of the word model state sequences for the words in the user corrected word sequence and the descriptive parameters of at least one other word model state sequence.
- View Dependent Claims (2, 3, 4)
- - 2. A method as in claim 1, wherein in step (d) the at least one inferior scoring word model state sequence is the word model state sequence having the second best score.
  - 3. A method as in claim 1, wherein in step (e)(ii) the at least one other word model state sequence is the word model state sequence having the next best score to the word model state sequence of the user corrected word sequence.
  - 4. A method as in claim 1, wherein the discriminative adjustment uses a gradient descent technique.

5. A method for a speech recognition system to convert an input utterance into a representative word sequence text, the method comprising:
- a. converting the input utterance into a sequence of representative vectors;
  
  b. quantizing the sequence of representative vectors into a sequence of standard prototype vectors;
  
  c. using discrete probability distribution functions (PDFs) of vocabulary word models to generate an alignment of the sequence of standard prototype vectors with a plurality of word model state sequences and to calculate initial match scores representative of a likelihood that a given word model state sequence alignment represents the sequence of standard prototype vectors;
  
  d. while retaining the alignment established in step (c), rescoring word model state sequences having an initial match score within a selected threshold value of the word model state sequence having the best score by comparing the word model state sequences to be rescored with the sequence of representative vectors using continuous PDFs of the word models; and
  
  e. selecting the word model state sequence having the best rescore as a recognition result for output to a user.
- View Dependent Claims (6, 7, 8, 9)
- - 6. A method as in claim 5, further comprising:
7. A method as in claim 6, wherein in step (f) the at least one inferior scoring word model state sequence is the word model state sequence having the second best score.
8. A method as in claim 6, wherein in step (g)(ii) the at least one other word model state sequence is the word model state sequence having the next best score to the word model state sequence of the user corrected word sequence.
9. A method as in claim 5, wherein the discriminative adjustment uses a gradient descent technique.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Lernout & Hauspie Speech Products NV (Intel Corporation)
Inventors
Sejnoha, Vladimir
Primary Examiner(s)
Tsang, Fan
Assistant Examiner(s)
Opsasnick, Michael N.

Application Number

US08/818,072
Time in Patent Office

1,579 Days
Field of Search

704/240, 704/241, 704/222, 704/251, 704/255, 704/244, 704/243
US Class Current

704/240
CPC Class Codes

G10L 15/063 Training

G10L 15/144 Training of HMMs

Speech recognition system employing discriminatively trained models

First Claim

10 Assignments

0 Petitions

Accused Products

Abstract

265 Citations

9 Claims

Specification

Use Cases

Quick Links

Others

Speech recognition system employing discriminatively trained models

First Claim

10 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

265 Citations

9 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others