Discriminatively trained mixture models in continuous speech recognition

US 6,490,555 B1
Filed: 04/05/2000
Issued: 12/03/2002
Est. Priority Date: 03/14/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method of a continuous speech recognition system for discriminatively training hidden Markov models for a system recognition vocabulary, the method comprising:

converting an input word phrase into a sequence of representative frames;

determining a correct state sequence alignment with the sequence of representative frames, the correct state sequence alignment corresponding to models of words in the input word phrase;

determining a plurality of incorrect recognition hypotheses representing words in the recognition vocabulary that do not correspond to the input word phrase, each hypothesis being a state sequence based on the word models in an acoustic model database;

selecting a correct segment of the correct word model state sequence alignment for discriminative training;

determining a frame segment of frames in the sequence of representative frames that corresponds to the correct segment;

selecting an incorrect segment of a state sequence in an incorrect recognition hypothesis, the incorrect segment corresponding to the frame segment;

performing a discriminative adjustment on selected states in the correct segment and the corresponding states in the incorrect segment.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of a continuous speech recognition system is given for discriminatively training hidden Markov for a system recognition vocabulary. An input word phrase is converted into a sequence of representative frames. A correct state sequence alignment with the sequence of representative frames is determined, the correct state sequence alignment corresponding to models of words in the input word phrase. A plurality of incorrect recognition hypotheses is determined representing words in the recognition vocabulary that do not correspond to the input word phrase, each hypothesis being a state sequence based on the word models in the acoustic model database. A correct segment of the correct word model state sequence alignment is selected for discriminative training. A frame segment of frames in the sequence of representative frames is determined that corresponds to the correct segment. An incorrect segment of a state sequence in an incorrect recognition hypothesis is selected, the incorrect segment corresponding to the frame segment. A discriminative adjustment is performed on selected states in the correct segment and the corresponding states in the incorrect segment.

54 Citations

View as Search Results

9 Claims

1. A method of a continuous speech recognition system for discriminatively training hidden Markov models for a system recognition vocabulary, the method comprising:
- converting an input word phrase into a sequence of representative frames;
  
  determining a correct state sequence alignment with the sequence of representative frames, the correct state sequence alignment corresponding to models of words in the input word phrase;
  
  determining a plurality of incorrect recognition hypotheses representing words in the recognition vocabulary that do not correspond to the input word phrase, each hypothesis being a state sequence based on the word models in an acoustic model database;
  
  selecting a correct segment of the correct word model state sequence alignment for discriminative training;
  
  determining a frame segment of frames in the sequence of representative frames that corresponds to the correct segment;
  
  selecting an incorrect segment of a state sequence in an incorrect recognition hypothesis, the incorrect segment corresponding to the frame segment;
  
  performing a discriminative adjustment on selected states in the correct segment and the corresponding states in the incorrect segment.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. A method according to claim 1, wherein performing a discriminative adjustment occurs in a batch training mode at the end of a user session with the speech recognition system, and the discriminative adjustment performed on the selected and corresponding states represents a sum of calculated adjustments over the session.
  - 3. A method according to claim 1, wherein performing a discriminative adjustment occurs in an on-line mode in which the selected and corresponding states are discriminatively adjusted for each input word phrase.
  - 4. A method according to claim 1, wherein performing a discriminative adjustment includes using a language model weighting of the selected and corresponding states.
  - 5. A method according to claim 4, wherein when the selected segment of an incorrect recognition hypothesis is a fractional portion of a word model state sequence, the language model weighting for the fractional portion corresponds to the fractional amount of the word model that the fractional portion represents.
  - 6. A method according to claim 1, wherein the discriminative adjustment includes performing a gradient adjustment to selected branches of a selected state in the correct hypothesis model and a corresponding state in the incorrect hypothesis.
  - 7. A method according to claim 6, wherein the gradient adjustment is to the best branch in each state model.
  - 8. A method according to claim 1, wherein the hidden Markov models are speaker independent models.
  - 9. A method according to claim 1, wherein the hidden Markov models are speaker dependent models.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
ScanSoft, Inc. n/k/a Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Yegnanarayanan, Girija, Sarukkai, Ramesh, Sejnoha, Vladimir
Primary Examiner(s)
Chawan, Vijay
Assistant Examiner(s)
Opsasnick, Michael N.

Application Number

US09/543,202
Time in Patent Office

972 Days
Field of Search

704/231, 704/236, 704/238, 704/240
US Class Current

704/231
CPC Class Codes

G10L 15/144 Training of HMMs

G10L 15/146 with insufficient amount of...

Discriminatively trained mixture models in continuous speech recognition

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

54 Citations

9 Claims

Specification

Use Cases

Quick Links

Others

Discriminatively trained mixture models in continuous speech recognition

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

54 Citations

9 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others