Unsupervised HMM adaptation based on speech-silence discrimination

US 6,076,057 A
Filed: 05/21/1997
Issued: 06/13/2000
Est. Priority Date: 05/21/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method for discriminating between speech and background regions, comprising:

segmenting an input utterance into speech and background regions without knowledge of the lexical content of the input utterance to create a segmented input string;

introducing insertion errors into the background regions that are error prone to generate error laden background strings;

statistically modeling the segmented input string and the error laden background strings using a discriminative training algorithm to generate a model with adapted parameters;

decoding the input utterance using the model with the adapted parameters; and

outputting a recognized string based on the decoding step.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An unsupervised, discriminative, sentence level, HMM adaptation based on speech-silence classification is presented. Silence and speech regions are determined either using a speech end-pointer or the segmentation obtained from the recognizer in a first pass. The discriminative training procedure using a GPD or any other discriminative training algorithm, employed in conjunction with the HMM-based recognizer, is then used to increase the discrimination between silence and speech.

88 Citations

View as Search Results

12 Claims

1. A method for discriminating between speech and background regions, comprising:
- segmenting an input utterance into speech and background regions without knowledge of the lexical content of the input utterance to create a segmented input string;
  
  introducing insertion errors into the background regions that are error prone to generate error laden background strings;
  
  statistically modeling the segmented input string and the error laden background strings using a discriminative training algorithm to generate a model with adapted parameters;
  
  decoding the input utterance using the model with the adapted parameters; and
  
  outputting a recognized string based on the decoding step.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method as recited in claim 1, wherein the model with adapted parameters is generated using Hidden Markov Models.
  - 3. The method as recited in claim 1, wherein the segmenting step uses Viterbi decoding.
  - 4. The method as recited in claim 1, wherein the discriminative training algorithm is a minimum string-error training algorithm using N competing string models.
  - 5. The method as recited in claim 1, wherein the decoding step uses Viterbi decoding.
  - 6. The method as recited in claim 1, wherein the statistically modeling step uses a Generalized Probabilistic Descent algorithm.

7. A system for decoding of speech information comprising:
- means for segmenting an input utterance into speech and background regions without knowledge of the lexical content of the input utterance to create a segmented input string;
  
  means for introducing insertion errors into the background regions that are error prone to generate error laden background strings;
  
  means for statistically modeling the segmented input string and the error laden background strings using a discriminative training algorithm to generate a model with adapted parameters;
  
  means for decoding the input utterance using the model with the adapted parameters; and
  
  means for outputting a recognized string based on the decoded input utterance.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The system as recited in claim 7, wherein the means for statistically modeling generates the model with adapted parameters using Hidden Markov Models.
  - 9. The system as recited in claim 7, wherein the means for segmenting the input utterances into the segments uses Viterbi decoding.
  - 10. The system as recited in claim 7, wherein the discriminative training algorithm includes a minimum string-error training algorithm using N competing string models.
  - 11. The system as recited in claim 7, wherein means for decoding the input utterance uses Viterbi decoding.
  - 12. The system as recited in claim 7, wherein the means for statistically modeling uses a Generalized Probabilistic Descent algorithm.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
AT&T Corporation (AT&T, Inc.)
Inventors
Narayanan, Shrikanth Sambasivan, Potamianos, Alexandros, Zeljkovic, Ilija
Primary Examiner(s)
Voeltz, Emanuel Todd
Assistant Examiner(s)
SOFOCLEOUS, MICHAEL D

Application Number

US08/861,413
Time in Patent Office

1,119 Days
Field of Search

704/210, 704/214, 704/244, 704/256, 704/231, 704/233
US Class Current

704/256.2
CPC Class Codes

G10L 15/065   Adaptation

G10L 15/142   Hidden Markov Models [HMMs]

G10L 25/78   Detection of presence or ab...

Unsupervised HMM adaptation based on speech-silence discrimination

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

88 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Unsupervised HMM adaptation based on speech-silence discrimination

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

88 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links