Method of active learning for automatic speech recognition

US 7,149,687 B1
Filed: 12/24/2002
Issued: 12/12/2006
Est. Priority Date: 07/29/2002
Status: Active Grant

First Claim

Patent Images

1. A method for reducing the transcription effort for training an automatic speech recognition module, the method comprising:

(1) training acoustic and language models using a small set of transcribed data S_t;

(2) recognizing utterances in a set S_uthat are candidates for transcription using the acoustic and language models;

(3) computing confidence scores of the utterances;

(4) selecting k utterances that have the smallest confidence scores from S_uand transcribing them into a new set S_i;

(5) redefining S_ias the union of S_tand S_i;

(6) redefining S_uas S_uminus S_i; and

(7) returning to step (1) if word accuracy has not converged.

View all claims

18 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.

76 Citations

View as Search Results

12 Claims

1. A method for reducing the transcription effort for training an automatic speech recognition module, the method comprising:
- (1) training acoustic and language models using a small set of transcribed data S_t;
  
  (2) recognizing utterances in a set S_uthat are candidates for transcription using the acoustic and language models;
  
  (3) computing confidence scores of the utterances;
  
  (4) selecting k utterances that have the smallest confidence scores from S_uand transcribing them into a new set S_i;
  
  (5) redefining S_ias the union of S_tand S_i;
  
  (6) redefining S_uas S_uminus S_i; and
  
  (7) returning to step (1) if word accuracy has not converged.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein k is one.
  - 3. The method of claim 1, wherein k is more than one.
  - 4. The method of claim 1, wherein selecting k utterances that have the smallest confidence scores from Su further comprises leaving out utterances with confidence scores indicating that the utterances were correctly recognized.
  - 5. The method of claim 1, wherein word posterior probability estimates are used for word confidence scores associated with the utterances.
  - 6. The method of claim 5, wherein a word is considered to be correctly recognized if its posterior probability is higher than a threshold value.

7. A method for reducing the transcription effort for training an automatic speech recognition module, the method comprising:
- (1) pre-processing unlabeled examples of transcribed utterances using a computer device;
  
  (2) using a lattice output from a speech recognizer, automatically estimating a confidence score for each word associated with a set of selected examples;
  
  (3) computing utterance confidence scores based on the estimated confidence score for each word; and
  
  (4) selecting utterances to be transcribed using the utterance confidence scores.
- View Dependent Claims (8)
- - 8. The method of claim 7, wherein the speech recognizer was trained on a small set of transcribed data.

9. A computer-readable medium that stores a program for controlling a computer device to perform the following method to reduce the transcription effort for training an automatic speech recognition module, the method comprising:
- (1) training acoustic and language models using a small set of transcribed data S_t;
  
  (2) recognizing utterances in a set S_uthat are candidates for transcription using the acoustic and language models;
  
  (3) computing confidence scores of the utterances;
  
  (4) selecting k utterances that have the smallest confidence scores from S_uand transcribing them into a new set S_i;
  
  (5) redefining S_ias the union of S_tand S_i;
  
  (6) redefining S_uas S_uminus S_i; and
  
  (7) returning to step (1) if word accuracy has not converged.

10. A computer-readable medium that stores a program for controlling a computer device to perform the following method to reduce the transcription effort for training an automatic speech recognition module, the method comprising:
- (1) pre-processing unlabeled examples of transcribed utterances using a computer device;
  
  (2) using a lattice output from a speech recognizer, automatically estimating a confidence score for each word associated with a set of selected examples;
  
  (3) computing utterance confidence scores based on the estimated confidence score for each word; and
  
  (4) selecting utterances to be transcribed using the utterance confidence scores.

11. An automatic speech recognition module trained using a method of reducing the transcription effort for training an automatic speech recognition module, the method comprising:
- (1) training acoustic and language models using a small set of transcribed data S_t;
  
  (2) recognizing utterances in a set S_uthat are candidates for transcription using the acoustic and language models;
  
  (3) computing confidence scores of the utterances;
  
  (4) selecting k utterances that have the smallest confidence scores from S_uand transcribing them into a new set S_i;
  
  (5) redefining S_ias the union of S_tand S_i;
  
  (6) redefining S_uas S_uminus S_i; and
  
  (7) returning to step (1) if word accuracy has not converged.

12. An automatic speech recognition module trained using a method of reducing the transcription effort for training an automatic speech recognition module, the method comprising:
- (1) pre processing unlabeled examples of transcribed utterances using a computer device;
  
  (2) using a lattice output from a speech recognizer, automatically estimating a confidence score for each word associated with a set of selected examples;
  
  (3) computing utterance confidence scores based on the estimated confidence score for each word; and
  
  (4) selecting utterances to be transcribed using the utterance confidence scores.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Interactions, LLC
Original Assignee
AT&T Corporation (AT&T, Inc.)
Inventors
Riccardi, Giuseppe, Gorin, Allen Louis, Hakkani-Tur, Dilek Z.
Primary Examiner(s)
Knepper, David D.

Application Number

US10/329,139
Time in Patent Office

1,449 Days
Field of Search

None
US Class Current

704/243
CPC Class Codes

G10L 15/063 Training

Method of active learning for automatic speech recognition

First Claim

18 Assignments

0 Petitions

Accused Products

Abstract

76 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Method of active learning for automatic speech recognition

First Claim

18 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

76 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links