Active learning for spoken language understanding

US 7,742,918 B1
Filed: 07/05/2007
Issued: 06/22/2010
Est. Priority Date: 10/25/2002
Status: Expired due to Fees

First Claim

Patent Images

1. A non-transitory computer-readable storage medium storing instructions for controlling a computing device to generate a classifier, the instructions comprising:

(1) training a classifier using current training data S_t, the training data S_tgenerated by sampling a plurality of utterances;

(2) classifying utterances in a pool S_uusing the trained classifier;

(3) computing a call type confidence score for each utterance;

(4) sorting candidate utterances with respect to the confidence score of the maximum scoring call type;

(5) selecting the lowest scored k utterances from S_uusing the confidence scores and labeling them to define a labeled set S_i;

(6) redefining S_t=S_t∪

S_i; and

(7) redefining S_u=S_u−

S_i.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed is a system and method of training a spoken language understanding module. Such a module may be utilized in a spoken dialog system. The method of training a spoken language understanding module comprises training acoustic and language models using a small set of transcribed data S_t, recognizing utterances in a set S_uthat are candidates for transcription using the acoustic and language models, computing confidence scores of the utterances, selecting k utterances that have the smallest confidence scores from S_uand transcribing them into a new set S_i, redefining S_tas the union of S_tand S_i, redefining S_uas S_uminus S_i, and returning to the step of training acoustic and language models if word accuracy has not converged.

Citations

15 Claims

1. A non-transitory computer-readable storage medium storing instructions for controlling a computing device to generate a classifier, the instructions comprising:
- (1) training a classifier using current training data S_t, the training data S_tgenerated by sampling a plurality of utterances;
  
  (2) classifying utterances in a pool S_uusing the trained classifier;
  
  (3) computing a call type confidence score for each utterance;
  
  (4) sorting candidate utterances with respect to the confidence score of the maximum scoring call type;
  
  (5) selecting the lowest scored k utterances from S_uusing the confidence scores and labeling them to define a labeled set S_i;
  
  (6) redefining S_t=S_t∪
  
  S_i; and
  
  (7) redefining S_u=S_u−
  
  S_i.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The non-transitory computer-readable storage medium of claim 1, wherein steps 1 through 7 are practiced until labelers and utterances are no longer available.
  - 3. The non-transitory computer-readable storage medium of claim 1, wherein k is more than one.
  - 4. The non-transitory computer-readable storage medium of claim 1, wherein selecting k utterances from S_ufurther comprises leaving out utterances with confidence scores indicating that the utterances were correctly recognized.
  - 5. The non-transitory computer-readable storage medium of claim 1, wherein selecting k utterances from S_ufurther comprises selecting the lowest scoring k utterances from S_u.
  - 6. The non-transitory computer-readable storage medium of claim 1, wherein selecting k utterances from S_ufurther comprises selecting utterances according to a confidence score distribution that is closest to a prior distribution.

7. A non-transitory computer-readable storage medium storing instructions for controlling a computing device to generate a spoken language understanding module, the instructions comprising, from a small amount of training data S_tand a larger amount of unlabeled data S_u:
- (1) training a plurality of classifiers independently using a training data set S_t, the training data S_tgenerated by sampling a plurality of utterances;
  
  (2) classifying utterances in a set S_uusing the plurality of classifiers and computing a call type confidence score for all utterances;
  
  (3) sorting candidate utterances with respect to a score of the maximum scoring call type according to one of the classifiers if the classifiers disagree;
  
  (4) selecting and labeling the lowest scored k utterances from S_uto define a labeled set S_iand redefining S_tand S_uas follows;
  
  (5) S_t=S_t∪
  
  S_i; and
  
  (6) S_u=S_u−
  
  S_i, wherein the labeled utterances are used to generate the spoken language understanding module.
- View Dependent Claims (8, 9, 10)
- - 8. The non-transitory computer-readable storage medium of claim 7, wherein the steps occur only while labelers and utterances are available.
  - 9. The non-transitory computer-readable storage medium of claim 7, wherein selecting k utterances from S_ufurther comprises selecting utterances according to a confidence score distribution that is closest to a prior distribution.
  - 10. The non-transitory computer-readable storage medium of claim 7, wherein selecting k utterances from S_ufurther comprises selecting the lowest scoring k utterances from S_u.

11. A method of generating a spoken dialog understanding module, the method causing a processor of a computing device to perform steps comprising, from a small amount of training data S_tand a larger amount of unlabeled data S_u:
- classifying via the processor of the computing device utterances in an unlabelled data set S_uusing a plurality of classifiers;
  
  computing via the processor of the computing device a call type confidence score for all utterances;
  
  selecting utterances for labeling from the unlabeled data S_ubased on whether the classification from the plurality of classifiers disagree;
  
  redefining S_t=S_t∪
  
  a labeled set S_i;
  
  redefining S_u=S_u−
  
  S_ilabeling the selected utterances; and
  
  generating a spoken language understanding module using the labeled utterances.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The method of claim 11, wherein the selected utterances are the lowest scored k utterances from S_uto the final label set S_i, wherein the method further causes the processor of the computing device to perform steps comprising redefining S_tand S_uas follows:
    - S_t=S_tÅ
      
      S_i; and
      
      S_u=S_u−
      
      S_i, wherein the labeled utterances are used to generate the spoken language understanding module.
  - 13. The method of claim 12, wherein the steps occur only while labelers and utterances are available.
  - 14. The method of claim 12, wherein selecting k utterances from S_ufurther comprises selecting utterances according to a confidence score distribution that is closest to a prior distribution.
  - 15. The method of claim 12, wherein selecting k utterances from S_ufurther comprises selecting the lowest scoring k utterances from S_u.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Hakkani-Tur, Dilek Z., Tur, Gokhan, Schapire, Robert Elias
Primary Examiner(s)
Han, Qi

Application Number

US11/773,681
Time in Patent Office

1,083 Days
Field of Search

704/245, 704/243, 704/231, 704/251
US Class Current

704/245
CPC Class Codes

G10L 15/063 Training

Active learning for spoken language understanding

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Active learning for spoken language understanding

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links