System and method for targeted tuning module of a speech recognition system

US 7,580,837 B2
Filed: 08/12/2004
Issued: 08/25/2009
Est. Priority Date: 08/12/2004
Status: Active Grant

First Claim

Patent Images

1. A method of tuning a speech system comprising:

accessing, from a database, information representing a plurality of utterances for at least one speech-enabled application, the plurality of utterances comprising at least a first type of utterance and a second type of utterance;

accessing, from the database, interpretive information representing an assigned interpretation for at least a portion of the plurality of utterances;

determining, by a training tool subsystem, an appropriate interpretation for the portion of the plurality of utterances;

comparing, by the training tool subsystem, the assigned interpretation for the portion of the plurality of utterances to the appropriate interpretation for the portion of the plurality of utterances;

determining, by the training tool subsystem, a frequency value for the second type of utterance that represents the percentage of occurrence of the second type of utterance in the plurality of utterances;

determining, by the training tool subsystem, that the speech-enabled application more accurately responds to the first type of utterance; and

electing, by the training tool subsystem, to apply a targeted tuning to the speech-enabled application to improve recognition of the second type of utterance when the frequency value for the second type of utterance is greater than a frequency threshold value.

View all claims

16 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method are disclosed for targeted tuning of a speech recognition system. A method incorporating teachings of the present disclosure may include deploying a speech recognition module to apply an appropriate interpretation to a plurality of utterance types. The method may also include accessing information representing a collection of recorded utterances and assigned interpretation for each of the plurality of recorded utterances. The assigned interpretation for each of the plurality of recorded utterances may then be compared to an accurate interpretation for each of the plurality of utterance, and a separate accuracy value may be determined for each of the plurality of utterance types. With some implementations, if the separate accuracy value for a given type of utterance is too low, a selection of utterances having the given type may be used to tune the speech recognition module.

Citations

26 Claims

1. A method of tuning a speech system comprising:
- accessing, from a database, information representing a plurality of utterances for at least one speech-enabled application, the plurality of utterances comprising at least a first type of utterance and a second type of utterance;
  
  accessing, from the database, interpretive information representing an assigned interpretation for at least a portion of the plurality of utterances;
  
  determining, by a training tool subsystem, an appropriate interpretation for the portion of the plurality of utterances;
  
  comparing, by the training tool subsystem, the assigned interpretation for the portion of the plurality of utterances to the appropriate interpretation for the portion of the plurality of utterances;
  
  determining, by the training tool subsystem, a frequency value for the second type of utterance that represents the percentage of occurrence of the second type of utterance in the plurality of utterances;
  
  determining, by the training tool subsystem, that the speech-enabled application more accurately responds to the first type of utterance; and
  
  electing, by the training tool subsystem, to apply a targeted tuning to the speech-enabled application to improve recognition of the second type of utterance when the frequency value for the second type of utterance is greater than a frequency threshold value.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
- - 2. The method of claim 1, further comprising tuning the speech-enabled application to improve recognition of the second type of utterance by feeding a collection of the second type of utterances into a learning module of the speech-enabled application.
  - 3. The method of claim 2, wherein the tuning step comprises avoiding a feeding of the first type of utterances into the learning module.
  - 4. The method of claim 2, wherein feeding the collection of the second type of utterances into the learning module comprises:
    - playing a file representing a second type of utterance recording; and
      
      inputting the appropriate interpretation for the recording.
  - 5. The method of claim 1, further comprising improving recognition of the second type of utterance without degrading recognition of the first type of utterance.
  - 6. The method of claim 1, wherein the speech-enabled application executes at an automated call router.
  - 7. The method of claim 1, wherein the speech-enabled application executes at a voice activated services platform.
  - 8. The method of claim 1, wherein the speech-enabled application executes in connection with a call center.
  - 9. The method of claim 1, wherein the assigned interpretation comprises an action to be performed.
  - 10. The method of claim 9, wherein the action-object to be performed is selected from a group consisting of a pay bill action, a transfer to agent action, an inquire about balance action, a change service action, an acquire service action, a cancel service action, an inquire about a bill action, an acquire about an account action, a schedule payment action, a reconnect service action, and another business-related combination of an action and an object to be acted upon in accordance with the action.
  - 11. The method of claim 1, wherein the plurality of utterances comprises an accumulation of utterances received via a deployed speech-enabled application, further wherein the portion of the plurality of utterances comprises all of the accumulation of utterances.
  - 12. The method of claim 1, further comprising storing information representing the plurality of utterances as discrete audio files.
  - 13. The method of claim 1, wherein the at least one speech-enabled application comprises an application deployed in an operational environment.
  - 14. The method of claim 1, wherein determining that the speech-enabled application more accurately responds to the first type of utterance comprises:
    - calculating a system hit rate for the first type of utterance, wherein the system hit rate for the first type of utterance reflects how often the at least one speech-enabled application applied a first type interpretation to a received first type utterance; and
      
      calculating a system hit rate for the second type of utterance.
  - 15. The method of claim 1, farther comprising calculating a system error rate for the first type of utterance, wherein the system error rate for the first type of utterance reflects how often the at least one speech-enabled application misapplies a first type interpretation to a received utterance of a type other than the first type of utterance.
  - 16. The method of claim 1, farther comprising setting an utterance type-specific hit rate design threshold for each of a collection of expected utterance types, wherein the targeted tuning comprises exclusively tuning the speech-enabled application to utterance types having an actual utterance type specific hit rate that fails to reach a respective utterance type-specific hit rate design threshold.

17. A speech tuning system, comprising:
- a repository comprising a memory to store a sample of captured utterances from an implemented speech-enabled application and an assigned utterance type for each of the captured utterances;
  
  an accuracy engine communicatively coupled to the repository and operable to determine if an assigned utterance type for a given captured utterance represents an accurate interpretation of the given captured utterance;
  
  a targeting engine communicatively coupled to the accuracy engine and operable to determine a first accuracy level of the speech-enabled application in identifying a first type of utterance and a second accuracy level of the speech-enabled application in identifying a second type of utterance; and
  
  a tuning engine operable to feed the speech-enabled application with a collection of utterances having the first type when the first accuracy level is lower than the second accuracy level and when a frequency of occurrence of the first type of utterance in the sample of captured utterances is greater than a frequency threshold value.
- View Dependent Claims (18, 19, 20, 21, 22)
- - 18. The system of claim 17, wherein the sample of captured utterances comprises the collection of utterances.
  - 19. The system of claim 17, further comprising a call center that comprises the implemented speech-enabled application.
  - 20. The system of claim 17, further comprising a computer readable medium, wherein a set of instructions embodying the accuracy engine and the tuning engine are stored on the computer readable medium.
  - 21. The system of claim 17, further comprising an automated call router that comprises the implemented speech-enabled application.
  - 22. The system of claim 17, further comprising a voice activated services platform that comprises the implemented speech-enabled application.

23. A method of tuning a speech-enabled application comprising:
- deploying a speech-recognition module to apply an appropriate interpretation to a plurality of utterance types;
  
  accessing, from a database, information representing a collection of recorded utterances and assigned interpretation for each of the plurality of recorded utterances;
  
  comparing, by an accuracy engine, the assigned interpretation for each of the plurality of recorded utterances to an accurate interpretation for each of the plurality of utterances;
  
  determining, by the accuracy engine, a separate accuracy value for each of the plurality of utterance types; and
  
  feeding the speech-recognition module with a selection of utterances having a given type when the separate accuracy value for the given type is lower than an accuracy threshold value and when a frequency of occurrence of the given type of utterance in the plurality of recorded utterances is greater than a frequency threshold value.
- View Dependent Claims (24, 25, 26)
- - 24. The method of claim 23, further comprising recording the collection of recorded utterances as discrete audio files.
  - 25. The method of claim 23, further comprising ensuring that the selection of utterances does not include a different utterance type if the separate accuracy value for the different utterance type is at or above the accuracy threshold value.
  - 26. The method of claim 25, further comprising determining a new accuracy value for the given type of utterance.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Interactions, LLC
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Knott, Benjamin Anthony, Bushey, Robert R., Martin, John Mills
Primary Examiner(s)
Hudspeth; David R
Assistant Examiner(s)
ALBERTALLI, BRIAN LOUIS

Application Number

US10/917,233
Publication Number

US 20060036437A1
Time in Patent Office

1,839 Days
Field of Search

704/231
US Class Current

704/244
CPC Class Codes

G10L 15/063   Training

G10L 15/065   Adaptation

G10L 15/19   Grammatical context, e.g. d...

G10L 15/22   Procedures used during a sp...

System and method for targeted tuning module of a speech recognition system

First Claim

16 Assignments

0 Petitions

Accused Products

Abstract

Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for targeted tuning module of a speech recognition system

First Claim

16 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links