Automated speech recognition proxy system for natural language understanding

US 9,245,525 B2
Filed: 07/08/2013
Issued: 01/26/2016
Est. Priority Date: 01/05/2011
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented system for processing an interaction, the interaction including an utterance requiring recognition before being usable for further computer-implemented processing, the system comprising:

an application configured to provide the utterance, the utterance received from a device of a customer over a computer network;

a recognition decision engine configured to receive the utterance for recognition, the recognition decision engine using parameters provided by the application to dynamically select one or more recognizers from;

automated speech recognition (ASR) subsystems, anda second type of recognizer subsystems, different from the ASR subsystems, and communicating over a computer network with devices located at locations remote from the computer-implemented system; and

a results decision engine coupled with the one or more recognizers and configured to provide a recognition result.

View all claims

19 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces. The system permits imperfect ASR subsystems to nonetheless relieve burden on HSR subsystems. An ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR subsystems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs.

Citations

20 Claims

1. A computer-implemented system for processing an interaction, the interaction including an utterance requiring recognition before being usable for further computer-implemented processing, the system comprising:
- an application configured to provide the utterance, the utterance received from a device of a customer over a computer network;
  
  a recognition decision engine configured to receive the utterance for recognition, the recognition decision engine using parameters provided by the application to dynamically select one or more recognizers from;
  
  automated speech recognition (ASR) subsystems, anda second type of recognizer subsystems, different from the ASR subsystems, and communicating over a computer network with devices located at locations remote from the computer-implemented system; and
  
  a results decision engine coupled with the one or more recognizers and configured to provide a recognition result.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The system of claim 1, further comprising a system status subsystem operably connected to the recognition decision engine, the recognition decision engine taking as input system load information from the system status subsystem for use in said dynamically selecting.
  - 3. The system of claim 1, wherein a subset of the one or more recognizers is configured to provide a confidence metric to the recognition decision engine, the recognition decision engine using the confidence metric in said dynamically selecting.
  - 4. The system of claim 3, wherein the confidence metric includes a threshold, the threshold varying based on resource availability.
  - 5. The system of claim 1, wherein the recognition decision engine is configured to favor selection of the automated speech recognition subsystems relative to the second type of recognizer subsystems based on recognition cost factors.
  - 6. The system of claim 1, wherein the recognition decision engine is configured to favor selection of the automated speech recognition subsystems relative to the second type of recognizer subsystems based on human resource availability factors.
  - 7. The system of claim 1, wherein the results decision engine is configured to update confidence thresholds associated with a first one of the recognizer subsystems responsive to agreement of results between the first one of the recognizer subsystems and a second one of the recognizer subsystems.
  - 8. The system of claim 1, wherein the recognition decision engine is configured to initially choose a first one of the automated speech recognition subsystems and, responsive to initial results provided by the first one of the automated speech recognition subsystems, make a subsequent selection of a second one of the recognizer subsystems, the subsequent selection being made before processing of the utterance is completed by the first one of the automated speech recognition subsystems.

9. A computer-implemented method performed by a computer system for processing an interaction, the interaction including an utterance requiring recognition before being usable for further computer-implemented processing, the computer-implemented method comprising:
- receiving data representing an utterance from a computer application, the utterance received from a device of a customer over a computer network;
  
  dynamically selecting, using parameters provided by the application, one or more recognizers from;
  
  an automated speech recognizer (ASR), anda second type of recognizer, different from the automated speech recognizer, and communicating over a computer network with devices located at locations remote from the computer system; and
  
  providing a recognition result responsive to results of processing by the one or more recognizers.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The computer-implemented method of claim 9, wherein said dynamically selecting is responsive to a system load metric.
  - 11. The computer-implemented method of claim 9, wherein said dynamically selecting is responsive to a confidence metric.
  - 12. The computer-implemented method of claim 11, wherein the confidence metric includes a threshold, the threshold varying based on resource availability.
  - 13. The computer-implemented method of claim 9, wherein said dynamically selecting favors selection of the automated speech recognizer relative to the second type of recognizer based on recognition cost factors.
  - 14. The computer-implemented method of claim 9, wherein said dynamically selecting favors selection of the automated speech recognizer relative to the second type of recognizer based on human resource availability factors.
  - 15. The computer-implemented method of claim 9, further comprising updating confidence thresholds associated with a first one of the recognizers responsive to agreement of results between the first one of the recognizers and a second one of the recognizers.
  - 16. The computer-implemented method of claim 9, further comprising initially choosing a first one of the automated speech recognizers and, responsive to initial results provided by the first one of the automated speech recognizers, making a subsequent selection of a second one of the recognizers, the subsequent selection being made before processing of the utterance is completed by the first one of the automated speech recognizers.

17. A non-transitory computer-readable storage medium storing executable computer program code for processing an interaction, the interaction including an utterance requiring recognition before being usable for further computer-implemented processing, the computer program code comprising instructions for:
- receiving data representing an utterance from a computer application, the utterance received from a device of a customer over a computer network;
  
  dynamically selecting, using parameters provided by the application, one or more recognizers from;
  
  an automated speech recognizer (ASR), anda second type of recognizer, different from the automated speech recognizer, and communicating over a computer network with devices located at locations remote from the computer system; and
  
  providing a recognition result responsive to results of processing by the one or more recognizers.
- View Dependent Claims (18, 19, 20)
- - 18. The non-transitory computer-readable storage medium of claim 17, wherein said dynamically selecting is responsive to a system load metric.
  - 19. The non-transitory computer-readable storage medium of claim 17, wherein said dynamically selecting is responsive to a confidence metric.
  - 20. The non-transitory computer-readable storage medium of claim 19, wherein the confidence metric includes a threshold, the threshold varying based on resource availability.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Interactions, LLC
Original Assignee
Interactions, LLC
Inventors
Yeracaris, Yoryos, Carus, Alwin B, Lapshina, Larissa
Primary Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US13/936,440
Publication Number

US 20140288932A1
Time in Patent Office

932 Days
Field of Search

704/270.1
US Class Current

1/1
CPC Class Codes

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/22   Procedures used during a sp...

G10L 15/285   Memory allocation or algori...

G10L 15/30   Distributed recognition, e....

G10L 15/32   Multiple recognisers used i...

G10L 17/10   Multimodal systems, i.e. ba...

G10L 25/39   using genetic algorithms

Automated speech recognition proxy system for natural language understanding

First Claim

19 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Automated speech recognition proxy system for natural language understanding

First Claim

19 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links