Automated speech recognition proxy system for natural language understanding

US 9,741,347 B2
Filed: 12/03/2015
Issued: 08/22/2017
Est. Priority Date: 01/05/2011
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented system for processing an interaction, the interaction including an utterance requiring recognition before being usable for further computer-implemented processing, the system comprising:

an application configured to provide the utterance, the utterance received from a device of a customer over a computer network;

a recognition decision engine configured to;

receive the utterance for recognition,identify a grammar to which the utterance is expected to conform,determine a time length of the utterance,dynamically select, based at least in part on the identified grammar and the time length of the utterance, one or more recognizers from;

an automated speech recognizer, anda second type of recognizer, different from the automated speech recognizer, and communicating over a computer network with devices located at locations remote from the computer-implemented system; and

a results decision engine coupled with the one or more recognizers and configured to provide a recognition result responsive to results of processing by the one or more recognizers.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces. The system permits imperfect ASR subsystems to nonetheless relieve burden on HSR subsystems. An ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR subsystems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs. In some embodiments, the ASR proxy dynamically selects one or more recognizers based at least in part on the identified grammar and the time length of the utterance.

55 Citations

View as Search Results

20 Claims

1. A computer-implemented system for processing an interaction, the interaction including an utterance requiring recognition before being usable for further computer-implemented processing, the system comprising:
- an application configured to provide the utterance, the utterance received from a device of a customer over a computer network;
  
  a recognition decision engine configured to;
  
  receive the utterance for recognition,identify a grammar to which the utterance is expected to conform,determine a time length of the utterance,dynamically select, based at least in part on the identified grammar and the time length of the utterance, one or more recognizers from;
  
  an automated speech recognizer, anda second type of recognizer, different from the automated speech recognizer, and communicating over a computer network with devices located at locations remote from the computer-implemented system; and
  
  a results decision engine coupled with the one or more recognizers and configured to provide a recognition result responsive to results of processing by the one or more recognizers.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The system of claim 1, further comprising a system status subsystem operably connected to the recognition decision engine, the recognition decision engine taking as input system load information from the system status subsystem for use in the dynamically selecting.
  - 3. The system of claim 1, wherein a subset of the one or more recognizers is configured to provide a confidence metric to the recognition decision engine, the recognition decision engine using the confidence metric in the dynamically selecting.
  - 4. The system of claim 3, wherein the confidence metric includes a threshold, the threshold varying based on resource availability.
  - 5. The system of claim 1, wherein the recognition decision engine is configured to favor selection of the automated speech recognizers relative to the second type of recognizer subsystems based on recognition cost factors.
  - 6. The system of claim 1, wherein the recognition decision engine is configured to favor selection of the automated speech recognizers relative to the second type of recognizer based on human resource availability factors.
  - 7. The system of claim 1, wherein the results decision engine is configured to update confidence thresholds associated with a first one of the recognizers responsive to agreement of results between the first one of the recognizers and a second one of the recognizers.
  - 8. The system of claim 1, wherein the recognition decision engine is configured to initially choose a first one of the automated speech recognizers and, responsive to initial results provided by the first one of the automated speech recognizers, make a subsequent selection of a second one of the recognizers, the subsequent selection being made before processing of the utterance is completed by the first one of the automated speech recognizers.

9. A computer-implemented method performed by a computer system for processing an interaction, the interaction including an utterance requiring recognition before being usable for further computer-implemented processing, the computer-implemented method comprising:
- receiving data representing an utterance from a computer application, the utterance received from a device of a customer over a computer network;
  
  identifying a grammar to which the utterance is expected to conform;
  
  determining a time length of the utterance;
  
  dynamically selecting, based at least in part on the identified grammar and the time length of the utterance, one or more recognizers from;
  
  an automated speech recognizer (ASR), anda second type of recognizer, different from the automated speech recognizer, and communicating over a computer network with devices located at locations remote from the computer system; and
  
  providing a recognition result responsive to results of processing by the one or more recognizers.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The computer-implemented method of claim 9, wherein said dynamically selecting is responsive to a system load metric.
  - 11. The computer-implemented method of claim 9, wherein said dynamically selecting is responsive to a confidence metric.
  - 12. The computer-implemented method of claim 11, wherein the confidence metric includes a threshold, the threshold varying based on resource availability.
  - 13. The computer-implemented method of claim 9, wherein said dynamically selecting favors selection of the automated speech recognizer relative to the second type of recognizer based on recognition cost factors.
  - 14. The computer-implemented method of claim 9, wherein said dynamically selecting favors selection of the automated speech recognizer relative to the second type of recognizer based on human resource availability factors.
  - 15. The computer-implemented method of claim 9, further comprising updating confidence thresholds associated with a first one of the recognizers responsive to agreement of results between the first one of the recognizers and a second one of the recognizers.
  - 16. The computer-implemented method of claim 9, further comprising initially choosing a first one of the automated speech recognizers and, responsive to initial results provided by the first one of the automated speech recognizers, making a subsequent selection of a second one of the recognizers, the subsequent selection being made before processing of the utterance is completed by the first one of the automated speech recognizers.

17. A non-transitory computer-readable storage medium storing executable computer program code for processing an interaction, the interaction including an utterance requiring recognition before being usable for further computer-implemented processing, the computer program code comprising instructions for:
- receiving data representing an utterance from a device of a customer over a computer network;
  
  identifying a grammar to which the utterance is expected to conform;
  
  determining a time length of the utterance;
  
  dynamically selecting, based at least in part on the identified grammar and the time length of the utterance, one or more recognizers from a set of recognizers including;
  
  an automated speech recognizer (ASR), anda second type of recognizer, different from the automated speech recognizer, and communicating over a computer network with devices located at locations remote from the computer system; and
  
  providing a recognition result responsive to results of processing by the one or more recognizers.
- View Dependent Claims (18, 19, 20)
- - 18. The non-transitory computer-readable storage medium of claim 17, wherein said dynamically selecting is responsive to a system load metric.
  - 19. The non-transitory computer-readable storage medium of claim 17, wherein said dynamically selecting is responsive to a confidence metric.
  - 20. The non-transitory computer-readable storage medium of claim 19, wherein the confidence metric includes a threshold, the threshold varying based on resource availability.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Interactions, LLC
Original Assignee
Interactions, LLC
Inventors
Yeracaris, Yoryos, Carus, Alwin B, Lapshina, Larissa
Primary Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US14/958,833
Publication Number

US 20160086606A1
Time in Patent Office

628 Days
Field of Search

704260
US Class Current
CPC Class Codes

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/22   Procedures used during a sp...

G10L 15/285   Memory allocation or algori...

G10L 15/30   Distributed recognition, e....

G10L 15/32   Multiple recognisers used i...

G10L 17/10   Multimodal systems, i.e. ba...

G10L 25/39   using genetic algorithms

Automated speech recognition proxy system for natural language understanding

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

55 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Automated speech recognition proxy system for natural language understanding

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

55 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links