METHOD AND SYSTEM FOR PROCESSING MULTIPLE SPEECH RECOGNITION RESULTS FROM A SINGLE UTTERANCE
First Claim
1. A system comprising:
- a directed-dialog-processor server having a directed-dialog-processor application executing thereon;
a speech-recognition-engine server having a plurality of parallel-operable speech-recognition-engine applications executing thereon;
a context database;
a multiple-recognition-processor server in data communication with the directed-dialog-processor server, the speech-recognition-engine server, and the context database and having a multiple-recognition-processor application executing thereon; and
wherein the multiple-recognition-processor server is operable, via the multiple-recognition-processor application, to;
receive context information and a forwarded caller response from the directed-dialog-processor application;
select a set of parallel-operable speech-recognition-engine applications from the plurality of parallel-operable speech-recognition-engine applications;
combine the context information with additional context information from the context database to form modified context information;
forward to each speech-recognition-engine application in the selected set the modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response; and
receive from each speech-recognition-engine application in the selected set an n-best list comprising at least one confidence-score value and at least one word-score value.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of and system for accurately determining a caller response by processing speech-recognition results and returning that result to a directed-dialog application for further interaction with the caller. Multiple speech-recognition engines are provided that process the caller response in parallel. Returned speech-recognition results comprising confidence-score values and word-score values from each of the speech-recognition engines may be modified based on context information provided by the directed-dialog application and grammars associated with each speech-recognition engine. An optional context database may be used to further reduce or add weight to confidence-score values and word-score values, remove phrases and/or words, and add phrases and/or words to the speech-recognition engine results. In situations where a predefined threshold-confidence-score value is not exceeded, a new dynamic grammar may be created. A set of n-best hypotheses of what the caller uttered is returned to the directed-dialog application.
78 Citations
27 Claims
-
1. A system comprising:
-
a directed-dialog-processor server having a directed-dialog-processor application executing thereon; a speech-recognition-engine server having a plurality of parallel-operable speech-recognition-engine applications executing thereon; a context database; a multiple-recognition-processor server in data communication with the directed-dialog-processor server, the speech-recognition-engine server, and the context database and having a multiple-recognition-processor application executing thereon; and wherein the multiple-recognition-processor server is operable, via the multiple-recognition-processor application, to; receive context information and a forwarded caller response from the directed-dialog-processor application; select a set of parallel-operable speech-recognition-engine applications from the plurality of parallel-operable speech-recognition-engine applications; combine the context information with additional context information from the context database to form modified context information; forward to each speech-recognition-engine application in the selected set the modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response; and receive from each speech-recognition-engine application in the selected set an n-best list comprising at least one confidence-score value and at least one word-score value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising:
-
(a) providing a processor; (b) providing a memory interoperably coupled to the processor and having computer-readable processor instructions stored thereon; (c) using the processor and the memory in combination with the computer-readable processor instructions to perform at least one of steps (d)-(h); (d) receiving context information and a forwarded caller response from a directed-dialog-processor application executing on a directed-dialog-processor server; (e) selecting a set of parallel-operable speech-recognition-engine applications from a plurality of parallel-operable speech-recognition-engine applications executing on a speech-recognition-engine server; (f) combining the context information received in step (d) and additional context information present in a context database, thereby forming modified context information; (g) forwarding modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response to each speech-recognition-engine application selected in step (e); (h) receiving from each speech-recognition-engine application of the set of speech recognition engine applications an n-best list comprising at least one confidence-score value and a at least one word-score value. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer-program product comprising a non-transitory computer-usable medium having computer-readable processor instructions embodied therein, the non-transitory computer-readable processor instructions adapted to be executed to implement a method comprising:
-
(a) providing a processor; (b) providing a memory interoperably coupled to the processor and having computer-readable processor instructions stored thereon; (c) using the processor and the memory in combination to perform at least one of steps (d)-(h); (d) receiving context information and a forwarded caller response from a directed-dialog-processor application executing on a directed-dialog-processor server; (e) selecting a set of parallel-operable speech-recognition-engine applications from a plurality of parallel-operable speech-recognition-engine applications executing on a speech-recognition-engine server; (f) combining the context information received in step (d) and additional context information present in a context database, thereby forming modified context information; (g) forwarding modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response to each speech-recognition-engine application selected in step (e); (h) receiving from each speech-recognition-engine application of the set of speech recognition-engine applications an n-best list comprising at least one confidence-score value and at least one word-score value. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification