Method and system for processing parallel context dependent speech recognition results from a single utterance utilizing a context database
First Claim
1. A system comprising:
- a directed-dialog-processor server having a directed-dialog-processor application executing thereon;
a speech-recognition-engine server having a plurality of parallel-operable speech-recognition-engine applications executing thereon;
wherein the plurality of parallel-operable speech-recognition-engine applications each provide a different speech-recognition capability;
a context database;
a multiple-recognition-processor server in data communication with the directed-dialog-processor server, the speech-recognition-engine server, and the context database and having a multiple-recognition-processor application executing thereon; and
wherein the multiple-recognition-processor server is operable, via the multiple-recognition-processor application, to;
receive context information and a forwarded caller response from the directed-dialog-processor application;
select, using the context information, a set of parallel-operable speech-recognition-engine applications from the plurality of parallel-operable speech-recognition-engine applications;
combine the context information with additional context information from the context database to form modified context information;
forward to each speech-recognition-engine application in the selected set the modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response;
receive from each speech-recognition-engine application in the selected set an n-best list comprising at least one confidence-score value and at least one word-score value;
wherein the at least one confidence-score value and the at least one word-score value in each n-best list are modified by a weight-multiplier value based on the context information provided by the directed-dialog-processor application, thereby creating a modified n-best list;
wherein each modified n-best list is combined into a single, sorted combined n-best list; and
wherein the at least one confidence-score value and the at least one word-score value of the sorted combined n-best list are modified by determining presence of phrases and words of the sorted combined n-best list in the context database.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of and system for accurately determining a caller response by processing speech-recognition results and returning that result to a directed-dialog application for further interaction with the caller. Multiple speech-recognition engines are provided that process the caller response in parallel. Returned speech-recognition results comprising confidence-score values and word-score values from each of the speech-recognition engines may be modified based on context information provided by the directed-dialog application and grammars associated with each speech-recognition engine. A context database is used to further reduce or add weight to confidence-score values and word-score values, remove phrases and/or words, and add phrases and/or words to the speech-recognition engine results. In situations where a predefined threshold-confidence-score value is not exceeded, a new dynamic grammar may be created. A set of n-best hypotheses of what the caller uttered is returned to the directed-dialog application.
-
Citations
18 Claims
-
1. A system comprising:
-
a directed-dialog-processor server having a directed-dialog-processor application executing thereon; a speech-recognition-engine server having a plurality of parallel-operable speech-recognition-engine applications executing thereon; wherein the plurality of parallel-operable speech-recognition-engine applications each provide a different speech-recognition capability; a context database; a multiple-recognition-processor server in data communication with the directed-dialog-processor server, the speech-recognition-engine server, and the context database and having a multiple-recognition-processor application executing thereon; and wherein the multiple-recognition-processor server is operable, via the multiple-recognition-processor application, to; receive context information and a forwarded caller response from the directed-dialog-processor application; select, using the context information, a set of parallel-operable speech-recognition-engine applications from the plurality of parallel-operable speech-recognition-engine applications; combine the context information with additional context information from the context database to form modified context information; forward to each speech-recognition-engine application in the selected set the modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response; receive from each speech-recognition-engine application in the selected set an n-best list comprising at least one confidence-score value and at least one word-score value; wherein the at least one confidence-score value and the at least one word-score value in each n-best list are modified by a weight-multiplier value based on the context information provided by the directed-dialog-processor application, thereby creating a modified n-best list; wherein each modified n-best list is combined into a single, sorted combined n-best list; and wherein the at least one confidence-score value and the at least one word-score value of the sorted combined n-best list are modified by determining presence of phrases and words of the sorted combined n-best list in the context database. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method comprising:
-
(a) providing a processor; (b) providing a memory interoperably coupled to the processor and having computer-readable processor instructions stored thereon; (c) using the processor and the memory in combination with the computer-readable processor instructions to perform at least one of steps (d)-(i); (d) receiving context information and a forwarded caller response from a directed-dialog-processor application executing on a directed-dialog-processor server; (e) selecting, using the context information, a set of parallel-operable speech-recognition-engine applications from a plurality of parallel-operable speech-recognition-engine applications executing on a speech-recognition-engine server; wherein the plurality of parallel-operable speech-recognition-engine applications each provide a different speech-recognition capability; (f) combining the context information received in step (d) and additional context information present in a context database, thereby forming modified context information; (g) forwarding modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response to each speech-recognition-engine application selected in step (e); (h) receiving from each speech-recognition-engine application of the set of parallel-operable speech-recognition-engine applications an n-best list comprising at least one confidence-score value and at least one word-score value; (i) responsive to step (h), modifying the at least one confidence-score value and the at least one word-score value in each n-best list by a weight-multiplier value based on the context information provided by the directed-dialog-processor application, thereby creating a modified n-best list; wherein each modified n-best list is combined into a single sorted combined n-best list; and wherein the at least one confidence-score value and the at least one word-score value of the sorted combined n-best list are modified by determining presence of phrases and words of the sorted combined n-best list in the context database. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-program product comprising a non-transitory computer-usable medium having computer-readable processor instructions embodied therein, the computer-readable processor instructions adapted to be executed to implement a method comprising:
-
(a) providing a processor; (b) providing a memory interoperably coupled to the processor and having computer-readable processor instructions stored thereon; (c) using the processor and the memory in combination to perform at least one of steps (d)-(i); (d) receiving context information and a forwarded caller response from a directed-dialog-processor application executing on a directed-dialog-processor server; (e) selecting, using the context information, a set of parallel-operable speech-recognition-engine applications from a plurality of parallel-operable speech-recognition-engine applications executing on a speech-recognition-engine server; wherein the plurality of parallel-operable speech-recognition-engine applications each provide a different speech-recognition capability; (f) combining the context information received in step (d) and additional context information present in a context database, thereby forming modified context information; (g) forwarding modified context information, the forwarded caller response, and a request to perform speech recognition of the forwarded caller response to each speech-recognition-engine application selected in step (e); (h) receiving from each speech-recognition-engine application of the set of parallel-operable speech-recognition-engine applications an n-best list comprising at least one confidence-score value and at least one word-score value; (i) responsive to step (h), modifying the at least one confidence-score value and the at least one word-score value in each n-best list by a weight-multiplier value based on the context information provided by the directed-dialog-processor application, thereby creating a modified n-best list; wherein each modified n-best list is combined into a single sorted combined n-best list; and wherein the at least one confidence-score value and the at least one word-score value of the sorted combined n-best list are modified by determining presence of phrases and words of the sorted combined n-best list in the context database. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification