Intelligent user adaptation in dialog systems
First Claim
1. A process for operating a speech dialog system, that adapts to the speech quality of different speakers, in which the responses of a system user are supplied via a speech interface to a speech recognizer associated with the speech dialog system, whereupon the speech recognizer estimates the likelihood of a correct recognition of the user response, in that, for estimation, it consults a confidence gage, via which the words or phrases potentially contained in the speech response are assigned a confidence value, and in that a conclusion is reached as to the correctness of the recognition of those words or, as the case may be, those phrases, which are associated with the greatest confidence values, when these confidence values exceed a predetermined confidence threshold value, and wherein a subsequent sequence of the speech dialog is adapted to the system user depending upon whether or not a conclusion had been reached that the recognition was correct, wherein at least in the case, in which no conclusion had been made as to a correct recognition, the potentially recognized words or, as the case may be, phrases are stored temporarily in a storage medium, wherein when the speech recognizer, during subsequent recognition processes, again does not come to a conclusion of a correct recognition, then at least the most recent words or, as the case may be, phrases stored in the storage medium are compared with the new words or phrases potentially recognized by the speech recognizer, and wherein the speech recognizer then makes a conclusion as to the correct recognition of a word or, as the case may be, phrase, if in the framework of the comparison these words or, as the case may be, these phrases, are identified both in the stored words or, as the case may be, phrases, as well in the new potentially recognized words or, as the case may be, phrases.
2 Assignments
0 Petitions
Accused Products
Abstract
In a process for operating a speech dialog system, which adapts its to the speech quality of different speakers, the speech recognizer estimates the probability of a correct recognition of the user response or expression, in that it consults for estimation a confidence gage by means of which the words or phrases potentially contained in the speech response or expression are assigned a confidence value. One of the particularly preferred solutions of the inventive task are comprised in that for those speakers which are difficult for the speech dialog system to understand, it accepts in certain cases repetitions of the same user responses which, by themselves, would not be acceptable. A further advantageous solution is comprised therein, that the confidence threshold is selected depending upon the actual current dialog step. Thereby the speech dialog system adapts itself to the system user depending upon the actual dialog stage and makes possible that those responses, which fit without problem into the actual dialog flow, are accepted more rapidly even in the case of speakers which are difficult to understand. Alternatively to this, there is provided a solution, at least in those cases, in which it has not been concluded that a correct recognition has been made, to store this at least temporarily in a storage medium. Thereby the system behavior adapts itself dynamically with a system user, in that it observes the speech comprehensibility of the system user, so that user responses are accepted, which lie below the actual confidence threshold value to be observed.
-
Citations
10 Claims
-
1. A process for operating a speech dialog system, that adapts to the speech quality of different speakers,
in which the responses of a system user are supplied via a speech interface to a speech recognizer associated with the speech dialog system, whereupon the speech recognizer estimates the likelihood of a correct recognition of the user response, in that, for estimation, it consults a confidence gage, via which the words or phrases potentially contained in the speech response are assigned a confidence value, and in that a conclusion is reached as to the correctness of the recognition of those words or, as the case may be, those phrases, which are associated with the greatest confidence values, when these confidence values exceed a predetermined confidence threshold value, and wherein a subsequent sequence of the speech dialog is adapted to the system user depending upon whether or not a conclusion had been reached that the recognition was correct, wherein at least in the case, in which no conclusion had been made as to a correct recognition, the potentially recognized words or, as the case may be, phrases are stored temporarily in a storage medium, wherein when the speech recognizer, during subsequent recognition processes, again does not come to a conclusion of a correct recognition, then at least the most recent words or, as the case may be, phrases stored in the storage medium are compared with the new words or phrases potentially recognized by the speech recognizer, and wherein the speech recognizer then makes a conclusion as to the correct recognition of a word or, as the case may be, phrase, if in the framework of the comparison these words or, as the case may be, these phrases, are identified both in the stored words or, as the case may be, phrases, as well in the new potentially recognized words or, as the case may be, phrases.
-
3. A process for operating a speech dialog system, that adapts to the speech quality of different speakers,
in which the responses of a system user are supplied via a speech interface to a speech recognizer associated with the speech dialog system, whereupon the speech recognizer estimates the likelihood of a correct recognition of the user response, in that, for estimation, it consults a confidence gage, via which the words or phrases potentially contained in the speech response are assigned a confidence value, and in that a conclusion is reached as to the correctness of the recognition of those words or, as the case may be, those phrases, which are associated with the greatest confidence values, when these confidence values exceed a predetermined confidence threshold value, and wherein a subsequent sequence of the speech dialog is adapted to the system user depending upon whether or not a conclusion had been reached that the recognition was correct, wherein the confidence threshold value is selected depending upon the actual current dialog step, wherein then, if the user response lies upon the projected path through the dialog, the normal confidence threshold value is lowered, so that the speech recognizer makes a conclusion as to a recognized word or, as the case may be, phrase, if this obtains a lower confidence value then was conventionally previously necessary.
-
4. A process for operating a speech dialog system, that adapts to the speech quality of different speakers,
in which the responses of a system user are supplied via a speech interface to a speech recognizer associated with the speech dialog system, whereupon the speech recognizer estimates the likelihood of a correct recognition of the user response, in that, for estimation, it consults a confidence gage, via which the words or phrases potentially contained in the speech response are assigned a confidence value, and in that a conclusion is reached as to the correctness of the recognition of those words or, as the case may be, those phrases, which are associated with the greatest confidence values, when these confidence values exceed a predetermined confidence threshold value, and wherein a subsequent sequence of the speech dialog is adapted to the system user depending upon whether or not a conclusion had been reached that the recognition was correct, wherein at least in those cases, in which a conclusion has not been made as to a correct recognition, the word or phrase is at least temporarily stored in a storage medium, and wherein the confidence threshold is lowered, if the responses of the system user, for which a correct recognition has not been concluded or determined, exceeds a predetermined proportion relative to the total number of responses, or that wherein the confidence threshold value is raised, if the responses of a system user, for which correct recognition has been concluded, always lies significantly above the confidence threshold value.
Specification