Turn-taking confidence
First Claim
1. A method for managing interactive voice response dialog between a machine comprising automatic speech recognition and a user, said method comprising the steps of:
- setting a mode confidence level parameter value to a first value prior to a first input from said user wherein said first input is of a speech input mode;
selecting one of a plurality of audio prompts comprising speech to annunciate to the user from the machine based on said first value of said mode confidence level parameter, wherein said one of said plurality of audio prompts solicits said first input comprising a first semantic response from said user in said speech input mode;
annunciating the at least one of a plurality of audio prompts to said user;
receiving said first input from said user;
determining a first speech recognition confidence level based on said first input;
setting said mode confidence level parameter to a second value based on said first speech recognition confidence level, said second value indicating a lower level of confidence of recognition relative to said first value of said mode confidence level;
selecting an another one of at least one of a plurality of audio prompts based on said mode confidence level; and
annunciating to the user from the machine said another one of least one of a plurality of audio prompts comprising speech based on said second value of said mode confidence level wherein said another one of said plurality of phrases solicits said first semantic response from said user in a DTMF input mode.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for managing interactive dialog between a machine and a user. In one embodiment, an interaction between the machine and the user is managed by determining at least one likelihood value which is dependent upon a possible speech onset of the user. In another embodiment, the likelihood value can be dependent on a model of a desire of the user for specific items, a model of an attention of the user to specific items, or a model of turn-taking cues. The values can be used to determine a mode confidence value that is used by the system to determine the nature of prompts provided to the user.
59 Citations
12 Claims
-
1. A method for managing interactive voice response dialog between a machine comprising automatic speech recognition and a user, said method comprising the steps of:
-
setting a mode confidence level parameter value to a first value prior to a first input from said user wherein said first input is of a speech input mode; selecting one of a plurality of audio prompts comprising speech to annunciate to the user from the machine based on said first value of said mode confidence level parameter, wherein said one of said plurality of audio prompts solicits said first input comprising a first semantic response from said user in said speech input mode; annunciating the at least one of a plurality of audio prompts to said user; receiving said first input from said user; determining a first speech recognition confidence level based on said first input; setting said mode confidence level parameter to a second value based on said first speech recognition confidence level, said second value indicating a lower level of confidence of recognition relative to said first value of said mode confidence level; selecting an another one of at least one of a plurality of audio prompts based on said mode confidence level; and annunciating to the user from the machine said another one of least one of a plurality of audio prompts comprising speech based on said second value of said mode confidence level wherein said another one of said plurality of phrases solicits said first semantic response from said user in a DTMF input mode. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for managing interactive voice response dialog between a machine comprising automatic speech recognition and a user, said method comprising the steps of:
-
setting a mode confidence level parameter at a first value prior to a first input from said user wherein said first input is of a DTMF input mode; selecting one of a plurality of audio prompts comprising speech to annunciate to the user from the machine based on said first value of said mode confidence level parameter, wherein said one of said plurality of audio prompts solicits said first input comprising a first semantic response from said user in said DTMF input mode; annunciating the at least one of a plurality of audio prompts to said user; receiving said first input from said user; determining a first speech recognition confidence level based on said first input; setting said mode confidence level parameter to a second value based on said first speech recognition confidence level, said second value indicating a higher level of confidence of recognition relative to said mode confidence level; selecting an another one of at least one of a plurality of audio prompts based on said mode confidence level; and annunciating to the user from the machine said another one of at least one of a plurality of audio prompts comprising speech based on said second value of said mode confidence level wherein said another one of said plurality of audio prompts solicits a different semantic response from said user in a speech input mode. - View Dependent Claims (10)
-
-
11. A method for managing interactive voice response dialog between a machine comprising automatic speech recognition and a user, said method comprising the steps of:
-
setting a mode confidence level parameter at a first value prior to a first input from said user wherein said first input is of a speech input mode; selecting one of a plurality of audio prompts comprising a plurality of speech segments to annunciate to the user from the machine, wherein said one of said plurality of audio prompts solicits said first input comprising a first semantic response from said user in said speech input mode; annunciating the at least one of the plurality of speech segments to said user; receiving said first input from said user, wherein said first input comprises speech input mode; determining a first speech recognition confidence value based on said first input; determining an onset time of said speech relative to the at least one of the plurality of speech segment for determining a turn confidence value; determining a speech duration of said first input, said speech duration used for determining a speech duration confidence value; using said first speech recognition confidence value, said turn confidence value, said speech duration confidence value for setting said mode confidence level parameter at a second value; selecting another one of at least another one of a plurality of audio prompts based on said mode confidence level parameter; and annunciating to the user from the machine another one of said plurality of audio prompts wherein said another one of said plurality of audio prompts solicits said first semantic response from said user in a DTMF input mode. - View Dependent Claims (12)
-
Specification