System and method for machine-mediated human-human conversation
First Claim
1. A method comprising:
- generating a conversation context model based on user utterances and facial recognition data, wherein the conversation context model comprises a model of a speech dialog occurring between a speech dialog system and a speaker;
continuously comparing the speech dialog to the conversation context model, to yield a context similarity score;
modifying the context similarity score based on a head orientation of the speaker, to yield a modified context similarity score;
when the modified context similarity score is above a threshold, incorporating a current user utterance into the conversation context model for use in the speech dialog; and
when the modified context similarity score is one of equaling the threshold and below the threshold, suppressing the current user utterance such that the current user utterance is not incorporated into the conversation context model and the speech dialog produces speech as though the current user utterance is not in the conversation context model.
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing speech. A system configured to practice the method monitors user utterances to generate a conversation context. Then the system receives a current user utterance independent of non-natural language input intended to trigger speech processing. The system compares the current user utterance to the conversation context to generate a context similarity score, and if the context similarity score is above a threshold, incorporates the current user utterance into the conversation context. If the context similarity score is below the threshold, the system discards the current user utterance. The system can compare the current user utterance to the conversation context based on an n-gram distribution, a perplexity score, and a perplexity threshold. Alternately, the system can use a task model to compare the current user utterance to the conversation context.
34 Citations
20 Claims
-
1. A method comprising:
-
generating a conversation context model based on user utterances and facial recognition data, wherein the conversation context model comprises a model of a speech dialog occurring between a speech dialog system and a speaker; continuously comparing the speech dialog to the conversation context model, to yield a context similarity score; modifying the context similarity score based on a head orientation of the speaker, to yield a modified context similarity score; when the modified context similarity score is above a threshold, incorporating a current user utterance into the conversation context model for use in the speech dialog; and when the modified context similarity score is one of equaling the threshold and below the threshold, suppressing the current user utterance such that the current user utterance is not incorporated into the conversation context model and the speech dialog produces speech as though the current user utterance is not in the conversation context model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; generating a conversation context model based on user utterances and facial recognition data, wherein the conversation context model comprises a model of a speech dialog occurring between a speech dialog system and a speaker; continuously comparing the speech dialog to the conversation context model, to yield a context similarity score; modifying the context similarity score based on a head orientation of the speaker, to yield a modified context similarity score; when the modified context similarity score is above a threshold, incorporating a current user utterance into the conversation context model for use in the speech dialog; and when the modified context similarity score is one of equaling the threshold and below the threshold, suppressing the current user utterance such that the current user utterance is not incorporated into the conversation context model and the speech dialog produces speech as though the current user utterance is not in the conversation context model. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
generating a conversation context model based on user utterances and facial recognition data, wherein the conversation context model comprises a model of a speech dialog occurring between a speech dialog system and a speaker; continuously comparing the speech dialog to the conversation context model, to yield a context similarity score; modifying the context similarity score based on a head orientation of the speaker, to yield a modified context similarity score; when the modified context similarity score is above a threshold, incorporating a current user utterance into the conversation context model for use in the speech dialog; and when the modified context similarity score is one of equaling the threshold and below the threshold, suppressing the current user utterance such that the current user utterance is not incorporated into the conversation context model and the speech dialog produces speech as though the current user utterance is not in the conversation context model.
-
Specification