SPEECH COMMUNICATION SYSTEM AND METHOD, AND ROBOT APPARATUS
First Claim
1. A speech communication system enabling a conversation with a conversation partner, said system comprising:
- a generation unit configured to generate a plurality of auditory communications according to a predetermined rule;
a speech recognition unit, in an apparatus, configured to recognize a speech content of the conversation partner;
an estimation control unit configured to estimate intentions of the conversation partner from the speech content recognized by the speech recognition unit;
a conversation control unit configured to dynamically select one of the plurality of auditory communications based on the estimation by the estimation control unit;
an audio output unit configured to output the one of the plurality of auditory communications selected by the conversation control unit;
an image recognition unit, in the apparatus, configured to recognize a face of the conversation partner;
a touch sensing unit, in the apparatus, configured to recognize a touch input by the conversation partner;
a tracking control unit configured to determine whether or not to continue the conversation based on a recognition result from the image recognition unit or the touch sensing unit; and
a network interface configured to communicate with an external network.
0 Assignments
0 Petitions
Accused Products
Abstract
This invention realizes a speech communication system and method, and a robot apparatus capable of significantly improving entertainment property. A speech communication system with a function to make conversation with a conversation partner is provided with a speech recognition means for recognizing speech of the conversation partner, a conversation control means for controlling conversation with the conversation partner based on the recognition result of the speech recognition means, an image recognition means for recognizing the face of the conversation partner, and a tracking control means for tracing the existence of the conversation partner based on one or both of the recognition result of the image recognition means and the recognition result of the speech recognition means. The conversation control means controls conversation so as to continue depending on tracking of the tracking control means.
28 Citations
20 Claims
-
1. A speech communication system enabling a conversation with a conversation partner, said system comprising:
-
a generation unit configured to generate a plurality of auditory communications according to a predetermined rule; a speech recognition unit, in an apparatus, configured to recognize a speech content of the conversation partner; an estimation control unit configured to estimate intentions of the conversation partner from the speech content recognized by the speech recognition unit; a conversation control unit configured to dynamically select one of the plurality of auditory communications based on the estimation by the estimation control unit; an audio output unit configured to output the one of the plurality of auditory communications selected by the conversation control unit; an image recognition unit, in the apparatus, configured to recognize a face of the conversation partner; a touch sensing unit, in the apparatus, configured to recognize a touch input by the conversation partner; a tracking control unit configured to determine whether or not to continue the conversation based on a recognition result from the image recognition unit or the touch sensing unit; and a network interface configured to communicate with an external network. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A speech communication apparatus enabling a conversation with a conversation partner, comprising:
-
a speech recognition unit configured to recognize a speech content of the conversation partner; an audio output unit configured to output auditory communications; an image recognition unit configured to recognize a face of the conversation partner; a touch sensing unit configured to recognize a touch input by the conversation partner; a tracking control unit configured to determine whether or not to continue the conversation based on a recognition result from the image recognition unit or the touch sensing unit; and a network interface configured to communicate with an external network. - View Dependent Claims (7, 8, 9, 10, 11)
-
-
12. A speech communication method enabling a conversation with a conversation partner, said method comprising:
-
generating a plurality of auditory communications according to a predetermined rule; recognizing, using a speech recognition unit in an apparatus, a speech content of the conversation partner; estimating intentions of the conversation partner from the speech content recognized by the recognizing of the speech recognition unit; dynamically selecting one of the plurality of auditory communications based on the estimation by the estimating; outputting the one of the plurality of auditory communications selected by the dynamically selecting; recognizing, using an image recognition unit in the apparatus, a face of the conversation partner; recognizing, using a touch sensing unit in the apparatus, a touch input by the conversation partner; and determining whether or not to continue the conversation based on a recognition result from recognizing by the image recognition unit or the touch sensing unit. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A non-transitory computer readable medium having stored thereon a program that when executed by a computing device causes the computing device to implement a speech communication method enabling a conversation with a conversation partner, said method comprising:
-
generating a plurality of auditory communications according to a predetermined rule; recognizing, using a speech recognition unit in an apparatus, a speech content of the conversation partner; estimating intentions of the conversation partner from the speech content recognized by the recognizing by the speech recognition unit; dynamically selecting one of the plurality of auditory communications based on the estimation by the estimating; outputting the one of the plurality of auditory communications selected by the dynamically selecting; recognizing, using an image recognition unit in the apparatus, a face of the conversation partner; recognizing, using a touch sensing unit in the apparatus, a touch input by the conversation partner; and determining whether or not to continue the conversation based on a recognition result from recognizing by the image recognition unit or the touch sensing unit. - View Dependent Claims (18, 19, 20)
-
Specification