Information processing device and method for determining whether a state of collected sound data is suitable for speech recognition
First Claim
Patent Images
1. An information processing device, comprising:
- circuitry configured to;
acquire an image of a user;
control a display device to display an object on a display screen;
determine an arrival direction of user voice with respect to a microphone based on analysis of the image of the user, wherein the microphone is configured to collect sound data;
control a movement of the object on the display screen based on the arrival direction;
acquire the collected sound data from the arrival direction based on a direction of the movement of the object on the display screen;
determine utterance of an expression based on the collected sound data, wherein the expression indicates one of a beginning of a sentence included in the collected sound data or an end of the sentence included in the collected sound data;
determine a state of the collected sound data based on the determination of utterance of the expression, wherein the state is one of a first state that indicates that the collected sound data is suitable for speech recognition or a second state that indicates that the collected sound data is unsuitable for the speech recognition;
control an output device to output the state of the collected sound data; and
control at least one parameter of the object based on the state of the collected sound data.
1 Assignment
0 Petitions
Accused Products
Abstract
Provided is an information processing device including: a collected sound data acquisition portion that acquires collected sound data; and an output controller that causes an output portion to output at least whether or not a state of the collected sound data is suitable for speech recognition.
0 Citations
21 Claims
-
1. An information processing device, comprising:
circuitry configured to; acquire an image of a user; control a display device to display an object on a display screen; determine an arrival direction of user voice with respect to a microphone based on analysis of the image of the user, wherein the microphone is configured to collect sound data; control a movement of the object on the display screen based on the arrival direction; acquire the collected sound data from the arrival direction based on a direction of the movement of the object on the display screen; determine utterance of an expression based on the collected sound data, wherein the expression indicates one of a beginning of a sentence included in the collected sound data or an end of the sentence included in the collected sound data; determine a state of the collected sound data based on the determination of utterance of the expression, wherein the state is one of a first state that indicates that the collected sound data is suitable for speech recognition or a second state that indicates that the collected sound data is unsuitable for the speech recognition; control an output device to output the state of the collected sound data; and control at least one parameter of the object based on the state of the collected sound data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
20. A method, comprising:
-
acquiring an image of a user; controlling a display device to display an object on a display screen; determining an arrival direction of user voice with respect to a microphone, based on analysis of the image of the user, wherein the microphone is configured to collect sound data; controlling a movement of the object on the display screen based on the arrival direction; acquiring the collected sound data from the arrival direction based on a direction of the movement of the object on the display screen; determining utterance of an expression based on the collected sound data, wherein the expression indicates one of a beginning of a sentence included in the collected sound data or an end of the sentence included in the collected sound data; determining a state of the collected sound data based on the determination of utterance of the expression, wherein the state is one of a first state that indicates that the collected sound data is suitable for speech recognition or a second state that indicates that the collected sound data is unsuitable for the speech recognition; controlling an output device to output the state of the collected sound data; and controlling at least one parameter of the object based on the collected sound data.
-
-
21. A non-transitory computer-readable medium having stored thereon, computer-executable instructions, which when executed by a computer, cause the computer to execute operations, the operations comprising:
-
acquiring an image of a user; controlling a display device to display an object on a display screen; determining an arrival direction of user voice with respect to a microphone, based on analysis of the image of the user, wherein the microphone is configured to collect sound data; controlling a movement of the object on the display screen based on the arrival direction; acquiring the collected sound data from the arrival direction based on a direction of the movement of the object on the display screen; determining utterance of an expression based on the collected sound data, wherein the expression indicates one of a beginning of a sentence included in the collected sound data or an end of the sentence included in the collected sound data; determining a state of the collected sound data based on the determination of utterance of the expression, wherein the state is one of a first state that indicates that the collected sound data is suitable for speech recognition or a second state that indicates that the collected sound data is unsuitable for the speech recognition; controlling an output device to output the state of the collected sound data; and controlling at least one parameter of the object based on the collected sound data.
-
Specification