MULTIMODAL REMOTE CONTROL
First Claim
Patent Images
1. A remote control method, comprising:
- detecting an audio input including speech content from a user;
detecting a motion input representative of a gesture performed by the user;
performing speech-to-text conversion on the audio input to generate a speech command;
processing the motion input to generate a gesture command;
synchronizing the speech command and the gesture command to generate a multimodal command; and
executing the multimodal command at a processor.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system for operating a remotely controlled device may use multimodal remote control commands that include a gesture command and a speech command. The gesture command may be interpreted from a gesture performed by a user, while the speech command may be interpreted from speech utterances made by the user. The gesture and speech utterances may be simultaneously received by the remotely controlled device in response to displaying a user interface configured to receive multimodal commands.
-
Citations
20 Claims
-
1. A remote control method, comprising:
-
detecting an audio input including speech content from a user; detecting a motion input representative of a gesture performed by the user; performing speech-to-text conversion on the audio input to generate a speech command; processing the motion input to generate a gesture command; synchronizing the speech command and the gesture command to generate a multimodal command; and executing the multimodal command at a processor. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A remotely controlled device for processing multimodal remote control commands, comprising:
-
a processor configured to access memory media; an infrared receiver; and a microphone; wherein the memory media include instructions executable by the processor to; capture a speech utterance from a user via the microphone; capture a gesture performed by the user via the infrared receiver; identify a speech command from the speech utterance; identify a gesture command from the gesture; and combine the speech command and the gesture command into a multimodal command. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. Computer-readable memory media, including instructions executable by a processor to:
-
capture, via an audio input device, a speech utterance from a user; capture, via a motion detection device, a gesture performed by the user; and identify a multimodal command based on a combination of the speech utterance and the gesture. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification