Method and apparatus for recognizing speech by lip reading
First Claim
1. A vehicle component control device comprising:
- an audio input device configured to receive a voice utterance including a plurality of words;
a video input device configured to receive video of lip motion of a user;
a memory portion;
a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and
a transceiver for sending the first data packets to a remote apparatus and receiving second data packets including machine code for controlling a vehicle component,wherein the machine code is assigned from configured dictation generated based upon the audio stream and the video stream from the remote apparatus,wherein in the configured dictation, at least one word in first dictation generated based upon the audio stream which has a predetermined characteristic has been corrected by a feature signal parameter sequence based upon the video stream,wherein the first data packets further include geographical data and the remote apparatus determines a location associated with the vehicle component control device based on the geographical data,wherein the machine code is assigned from the first dictation or the configured dictation based upon the location associated with the vehicle component control device.
1 Assignment
0 Petitions
Accused Products
Abstract
A dictation device includes: an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion during the voice utterance; a memory portion; a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a server end device and receiving second data packets including combined dictation based upon the audio stream and the video stream from the server end device. In the combined dictation, first dictation generated based upon the audio stream has been corrected by second dictation generated based upon the video stream.
-
Citations
16 Claims
-
1. A vehicle component control device comprising:
-
an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion of a user; a memory portion; a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a remote apparatus and receiving second data packets including machine code for controlling a vehicle component, wherein the machine code is assigned from configured dictation generated based upon the audio stream and the video stream from the remote apparatus, wherein in the configured dictation, at least one word in first dictation generated based upon the audio stream which has a predetermined characteristic has been corrected by a feature signal parameter sequence based upon the video stream, wherein the first data packets further include geographical data and the remote apparatus determines a location associated with the vehicle component control device based on the geographical data, wherein the machine code is assigned from the first dictation or the configured dictation based upon the location associated with the vehicle component control device. - View Dependent Claims (2, 3)
-
-
4. A vehicle component control device comprising:
-
an audio input device that receives an audio signal representing a voice utterance; a video input device that receives a video signal representative of movement of a user; a controller configured according to instructions stored in a memory, the controller configured to; generate first dictation based on the audio signal; generate a feature signal parameter sequence based on the video signal; generate configured dictation based on the first dictation and the feature signal parameter sequence; determine a location associated with the vehicle component control device based on geographical data associated with the vehicle component control device; and assign machine code for controlling a vehicle component based on the first dictation or the configured dictation based upon the location associated with the vehicle component control device. - View Dependent Claims (5, 6, 7, 8, 9, 10)
-
-
11. A vehicle component control device comprising:
-
an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion of a user; a controller configured to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a remote apparatus and receiving second data packets including machine code for controlling a vehicle component, the machine code assigned from configured dictation generated based upon the audio stream and the video stream from the remote apparatus, wherein in the configured dictation, at least one word in first dictation generated based upon the audio stream which has a predetermined characteristic has been corrected by a feature signal parameter sequence based upon the video stream, wherein the vehicle component control device configured to receive global position data (GPS) data to determine a location of the vehicle components control device, wherein the machine code is assigned from the first dictation or the configured dictation based upon the location associated with the vehicle component control device. - View Dependent Claims (12, 13, 14, 15, 16)
-
Specification