Audio visual speech recognition
First Claim
Patent Images
1. An apparatus for producing an output indicating at least some of a sequence of spoken phonemes from a human speaker comprising:
- means for detecting-sounds and converting said sounds into an electrical signal;
means for analyzing said signal to detect said phonemes to produce an electrical acoustic output signal indicating for each of at least some of said detected phonemes one group of a plurality of phoneme groups including the detected phoneme, each of said phoneme groups including at least one phoneme;
means for optically scanning the face of said speaker and producing an electrical lipshape signal representing the visual manifestation for at least some of said spoken phonemes indicating one of a plurality of lipshapes, each lipshape being associated with at least one phoneme; and
means for receiving and correlating said lipshape signal and said acoustic output signal to produce said output.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for indicating at least some of a sequence of spoken phonemes in which detected sounds are analyzed to determine a group of phonemes to which a phoneme belongs, the lipshape is optically detected and the respective signals correlated by a computer to indicate the detected phoneme.
47 Citations
5 Claims
-
1. An apparatus for producing an output indicating at least some of a sequence of spoken phonemes from a human speaker comprising:
-
means for detecting-sounds and converting said sounds into an electrical signal; means for analyzing said signal to detect said phonemes to produce an electrical acoustic output signal indicating for each of at least some of said detected phonemes one group of a plurality of phoneme groups including the detected phoneme, each of said phoneme groups including at least one phoneme; means for optically scanning the face of said speaker and producing an electrical lipshape signal representing the visual manifestation for at least some of said spoken phonemes indicating one of a plurality of lipshapes, each lipshape being associated with at least one phoneme; and means for receiving and correlating said lipshape signal and said acoustic output signal to produce said output. - View Dependent Claims (2, 3, 4)
-
-
5. A method of producing an output indicating at least some of a sequence of spoken phonemes from a human speaker comprising the steps of:
-
detecting sounds and converting said sounds into an electrical signal; analyzing said signal to detect said phonemes to produce an electrical acoustic output signal indicating for each of at least some of said detected phonemes one group of a plurality of phoneme groups including the detected phoneme, each of said phoneme groups including at least one phoneme; optically scanning the face of said speaker and producing an electrical lipshape signal representing the visual manifestation for at least some of said spoken phonemes indicating one of a plurality of lipshapes, each lipshape being associated with at least one phoneme; and correlating said lip-shape signal and said acoustic output signal to produce said output.
-
Specification