Film language
First Claim
1. A method for modifying an audio visual recording originally produced with an original audio track of an original speaker, using a second audio dub track of a second speaker, to produce a new audio visual recording with synchronized audio to facial expressive speech of the second audio dub track spoken by the original speaker, comprising analyzing the original audio track to convert it into phonemes as a time-coded phoneme stream to identify corresponding visual facial motions of the original speaker to create frames of facial motion corresponding to speech phoneme utterance states and transformations, storing these frames in a database, analyzing the second audio dub track to convert it to phonemes as a time-coded phoneme stream, using the second audio dub track time-coded phoneme stream to animate the original speaker'"'"'s face, synchronized to the second audio dub track to create natural continuous facial speech expression by the original speaker of the second dub audio track.
1 Assignment
0 Petitions
Accused Products
Abstract
This invention comprises analyzing a speaker'"'"'s speech in an audio visual recording to convert it into triphones and/or phonemes and then using a time coded phoneme stream to identify corresponding visual facial motions, to create single frame snapshots or multi-frame clips of facial motion corresponding to speech phoneme utterance states and transformations, which are stored in a database, and which are subsequently used to animate the original speaker'"'"'s face, synchronized to a new voice track that has been converted into a time-coded, image frame-indexed phoneme stream.
43 Citations
35 Claims
- 1. A method for modifying an audio visual recording originally produced with an original audio track of an original speaker, using a second audio dub track of a second speaker, to produce a new audio visual recording with synchronized audio to facial expressive speech of the second audio dub track spoken by the original speaker, comprising analyzing the original audio track to convert it into phonemes as a time-coded phoneme stream to identify corresponding visual facial motions of the original speaker to create frames of facial motion corresponding to speech phoneme utterance states and transformations, storing these frames in a database, analyzing the second audio dub track to convert it to phonemes as a time-coded phoneme stream, using the second audio dub track time-coded phoneme stream to animate the original speaker'"'"'s face, synchronized to the second audio dub track to create natural continuous facial speech expression by the original speaker of the second dub audio track.
-
17. A method for modifying an audio visual recording originally produced with an original audio track of an original screen actor, using a second audio dub track of a second screen actor, to produce a new audio visual recording with synchronized audio to facial expressive speech of the second audio dub track spoken by the original screen actor, comprising analyzing the original audio track to convert it into phonemes as a time-coded phoneme stream to identify corresponding visual facial motions of the original speaker to create frames of facial motion corresponding to speech phoneme utterance states and transformations, storing these frames in a database, analyzing the second audio dub track to convert it to phonemes as a time-coded phoneme stream, using the second audio dub track time-coded phoneme stream to animate the original screen actor'"'"'s face, synchronized to the second audio dub track to create natural continuous facial speech expression by the original screen actor of the second dub audio track.
- 32. A method for modifying an audio visual recording originally produced with an original audio track of an original screen actor, using a second audio dub track of a second screen actor, to produce a new audio visual recording with synchronized audio to facial expressive speech of the second audio dub track spoken by the original screen actor, comprising analyzing the original audio track to convert it into phonemes as a time-coded phoneme stream, identifying corresponding visemes of the original screen actor, using radar to measure a set of facial reference points corresponding to speech phoneme utterance states and transformations, storing the data obtained in a database, analyzing the second audio dub track to convert it to phonemes as a time-coded phoneme stream, identifying corresponding visemes of the second screen actor, using radar to measure a set of facial reference points corresponding to speech phoneme utterance states and transformations, storing the data obtained in a database, using the second audio dub track time-coded phoneme stream and the actors visemes to animate the original screen actor'"'"'s face, synchronized to the second audio dub track to create natural continuous facial speech expression by the original screen actor of the second dub audio track.
Specification