Animating speech of an avatar representing a participant in a mobile communication
First Claim
1. A method of animating speech of an avatar representing a participant in a mobile communication, the method comprising:
- selecting, by a computer, from data storage, one or more images to represent the participant;
selecting, by the computer, from data storage, a generic animation template for the participant, the generic animation template having a mouth and at least one emotive feature, the mouth characterized by a mouth position;
fitting, by the computer, the one or more images with the generic animation template;
texture wrapping, by the computer, the one or more images over the generic animation template;
displaying, by the computer, the one or more images texture wrapped over the generic animation template;
receiving, by the computer, an audio speech signal derived from the mobile communication of the participant;
identifying, by the computer, from the audio speech signal, a series of phonemes and one or more points of voice inflection greater than a predetermined threshold, each phoneme in the series of phonemes representing a portion of the audio speech signal;
for each phoneme in the series of phonemes;
identifying, by the computer, a new mouth position for the mouth of the generic animation template;
altering, by the computer, the mouth position of the mouth of the generic animation template to the new mouth position;
texture wrapping, by the computer, a portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template;
displaying, by the computer, the texture wrapped portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; and
playing, by the computer, synchronously with the displayed texture wrapped portion of the one or more images, the portion of the audio speech signal represented by the phoneme; and
for each point of voice inflection of the one or more points of inflection greater than the predetermined threshold, triggering, by the computer, a motion key-frame caption that alters display of the at least one emotive feature synchronously with playing, by the computer, a portion of the audio speech signal including the point of voice inflection greater than the predetermined threshold.
4 Assignments
0 Petitions
Accused Products
Abstract
Animating speech of an avatar representing a participant in a mobile communication including selecting one or more images; selecting a generic animation template; fitting the one or more images with the generic animation template; texture wrapping the one more images over the generic animation template; and displaying the one or more images texture wrapped over the generic animation template. Receiving an audio speech signal; identifying a series of phonemes; and for each phoneme: identifying a new mouth position for the mouth of the generic animation template; altering the mouth position to the new mouth position; texture wrapping a portion of the one or more images corresponding to the altered mouth position; displaying the texture wrapped portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; and playing the portion of the audio speech signal represented by the phoneme.
111 Citations
18 Claims
-
1. A method of animating speech of an avatar representing a participant in a mobile communication, the method comprising:
-
selecting, by a computer, from data storage, one or more images to represent the participant; selecting, by the computer, from data storage, a generic animation template for the participant, the generic animation template having a mouth and at least one emotive feature, the mouth characterized by a mouth position; fitting, by the computer, the one or more images with the generic animation template; texture wrapping, by the computer, the one or more images over the generic animation template; displaying, by the computer, the one or more images texture wrapped over the generic animation template; receiving, by the computer, an audio speech signal derived from the mobile communication of the participant; identifying, by the computer, from the audio speech signal, a series of phonemes and one or more points of voice inflection greater than a predetermined threshold, each phoneme in the series of phonemes representing a portion of the audio speech signal; for each phoneme in the series of phonemes; identifying, by the computer, a new mouth position for the mouth of the generic animation template; altering, by the computer, the mouth position of the mouth of the generic animation template to the new mouth position; texture wrapping, by the computer, a portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; displaying, by the computer, the texture wrapped portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; and playing, by the computer, synchronously with the displayed texture wrapped portion of the one or more images, the portion of the audio speech signal represented by the phoneme; and for each point of voice inflection of the one or more points of inflection greater than the predetermined threshold, triggering, by the computer, a motion key-frame caption that alters display of the at least one emotive feature synchronously with playing, by the computer, a portion of the audio speech signal including the point of voice inflection greater than the predetermined threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 11, 12, 13, 14)
-
-
8. A method of animating speech of an avatar representing a participant in a mobile communication, the method comprising:
-
selecting, by a computer, from data storage, one or more images to represent the participant; selecting, by the computer, from data storage, a generic animation template for the participant, the generic animation template having a mouth, the mouth characterized by a mouth position; fitting, by the computer, the one or more images with the generic animation template; texture wrapping, by the computer, the one or more images over the generic animation template; displaying, by the computer, the one or more images texture wrapped over the generic animation template; and receiving, by the computer, an audio speech signal derived from the mobile communication of the participant; identifying, by the computer, a vocal pattern from a particular portion of the audio speech signal; determining, by the computer, whether the vocal pattern matches a predetermined vocal pattern; identifying, by the computer, from the audio speech signal, a series of phonemes, each phoneme in the series of phonemes representing a portion of the audio speech signal; for each phoneme in the series of phonemes; identifying, by the computer, a new mouth position for the mouth of the generic animation template; altering, by the computer, the mouth position of the mouth of the generic animation template to the new mouth position; texture wrapping, by the computer, a portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; displaying, by the computer, the texture wrapped portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; and playing, by the computer, synchronously with the displayed texture wrapped portion of the one or more images, the portion of the audio speech signal represented by the phoneme; and if the vocal pattern of the particular portion of the audio speech signal matches the predetermined vocal pattern, displaying, by the computer, an indication of the predetermined vocal pattern synchronously with playing, by the computer, the particular portion of the audio speech signal. - View Dependent Claims (9, 10)
-
-
15. A system for animating speech of an avatar representing a participant in a mobile communication, the system configured to display the avatar on a display screen of a mobile communications device, the system comprising:
-
one or more processors, one or more computer-readable memories, and one or more computer-readable tangible storage devices; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to select, from data storage, one or more images to represent the participant; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to select, from data storage, a generic animation template for the participant, the generic animation template having a mouth and at least one emotive feature, the mouth characterized by a mouth position; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to fit the one or more images with the generic animation template; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to texture wrap the one or more images over the generic animation template; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to display the one or more images texture wrapped over the generic animation template; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to receive an audio speech signal derived from the mobile communication of the participant; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to identify, from the audio speech signal, a series of phonemes and one or more points of voice inflection greater than a predetermined threshold, each phoneme in the series of phonemes representing a portion of the audio speech signal; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to, for each phoneme in the series of phonemes; identify a new mouth position for the mouth of the generic animation template; alter the mouth position of the mouth of the generic animation template to the new mouth position; texture wrap a portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; display the texture wrapped portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; and play synchronously with the displayed texture wrapped portion of the one or more images, the portion of the audio speech signal represented by the phoneme; and program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to, for each point of voice inflection of the one or more points of inflection greater than the predetermined threshold, trigger a motion key-frame caption that alters display of at least one emotive feature synchronously with playing a portion of the audio speech signal including the point of voice inflection greater than the predetermined threshold.
-
-
16. A computer program product for animating speech of an avatar representing a participant in a mobile communication, the computer program product comprising:
-
one or more computer-readable tangible storage devices; program instructions, stored on at least one of the one or more storage devices, to select, from data storage, one or more images to represent the participant; program instructions, stored on at least one of the one or more storage devices, to select, from data storage, a generic animation template for the participant, the generic animation template having a mouth and at least one emotive feature, the mouth characterized by a mouth position; program instructions, stored on at least one of the one or more storage devices, to fit the one or more images with the generic animation template; program instructions, stored on at least one of the one or more storage devices, to texture wrap the one or more images over the generic animation template; program instructions, stored on at least one of the one or more storage devices, to display the one or more images texture wrapped over the generic animation template; program instructions, stored on at least one of the one or more storage devices, to receive an audio speech signal derived from the mobile communication of the participant; program instructions, stored on at least one of the one or more storage devices, to identify from the audio speech signal, a series of phonemes and one or more points of voice inflection greater than a predetermined threshold, each phoneme in the series of phonemes representing a portion of the audio speech signal; program instructions, stored on at least one of the one or more storage devices, to, for each phoneme in the series of phonemes; identify a new mouth position for the mouth of the generic animation template; alter the mouth position of the mouth of the generic animation template to the new mouth position; texture wrap a portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; display the texture wrapped portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; and play synchronously with the displayed texture wrapped portion of the one or more images, the portion of the audio speech signal represented by the phoneme; and program instructions, stored on at least one of the one or more storage devices, to, for each point of voice inflection of the one or more points of inflection greater than the predetermined threshold, trigger a motion key-frame caption that alters display of at least one emotive feature synchronously with playing a portion of the audio speech signal including the point of voice inflection greater than the predetermined threshold.
-
-
17. A system for animating speech of an avatar representing a participant in a mobile communication, the system configured to display the avatar on a display screen of a mobile communications device, the system comprising:
-
one or more processors, one or more computer-readable memories, and one or more computer-readable tangible storage devices; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to select, from data storage, one or more images to represent the participant; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to select, from data storage, a generic animation template for the participant, the generic animation template having a mouth, the mouth characterized by a mouth position; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to fit the one or more images with the generic animation template; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to texture wrap the one or more images over the generic animation template; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to display the one or more images texture wrapped over the generic animation template; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to receive an audio speech signal derived from the mobile communication of the participant; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to identify a vocal pattern from a particular portion of the audio speech signal; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to determine whether the vocal pattern matches a predetermined vocal pattern; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to identify, from the audio speech signal, a series of phonemes, each phoneme in the series of phonemes representing a portion of the audio speech signal; program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to, for each phoneme in the series of phonemes; identify a new mouth position for the mouth of the generic animation template; alter the mouth position of the mouth of the generic animation template to the new mouth position; texture wrap a portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; display the texture wrapped portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; and play synchronously with the displayed texture wrapped portion of the one or more images, the portion of the audio speech signal represented by the phoneme; and program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to, if the vocal pattern of the particular portion of the audio speech signal matches the predetermined vocal pattern, display an indication of the predetermined vocal pattern synchronously with playing the particular portion of the audio speech signal.
-
-
18. A computer program product for animating speech of an avatar representing a participant in a mobile communication, the computer program product comprising:
-
one or more computer-readable tangible storage devices; program instructions, stored on at least one of the one or more storage devices, to select, from data storage, one or more images to represent the participant; program instructions, stored on at least one of the one or more storage devices, to select, from data storage, a generic animation template for the participant, the generic animation template having a mouth, the mouth characterized by a mouth position; program instructions, stored on at least one of the one or more storage devices, to fit the one or more images with the generic animation template; program instructions, stored on at least one of the one or more storage devices, to texture wrap the one or more images over the generic animation template; program instructions, stored on at least one of the one or more storage devices, to display the one or more images texture wrapped over the generic animation template; program instructions, stored on at least one of the one or more storage devices, to receive an audio speech signal derived from the mobile communication of the participant; program instructions, stored on at least one of the one or more storage devices, to identify a vocal pattern from a particular portion of the audio speech signal; program instructions, stored on at least one of the one or more storage devices, to determine whether the vocal pattern matches a predetermined vocal pattern; program instructions, stored on at least one of the one or more storage devices, to identify from the audio speech signal, a series of phonemes, each phoneme in the series of phonemes representing a portion of the audio speech signal; program instructions, stored on at least one of the one or more storage devices, to, for each phoneme in the series of phonemes; identify a new mouth position for the mouth of the generic animation template; alter the mouth position of the mouth of the generic animation template to the new mouth position; texture wrap a portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; display the texture wrapped portion of the one or more images corresponding to the altered mouth position of the mouth of the generic animation template; and play synchronously with the displayed texture wrapped portion of the one or more images, the portion of the audio speech signal represented by the phoneme; and program instructions, stored on at least one of the one or more storage devices, to, if the vocal pattern of the particular portion of the audio speech signal matches the predetermined vocal pattern, display an indication of the predetermined vocal pattern synchronously with playing the particular portion of the audio speech signal.
-
Specification