Systems and Methods of Rendering a Textual Animation
First Claim
1. A method of rendering a textual animation, comprising:
- receiving an audio sample of an audio signal comprising at least one of audio elements and vocal elements, the audio signal being rendered by a media rendering source;
sending the audio sample to a server;
in response to sending the audio sample to the server, receiving one or more descriptors for the audio signal based on at least one of a semantic vector, an audio vector, and an emotion vector, wherein the semantic vector indicates a semantic content of corresponding textual transcriptions of vocal elements of the audio signal as a function of time with respect to a length of the audio signal, wherein the audio vector indicates an audio content of audio elements of the audio signal as a function of time with respect to a length of the audio signal, and wherein the emotion vector indicates an emotional content of audio elements of the audio signal as a function of time with respect to a length of the audio signal; and
based on the one or more descriptors, a client device rendering the textual transcriptions of vocal elements of the audio signal in an animated manner.
3 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods of rendering a textual animation are provided. The methods include receiving an audio sample of an audio signal that is being rendered by a media rendering source. The methods also include receiving one or more descriptors for the audio signal based on at least one of a semantic vector, an audio vector, and an emotion vector. Based on the one or more descriptors, a client device may render the textual transcriptions of vocal elements of the audio signal in an animated manner. The client device may further render the textual transcriptions of the vocal elements of the audio signal to be substantially in synchrony to the audio signal being rendered by the media rendering source. In addition, the client device may further receive an identification of a song corresponding to the audio sample, and may render lyrics of the song in an animated manner.
217 Citations
33 Claims
-
1. A method of rendering a textual animation, comprising:
-
receiving an audio sample of an audio signal comprising at least one of audio elements and vocal elements, the audio signal being rendered by a media rendering source; sending the audio sample to a server; in response to sending the audio sample to the server, receiving one or more descriptors for the audio signal based on at least one of a semantic vector, an audio vector, and an emotion vector, wherein the semantic vector indicates a semantic content of corresponding textual transcriptions of vocal elements of the audio signal as a function of time with respect to a length of the audio signal, wherein the audio vector indicates an audio content of audio elements of the audio signal as a function of time with respect to a length of the audio signal, and wherein the emotion vector indicates an emotional content of audio elements of the audio signal as a function of time with respect to a length of the audio signal; and based on the one or more descriptors, a client device rendering the textual transcriptions of vocal elements of the audio signal in an animated manner. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A method comprising:
-
receiving an audio sample; determining an identification of a song corresponding to the audio sample, the song comprising at least one of audio elements and vocal elements; retrieving one or more descriptors for the song based on at least one of a semantic vector, an audio vector, and an emotion vector, wherein the semantic vector indicates a semantic content of corresponding textual transcriptions of the vocal elements as a function of time with respect to a length of the song, wherein the audio vector indicates an audio content of the audio elements as a function of time with respect to a length of the song, and wherein the emotion vector indicates an emotional content of the audio elements as a function of time with respect to a length of the song; and sending to a client device the one or more descriptors. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A non-transitory computer readable storage medium having stored therein instructions executable by a computing device to cause the computing device to perform functions of:
-
receiving an audio sample of an audio signal comprising at least one of audio elements and vocal elements, the audio signal being rendered by a media rendering source; sending the audio sample to a server; in response to sending the audio sample to the server, receiving one or more descriptors for the audio signal based on at least one of a semantic vector, an audio vector, and an emotion vector, wherein the semantic vector indicates a semantic content of corresponding textual transcriptions of vocal elements of the audio signal as a function of time with respect to a length of the audio signal, wherein the audio vector indicates an audio content of the audio elements of the audio signal as a function of time with respect to a length of the audio signal, and wherein the emotion vector indicates an emotional content of the audio elements of the audio signal as a function of time with respect to a length of the audio signal; and based on the set of descriptors, rendering the textual transcriptions of vocal elements of the audio signal in an animated manner.
-
-
31. A method of rendering a textual animation, comprising:
-
receiving an audio sample of an audio signal comprising at least one of audio elements and vocal elements, the audio signal being rendered by a media rendering source; determining an identification of a song corresponding to the audio sample and lyrics corresponding to the vocal elements; receiving a set of descriptors for the song based on at least one of a semantic vector, an audio vector, and an emotion vector, wherein the semantic vector indicates a semantic content of the lyrics as a function of time with respect to a length of the song, wherein the audio vector indicates an audio content of audio elements of the song as a function of time with respect to a length of the song, and wherein the emotion vector indicates an emotional content of audio elements of the song as a function of time with respect to a length of the song; receiving a time offset indicating a time position in the audio signal corresponding to a beginning time of the audio sample; determining a real-time offset using a real-time timestamp, a beginning time of the audio sample, and the time offset, wherein the real-time timestamp indicates a present time; and based on the set of descriptors, a client device rendering the lyrics in an animated manner at a time corresponding to the real-time offset and substantially in synchrony to the audio signal being rendered by the media rendering source. - View Dependent Claims (32, 33)
-
Specification