GENERATION OF TIMED TEXT USING SPEECH-TO-TEXT TECHNOLOGY, AND APPLICATIONS THEREOF
First Claim
1. A method, comprising:
- generating intermediate timed text data based on audio of a video, the intermediate timed text data comprising timing data to enable a text transcription to be generated when playing the video;
determining the text transcription of the audio based on the intermediate timed text data in response to a request to play the video; and
transmitting the text transcription of the audio to a client computing device for display along with the video.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments relate to generation of timed text in web video. In an embodiment, a computer-implemented method generates timed text for online video. In the method, a request to play a timed text track of a video incorporated into a web video service is received from a client computing device. Prior to receipt of the request, audio of the video is processed to determine intermediate timed text data. The intermediate timed text data lacks a complete text transcription of the audio, but includes data to enable the complete text transcription to be generated when playing the video. In response to receipt of the request, a text transcription of the audio is determined using the intermediate data with an automated speech-to-text algorithm. Finally, the text transcription of the audio is sent to the client computing device for display along with the video.
18 Citations
20 Claims
-
1. A method, comprising:
-
generating intermediate timed text data based on audio of a video, the intermediate timed text data comprising timing data to enable a text transcription to be generated when playing the video; determining the text transcription of the audio based on the intermediate timed text data in response to a request to play the video; and transmitting the text transcription of the audio to a client computing device for display along with the video. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system, comprising:
-
a memory to store a video; a processor to; generate intermediate timed text data based on audio of the video, the intermediate timed text data comprising timing data to enable a text transcription to be generated when playing the video; determine the text transcription of the audio based on the intermediate timed text data in response to a request to play the video; and transmit the text transcription of the audio to a client computing device for display along with the video. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A non-transitory computer readable storage medium having instructions that, when executed by a processing device, cause the processing device to perform operations comprising
generating intermediate timed text data based on audio of a video, the intermediate timed text data comprising timing data to enable a text transcription to be generated when playing the video; -
determining the text transcription of the audio based on the intermediate timed text data in response to a request to play the video; and transmitting the text transcription of the audio to a client computing device for display along with the video. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification