Method and apparatus for annotating video content with metadata generated using speech recognition technology
First Claim
Patent Images
1. A method for annotation of video content in a device communicatively coupled to a network, the method comprising:
- receiving, in the device, a captured speech segment comprising speech from a user of a second device, wherein the captured speech segment annotates a portion of the video content streamed to the second device for being played to the user contemporaneously with the speech from the user;
converting the captured speech segment to a text-segment;
associating the text-segment with the portion of the video content contemporaneously played to the user; and
storing in a selectively retrievable manner the text-segment so that the text-segment is associated with the portion of the video content.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus is provided for annotating video content with metadata generated using speech recognition technology. The method begins by rendering video content on a display device. A segment of speech is received from a user such that the speech segment annotates a portion of the video content currently being rendered. The speech segment is converted to a text-segment and the text-segment is associated with the rendered portion of the video content. The text segment is stored in a selectively retrievable manner so that it is associated with the rendered portion of the video content.
38 Citations
22 Claims
-
1. A method for annotation of video content in a device communicatively coupled to a network, the method comprising:
-
receiving, in the device, a captured speech segment comprising speech from a user of a second device, wherein the captured speech segment annotates a portion of the video content streamed to the second device for being played to the user contemporaneously with the speech from the user; converting the captured speech segment to a text-segment; associating the text-segment with the portion of the video content contemporaneously played to the user; and storing in a selectively retrievable manner the text-segment so that the text-segment is associated with the portion of the video content. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
2. An apparatus for annotation of a video content, the apparatus comprising:
-
a memory; and a processor communicatively coupled to the memory and to a network interface, the processor configured to be communicatively coupled via the network interface to a network; the processor further configured to receive, via the network interface, a captured speech segment comprising speech from a user of a second device coupled to the network, wherein the captured speech segment annotates a portion of the video content streamed to the second device for being played to the user contemporaneously with the speech from the user; the processor further configured to convert the captured speech segment to a text-segment, to associate the text-segment with the portion of the video content contemporaneously played to the user; and
to store in a selectively retrievable manner the text-segment so that the text-segment is associated with the portion of the video content.
-
-
15. A method for annotation of video content in a device communicatively coupled to a network, the method comprising:
-
receiving, in the device, a text-segment of recognized speech comprising recognized speech from a user of a second device coupled to the network, wherein the text-segment annotates a portion of the video content streamed to the second device for being played to the user contemporaneously with the speech from the user; associating the text-segment with the portion of the video content; and storing in a selectively retrievable manner the text-segment so that it is associated with the portion of the video content. - View Dependent Claims (16, 17, 18)
-
-
19. A method for annotation of video content in a device communicatively coupled to a network, the method comprising:
-
receiving, in the device, a text-segment of recognized speech comprising recognized speech from a user of a second device, wherein the text-segment annotates a portion of the video content streamed to the second device for being played to the user contemporaneously with the speech from the user; receiving, in the device, metadata comprising a timestamp for associating the text-segment with the portion of the video content; and storing in a selectively retrievable manner the text-segment so that it is associated with the portion of the video content. - View Dependent Claims (20, 21, 22)
-
Specification