Recording audio metadata for stored images
First Claim
Patent Images
1. A method of processing audio signals including speech signals, the audio signals and image data being recorded in a media file, comprising:
- a. automatically extracting the speech signals from the audio signals from the media file and converting the speech signals to textual metadata wherein the textual metadata are keywords recognized from a pre-determined vocabulary;
b. automatically analyzing the textual metadata using natural language processing algorithms to identify people'"'"'s names, place names, or object names and adding the identified names to the textual metadata;
c. using the updated textual metadata to compute a commentary value metric wherein the commentary value metric is a measure of the amount of viewer commentary associated with the media file;
d. automatically semantically analyzing the image data from the media file to identify a person, place, object or activity to produce a visual display of selected portions of the image data, and prompt the user to provide additional textual metadata associated with each of the selected portions of the image data; and
e. associating the updated textual metadata automatically obtained from the speech signals in the media file, the additional textual metadata provided by the user during the display of the selected portions of the image data from the media file, and the commentary value metric with the media file.
5 Assignments
0 Petitions
Accused Products
Abstract
A method of processing audio signals recorded during display of image data from a media file on a display device to produce semantic understanding data and associating such data with the original media file, includes: separating a desired audio signal from the aggregate mixture of audio signals; analyzing the separated signal for purposes of gaining semantic understanding; and associating the semantic information obtained from the audio signals recorded during image display with the original media file.
50 Citations
15 Claims
-
1. A method of processing audio signals including speech signals, the audio signals and image data being recorded in a media file, comprising:
-
a. automatically extracting the speech signals from the audio signals from the media file and converting the speech signals to textual metadata wherein the textual metadata are keywords recognized from a pre-determined vocabulary; b. automatically analyzing the textual metadata using natural language processing algorithms to identify people'"'"'s names, place names, or object names and adding the identified names to the textual metadata; c. using the updated textual metadata to compute a commentary value metric wherein the commentary value metric is a measure of the amount of viewer commentary associated with the media file; d. automatically semantically analyzing the image data from the media file to identify a person, place, object or activity to produce a visual display of selected portions of the image data, and prompt the user to provide additional textual metadata associated with each of the selected portions of the image data; and e. associating the updated textual metadata automatically obtained from the speech signals in the media file, the additional textual metadata provided by the user during the display of the selected portions of the image data from the media file, and the commentary value metric with the media file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
Specification