System and method for inserting a description of images into audio recordings
First Claim
1. A method of inserting a description of an image into an audio recording, comprising:
- interpreting, using a computer device, an image, including;
applying a template to the image to extract file content;
constructing a logical structure from the image;
populating the logical structure with the extracted file content; and
producing a word description of the image including at least one image keyword from the interpreted image;
parsing, using the computer device, an audio recording into a plurality of audio clips;
producing a transcription of each audio clip;
extracting a plurality of noun phrases from the transcription;
calculating an importance value for each noun phrase and identifying at least one audio keyword based upon the importance value;
calculating, using the computer device, a similarity distance between the at least one image keyword and the at least one audio keyword of each audio clip; and
selecting, using the computer device, the audio clip transcription having a shortest similarity distance to the at least one image keyword as a location to insert the word description of the image.
1 Assignment
0 Petitions
Accused Products
Abstract
There is disclosed a system and method for interpreting and describing graphic images. In an embodiment, the method of inserting a description of an image into an audio recording includes: interpreting an image and producing a word description of the image including at least one image keyword; parsing an audio recording into a plurality of audio clips, and producing a transcription of each audio clip, each audio clip transcription including at least one audio keyword; calculating a similarity distance between the at least one image keyword and the at least one audio keyword of each audio clip; and selecting the audio clip transcription having a shortest similarity distance to the at least one image keyword as a location to insert the word description of the image. The word description of the image can then be appended to the selected audio clip to produce an augmented audio recording including the interpreted word description of the image.
17 Citations
21 Claims
-
1. A method of inserting a description of an image into an audio recording, comprising:
-
interpreting, using a computer device, an image, including; applying a template to the image to extract file content; constructing a logical structure from the image; populating the logical structure with the extracted file content; and producing a word description of the image including at least one image keyword from the interpreted image; parsing, using the computer device, an audio recording into a plurality of audio clips; producing a transcription of each audio clip; extracting a plurality of noun phrases from the transcription; calculating an importance value for each noun phrase and identifying at least one audio keyword based upon the importance value; calculating, using the computer device, a similarity distance between the at least one image keyword and the at least one audio keyword of each audio clip; and selecting, using the computer device, the audio clip transcription having a shortest similarity distance to the at least one image keyword as a location to insert the word description of the image. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for inserting a description of an image into an audio recording, comprising:
-
a computer hardware device, including; an interpreting system for interpreting an image, including; applying a template to the image to extract file content; constructing a logical structure from the image; populating the logical structure with the extracted file content; and producing a word description of the image including at least one image keyword from the interpreted image; a parsing system for parsing an audio recording into a plurality of audio clips for producing a transcription of each audio clip and for extracting a plurality of noun phrases from the transcription; a calculating system for calculating an importance value for each noun phrase and identifying at least one audio keyword based upon the importance value; and
calculating a similarity distance between the at least one image keyword and the at least one audio keyword of each audio clip; anda selecting system for selecting the audio clip transcription having a shortest similarity distance to the at least one image keyword as a location to insert the word description of the image. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A program product stored on a non-transitory computer readable medium, which when executed, inserts a description of an image into an audio recording, the non-transitory computer readable medium comprising program code for:
-
interpreting an image, including; applying a template to the image to extract file content; constructing a logical structure from the image; populating the logical structure with the extracted file content; and producing a word description of the image including at least one image keyword from the interpreted image; parsing an audio recording into a plurality of audio clips; producing a transcription of each audio clip; extracting a plurality of noun phrases from the transcription; calculating an importance value for each noun phrase and identifying at least one audio keyword based upon the importance value; calculating a similarity distance between the at least one image keyword and the at least one audio keyword of each audio clip; and selecting the audio clip transcription having a shortest similarity distance to the at least one image keyword as a location to insert the word description of the image. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification