AUDIO ANALYSIS LEARNING USING VIDEO DATA
First Claim
1. A computer-implemented method for audio analysis comprising:
- obtaining video data, on a first computing device, wherein the video data includes images of one or more people;
obtaining audio data, on a second computing device, corresponding to the video data;
identifying a face within the video data;
associating a first voice, from the audio data, with the face within the video data;
analyzing the face within the video data for cognitive content;
learning an audio classifier, on a third computing device, based on the analyzing of the face within the video data; and
analyzing further audio data using the audio classifier.
1 Assignment
0 Petitions
Accused Products
Abstract
Audio analysis learning is performed using video data. Video data is obtained, on a first computing device, wherein the video data includes images of one or more people. Audio data is obtained, on a second computing device, which corresponds to the video data. A face is identified within the video data. A first voice, from the audio data, is associated with the face within the video data. The face within the video data is analyzed for cognitive content. Audio features are extracted corresponding to the cognitive content of the video data. The audio data is segmented to correspond to an analyzed cognitive state. An audio classifier is learned, on a third computing device, based on the analyzing of the face within the video data. Further audio data is analyzed using the audio classifier.
90 Citations
29 Claims
-
1. A computer-implemented method for audio analysis comprising:
-
obtaining video data, on a first computing device, wherein the video data includes images of one or more people; obtaining audio data, on a second computing device, corresponding to the video data; identifying a face within the video data; associating a first voice, from the audio data, with the face within the video data; analyzing the face within the video data for cognitive content; learning an audio classifier, on a third computing device, based on the analyzing of the face within the video data; and analyzing further audio data using the audio classifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 27)
-
-
23-25. -25. (canceled)
-
28. A computer program product embodied in a non-transitory computer readable medium for audio analysis, the computer program product comprising code which causes one or more processors to perform operations of:
-
obtaining video data wherein the video data includes images of one or more people; obtaining audio data corresponding to the video data; identifying a face within the video data; associating a first voice, from the audio data, with the face within the video data; analyzing the face within the video data for cognitive content; learning an audio classifier based on the analyzing of the face within the video data; and analyzing further audio data using the audio classifier.
-
-
29. A computer system for audio analysis comprising:
-
a memory which stores instructions; one or more processors attached to the memory wherein the one or more processors, when executing the instructions which are stored, are configured to; obtain video data, on a first computing device, wherein the video data includes images of one or more people; obtain audio data, on a second computing device, corresponding to the video data; identify a face within the video data; associate a first voice, from the audio data, with the face within the video data; analyze the face within the video data for cognitive content; learn an audio classifier, on a third computing device, based on the analyzing of the face within the video data; and analyze further audio data using the audio classifier.
-
Specification