Exploiting multi-modal affect and semantics to assess the persuasiveness of a video
First Claim
Patent Images
1. A method for determining the persuasiveness of a multimedia item, the method comprising, with a computing system comprising one or more computing devices:
- using a visual feature extraction module, extracting a plurality of visual concept features from at least a portion of the multimedia item using automated machine learning techniques;
using an affective feature extraction module of the visual feature extraction module, automatically analyzing the extracted visual concept features to identify sentiments of the extracted visual concept features using a first, trained neural network; and
using a semantic feature extraction module of the visual feature extraction module, automatically analyzing the extracted visual concept features to identify semantic concepts of the extracted visual concept features using a second, different, trained neural network;
using an audio feature extraction module, extracting an audio concept feature from at least a portion of the multimedia item using automated machine learning techniques;
identifying a text item associated with the multimedia item and extracting text from at least a portion of the text item using a comment extraction module; and
using a video persuasiveness prediction module,receiving the semantic concepts, sentiments and visual concept features from the visual feature extraction module, the audio concept feature from the audio feature extraction module and the text from the comment extraction module andcomparing the semantic concepts, sentiments and visual concept features, the audio concept feature and the extracted text to semantic concepts, sentiments and visual concept features, audio concept features and text in a persuasiveness model having respective measures of audience impact, andgenerating a measure of audience impact for the multimedia item based on the comparison, wherein the measure of audience impact is used to determine the persuasiveness of the multimedia item.
2 Assignments
0 Petitions
Accused Products
Abstract
Technologies to detect persuasive multimedia content by using affective and semantic concepts extracted from the audio-visual content as well as the sentiment of associated comments are disclosed. The multimedia content is analyzed and compared with a persuasiveness model.
-
Citations
20 Claims
-
1. A method for determining the persuasiveness of a multimedia item, the method comprising, with a computing system comprising one or more computing devices:
-
using a visual feature extraction module, extracting a plurality of visual concept features from at least a portion of the multimedia item using automated machine learning techniques; using an affective feature extraction module of the visual feature extraction module, automatically analyzing the extracted visual concept features to identify sentiments of the extracted visual concept features using a first, trained neural network; and using a semantic feature extraction module of the visual feature extraction module, automatically analyzing the extracted visual concept features to identify semantic concepts of the extracted visual concept features using a second, different, trained neural network; using an audio feature extraction module, extracting an audio concept feature from at least a portion of the multimedia item using automated machine learning techniques; identifying a text item associated with the multimedia item and extracting text from at least a portion of the text item using a comment extraction module; and using a video persuasiveness prediction module, receiving the semantic concepts, sentiments and visual concept features from the visual feature extraction module, the audio concept feature from the audio feature extraction module and the text from the comment extraction module and comparing the semantic concepts, sentiments and visual concept features, the audio concept feature and the extracted text to semantic concepts, sentiments and visual concept features, audio concept features and text in a persuasiveness model having respective measures of audience impact, and generating a measure of audience impact for the multimedia item based on the comparison, wherein the measure of audience impact is used to determine the persuasiveness of the multimedia item. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A multimodal data analyzer to determine the persuasiveness of a multimedia item, comprising:
-
a processor to execute program instructions; and a memory in communication with the processor, the memory having stored therein at least one of programs and instructions executable by the processor to configure the multimodal data analyzer to implement; a visual feature extraction module to extract a plurality of visual concept features from at least a portion of the multimedia item using automated machine learning techniques; wherein the visual feature extraction module includes; an affective feature extraction module to receive the multimedia item from an input and to automatically analyze the extracted visual concept features to identify sentiments of the extracted visual concept features using a first, trained neural network; and a semantic feature extraction module to receive the multimedia item from an input and to automatically analyze the extracted visual concept features to identify semantic concepts of the extracted visual concept features using a second, different, trained neural network; an audio feature extraction module to receive the multimedia item from an input and to extract, an audio concept feature from at least a portion of the multimedia item using automated machine learning techniques; a comment extraction module to identify a text item associated with the multimedia item and extract text from at least a portion of the text item; and a video persuasiveness prediction module, to receive the semantic concepts, sentiments and visual concept features from the visual feature extraction module, the audio concept feature from the audio feature extraction module and the text from the comment extraction module and to compare the semantic concepts, sentiments and visual concept features, the audio concept feature and the extracted text to semantic concepts, sentiments and visual concept features, audio concept features and text in a persuasiveness model having respective measures of audience impact and to generate a measure of audience impact for the multimedia item based on the comparison, wherein the measure of audience impact is used to determine the persuasiveness of the multimedia item. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A method for building a model of audience impact of a video, with a computing system comprising one or more computing devices, the method comprising:
-
accessing a plurality of multimedia items and text items associated with the multimedia items; using a visual feature extraction module, extracting visual features from the multimedia items using automated machine learning techniques; using an affective feature extraction module of the visual feature extraction module, automatically analyzing the extracted visual features to identify sentiments of the extracted visual features using a first, trained neural network; and using a semantic feature extraction module of the visual feature extraction module, automatically analyzing the extracted visual features to identify semantic concepts of the extracted visual features using a second, different, trained neural network; using an audio feature extraction module, extracting an audio concept feature from at least a portion of the multimedia items using automated machine learning techniques; extracting text from the text items using a comment extraction module; using a video persuasiveness prediction module, annotating the identified semantic concepts and sentiments, the extracted audio features, visual features, and text items with a measure of audience impact based on the semantic analysis or the affective analysis of the visual features, an affective analysis of the audio features, and a sentiment analysis of the extracted text; classifying each of the multimedia items based on a combination of the annotations; and storing the classifications in the audience impact model. - View Dependent Claims (14, 15, 16)
-
-
17. A video classifier device for building a model of audience impact of a video, comprising:
-
a visual feature extraction module to extract visual features from a plurality of accessed multimedia items using automated machine learning techniques; wherein the visual feature extraction module includes; an affective feature extraction module to receive the multimedia items from an input and to automatically analyze the extracted visual concept features to identify sentiments of the extracted visual concept features using a first, trained neural network; and a semantic feature extraction module to receive the multimedia items from an input and to automatically analyze the extracted visual concept features to identify semantic concepts of the extracted visual concept features using a second, different, trained neural network; an audio feature extraction module to receive the multimedia items from an input and to extract, an audio concept feature from at least a portion of the multimedia items using automated machine learning techniques; a comment extraction module to extract text from accessed text items; a video persuasiveness prediction module to annotate the identified semantic concepts and sentiments, the extracted audio features, visual features, and text items with a measure of audience impact based on the semantic analysis or the affective analysis of the visual features, an affective analysis of the audio features, and a sentiment analysis of the extracted text; classify each of the multimedia items based on a combination of the annotations; and store the classifications in an audience impact model. - View Dependent Claims (18, 19, 20)
-
Specification