EXPLOITING MULTI-MODAL AFFECT AND SEMANTICS TO ASSESS THE PERSUASIVENESS OF A VIDEO

US 20160328384A1
Filed: 10/02/2015
Published: 11/10/2016
Est. Priority Date: 05/04/2015
Status: Active Grant

First Claim

Patent Images

1. A method for determining the persuasiveness of a multimedia item, the method comprising, with a computing system comprising one or more computing devices:

extracting a plurality of features from at least a portion of the multimedia item, the extracted features comprising a visual feature or an audio feature;

identifying a text item associated with the multimedia item;

extracting text from at least a portion of the text item;

analyzing the extracted features and the extracted text using a video persuasiveness model; and

generating a persuasiveness indication for the multimedia item based on the analysis using the video persuasiveness model.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Technologies to detect persuasive multimedia content by using affective and semantic concepts extracted from the audio-visual content as well as the sentiment of associated comments are disclosed. The multimedia content is analyzed and compared with a persuasiveness model.

Citations

20 Claims

1. A method for determining the persuasiveness of a multimedia item, the method comprising, with a computing system comprising one or more computing devices:
- extracting a plurality of features from at least a portion of the multimedia item, the extracted features comprising a visual feature or an audio feature;
  
  identifying a text item associated with the multimedia item;
  
  extracting text from at least a portion of the text item;
  
  analyzing the extracted features and the extracted text using a video persuasiveness model; and
  
  generating a persuasiveness indication for the multimedia item based on the analysis using the video persuasiveness model.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the multimedia item comprises a video and the text item comprises one or more comments associated with the video, the extracted features comprise a combination of audio features and visual features extracted from the video, and the persuasiveness indication is generated based on an analysis of the combination of audio features and visual features and extracted text.
  - 3. The method of claim 2, wherein the generating of the persuasiveness indication further comprises calculating a score based on an individual analysis of each of the extracted features and extracted text.
  - 4. The method of claim 3, wherein the score is calculated by fusing individual scores calculated with respect to the individual extracted features and extracted text.
  - 5. The method of claim 4, wherein the score fusion is performed using:
    - an early fusion technique, a simple late fusion technique, or a learning based late fusion technique.
  - 6. The method of claim 1, further comprising:
    - comparing the persuasiveness indication of the multimedia item with a second persuasiveness indication associated with a second multimedia item; and
      
      outputting, in response to the comparing, an output which indicates the more persuasive multimedia item or the less persuasive multimedia item.

7. A multimodal data analyzer comprising instructions embodied in one or more non-transitory machine accessible storage media, the multimodal data analyzer configured to cause a computing system comprising one or more computing devices to:
- extract a plurality of features from at least a portion of the multimedia item, the extracted features comprising a visual feature or an audio feature;
  
  identify a text item associated with the multimedia item;
  
  extract text from at least a portion of the text item;
  
  analyze the extracted features and the extracted text using a video persuasiveness model; and
  
  generate a persuasiveness indication for the multimedia item based on the analysis using the video persuasiveness model.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The multimedia analyzer of claim 7, wherein the multimedia item comprises a video and the text item comprises one or more comments associated with the video, the extracted features comprise a combination of audio features and visual features extracted from the video, and the persuasiveness indication is generated based on an analysis of the combination of audio features and visual features and extracted text.
  - 9. The multimedia analyzer of claim 8, wherein the generating of the persuasiveness indication further comprises calculating a score based on an individual analysis of each of the extracted features and extracted text.
  - 10. The multimedia analyzer of claim 9, wherein the score is calculated by fusing individual scores calculated with respect to the individual extracted features and extracted text.
  - 11. The multimedia analyzer of claim 10, wherein the score fusion is performed using:
    - an early fusion technique, a simple late fusion technique, or a learning based late fusion technique.
  - 12. The multimedia analyzer of claim 7, further configured to:
    - compare the persuasiveness indication of the multimedia item with a second persuasiveness indication associated with a second multimedia item; and
      
      output, in response to the comparison, an output which indicates the more persuasive multimedia item or the less persuasive multimedia item.

13. A method for building a model of audience impact of a video, with a computing system comprising one or more computing devices, the method comprising:
- accessing a plurality of multimedia items and text items associated with the multimedia items;
  
  extracting audio and visual features from the multimedia items;
  
  extracting text from the text items;
  
  annotating the extracted audio features, visual features, and text items with an indicator of audience impact based on a semantic analysis or an affective analysis of the visual features, an affective analysis of the audio features, and a sentiment analysis of the extracted text;
  
  classifying each of the multimedia items based on a combination of the annotations; and
  
  storing the classifications in the audience impact model.
- View Dependent Claims (14, 15, 16)
- - 14. The method of claim 13, comprising:
    - determining, based on the affective analysis of the extracted audio features, an indication of the emotional content of the audio, and generating the indicator of audience impact based at least partly on the indication of emotional content of the audio.
  - 15. The method of claim 13, comprising:
    - performing a sentiment analysis on the extracted visual features, and generating the indicator of audience impact based at least partly on the sentiment analysis performed on the extracted visual features.
  - 16. The method of claim 13, comprising:
    - performing a sentiment analysis on the extracted text, and generating the indicator of audience impact based at least partly on the sentiment analysis performed on the extracted text.

17. A video classifier device comprising instructions embodied in one or more non-transitory machine accessible storage media, the video classifier device configured to cause a computing system comprising one or more computing devices to:
- access a plurality of multimedia items and text items associated with the multimedia items;
  
  extract audio and visual features from the multimedia items;
  
  extract text from the text items;
  
  annotate the extracted audio features, visual features, and text items with an indicator of audience impact based on a semantic analysis of the visual features, an affective analysis of the visual features, an affective analysis of the audio features, and a sentiment analysis of the extracted text;
  
  classify each of the multimedia items based on a combination of the annotations; and
  
  store the classifications in an audience impact model.
- View Dependent Claims (18, 19, 20)
- - 18. The video classifier device of claim 17, configured to:
    - determine, based on the affective analysis of the extracted audio features, an indication of the emotional content of the audio, and generate the indicator of audience impact based at least partly on the indication of emotional content of the audio.
  - 19. The video classifier device of claim 17, configured to:
    - perform a sentiment analysis on the extracted visual features, and generate the indicator of audience impact based at least partly on the sentiment analysis performed on the extracted visual features.
  - 20. The video classifier device of claim 17, configured to:
    - perform a sentiment analysis on the extracted text, and generate the indicator of audience impact based at least partly on the sentiment analysis performed on the extracted text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
SRI International, Inc.
Original Assignee
SRI International, Inc.
Inventors
Chisholm, David, Divakaran, Ajay, Siddiquie, Behjat, Shriberg, Elizabeth

Granted Patent

US 10,303,768 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 16/9535   Search customisation based ...

G06F 40/169   Annotation, e.g. comment da...

G06F 40/211   Syntactic parsing, e.g. bas...

G06F 40/279   Recognition of textual enti...

G06F 40/30   Semantic analysis

G06V 20/41   Higher-level, semantic clus...

G06V 20/46   Extracting features or char...

G06V 20/635   Overlay text, e.g. embedded...

H04N 21/4532   involving end-user characte...

H04N 21/4668   for recommending content, e...

EXPLOITING MULTI-MODAL AFFECT AND SEMANTICS TO ASSESS THE PERSUASIVENESS OF A VIDEO

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

EXPLOITING MULTI-MODAL AFFECT AND SEMANTICS TO ASSESS THE PERSUASIVENESS OF A VIDEO

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links