DETERMINING INTENT FROM MULTIMODAL CONTENT EMBEDDED IN A COMMON GEOMETRIC SPACE

US 20200134398A1
Filed: 04/12/2019
Published: 04/30/2020
Est. Priority Date: 10/29/2018
Status: Active Application

First Claim

Patent Images

1. A method of creating a semantic embedding space for multimodal content for determining intent of content, the method comprising:

for each of a plurality of content of the multimodal content, creating a respective, first modality feature vector representative of content of the multimodal content having a first modality using a first machine learning model;

for each of a plurality of content of the multimodal content, creating a respective, second modality feature vector representative of content of the multimodal content having a second modality using a second machine learning model;

for each of a plurality of first modality feature vector and second modality feature vector multimodal content pairs, forming a combined multimodal feature vector from the first modality feature vector and the second modality feature vector;

for at least one first modality feature vector and second modality feature vector multimodal content pair, assigning at least one taxonomy class of intent; and

semantically embedding the respective, combined multimodal feature vectors in a common geometric space, wherein embedded combined multimodal feature vectors having related intent are closer together in the common geometric space than unrelated multimodal feature vectors.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Inferring multimodal content intent in a common geometric space in order to improve recognition of influential impacts of content includes mapping the multimodal content in a common geometric space by embedding a multimodal feature vector representing a first modality of the multimodal content and a second modality of the multimodal content and inferring intent of the multimodal content mapped into the common geometric space such that connections between multimodal content result in an improvement in recognition of the influential impact of the multimodal content.

Citations

20 Claims

1. A method of creating a semantic embedding space for multimodal content for determining intent of content, the method comprising:
- for each of a plurality of content of the multimodal content, creating a respective, first modality feature vector representative of content of the multimodal content having a first modality using a first machine learning model;
  
  for each of a plurality of content of the multimodal content, creating a respective, second modality feature vector representative of content of the multimodal content having a second modality using a second machine learning model;
  
  for each of a plurality of first modality feature vector and second modality feature vector multimodal content pairs, forming a combined multimodal feature vector from the first modality feature vector and the second modality feature vector;
  
  for at least one first modality feature vector and second modality feature vector multimodal content pair, assigning at least one taxonomy class of intent; and
  
  semantically embedding the respective, combined multimodal feature vectors in a common geometric space, wherein embedded combined multimodal feature vectors having related intent are closer together in the common geometric space than unrelated multimodal feature vectors.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein semantically embedding multimodal content into the common geometric space comprises:
    - projecting a multimodal feature vector representing a first modality feature of the multimodal content and a second modality feature of the multimodal content into the common geometric space; and
      
      inferring an intent of the multimodal content mapped into the common geometric space based on a proximity of the mapped multimodal content to at least one other mapped multimodal content in the common geometric space having a predetermined intent such that determined related intents between multimodal content result in an improvement in recognition of influential impact of the multimodal content.
  - 3. The method of claim 2, wherein the multimodal content is a social media posting.
  - 4. The method of claim 2, further comprising:
    - determining if a first multimodal content is in proximity to a desired intent.
  - 5. The method of claim 4, further comprising:
    - suggesting alterations of the first multimodal content such that the altered first multimodal content, if mapped to the common geometric space, would be closer to the desired intent.
  - 6. The method of claim 1, wherein intent is classified by a taxonomy comprising advocative, information, expressive, provocative, entertainment, and exhibitionist classes.
  - 7. The method of claim 1, further comprising:
    - determining a contextual relationship between a first modality feature represented by the first modality feature vector of the multimodal content and a second modality feature represented by the second modality feature vector of the multimodal content.
  - 8. The method of claim 7, wherein the contextual relationship is classified by a taxonomy comprising minimal, close, and transcendent classes.
  - 9. The method of claim 1, further comprising:
    - inferring a semiotic relationship between a first modality represented by the first modality feature vector of the multimodal content and a second modality represented by the second modality feature vector of the multimodal content.
  - 10. The method of claim 9, wherein the semiotic relationship is classified by a taxonomy comprising divergent, parallel, and additive classes.
  - 11. The method of claim 1, wherein the common geometric space is a non-Euclidean common geometric space.
  - 12. The method of claim 1, further comprising:
    - semantically embedding the respective, combined multimodal feature vectors including the respective at least one taxonomy class of intent in a common geometric space.

13. A method of creating a semantic embedding space for multimodal content for determining intent of content, the method comprising:
- for each of a plurality of content of the multimodal content, creating a respective, first modality feature vector representative of content of the multimodal content having a first modality using a first machine learning model;
  
  for each of a plurality of content of the multimodal content, creating a respective, second modality feature vector representative of content of the multimodal content having a second modality using a second machine learning model;
  
  for each of a plurality of first modality feature vector and second modality feature vector multimodal content pairs, forming a combined multimodal feature vector from the first modality feature vector and the second modality feature vector;
  
  for at least one first modality feature vector and second modality feature vector multimodal content pair, assigning at least one taxonomy class of intent;
  
  projecting the combined multimodal feature vector into the common geometric space; and
  
  inferring an intent of the multimodal content represented by the combined multimodal feature vector based on the projection of the multimodal feature vector in the common geometric space and a classifier.
- View Dependent Claims (14, 15, 16)
- - 14. The method of claim 13, further comprising:
    - determining if a first multimodal content associated with a first agent is in proximity to a desired intent; and
      
      suggesting alterations of the first multimodal content to the first agent such that the first multimodal content will be mapped into the common geometric space closer to the desired intent.
  - 15. The method of claim 13, further comprising:
    - inferring a semiotic relationship between a first modality represented by the first modality feature vector of the multimodal content and a second modality represented by the second modality feature vector of the multimodal content.
  - 16. The method of claim 13, wherein intent is classified by the classifier based on a taxonomy comprising advocative, information, expressive, provocative, entertainment, and exhibitionist classes.

17. A non-transitory computer-readable medium having stored thereon at least one program, the at least one program including instructions which, when executed by a processor, cause the processor to perform a method of creating a semantic embedding space for multimodal content for determining intent of content, comprising:
- for each of a plurality of content of the multimodal content, creating a respective, first modality feature vector representative of content of the multimodal content having a first modality using a first machine learning model;
  
  for each of a plurality of content of the multimodal content, creating a respective, second modality feature vector representative of content of the multimodal content having a second modality using a second machine learning model;
  
  for each of a plurality of first modality feature vector and second modality feature vector multimodal content pairs, forming a combined multimodal feature vector from the first modality feature vector and the second modality feature vector;
  
  for at least one first modality feature vector and second modality feature vector multimodal content pair, assigning at least one taxonomy class of intent; and
  
  semantically embedding the respective, combined multimodal feature vectors in a common geometric space, wherein embedded combined multimodal feature vectors having related intent are closer together in the common geometric space than unrelated multimodal feature vectors.
- View Dependent Claims (18, 19, 20)
- - 18. The non-transitory computer-readable medium of claim 17, further comprising:
    - determining if a first multimodal content associated with a first agent is in proximity to a desired intent; and
      
      suggesting alterations of the first multimodal content to the first agent such that the first multimodal content will be mapped into the common geometric space closer to the desired intent.
  - 19. The non-transitory computer-readable medium of claim 17, further comprising:
    - inferring a semiotic relationship between a first modality represented by the first modality feature vector of the multimodal content and a second modality represented by the second modality feature vector of the multimodal content.
  - 20. The method of claim 19, wherein the semiotic relationship is classified by a taxonomy comprising divergent, parallel, and additive classes.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
SRI International, Inc.
Original Assignee
SRI International, Inc.
Inventors
Kruk, Julia, Lubin, Jonah M., Sikka, Karan, Lin, Xiao, Divakaran, Ajay

Application Number

US16/383,437
Publication Number

US 20200134398A1
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 18/254   of classification results, ...

G06N 20/00   Machine learning

G06N 20/20   Ensemble learning

G06N 3/044   Recurrent networks, e.g. Ho...

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

G06Q 50/01   Social networking

G06V 30/40   Document-oriented image-bas...

DETERMINING INTENT FROM MULTIMODAL CONTENT EMBEDDED IN A COMMON GEOMETRIC SPACE

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

DETERMINING INTENT FROM MULTIMODAL CONTENT EMBEDDED IN A COMMON GEOMETRIC SPACE

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links