Emotion recognition in video conferencing

US 10,235,562 B2
Filed: 11/17/2017
Issued: 03/19/2019
Est. Priority Date: 03/18/2015
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for video conferencing, the method comprising:

receiving a video including a sequence of images corresponding to a videoconference between first and second users;

detecting at least one object of interest in one or more of the images;

locating feature reference points of the at least one object of interest;

determining that at least one deformation between two or more of the feature reference points refers to a facial emotion selected from a plurality of reference facial emotions;

determining that the facial emotion is a negative facial emotion; and

in response to determining that the facial emotion is the negative facial emotion, generating a communication for transmission to a non-participant of the videoconference between the first and second users, the communication bearing data associated with the negative facial emotion.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems for videoconferencing include recognition of emotions related to one videoconference participant such as a customer. This ultimately enables another videoconference participant, such as a service provider or supervisor, to handle angry, annoyed, or distressed customers. One example method includes the steps of receiving a video that includes a sequence of images, detecting at least one object of interest (e.g., a face), locating feature reference points of the at least one object of interest, aligning a virtual face mesh to the at least one object of interest based on the feature reference points, finding over the sequence of images at least one deformation of the virtual face mesh that reflect face mimics, determining that the at least one deformation refers to a facial emotion selected from a plurality of reference facial emotions, and generating a communication bearing data associated with the facial emotion.

50 Citations

View as Search Results

20 Claims

1. A computer-implemented method for video conferencing, the method comprising:
- receiving a video including a sequence of images corresponding to a videoconference between first and second users;
  
  detecting at least one object of interest in one or more of the images;
  
  locating feature reference points of the at least one object of interest;
  
  determining that at least one deformation between two or more of the feature reference points refers to a facial emotion selected from a plurality of reference facial emotions;
  
  determining that the facial emotion is a negative facial emotion; and
  
  in response to determining that the facial emotion is the negative facial emotion, generating a communication for transmission to a non-participant of the videoconference between the first and second users, the communication bearing data associated with the negative facial emotion.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, further comprising:
    - aligning a virtual face mesh to the at least one object of interest based at least in part on the feature reference points.
  - 3. The method of claim 2, wherein determining at least one deformation further comprises:
    - finding at least one facial deformation of the virtual face mesh associated with at least one facial mimic.
  - 4. The method of claim 1, wherein determining that the at least one deformation refers to a facial emotion further comprises:
    - comparing the deformation between two or more feature reference points to reference facial parameters of the plurality of facial emotions; and
      
      selecting the facial emotion based on the comparison of the deformation between two or more feature reference points to the reference facial parameters.
  - 5. The method of claim 1, wherein the first user is a service provider and the non-participant is a third party, further comprising:
    - establishing a videoconference between the service provider and the second user associated with the negative facial emotion; and
      
      transmitting the communication over a communications network to the third party.
  - 6. The method of claim 1, wherein the first user is a service provider and the non-participant is a third party, further comprising:
    - establishing a videoconference between the service provider, the second user, and the third party, the videoconference including the third party being established responsive to determining the negative facial emotion.
  - 7. The method of claim 1, further comprising:
    - detecting one or more gestures; and
      
      determining the one or more gestures are associated with the negative facial emotion.

8. A system, comprising:
- one or more processors; and
  
  a non-transitory processor-readable medium coupled to the one or more processors, the non-transitory processor-readable medium comprising processor-executable instructions that, when executed by one or more processors of a machine, cause the machine to perform operations comprising;
  
  receiving a video including a sequence of images corresponding to a videoconference between first and second users;
  
  detecting at least one object of interest in one or more of the images;
  
  locating feature reference points of the at least one object of interest;
  
  determining that at least one deformation between two or more of the feature reference points refers to a facial emotion selected from a plurality of reference facial emotions;
  
  determining that the facial emotion is a negative facial emotion; and
  
  in response to determining that the facial emotion is the negative facial emotion, generating a communication for transmission to a non-participant of the videoconference between the first and second users, the communication bearing data associated with the negative facial emotion.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein the operations further comprise:
    - aligning a virtual face mesh to the at least one object of interest based at least in part on the feature reference points.
  - 10. The system of claim 9, wherein determining at least one deformation further comprises:
    - finding at least one facial deformation of the virtual face mesh associated with at least one facial mimic.
  - 11. The system of claim 8, wherein determining that the at least one deformation refers to a facial emotion further comprises:
    - comparing the deformation between two or more feature reference points to reference facial parameters of the plurality of facial emotions; and
      
      selecting the facial emotion based on the comparison of the deformation between two or more feature reference points to the reference facial parameters.
  - 12. The system of claim 8, wherein the first user is a service provider and the non-participant is a third party, and wherein the operations further comprise:
    - establishing a videoconference between the service provider and the second user associated with the negative facial emotion; and
      
      transmitting the communication over a communications network to the third party.
  - 13. The system of claim 8, wherein the first user is a service provider and the non-participant is a third party, and wherein the operations further comprise:
    - establishing a videoconference between the service provider, the second user, and the third party, the videoconference including the third party being established responsive to determining the negative facial emotion.
  - 14. The system of claim 8, wherein the operations further comprise:
    - detecting one or more gestures; and
      
      determining the one or more gestures are associated with the negative facial emotion.

15. A non-transitory processor-readable medium comprising processor-executable instructions that, when executed by one or more processors of a machine, cause the machine to perform operations comprising:
- receiving a video including a sequence of images corresponding to a videoconference between first and second users;
  
  detecting at least one object of interest in one or more of the images;
  
  locating feature reference points of the at least one object of interest;
  
  determining that at least one deformation between two or more of the feature reference points refers to a facial emotion selected from a plurality of reference facial emotions;
  
  determining that the facial emotion is a negative facial emotion; and
  
  in response to determining that the facial emotion is the negative facial emotion, generating a communication for transmission to a non-participant of the videoconference between the first and second users, the communication bearing data associated with the negative facial emotion.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The non-transitory processor-readable medium of claim 15, wherein the operations further comprise:
    - aligning a virtual face mesh to the at least one object of interest based at least in part on the feature reference points.
  - 17. The non-transitory processor-readable medium of claim 16, wherein determining at least one deformation further comprises:
    - finding at least one facial deformation of the virtual face mesh associated with at least one facial mimic.
  - 18. The non-transitory processor-readable medium of claim 15, wherein determining that the at least one deformation refers to a facial emotion further comprises:
    - comparing the deformation between two or more feature reference points to reference facial parameters of the plurality of facial emotions; and
      
      selecting the facial emotion based on the comparison of the deformation between two or more feature reference points to the reference facial parameters.
  - 19. The non-transitory processor-readable medium of claim 15, wherein the first user is a service provider and the non-participant is a third party, and wherein the operations further comprise:
    - establishing a videoconference between the service provider and the second user associated with the negative facial emotion;
      
      transmitting the communication over a communications network to the third party; and
      
      establishing a videoconference between the service provider, the second user, and the third party, the videoconference including the third party being established responsive to determining the negative facial emotion.
  - 20. The non-transitory processor-readable medium of claim 19, wherein the operations further comprise:
    - detecting one or more gestures; and
      
      determining the one or more gestures are associated with the negative facial emotion.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Snap, Inc.
Original Assignee
Snap, Inc.
Inventors
Shaburov, Victor, Monastyrshyn, Yurii
Primary Examiner(s)
Akhavannik, Hadi

Application Number

US15/816,776
Publication Number

US 20180075292A1
Time in Patent Office

487 Days
Field of Search

None
US Class Current
CPC Class Codes

G06Q 30/0281   Customer communication at a...

G06T 2207/10016   Video; Image sequence

G06T 2207/30201   Face

G06T 7/337   involving reference images ...

G06T 7/344   involving models

G06V 10/7553   based on shape, e.g. active...

G06V 20/64   Three-dimensional objects

G06V 40/165   using facial parts and geom...

G06V 40/167   using comparisons between t...

G06V 40/171   Local features and componen...

G06V 40/176   Dynamic expression

G10L 25/57   for processing of video sig...

G10L 25/63   for estimating an emotional...

H04N 7/147   Communication arrangements,...

H04N 7/15   Conference systems

Emotion recognition in video conferencing

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

50 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Emotion recognition in video conferencing

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

50 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links