Gesture based annotations

US 10,241,990 B2
Filed: 08/26/2015
Issued: 03/26/2019
Est. Priority Date: 08/26/2015
Status: Active Grant

First Claim

Patent Images

1. A system, comprising:

a 360-degree camera configured to capture images within a 360-degree view;

at least one processor; and

storage comprising a set of instructions executable by the at least one processor to;

receive an audio recording containing speech of a first participant of a meeting;

receive, from the 360-degree camera, a video comprising the 360-degree view that includes the first participant;

generate a three-dimensional coordinate mapping identifying where the first participant and a second participant are located in the 360-degree view of the video;

convert the audio recording of the first participant to digital text;

identify a gesture of the first participant directed at the second participant in the 360-degree view;

determine, based on the three-dimensional coordinate mapping, the first participant is directing the gesture toward the second participant; and

store an annotation for the digital text corresponding to the gesture being directed to the second participant.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In an embodiment a device to convert conversations from a meeting to text and annotate the text is disclosed. In an embodiment a device is disclosed, comprising: a microphone; a camera; a processor; and a storage comprising a set of instructions; wherein the set of instructions causes a processor to: receive from the microphone, an audio recording containing speech of a participant of a meeting; receive from the camera, a video of the participant; identify the participant; convert the speech of the participant to a digital text; develop a skeletal map of the participant; recognize a gesture of the participant from the skeletal maps; detect and identify a target of the gesture; based on the target and the gesture determine an annotation for the digital text corresponding to a point of time of the gesture.

18 Citations

20 Claims

1. A system, comprising:
- a 360-degree camera configured to capture images within a 360-degree view;
  
  at least one processor; and
  
  storage comprising a set of instructions executable by the at least one processor to;
  
  receive an audio recording containing speech of a first participant of a meeting;
  
  receive, from the 360-degree camera, a video comprising the 360-degree view that includes the first participant;
  
  generate a three-dimensional coordinate mapping identifying where the first participant and a second participant are located in the 360-degree view of the video;
  
  convert the audio recording of the first participant to digital text;
  
  identify a gesture of the first participant directed at the second participant in the 360-degree view;
  
  determine, based on the three-dimensional coordinate mapping, the first participant is directing the gesture toward the second participant; and
  
  store an annotation for the digital text corresponding to the gesture being directed to the second participant.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The system of claim 1, wherein the annotation comprises a time stamp including a point of time of the gesture.
  - 3. The system of claim 1, further comprising an action point that comprises assignment of a task by the first participant to the second participant.
  - 4. The system of claim 1, wherein the instructions further cause the at least one processor to develop an awareness of a location of the first participant relative to the second participant.
  - 5. The system of claim 4, wherein the instructions are further configured to determine a task assigned to the second participant based on the gesture of the first participant and the direction of said gesture.
  - 6. The system of claim 1, wherein the second participant is positioned, relative to the 360-degree camera, at an angle greater than 90 degrees from the first participant.
  - 7. The system of claim 1, further comprising at least one microphone and at least one camera, wherein the at least one processor receives the audio recording of the first participant from the at least one microphone and video of the participant from the 360-degree camera.
  - 8. The system of claim 7, wherein the microphone comprises an array of directional microphones.
  - 9. The system of claim 7, further comprising multiple cameras wherein the cameras are configured throughout a meeting space in a distributed fashion.
  - 10. The system of claim 7, further comprising multiple microphones wherein microphones are configured throughout a meeting space in a distributed fashion.
  - 11. The system of claim 1, wherein a recognition and identification of the first or second participant of the meeting is affected by biometric recognition.
  - 12. The system of claim 1, wherein the first or second participant is recognized through electronically reading identifying information from a device carried by the participant.
  - 13. The system of claim 1, wherein the instructions further cause the at least one processor to develop a skeletal map of the first or second participant.
  - 14. The system of claim 13, wherein the instructions further cause the at least one processor to compare the detected gesture with a set of predefined gestures using the skeletal map.
  - 15. The system of claim 1, wherein the instructions further comprise a speaker specific speech recognition profile executable by the at least one processor to convert the audio recording of the participant to the digital text.

16. A device, comprising:
- at least one microphone;
  
  at least one 360-degree camera;
  
  at least one processor; and
  
  at least one storage comprising a set of instructions;
  
  wherein the set of instructions causes a processor to;
  
  receive from the at least one microphone, speech of a first participant of a meeting;
  
  receive from the 360-degree camera, a video of the first participant;
  
  generate a three-dimensional coordinate mapping identifying where the first participant and a second participant are located in the 360-degree view of the video;
  
  convert the speech of the first participant to a-digital text;
  
  recognize at least one gesture of the first participant;
  
  determine, based on the three-dimensional coordinate mapping, the first participant is directing the at least one gesture toward the second participant; and
  
  store an annotation for the digital text corresponding to the at least one gesture being directed to the second participant.
- View Dependent Claims (17)
- - 17. The device of claim 16, further comprising a gyroscope capable of measuring an angle of tilt of the camera, wherein the angle is used to measure distance of one or more participants from the camera.

18. A method, comprising:
- receiving an audio recording containing speech of a first participant of a meeting;
  
  receiving a 360-degree view of video comprising the first participant;
  
  generating a three-dimensional coordinate mapping identifying where the first participant and a second participant are located in the 360-degree view of the video;
  
  converting the audio recording of the first participant to digital text;
  
  identifying a suspected gesture of the first participant directed at the second participant in the 360-degree view of the video;
  
  determining, based on the three-dimensional coordinate mapping, the first participant is directing the gesture toward the second participant; and
  
  storing an annotation for the digital text corresponding to a point of time of the at least one gesture and indicating that the at least one gesture is directed from the first participant to the second participant.
- View Dependent Claims (19, 20)
- - 19. The method of claim 18, further comprising developing a location awareness of the first participant and the second-participant using the three-dimensional coordinate mapping.
  - 20. The method of claim 18, wherein identification of the first or second participant is performed using biometric recognition on the 360-degree view of video.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Rainisto, Roope
Primary Examiner(s)
Baderman, Scott T
Assistant Examiner(s)
Level, Barbara M

Application Number

US14/836,546
Publication Number

US 20170060828A1
Time in Patent Office

1,308 Days
Field of Search
US Class Current
CPC Class Codes

G06F 3/011   Arrangements for interactio...

G06F 3/017   Gesture based interaction, ...

G06F 3/16   Sound input; Sound output s...

G06F 40/169   Annotation, e.g. comment da...

G10L 15/26   Speech to text systems G10L...

H04L 12/1831   Tracking arrangements for l...

H04N 7/147   Communication arrangements,...

Gesture based annotations

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

18 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Gesture based annotations

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

18 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links