Gesture based annotations
First Claim
1. A system, comprising:
- a 360-degree camera configured to capture images within a 360-degree view;
at least one processor; and
storage comprising a set of instructions executable by the at least one processor to;
receive an audio recording containing speech of a first participant of a meeting;
receive, from the 360-degree camera, a video comprising the 360-degree view that includes the first participant;
generate a three-dimensional coordinate mapping identifying where the first participant and a second participant are located in the 360-degree view of the video;
convert the audio recording of the first participant to digital text;
identify a gesture of the first participant directed at the second participant in the 360-degree view;
determine, based on the three-dimensional coordinate mapping, the first participant is directing the gesture toward the second participant; and
store an annotation for the digital text corresponding to the gesture being directed to the second participant.
1 Assignment
0 Petitions
Accused Products
Abstract
In an embodiment a device to convert conversations from a meeting to text and annotate the text is disclosed. In an embodiment a device is disclosed, comprising: a microphone; a camera; a processor; and a storage comprising a set of instructions; wherein the set of instructions causes a processor to: receive from the microphone, an audio recording containing speech of a participant of a meeting; receive from the camera, a video of the participant; identify the participant; convert the speech of the participant to a digital text; develop a skeletal map of the participant; recognize a gesture of the participant from the skeletal maps; detect and identify a target of the gesture; based on the target and the gesture determine an annotation for the digital text corresponding to a point of time of the gesture.
18 Citations
20 Claims
-
1. A system, comprising:
-
a 360-degree camera configured to capture images within a 360-degree view; at least one processor; and storage comprising a set of instructions executable by the at least one processor to; receive an audio recording containing speech of a first participant of a meeting; receive, from the 360-degree camera, a video comprising the 360-degree view that includes the first participant; generate a three-dimensional coordinate mapping identifying where the first participant and a second participant are located in the 360-degree view of the video; convert the audio recording of the first participant to digital text; identify a gesture of the first participant directed at the second participant in the 360-degree view; determine, based on the three-dimensional coordinate mapping, the first participant is directing the gesture toward the second participant; and store an annotation for the digital text corresponding to the gesture being directed to the second participant. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A device, comprising:
-
at least one microphone;
at least one 360-degree camera;at least one processor; and
at least one storage comprising a set of instructions;
wherein the set of instructions causes a processor to;receive from the at least one microphone, speech of a first participant of a meeting; receive from the 360-degree camera, a video of the first participant; generate a three-dimensional coordinate mapping identifying where the first participant and a second participant are located in the 360-degree view of the video; convert the speech of the first participant to a-digital text; recognize at least one gesture of the first participant; determine, based on the three-dimensional coordinate mapping, the first participant is directing the at least one gesture toward the second participant; and store an annotation for the digital text corresponding to the at least one gesture being directed to the second participant. - View Dependent Claims (17)
-
-
18. A method, comprising:
-
receiving an audio recording containing speech of a first participant of a meeting; receiving a 360-degree view of video comprising the first participant; generating a three-dimensional coordinate mapping identifying where the first participant and a second participant are located in the 360-degree view of the video; converting the audio recording of the first participant to digital text; identifying a suspected gesture of the first participant directed at the second participant in the 360-degree view of the video; determining, based on the three-dimensional coordinate mapping, the first participant is directing the gesture toward the second participant; and storing an annotation for the digital text corresponding to a point of time of the at least one gesture and indicating that the at least one gesture is directed from the first participant to the second participant. - View Dependent Claims (19, 20)
-
Specification