Immersive videoconference method and system
First Claim
1. An immersive videoconference method allowing multiple participants in different locations to remotely interact with each other through a telecommunication network architecture, wherein the method comprises at the location of a given participant:
- capturing video images of the participant by a pair of video cameras;
detecting, tracking and determining size and position related parameters of the participant in the video images;
generating a single elementary video stream related to the participant;
associating a room identifier to the elementary video stream, the room identifier being uniquely associated to the given participant;
sending the elementary video stream, the size and position related parameters and the room identifier to a centralized entity;
repeating the above for each participant at each different location;
wherein the method further comprises at the centralized entity;
creating a virtual room by combining the elementary video streams for all the participants;
staging the elementary video streams of all the participants in said virtual room and computing a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants; and
generating, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants;
wherein detecting and tracking the participant in the video images comprises detecting and tracking a body of the participant without a background from the video images based on a histograms of oriented gradients HOG for the purpose of human detection algorithm and wherein results of said HOG algorithm are further filtered by a depth mapping matrix computed from a pair of video signals of the participant obtained from the pair of video cameras.
11 Assignments
0 Petitions
Accused Products
Abstract
An immersive videoconference method wherein multiple participants (21, 22, 23, 24) in different locations (11, 12, 13) remotely interact with each other through a telecommunication network architecture (8, 31, 38), wherein the method comprises at the location (11, 12, 13) of a given participant (21, 22, 23, 24); —capturing video images of the participant by a pair of video cameras (4A, 4B); —detecting, tracking and determining size and position related parameters of the participant in the video images; —generating a single elementary video stream related to the participant; —associating a room identifier to the elementary video stream, the room identifier being uniquely associated to the given participant; —sending the elementary video stream, the size and position related parameters and the room identifier (41A, 42A, 43A) to a centralized entity (30); —repeating the above steps for each participant (21, 22, 23, 24) at the different location (11, 12, 13); wherein the method further comprises at the centralized entity (30): —creating a virtual room (70) by combining the elementary video streams (41A, 42A, 43A) for all the participants; —staging the elementary video streams of all the participants in said virtual room and computing a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants; and —generating, for each participant, a single composite video stream (41B, 42B, 43B) of the virtual room (70) that displays the 2D video of the other participants sized and positioned as if the participants (21, 22, 23, 24) were in the same virtual room (70) based on the scene specification and a combination of the elementary video streams of the other participants.
-
Citations
10 Claims
-
1. An immersive videoconference method allowing multiple participants in different locations to remotely interact with each other through a telecommunication network architecture, wherein the method comprises at the location of a given participant:
-
capturing video images of the participant by a pair of video cameras; detecting, tracking and determining size and position related parameters of the participant in the video images; generating a single elementary video stream related to the participant; associating a room identifier to the elementary video stream, the room identifier being uniquely associated to the given participant; sending the elementary video stream, the size and position related parameters and the room identifier to a centralized entity; repeating the above for each participant at each different location; wherein the method further comprises at the centralized entity; creating a virtual room by combining the elementary video streams for all the participants; staging the elementary video streams of all the participants in said virtual room and computing a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants; and generating, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants; wherein detecting and tracking the participant in the video images comprises detecting and tracking a body of the participant without a background from the video images based on a histograms of oriented gradients HOG for the purpose of human detection algorithm and wherein results of said HOG algorithm are further filtered by a depth mapping matrix computed from a pair of video signals of the participant obtained from the pair of video cameras. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An immersive videoconference method allowing multiple participants in different locations to remotely interact with each other through a telecommunication network architecture, wherein the method comprises at the location of a given participant:
-
capturing video images of the participant by a pair of video cameras; detecting, tracking and determining size and position related parameters of the participant in the video images; generating a single elementary video stream related to the participant; associating a room identifier to the elementary video stream, the room identifier being uniquely associated to the given participant; sending the elementary video stream, the size and position related parameters and the room identifier to a centralized entity; repeating the above for each participant at each different location; wherein the method further comprises at the centralized entity; creating a virtual room by combining the elementary video streams for all the participants; staging the elementary video streams of all the participants in said virtual room and computing a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants; and generating, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants; wherein the scene specification comprises z-indexes of the elementary video streams describing whether an elementary video stream related to one participant is in front or behind other elementary video streams related to the other participants in the virtual room, a 2D position of each video describing the positions of each participant relatively to a given point of view in the virtual room, and a zoom scale describing the proximity of one participant relatively to another one.
-
-
8. An immersive videoconference system wherein multiple participants in different locations remotely interact with each other through a telecommunication network architecture, the immersive videoconference system comprising:
-
a pair of video cameras, at the location of each participant, arranged to capture video images of the participant; a pretreatment module, at the location of each participant, comprising a depth map generator coupled to a tracker arranged to detect and track the participant in the video images, a body position calculator arranged to determine size and position related parameters of the participant in the video images, a video streamer arranged to generate a single elementary video stream related to the participant, and a room identifier requestor arranged to associate a room identifier to the elementary video stream; and a virtual place building module, at a centralized location, comprising a staging director arranged to create a virtual room by combining the elementary video streams for all the participants, stage the elementary video streams of all the participants in said virtual room and compute a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants, and a video mixer arranged to generate, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants; wherein the tracker is arranged to detect and track a body of the participant without a background from the video images based on a histograms of oriented gradients HOG for the purpose of human detection algorithm and wherein results of said HOG algorithm are further filtered by a depth mapping matrix computed from a pair of video signals of the participant obtained from the pair of video cameras. - View Dependent Claims (9)
-
-
10. An immersive videoconference system wherein multiple participants in different locations remotely interact with each other through a telecommunication network architecture, the immersive videoconference system comprising:
-
a pair of video cameras, at the location of each participant, arranged to capture video images of the participant; a pretreatment module, at the location of each participant, comprising a depth map generator coupled to a tracker arranged to detect and track the participant in the video images, a body position calculator arranged to determine size and position related parameters of the participant in the video images, a video streamer arranged to generate a single elementary video stream related to the participant, and a room identifier requestor arranged to associate a room identifier to the elementary video stream; and a virtual place building module, at a centralized location, comprising a staging director arranged to create a virtual room by combining the elementary video streams for all the participants, stage the elementary video streams of all the participants in said virtual room and compute a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants, and a video mixer arranged to generate, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants; wherein the scene specification comprises z-indexes of the elementary video streams describing whether an elementary video stream related to one participant is in front or behind other elementary video streams related to the other participants in the virtual room, a 2D position of each video describing the positions of each participant relatively to a given point of view in the virtual room, and a zoom scale describing the proximity of one participant relatively to another one.
-
Specification