Immersive videoconference method and system

US 9,432,625 B2
Filed: 09/03/2013
Issued: 08/30/2016
Est. Priority Date: 09/28/2012
Status: Active Grant

First Claim

Patent Images

1. An immersive videoconference method allowing multiple participants in different locations to remotely interact with each other through a telecommunication network architecture, wherein the method comprises at the location of a given participant:

capturing video images of the participant by a pair of video cameras;

detecting, tracking and determining size and position related parameters of the participant in the video images;

generating a single elementary video stream related to the participant;

associating a room identifier to the elementary video stream, the room identifier being uniquely associated to the given participant;

sending the elementary video stream, the size and position related parameters and the room identifier to a centralized entity;

repeating the above for each participant at each different location;

wherein the method further comprises at the centralized entity;

creating a virtual room by combining the elementary video streams for all the participants;

staging the elementary video streams of all the participants in said virtual room and computing a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants; and

generating, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants;

wherein detecting and tracking the participant in the video images comprises detecting and tracking a body of the participant without a background from the video images based on a histograms of oriented gradients HOG for the purpose of human detection algorithm and wherein results of said HOG algorithm are further filtered by a depth mapping matrix computed from a pair of video signals of the participant obtained from the pair of video cameras.

View all claims

11 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An immersive videoconference method wherein multiple participants (21, 22, 23, 24) in different locations (11, 12, 13) remotely interact with each other through a telecommunication network architecture (8, 31, 38), wherein the method comprises at the location (11, 12, 13) of a given participant (21, 22, 23, 24); —capturing video images of the participant by a pair of video cameras (4A, 4B); —detecting, tracking and determining size and position related parameters of the participant in the video images; —generating a single elementary video stream related to the participant; —associating a room identifier to the elementary video stream, the room identifier being uniquely associated to the given participant; —sending the elementary video stream, the size and position related parameters and the room identifier (41A, 42A, 43A) to a centralized entity (30); —repeating the above steps for each participant (21, 22, 23, 24) at the different location (11, 12, 13); wherein the method further comprises at the centralized entity (30): —creating a virtual room (70) by combining the elementary video streams (41A, 42A, 43A) for all the participants; —staging the elementary video streams of all the participants in said virtual room and computing a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants; and —generating, for each participant, a single composite video stream (41B, 42B, 43B) of the virtual room (70) that displays the 2D video of the other participants sized and positioned as if the participants (21, 22, 23, 24) were in the same virtual room (70) based on the scene specification and a combination of the elementary video streams of the other participants.

Citations

10 Claims

1. An immersive videoconference method allowing multiple participants in different locations to remotely interact with each other through a telecommunication network architecture, wherein the method comprises at the location of a given participant:
- capturing video images of the participant by a pair of video cameras;
  
  detecting, tracking and determining size and position related parameters of the participant in the video images;
  
  generating a single elementary video stream related to the participant;
  
  associating a room identifier to the elementary video stream, the room identifier being uniquely associated to the given participant;
  
  sending the elementary video stream, the size and position related parameters and the room identifier to a centralized entity;
  
  repeating the above for each participant at each different location;
  
  wherein the method further comprises at the centralized entity;
  
  creating a virtual room by combining the elementary video streams for all the participants;
  
  staging the elementary video streams of all the participants in said virtual room and computing a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants; and
  
  generating, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants;
  
  wherein detecting and tracking the participant in the video images comprises detecting and tracking a body of the participant without a background from the video images based on a histograms of oriented gradients HOG for the purpose of human detection algorithm and wherein results of said HOG algorithm are further filtered by a depth mapping matrix computed from a pair of video signals of the participant obtained from the pair of video cameras.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The immersive videoconference method of claim 1, wherein the depth mapping matrix is computed based on a pinhole camera model.
  - 3. The immersive videoconference method according to claim 1, wherein detecting and tracking the participant in the video images comprises determining a 3D position of the participant relatively to a position of one of the video camera based on a binary mask image and the depth mapping matrix.
  - 4. The immersive videoconference method according to claim 1, wherein generating the elementary video stream comprises encoding images of the elementary video stream with a textured mask, the elementary video stream being a Red Green Blue and Alpha video stream with alpha being the level of transparency.
  - 5. The immersive videoconference method according to claim 1, wherein generating one composite video stream for the participant comprises translating, zooming and superimposing the elementary video streams received from the other participants based on the scene specification.
  - 6. The immersive videoconference method according to claim 1, wherein the method further comprises only publishing and displaying said single composite video stream to an appropriate participant based on the corresponding unique room identifier.

7. An immersive videoconference method allowing multiple participants in different locations to remotely interact with each other through a telecommunication network architecture, wherein the method comprises at the location of a given participant:
- capturing video images of the participant by a pair of video cameras;
  
  detecting, tracking and determining size and position related parameters of the participant in the video images;
  
  generating a single elementary video stream related to the participant;
  
  associating a room identifier to the elementary video stream, the room identifier being uniquely associated to the given participant;
  
  sending the elementary video stream, the size and position related parameters and the room identifier to a centralized entity;
  
  repeating the above for each participant at each different location;
  
  wherein the method further comprises at the centralized entity;
  
  creating a virtual room by combining the elementary video streams for all the participants;
  
  staging the elementary video streams of all the participants in said virtual room and computing a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants; and
  
  generating, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants;
  
  wherein the scene specification comprises z-indexes of the elementary video streams describing whether an elementary video stream related to one participant is in front or behind other elementary video streams related to the other participants in the virtual room, a 2D position of each video describing the positions of each participant relatively to a given point of view in the virtual room, and a zoom scale describing the proximity of one participant relatively to another one.

8. An immersive videoconference system wherein multiple participants in different locations remotely interact with each other through a telecommunication network architecture, the immersive videoconference system comprising:
- a pair of video cameras, at the location of each participant, arranged to capture video images of the participant;
  
  a pretreatment module, at the location of each participant, comprising a depth map generator coupled to a tracker arranged to detect and track the participant in the video images, a body position calculator arranged to determine size and position related parameters of the participant in the video images, a video streamer arranged to generate a single elementary video stream related to the participant, and a room identifier requestor arranged to associate a room identifier to the elementary video stream; and
  
  a virtual place building module, at a centralized location, comprising a staging director arranged to create a virtual room by combining the elementary video streams for all the participants, stage the elementary video streams of all the participants in said virtual room and compute a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants, and a video mixer arranged to generate, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants;
  
  wherein the tracker is arranged to detect and track a body of the participant without a background from the video images based on a histograms of oriented gradients HOG for the purpose of human detection algorithm and wherein results of said HOG algorithm are further filtered by a depth mapping matrix computed from a pair of video signals of the participant obtained from the pair of video cameras.
- View Dependent Claims (9)
- - 9. The immersive videoconference system of claim 8, wherein the virtual place building module further comprises a video server arranged to publish the composite video streams of the participants, each video stream being associate with a room identifier uniquely associated to the given participant.

10. An immersive videoconference system wherein multiple participants in different locations remotely interact with each other through a telecommunication network architecture, the immersive videoconference system comprising:
- a pair of video cameras, at the location of each participant, arranged to capture video images of the participant;
  
  a pretreatment module, at the location of each participant, comprising a depth map generator coupled to a tracker arranged to detect and track the participant in the video images, a body position calculator arranged to determine size and position related parameters of the participant in the video images, a video streamer arranged to generate a single elementary video stream related to the participant, and a room identifier requestor arranged to associate a room identifier to the elementary video stream; and
  
  a virtual place building module, at a centralized location, comprising a staging director arranged to create a virtual room by combining the elementary video streams for all the participants, stage the elementary video streams of all the participants in said virtual room and compute a scene specification associated to the room identifier of each participant based on the size and position related parameters of all the participants, and a video mixer arranged to generate, for each participant, a single composite video stream of the virtual room that displays the 2D video of the other participants sized and positioned as if the participants were in the same virtual room based on the scene specification and a combination of the elementary video streams of the other participants;
  
  wherein the scene specification comprises z-indexes of the elementary video streams describing whether an elementary video stream related to one participant is in front or behind other elementary video streams related to the other participants in the virtual room, a 2D position of each video describing the positions of each participant relatively to a given point of view in the virtual room, and a zoom scale describing the proximity of one participant relatively to another one.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
RPX Corporation
Original Assignee
Alcatel-Lucent SA (Nokia Corporation)
Inventors
Delegue, Gerard, Bouche, Nicolas
Primary Examiner(s)
Woo, Stella L

Application Number

US14/431,101
Publication Number

US 20150244987A1
Time in Patent Office

1,092 Days
Field of Search

348 1401- 1416, 715756-757
US Class Current

1/1
CPC Class Codes

H04M 3/567   Multimedia conference systems

H04N 13/239   using two 2D image sensors ...

H04N 2213/003   Aspects relating to the "2D...

H04N 7/157   defining a virtual conferen...

Immersive videoconference method and system

First Claim

11 Assignments

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Immersive videoconference method and system

First Claim

11 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links