Method for generating an immersive video of a plurality of persons

US 9,729,825 B2
Filed: 06/20/2014
Issued: 08/08/2017
Est. Priority Date: 07/09/2013
Status: Active Grant

First Claim

Patent Images

1. A method for generating an immersive video of a plurality of persons in a computer server, the method comprising:

receiving a plurality of video streams on a plurality of video channels from a plurality of client devices, each video stream including a silhouette of a person;

analyzing the video stream received from a client device to detect a silhouette metadata channel carrying silhouette data the silhouette data being selected from the group consisting of coarse silhouette information representing coordinates of a characteristic point of the face of the person and fine silhouette information representing a mask corresponding to the shape and location of the person within a frame of the video stream;

analyzing the silhouette data when a silhouette metadata channel has been detected and;

in the case that a silhouette metadata channel is detected, extracting the person'"'"'s silhouette as a function of the silhouette data and generating a filtered video stream representing only the silhouette of the person;

in the case that a silhouette metadata channel has not been detected, generating in the computer server silhouette data from the video stream received and extracting the person'"'"'s silhouette from the incoming video stream as a function of the silhouette data generated; and

generating a video stream of a virtual scene comprising a plurality of the silhouettes extracted from the plurality of filtered video streams.

View all claims

9 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for generating an immersive video of a plurality of persons in a computer server, the method comprising: —receiving a plurality of video streams on a plurality of video channels from a plurality of client devices, each video stream including a silhouette of a person; —extracting the person'"'"'s silhouette from each incoming video stream to generate a filtered video stream representing only the silhouette of the person; —generating a video stream of a virtual scene comprising a plurality of—the silhouettes extracted from the plurality of filtered video streams the method further comprising receiving silhouette data carried on a metadata channel from a client device in addition to the video stream, the silhouette data representing a position of a face of the person within a frame of the video stream; —analyzing the silhouette data performing the silhouette extraction as a function of the silhouette data analyzed.

56 Citations

View as Search Results

15 Claims

1. A method for generating an immersive video of a plurality of persons in a computer server, the method comprising:
- receiving a plurality of video streams on a plurality of video channels from a plurality of client devices, each video stream including a silhouette of a person;
  
  analyzing the video stream received from a client device to detect a silhouette metadata channel carrying silhouette data the silhouette data being selected from the group consisting of coarse silhouette information representing coordinates of a characteristic point of the face of the person and fine silhouette information representing a mask corresponding to the shape and location of the person within a frame of the video stream;
  
  analyzing the silhouette data when a silhouette metadata channel has been detected and;
  
  in the case that a silhouette metadata channel is detected, extracting the person'"'"'s silhouette as a function of the silhouette data and generating a filtered video stream representing only the silhouette of the person;
  
  in the case that a silhouette metadata channel has not been detected, generating in the computer server silhouette data from the video stream received and extracting the person'"'"'s silhouette from the incoming video stream as a function of the silhouette data generated; and
  
  generating a video stream of a virtual scene comprising a plurality of the silhouettes extracted from the plurality of filtered video streams.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method according to claim 1, further including:
    - listening for CPU allocation requests from the client devices, wherein a CPU allocation request defines a type of the silhouette data to be sent by the client device; and
      
      determining an available CPU capacity of the computer server in order to reply to a requesting first client device according to the CPU load.
  - 3. The method according to claim 2, further including accepting the allocation request from the first client device, allocating CPU capacity to the first client device and registering an ID of the first client device.
  - 4. The method according to claim 3 further including:
    - sending a message to the first client device to ask if the first client device has CPU capacity to generate fine silhouette information;
      
      unregistering said registered first client device in response to receiving a positive reply from the first client device;
      
      allocating CPU capacity freed from the first client device to perform the silhouette extraction for a requesting second client device; and
      
      registering the requesting second client device.
  - 5. The method according to claim 3 further including:
    - sending a message to the first client device to ask if the first client device has CPU capacity to generate coarse silhouette information;
      
      discontinuing the extraction of coarse information for the first client device in response to receiving a positive reply from the first client device;
      
      allocating CPU capacity freed from the first client device to perform the silhouette extraction for a requesting second client device; and
      
      Registering the requesting second client device.
  - 6. The method according to claim 1, wherein the coarse silhouette information comprises coordinates of a face center, a height and a width of the face.
  - 7. The method according to claim 1, wherein the received silhouette data comprises coarse silhouette information, the method further comprising transmitting the coarse silhouette information to a silhouette extraction processing block which computes the filtered video stream of the silhouette as a function of the coarse silhouette information and the incoming video stream.
  - 8. The method according to claim 7, wherein the silhouette extraction processing block applies a prediction algorithm to perform the silhouette extraction according to coarse silhouette information.
  - 9. The method according to claim 1, wherein the received silhouette data comprises the fine silhouette information representing the mask, the method further including comparing the fine silhouette information to a threshold to detect the mask and applying the mask to the corresponding frame of the incoming video stream to obtain the filtered video stream representing only the silhouette of the person.
  - 10. The method according to claim 9, wherein the fine silhouette information is a picture composed from pixels of a first color and/or pixels of a second color, the method further including:
    - comparing a value for each pixel in the picture to a mean value; and
      
      attributing a standard value to each pixel as a function of a result of the comparison, wherein the standard value is a first standard value attributed if the pixel value is lower than the mean value and a second standard value attributed if the pixel value is higher than the mean value.

11. A method of generating a video stream in a client device including:
- capturing a video stream from a capture device;
  
  performing a silhouette extraction process from the captured video stream to generate silhouette data;
  
  encoding the silhouette data generated by the silhouette extraction process; and
  
  sending the video stream in a four channels video format including three channels of raw video data using a color space for the video stream and one metadata channel for the silhouette data generated by the silhouette extraction process.
- View Dependent Claims (12)
- - 12. The method according to claim 11, wherein the metadata channel is selected in the group consisting of an alpha channel, a RTP extension channel and a RTP channel.

13. A video processing Server including:
- a video decoder block able to receive video streams from a plurality of client devices and to generate respective decoded video streams,a client processing detection block able to detect a metadata channel in an incoming video stream;
  
  a first silhouette extraction sub-processing block able to perform a silhouette extraction process in a respective decoded video stream to generate coarse silhouette information;
  
  a second silhouette extraction sub-processing block able to perform a silhouette extraction process to generate fine silhouette information as a function of coarse silhouette information and the respective decoded video stream; and
  
  an immersive rendering block able to use fine silhouette data to extract silhouette video streams of respective persons from respective video streams and mix a plurality of silhouette video streams to generate a virtual scene;
  
  wherein the client processing detection block is able to transmit to the video decoder block instructions to send the decoded video stream;
  
  to the first silhouette extraction sub-processing block when no metadata channel is detected or when the metadata channel does not contain any silhouette data,to the second silhouette extraction sub-processing block when a metadata channel is detected and contains coarse silhouette information, the course silhouette information representing coordinates of a location of a person within a frame of the video stream; and
  
  to the immersive rendering block when a metadata channel is detected and contains fine silhouette information representing a mask corresponding to the shape and location of the person within a frame of the video stream;
  
  the video processing server further including;
  
  a video encoder block able to encode and send a video stream comprising the virtual scene to a client device.
- View Dependent Claims (14)
- - 14. The Video processing server according to claim 13 further including a Resource manager module including:
    - a listener module adapted to listen for CPU request for extraction,a CPU capacity determination module able to determine a CPU capacity available in the video processing server;
      
      a communication module able to send messages to registered client devices for which the video processing server performs silhouette extraction, anda memory able to register a client device.

15. A method for generating an immersive video in a computer server, the method comprising:
- receiving a video stream from a client device of an associated user;
  
  determining whether the video stream includes fine silhouette information representing a mask corresponding to the silhouette of the user suitable for identifying pixels in the video stream to be used in generation the immersive video, course silhouette information representing face detection of the user suitable for aiding the generation of fine silhouette information or no silhouette information describing a silhouette of the user;
  
  under the condition that the video stream includes fine silhouette information;
  
  using the received fine silhouette information to select pixels in the video stream to be used in generating the immersive video;
  
  under the condition that the video stream includes coarse silhouette information;
  
  processing the video stream guided by the coarse silhouette information to generate fine silhouette information;
  
  under the condition that the video stream includes no information describing a silhouette of the user;
  
  processing the video stream to generate fine silhouette information;
  
  using the generated fine silhouette information to select pixels in the video stream to be used in generation of the immersive video; and
  
  generating a video stream of a virtual scene including representations based on the selected pixels.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
RPX Corporation
Original Assignee
Alcatel-Lucent SA (Nokia Corporation)
Inventors
Fadili, Moulay, Abou-Chakra, Rabih
Primary Examiner(s)
Gauthier, Gerald

Application Number

US14/903,638
Publication Number

US 20160150187A1
Time in Patent Office

1,145 Days
Field of Search

345419, 348 1401, 348 1402, 348 1403, 348 1407, 348 1408, 348 1409, 348 141, 348 1412, 348239, 348571, 348 1413, 348152, 3482221, 370261, 381 92, 463 17, 704270, 705 3, 709204, 715719, 382103, 382115, 382118, 382190, 382218, 396 18, 707706
US Class Current
CPC Class Codes

H04N 7/147   Communication arrangements,...

H04N 7/15   Conference systems

H04N 7/157   defining a virtual conferen...

Method for generating an immersive video of a plurality of persons

First Claim

9 Assignments

0 Petitions

Accused Products

Abstract

56 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Method for generating an immersive video of a plurality of persons

First Claim

9 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

56 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links