Systems and methods for real-time virtual-reality immersive multimedia communications

US 9,143,729 B2
Filed: 05/11/2011
Issued: 09/22/2015
Est. Priority Date: 05/12/2010
Status: Active Grant

First Claim

Patent Images

1. A method, comprising:

accepting a plurality of audio and video streams from a plurality of video conference endpoints, wherein each of the video conference endpoints are associated with one of a plurality of participants to a video conference, and wherein the video conference endpoints are of different types;

automatically determining the required type of conversion between the plurality of endpoints based on video conference service provider-specific combinations of video encoding format, audio encoding format, video encoding profile, video encoding level, communication protocol video resolution screen ratio bitrate for an audio stream bitrate for a video stream encryption standard, and acoustic consideration of the respective video conference feeds of each of the plurality of video conference endpoints;

for each of the participants to the video conference, (i) converting and composing, in real-time, the plurality of audio and video streams into a composite audio and video stream which is compatible with all of the different video conference endpoints, and (ii) rendering the composite audio and video stream at each of the different video conference endpoints;

for each of the composite audio and video streams, (i) building a composite metadata field from a metadata field associated with each of the video streams and (ii) utilizing information from the composite metadata field to transcode and process the composite audio and video stream;

enabling the video conference in real-time among the participants; and

supporting real-time human translator-free multimedia communications during the video conference by simultaneously translating, in real-time, the audio streams between different languages into a preferred language of each participant to the video conference and simultaneously providing the translations to each participant in only the preferred language of the respective participant so as to allow the participants to speak to each other in their respective preferred languages during the video conference.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A new approach is proposed that contemplates systems and methods to support the operation of a Virtual Media Room or Virtual Meeting Room (VMR), wherein each VMR can accept from a plurality of participants at different geographic locations a variety of video conferencing feeds of audio and video streams from video conference endpoints. The approach further utilizes virtual reality and augmented-reality techniques to transform the video and audio streams from the participants in various customizable ways to achieve a rich set of user experiences. A globally distributed infrastructure supports the sharing of the event among the participants at geographically distributed locations through a plurality of MCUs (Multipoint Control Unit), each configured to process the plurality of audio and video streams from the plurality of video conference endpoints in real time.

168 Citations

6 Claims

1. A method, comprising:
- accepting a plurality of audio and video streams from a plurality of video conference endpoints, wherein each of the video conference endpoints are associated with one of a plurality of participants to a video conference, and wherein the video conference endpoints are of different types;
  
  automatically determining the required type of conversion between the plurality of endpoints based on video conference service provider-specific combinations of video encoding format, audio encoding format, video encoding profile, video encoding level, communication protocol video resolution screen ratio bitrate for an audio stream bitrate for a video stream encryption standard, and acoustic consideration of the respective video conference feeds of each of the plurality of video conference endpoints;
  
  for each of the participants to the video conference, (i) converting and composing, in real-time, the plurality of audio and video streams into a composite audio and video stream which is compatible with all of the different video conference endpoints, and (ii) rendering the composite audio and video stream at each of the different video conference endpoints;
  
  for each of the composite audio and video streams, (i) building a composite metadata field from a metadata field associated with each of the video streams and (ii) utilizing information from the composite metadata field to transcode and process the composite audio and video stream;
  
  enabling the video conference in real-time among the participants; and
  
  supporting real-time human translator-free multimedia communications during the video conference by simultaneously translating, in real-time, the audio streams between different languages into a preferred language of each participant to the video conference and simultaneously providing the translations to each participant in only the preferred language of the respective participant so as to allow the participants to speak to each other in their respective preferred languages during the video conference.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, further comprising:
    - supporting translation from speech-to-visual for speech-initiated services.
  - 3. The method of claim 1, further comprising:
    - providing the participants with device-independent control over the video conference via interfaces so that the control of the video conference is independent of the video conference endpoints of different types.
  - 4. The method of claim 1, wherein at least one endpoint is associated with a proprietary videoconferencing service, and at least one other endpoint is associated with a different proprietary videoconferencing service, or a standards-based videoconferencing service.

5. A non-transitory machine-readable storage medium comprising software instructions that, when executed by a processor, cause the processor to:
- accept a plurality of audio and video streams from a plurality of video conference endpoints, wherein each of the video conference endpoints are associated with one of a plurality of participants to a video conference, and wherein the video conference endpoints are of different types;
  
  automatically determine the required type of conversion between the plurality of endpoints based on video conference service provider-specific combinations of video encoding format, audio encoding format, video encoding profile, video encoding level, communication protocol video resolution screen ratio bitrate for an audio stream bitrate for a video stream encryption standard, and acoustic consideration of the respective video conference feeds of each of the plurality of video conference endpoints;
  
  for each of the participants to the video conference, (i) convert and compose, in real-time, the plurality of audio and video streams into a composite audio and video stream which is compatible with all of the different video conference endpoints, and (ii) render the composite audio and video stream at each of the different video conference endpoints;
  
  for each of the composite audio and video streams, (i) build a composite metadata field from a metadata field associated with each of the video streams and (ii) utilize information from the composite metadata field to transcode and process the composite audio and video stream;
  
  enable the video conference in real-time among the participants; and
  
  support real-time human translator-free multimedia communications during the video conference by simultaneously translating, in real-time, the audio streams between different languages into a preferred language of each participant to the video conference and simultaneously providing the translations to each participant in only the preferred language of the respective participant so as to allow the participants to speak to each other in their respective preferred languages during the video conference.
- View Dependent Claims (6)
- - 6. The machine-readable storage medium of claim 5, wherein at least one endpoint is associated with a proprietary videoconferencing service, and at least one other endpoint is associated with a different proprietary videoconferencing service, or a standards-based videoconferencing service.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Verizon Patent and Licensing Incorporated (Verizon Communications Inc.)
Original Assignee
Blue Jeans Network, Inc. (Verizon Communications Inc.)
Inventors
Anand, Raghavan, Periyannan, Alagu
Primary Examiner(s)
Nguyen, Joseph J

Application Number

US13/105,723
Publication Number

US 20110279639A1
Time in Patent Office

1,595 Days
Field of Search

348/14.08, 348/14.09, 704/2, 704/276
US Class Current

1/1
CPC Class Codes

H04L 12/1827   Network arrangements for co...

H04L 51/10   Multimedia information

H04N 5/265   Mixing

H04N 7/141   between two video terminals...

H04N 7/152   Multipoint control units th...

Systems and methods for real-time virtual-reality immersive multimedia communications

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

168 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for real-time virtual-reality immersive multimedia communications

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

168 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links