Systems and methods for real-time virtual-reality immersive multimedia communications
First Claim
1. A method, comprising:
- accepting a plurality of audio and video streams from a plurality of video conference endpoints, wherein each of the video conference endpoints are associated with one of a plurality of participants to a video conference, and wherein the video conference endpoints are of different types;
automatically determining the required type of conversion between the plurality of endpoints based on video conference service provider-specific combinations of video encoding format, audio encoding format, video encoding profile, video encoding level, communication protocol video resolution screen ratio bitrate for an audio stream bitrate for a video stream encryption standard, and acoustic consideration of the respective video conference feeds of each of the plurality of video conference endpoints;
for each of the participants to the video conference, (i) converting and composing, in real-time, the plurality of audio and video streams into a composite audio and video stream which is compatible with all of the different video conference endpoints, and (ii) rendering the composite audio and video stream at each of the different video conference endpoints;
for each of the composite audio and video streams, (i) building a composite metadata field from a metadata field associated with each of the video streams and (ii) utilizing information from the composite metadata field to transcode and process the composite audio and video stream;
enabling the video conference in real-time among the participants; and
supporting real-time human translator-free multimedia communications during the video conference by simultaneously translating, in real-time, the audio streams between different languages into a preferred language of each participant to the video conference and simultaneously providing the translations to each participant in only the preferred language of the respective participant so as to allow the participants to speak to each other in their respective preferred languages during the video conference.
5 Assignments
0 Petitions
Accused Products
Abstract
A new approach is proposed that contemplates systems and methods to support the operation of a Virtual Media Room or Virtual Meeting Room (VMR), wherein each VMR can accept from a plurality of participants at different geographic locations a variety of video conferencing feeds of audio and video streams from video conference endpoints. The approach further utilizes virtual reality and augmented-reality techniques to transform the video and audio streams from the participants in various customizable ways to achieve a rich set of user experiences. A globally distributed infrastructure supports the sharing of the event among the participants at geographically distributed locations through a plurality of MCUs (Multipoint Control Unit), each configured to process the plurality of audio and video streams from the plurality of video conference endpoints in real time.
168 Citations
6 Claims
-
1. A method, comprising:
-
accepting a plurality of audio and video streams from a plurality of video conference endpoints, wherein each of the video conference endpoints are associated with one of a plurality of participants to a video conference, and wherein the video conference endpoints are of different types; automatically determining the required type of conversion between the plurality of endpoints based on video conference service provider-specific combinations of video encoding format, audio encoding format, video encoding profile, video encoding level, communication protocol video resolution screen ratio bitrate for an audio stream bitrate for a video stream encryption standard, and acoustic consideration of the respective video conference feeds of each of the plurality of video conference endpoints; for each of the participants to the video conference, (i) converting and composing, in real-time, the plurality of audio and video streams into a composite audio and video stream which is compatible with all of the different video conference endpoints, and (ii) rendering the composite audio and video stream at each of the different video conference endpoints; for each of the composite audio and video streams, (i) building a composite metadata field from a metadata field associated with each of the video streams and (ii) utilizing information from the composite metadata field to transcode and process the composite audio and video stream; enabling the video conference in real-time among the participants; and supporting real-time human translator-free multimedia communications during the video conference by simultaneously translating, in real-time, the audio streams between different languages into a preferred language of each participant to the video conference and simultaneously providing the translations to each participant in only the preferred language of the respective participant so as to allow the participants to speak to each other in their respective preferred languages during the video conference. - View Dependent Claims (2, 3, 4)
-
-
5. A non-transitory machine-readable storage medium comprising software instructions that, when executed by a processor, cause the processor to:
-
accept a plurality of audio and video streams from a plurality of video conference endpoints, wherein each of the video conference endpoints are associated with one of a plurality of participants to a video conference, and wherein the video conference endpoints are of different types; automatically determine the required type of conversion between the plurality of endpoints based on video conference service provider-specific combinations of video encoding format, audio encoding format, video encoding profile, video encoding level, communication protocol video resolution screen ratio bitrate for an audio stream bitrate for a video stream encryption standard, and acoustic consideration of the respective video conference feeds of each of the plurality of video conference endpoints; for each of the participants to the video conference, (i) convert and compose, in real-time, the plurality of audio and video streams into a composite audio and video stream which is compatible with all of the different video conference endpoints, and (ii) render the composite audio and video stream at each of the different video conference endpoints; for each of the composite audio and video streams, (i) build a composite metadata field from a metadata field associated with each of the video streams and (ii) utilize information from the composite metadata field to transcode and process the composite audio and video stream; enable the video conference in real-time among the participants; and support real-time human translator-free multimedia communications during the video conference by simultaneously translating, in real-time, the audio streams between different languages into a preferred language of each participant to the video conference and simultaneously providing the translations to each participant in only the preferred language of the respective participant so as to allow the participants to speak to each other in their respective preferred languages during the video conference. - View Dependent Claims (6)
-
Specification