Systems and methods for error resilient scheme for low latency H.264 video coding
First Claim
Patent Images
1. A system, comprising:
- a virtual meeting room (VMR) engine capable of converting and composing in real time a plurality of video conference feeds from a plurality of participants to a composite video and audio stream compatible with each of a plurality of video conference endpoints, wherein the plurality of video conference endpoints are of different types, anda media processing node to support the VMR engine having a video encoder, wherein the video encoder, in operation,encodes and organizes a plurality of picture frames of a video stream at a plurality of temporal layers in a hierarchical P-structure, wherein the organization includes varying a number of layers of the hierarchical P-structure based on a frame rate of the video stream in order to ensure an identical structure length for different video streams of respective different frame rates;
records one or more encoded reference frames of the video stream in a display picture buffer (DPB) associated with the video encoder, wherein each of the reference frames has been encoded by the video encoder;
transmits the plurality of encoded picture frames of the video stream over a network to a video decoder, wherein the video decoder is at one of the plurality of video conference endpoints;
in response to the video decoder providing a negative feedback on one or more frames lost en route from the video encoder to the video decoder, selects one of the reference frames in the DPB that is earlier in time than the one or more lost frames; and
transmits the selected reference frame to the video decoder; and
the video decoder, which in operation,receives the video stream transmitted over the network;
transmits the negative feedback on the one or more lost frames to the video encoder through a back channel mechanism to trigger the selection of the reference frames; and
recovers the one or more lost frames of the plurality of encoded picture frames during decoding of the video stream using a combination of
1) decoding the picture frames of a lower temporal layer in the hierarchical P-structure than a temporal layer of the one or more lost frames and
2) using the selected reference frame as a restarting point for continued decoding of the video stream.
5 Assignments
0 Petitions
Accused Products
Abstract
A new approach is proposed that contemplates systems and methods to support error resilient coding of H.264 compatible video streams for low latency/delay multimedia communication applications by utilizing and integrating a plurality of error resilient H.264 encoding/decoding schemes in an efficient manner. These error resilient H.264 encoding/decoding schemes can be used to offer a better quality video even when there is network loss of picture frames in the video stream. It has the ability to recover from such loss and recover faster than other techniques without requiring additional data/frames to be sent over the network to achieve the same level of recovery.
122 Citations
21 Claims
-
1. A system, comprising:
-
a virtual meeting room (VMR) engine capable of converting and composing in real time a plurality of video conference feeds from a plurality of participants to a composite video and audio stream compatible with each of a plurality of video conference endpoints, wherein the plurality of video conference endpoints are of different types, and a media processing node to support the VMR engine having a video encoder, wherein the video encoder, in operation, encodes and organizes a plurality of picture frames of a video stream at a plurality of temporal layers in a hierarchical P-structure, wherein the organization includes varying a number of layers of the hierarchical P-structure based on a frame rate of the video stream in order to ensure an identical structure length for different video streams of respective different frame rates; records one or more encoded reference frames of the video stream in a display picture buffer (DPB) associated with the video encoder, wherein each of the reference frames has been encoded by the video encoder; transmits the plurality of encoded picture frames of the video stream over a network to a video decoder, wherein the video decoder is at one of the plurality of video conference endpoints; in response to the video decoder providing a negative feedback on one or more frames lost en route from the video encoder to the video decoder, selects one of the reference frames in the DPB that is earlier in time than the one or more lost frames; and transmits the selected reference frame to the video decoder; and the video decoder, which in operation, receives the video stream transmitted over the network; transmits the negative feedback on the one or more lost frames to the video encoder through a back channel mechanism to trigger the selection of the reference frames; and recovers the one or more lost frames of the plurality of encoded picture frames during decoding of the video stream using a combination of
1) decoding the picture frames of a lower temporal layer in the hierarchical P-structure than a temporal layer of the one or more lost frames and
2) using the selected reference frame as a restarting point for continued decoding of the video stream. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method, comprising:
-
encoding and organizing, by a video encoder, a plurality of picture frames of a video stream at a plurality of temporal layers in a hierarchical P-structure, wherein the organizing includes varying a number of layers of the hierarchical P-structure based on a frame rate of the video stream in order to ensure an identical structure length for different video streams of respective different frame rates; recording, by the video encoder, one or more encoded reference frames of the video stream in a display picture buffer (DPB); transmitting, by the video encoder, the plurality of encoded picture frames of the video stream over a network to a video decoder, wherein the video decoder is at one of a plurality of video conference endpoints, and the plurality of video conference endpoints are of different types; accepting, by the video decoder, the video stream transmitted over the network, wherein one or more frames of the plurality of encoded picture frames of the video stream are lost en route from the video encoder to the video decoder; transmitting, by the video decoder, negative feedback on the one or more lost frames to the video encoder through a back channel mechanism to trigger a selection of the reference frames; in response to the video decoder providing the negative feedback on the one or more lost frames, selecting, by the video encoder, a reference frame in the DPB that is earlier in time than the one or more lost frames; transmitting, by the video encoder, the selected reference frame over the network to the video decoder; and recovering the one or more lost frames of the plurality of encoded picture frames during decoding of the video stream using a combination of
1) decoding the encoded picture frames of a lower temporal layer in the hierarchical P-structure than a temporal layer of the one or more lost frames and
2) using the selected reference frame as a restarting point for continued decoding of the video stream. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
Specification