Signaling reference frame distances
First Claim
1. A computer-implemented method for transforming encoded video information using a video decoder, the method comprising:
- receiving, at the video decoder, encoded video information in a bitstream;
with the video decoder, parsing, from the encoded video information, a first code, wherein the first code is signaled at entry-point level for plural pictures, wherein the first code indicates whether reference frame distances for the plural pictures are signaled at picture level in the bitstream or have a default value and are not signaled in the bitstream, and wherein reference frame distance indicates a count of frames between a current video frame and a preceding reference frame; and
with the video decoder, for each picture of the plural pictures;
parsing, from the encoded video information, a second code, wherein the second code indicates a frame coding mode of the picture, and wherein the picture is a current field-coded interlaced video frame;
parsing, from the encoded video information, a third code, wherein the third code indicates a field picture type of the picture;
if the first code indicates that reference frame distances are signaled and if the third code indicates a field picture type of I/I, I/P, P/I, or P/P, then parsing, from the encoded video information, a fourth code for a reference frame distance for the current field-coded interlaced video frame, wherein the fourth code is a variable length code, and wherein the fourth code uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16;
if the first code indicates that reference frame distances have a default value and are not signaled in the bitstream, then using the default value for the reference frame distance for the current field-coded interlaced video frame, wherein the default value is zero; and
decoding the current field-coded interlaced video frame, including, for a given motion vector of a current block or macroblock of the current field-coded interlaced video frame;
computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the current field-coded interlaced video frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the current field-coded interlaced video frame; and
reconstructing the given motion vector of the current block or macroblock using the motion vector predictor and motion vector differential information.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques and tools for signaling reference frame distances are described. For example, a video encoder signals a code for a reference frame distance for a current field-coded interlaced video frame. The code indicates a count of frames (e.g., bi-directionally predicted frames) between the current frame and a preceding reference frame. The code may be a variable length code signaled in the frame header for the current frame. The encoder may selectively signal the use of a default value for reference frame distances rather than signal a reference frame distance per frame. A video decoder performs corresponding parsing and decoding.
148 Citations
19 Claims
-
1. A computer-implemented method for transforming encoded video information using a video decoder, the method comprising:
-
receiving, at the video decoder, encoded video information in a bitstream; with the video decoder, parsing, from the encoded video information, a first code, wherein the first code is signaled at entry-point level for plural pictures, wherein the first code indicates whether reference frame distances for the plural pictures are signaled at picture level in the bitstream or have a default value and are not signaled in the bitstream, and wherein reference frame distance indicates a count of frames between a current video frame and a preceding reference frame; and with the video decoder, for each picture of the plural pictures; parsing, from the encoded video information, a second code, wherein the second code indicates a frame coding mode of the picture, and wherein the picture is a current field-coded interlaced video frame; parsing, from the encoded video information, a third code, wherein the third code indicates a field picture type of the picture; if the first code indicates that reference frame distances are signaled and if the third code indicates a field picture type of I/I, I/P, P/I, or P/P, then parsing, from the encoded video information, a fourth code for a reference frame distance for the current field-coded interlaced video frame, wherein the fourth code is a variable length code, and wherein the fourth code uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16; if the first code indicates that reference frame distances have a default value and are not signaled in the bitstream, then using the default value for the reference frame distance for the current field-coded interlaced video frame, wherein the default value is zero; and decoding the current field-coded interlaced video frame, including, for a given motion vector of a current block or macroblock of the current field-coded interlaced video frame; computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the current field-coded interlaced video frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the current field-coded interlaced video frame; and reconstructing the given motion vector of the current block or macroblock using the motion vector predictor and motion vector differential information. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method for transforming encoded video information using a video decoder, the method comprising:
-
receiving, at the video decoder, encoded video information in a bitstream; with the video decoder, parsing, from the encoded video information, a first syntax element that indicates whether reference frame distances for plural frames are signaled in the bitstream for individual frames of the plural frames or have a default value and are not signaled in the bitstream, wherein the first syntax element is signaled at entry-point level, as part of an entry point header, for an entry point segment that includes the plural frames, and wherein reference frame distance indicates a number of frames between a current frame and a reference frame; and with the video decoder, for each of the plural frames, parsing, from the encoded video information, a second syntax element, wherein the second syntax element indicates a field picture type per frame; if reference frame distances are signaled, and if and only if the second syntax element indicates a field picture type of I/I, I/P, P/I, or P/P, then parsing, from the encoded video information, a third syntax element per frame that indicates a reference frame distance for the frame, wherein the third syntax element is a variable length code, and wherein the third syntax element uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16, otherwise, if reference frame distances have the default value and are not signaled in the bitstream, then using the default value for the reference frame distance for the frame, wherein the default value is zero; and decoding the frame, including, for a given motion vector of a current block or macroblock of the frame; computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the frame; and reconstructing the given motion vector of the current block or macroblock using the motion vector predictor and motion vector differential information. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A video decoder comprising:
-
means for buffering a coded video bitstream; means for decoding a first code, wherein the first code is signaled at entry-point level for plural pictures, and wherein the first code indicates whether reference frame distances for the plural pictures are signaled at picture level in the coded video bitstream or have a default value and are not signaled in the coded video bitstream, and wherein reference frame distance indicates a count of bi-directionally predicted frames between a current video frame and a preceding reference frame; and means for, for each picture of the plural pictures; decoding a second code, wherein the second code indicates a frame coding mode of the picture, wherein the picture is a current frame; decoding a third code, wherein the third code indicates a field picture type of the picture; if the first code indicates that reference frame distances are signaled and if the third code indicates a field picture type of I/I, I/P, P/I, or P/P, decoding a fourth code for a reference frame distance for the current frame, wherein the fourth code is a variable length code, and wherein the fourth code uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16; and otherwise, if the first code indicates that reference frame distances have a default value and are not signaled in the bitstream, then using the default value for the reference frame distance for the current frame, wherein the default value is zero; and decoding the current frame, including, for a given motion vector of a current block or macroblock of the current frame; computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the current frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the current frame; and reconstructing the given motion vector of the current block or macroblock using the motion vector predictor and motion vector differential information. - View Dependent Claims (14)
-
-
15. A computer-implemented method for transforming video information using a video encoder, the method comprising:
-
determining, at the video encoder, whether reference frame distances for plural pictures will be signaled or will have a default value, wherein the default value is zero; if reference frame distances will be signaled for the plural pictures then, with the video encoder, signaling, in a bitstream, a first code indicating that reference frame distances are signaled at picture level in the bitstream for the plural pictures, wherein the first code is signaled at entry-point level for the plural pictures, and wherein reference frame distance indicates a count of frames between a current video frame and a preceding reference frame; and with the video encoder, for each picture of the plural pictures; signaling, in the bitstream, a second code, wherein the second code indicates a frame coding mode of the picture, and wherein the picture is a current field-coded interlaced video frame; signaling, in the bitstream, a third code, wherein the third code indicates a field picture type of the picture; if the first code indicates that reference frame distances are signaled and if the third code indicates a field picture type of I/I, I/P, P/I, or P/P, then signaling, in the bitstream, a fourth code for a reference frame distance for the current field-coded interlaced video frame, wherein the fourth code is a variable length code, and wherein the fourth code uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16; otherwise, if the first code indicates that reference frame distances will have a default value and are not signaled in the bitstream, then skipping the signaling the fourth code; and encoding motion vector information for the current field-coded interlaced video frame, including, for a given motion vector of a current block or macroblock of the current field-coded interlaced video frame; computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the current field-coded interlaced video frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the current field-coded interlaced video frame; determining motion vector differential information using the motion vector predictor and the given motion vector of the current block or macroblock; and encoding and signaling the motion vector differential information. - View Dependent Claims (16, 17, 18, 19)
-
Specification