Signaling reference frame distances

US 8,085,844 B2
Filed: 11/15/2004
Issued: 12/27/2011
Est. Priority Date: 09/07/2003
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for transforming encoded video information using a video decoder, the method comprising:

receiving, at the video decoder, encoded video information in a bitstream;

with the video decoder, parsing, from the encoded video information, a first code, wherein the first code is signaled at entry-point level for plural pictures, wherein the first code indicates whether reference frame distances for the plural pictures are signaled at picture level in the bitstream or have a default value and are not signaled in the bitstream, and wherein reference frame distance indicates a count of frames between a current video frame and a preceding reference frame; and

with the video decoder, for each picture of the plural pictures;

parsing, from the encoded video information, a second code, wherein the second code indicates a frame coding mode of the picture, and wherein the picture is a current field-coded interlaced video frame;

parsing, from the encoded video information, a third code, wherein the third code indicates a field picture type of the picture;

if the first code indicates that reference frame distances are signaled and if the third code indicates a field picture type of I/I, I/P, P/I, or P/P, then parsing, from the encoded video information, a fourth code for a reference frame distance for the current field-coded interlaced video frame, wherein the fourth code is a variable length code, and wherein the fourth code uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16;

if the first code indicates that reference frame distances have a default value and are not signaled in the bitstream, then using the default value for the reference frame distance for the current field-coded interlaced video frame, wherein the default value is zero; and

decoding the current field-coded interlaced video frame, including, for a given motion vector of a current block or macroblock of the current field-coded interlaced video frame;

computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the current field-coded interlaced video frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the current field-coded interlaced video frame; and

reconstructing the given motion vector of the current block or macroblock using the motion vector predictor and motion vector differential information.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques and tools for signaling reference frame distances are described. For example, a video encoder signals a code for a reference frame distance for a current field-coded interlaced video frame. The code indicates a count of frames (e.g., bi-directionally predicted frames) between the current frame and a preceding reference frame. The code may be a variable length code signaled in the frame header for the current frame. The encoder may selectively signal the use of a default value for reference frame distances rather than signal a reference frame distance per frame. A video decoder performs corresponding parsing and decoding.

148 Citations

19 Claims

1. A computer-implemented method for transforming encoded video information using a video decoder, the method comprising:
- receiving, at the video decoder, encoded video information in a bitstream;
  
  with the video decoder, parsing, from the encoded video information, a first code, wherein the first code is signaled at entry-point level for plural pictures, wherein the first code indicates whether reference frame distances for the plural pictures are signaled at picture level in the bitstream or have a default value and are not signaled in the bitstream, and wherein reference frame distance indicates a count of frames between a current video frame and a preceding reference frame; and
  
  with the video decoder, for each picture of the plural pictures;
  
  parsing, from the encoded video information, a second code, wherein the second code indicates a frame coding mode of the picture, and wherein the picture is a current field-coded interlaced video frame;
  
  parsing, from the encoded video information, a third code, wherein the third code indicates a field picture type of the picture;
  
  if the first code indicates that reference frame distances are signaled and if the third code indicates a field picture type of I/I, I/P, P/I, or P/P, then parsing, from the encoded video information, a fourth code for a reference frame distance for the current field-coded interlaced video frame, wherein the fourth code is a variable length code, and wherein the fourth code uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16;
  
  if the first code indicates that reference frame distances have a default value and are not signaled in the bitstream, then using the default value for the reference frame distance for the current field-coded interlaced video frame, wherein the default value is zero; and
  
  decoding the current field-coded interlaced video frame, including, for a given motion vector of a current block or macroblock of the current field-coded interlaced video frame;
  
  computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the current field-coded interlaced video frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the current field-coded interlaced video frame; and
  
  reconstructing the given motion vector of the current block or macroblock using the motion vector predictor and motion vector differential information.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 wherein the count is a count of bi-directionally predicted field-coded interlaced video frames between the current field-coded interlaced video frame and the preceding reference frame.
  - 3. The method of claim 1 wherein the count is an arbitrary value selected so as to improve performance of operations that are based at least in part upon the count.
  - 4. The method of claim 1 wherein the fourth code is signaled in a frame header for the current field-coded interlaced video frame.
  - 5. The method of claim 1 wherein the current field-coded interlaced video frame includes at least one interlaced P-field.
  - 6. The method of claim 1 wherein the motion vector predictor is one of plural motion vector predictors computed for the given motion vector, the plural motion vector predictors including a same field motion vector predictor and an opposite field motion vector predictor, and wherein the decoding further includes, for the given motion vector, selecting one of the plural motion vector predictors to use in reconstructing the given motion vector.

7. A computer-implemented method for transforming encoded video information using a video decoder, the method comprising:
- receiving, at the video decoder, encoded video information in a bitstream;
  
  with the video decoder, parsing, from the encoded video information, a first syntax element that indicates whether reference frame distances for plural frames are signaled in the bitstream for individual frames of the plural frames or have a default value and are not signaled in the bitstream, wherein the first syntax element is signaled at entry-point level, as part of an entry point header, for an entry point segment that includes the plural frames, and wherein reference frame distance indicates a number of frames between a current frame and a reference frame; and
  
  with the video decoder, for each of the plural frames,parsing, from the encoded video information, a second syntax element, wherein the second syntax element indicates a field picture type per frame;
  
  if reference frame distances are signaled, and if and only if the second syntax element indicates a field picture type of I/I, I/P, P/I, or P/P, then parsing, from the encoded video information, a third syntax element per frame that indicates a reference frame distance for the frame, wherein the third syntax element is a variable length code, and wherein the third syntax element uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16,otherwise, if reference frame distances have the default value and are not signaled in the bitstream, then using the default value for the reference frame distance for the frame, wherein the default value is zero; and
  
  decoding the frame, including, for a given motion vector of a current block or macroblock of the frame;
  
  computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the frame; and
  
  reconstructing the given motion vector of the current block or macroblock using the motion vector predictor and motion vector differential information.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The method of claim 7 wherein, for each of the plural frames, the reference frame distance for the frame indicates a count of bi-directionally predicted frames between the frame and a preceding reference frame.
  - 9. The method of claim 7 wherein, for each of the plural frames, the reference frame distance for the frame is an arbitrary value selected so as to improve performance of operations that are based at least in part upon the reference frame distance.
  - 10. The method of claim 7 wherein each of the plural frames is a field-coded interlaced video frame that is not bi-directionally predicted.
  - 11. The method of claim 7 wherein, if reference frame distances are signaled, the third syntax element per frame is part of a frame header for the frame.
  - 12. The method of claim 7 wherein the motion vector predictor is one of plural motion vector predictors computed for the given motion vector, the plural motion vector predictors including a same field motion vector predictor and an opposite field motion vector predictor, and wherein the decoding further includes, for the given motion vector, selecting one of the plural motion vector predictors to use in reconstructing the given motion vector.

13. A video decoder comprising:
- means for buffering a coded video bitstream;
  
  means for decoding a first code, wherein the first code is signaled at entry-point level for plural pictures, and wherein the first code indicates whether reference frame distances for the plural pictures are signaled at picture level in the coded video bitstream or have a default value and are not signaled in the coded video bitstream, and wherein reference frame distance indicates a count of bi-directionally predicted frames between a current video frame and a preceding reference frame; and
  
  means for, for each picture of the plural pictures;
  
  decoding a second code, wherein the second code indicates a frame coding mode of the picture, wherein the picture is a current frame;
  
  decoding a third code, wherein the third code indicates a field picture type of the picture;
  
  if the first code indicates that reference frame distances are signaled and if the third code indicates a field picture type of I/I, I/P, P/I, or P/P, decoding a fourth code for a reference frame distance for the current frame, wherein the fourth code is a variable length code, and wherein the fourth code uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16; and
  
  otherwise, if the first code indicates that reference frame distances have a default value and are not signaled in the bitstream, then using the default value for the reference frame distance for the current frame, wherein the default value is zero; and
  
  decoding the current frame, including, for a given motion vector of a current block or macroblock of the current frame;
  
  computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the current frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the current frame; and
  
  reconstructing the given motion vector of the current block or macroblock using the motion vector predictor and motion vector differential information.
- View Dependent Claims (14)
- - 14. The decoder of claim 13 wherein the fourth code is signaled in a frame header for the current frame.

15. A computer-implemented method for transforming video information using a video encoder, the method comprising:
- determining, at the video encoder, whether reference frame distances for plural pictures will be signaled or will have a default value, wherein the default value is zero;
  
  if reference frame distances will be signaled for the plural pictures then, with the video encoder, signaling, in a bitstream, a first code indicating that reference frame distances are signaled at picture level in the bitstream for the plural pictures, wherein the first code is signaled at entry-point level for the plural pictures, and wherein reference frame distance indicates a count of frames between a current video frame and a preceding reference frame; and
  
  with the video encoder, for each picture of the plural pictures;
  
  signaling, in the bitstream, a second code, wherein the second code indicates a frame coding mode of the picture, and wherein the picture is a current field-coded interlaced video frame;
  
  signaling, in the bitstream, a third code, wherein the third code indicates a field picture type of the picture;
  
  if the first code indicates that reference frame distances are signaled and if the third code indicates a field picture type of I/I, I/P, P/I, or P/P, then signaling, in the bitstream, a fourth code for a reference frame distance for the current field-coded interlaced video frame, wherein the fourth code is a variable length code, and wherein the fourth code uses a 2-bit codeword to represent reference frame distance values of zero, one, and two, and uses an N-bit unary codeword to represent reference frame distance values from three, where N=3, to sixteen, where N=16;
  
  otherwise, if the first code indicates that reference frame distances will have a default value and are not signaled in the bitstream, then skipping the signaling the fourth code; and
  
  encoding motion vector information for the current field-coded interlaced video frame, including, for a given motion vector of a current block or macroblock of the current field-coded interlaced video frame;
  
  computing a motion vector predictor using motion vector values of plural neighbor blocks or macroblocks in the current field-coded interlaced video frame, including scaling at least one of the motion vector values of the plural neighbor blocks or macroblocks according to a scaling factor that varies depending on the reference frame distance for the current field-coded interlaced video frame;
  
  determining motion vector differential information using the motion vector predictor and the given motion vector of the current block or macroblock; and
  
  encoding and signaling the motion vector differential information.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The method of claim 15 wherein the count is a count of bi-directionally predicted field-coded interlaced video frames between the current field-coded interlaced video frame and the preceding reference frame.
  - 17. The method of claim 15 wherein the count is an arbitrary value selected so as to improve performance of operations that are based at least in part upon the count.
  - 18. The method of claim 15 wherein the fourth code is signaled in a frame header for the current field-coded interlaced video frame.
  - 19. The method of claim 15 wherein the current field-coded interlaced video frame includes at least one interlaced P-field.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Holcomb, Thomas W., Mukerjee, Kunal, Lin, Chih-Lung
Primary Examiner(s)
Czekaj, David
Assistant Examiner(s)
Werner, David N

Application Number

US10/990,236
Publication Number

US 20050111547A1
Time in Patent Office

2,598 Days
Field of Search

375/240.12, 375/240.14, 375/240.16, 375/240.13, 375/240.23
US Class Current

375/240.14
CPC Class Codes

H04N 19/112   according to a given displa...

H04N 19/172   the region being a picture,...

H04N 19/517   by encoding

H04N 19/573   Motion compensation with mu...

H04N 19/70   characterised by syntax asp...

Signaling reference frame distances

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

148 Citations

19 Claims

Specification

Use Cases

Quick Links

Others

Signaling reference frame distances

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

148 Citations

19 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others