Advanced bi-directional predictive coding of video frames
First Claim
1. In a computer system, a method of processing images in a sequence of video images, the method comprising:
- determining a fraction for a current image in the sequence, wherein the fraction represents an estimated temporal distance position for the current image relative to an interval between a first reference image for the current image and a second reference image for the current image; and
processing the fraction along with a motion vector for the first reference image, wherein the motion vector represents motion in the first reference image relative to a second reference image for the current image, and wherein the processing the fraction along with the motion vector results in a representation of motion in the current image relative to the first reference image.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques and tools for coding/decoding of video images, and in particular, B-frames, are described. In one aspect, a video encoder/decoder determines a fraction for a current image in a sequence. The fraction represents an estimated temporal distance position for the current image relative to an interval between a reference images for the current image. The video encoder/decoder processes the fraction along with a motion vector for a first reference image, resulting in a representation of motion (e.g., constant or variable velocity motion) in the current image. Other aspects are also described, including intra B-frames, forward and backward buffers for motion vector prediction, bitplane encoding of direct mode prediction information, multiple motion vector resolutions/interpolation filters for B-frames, proactive dropping of B-frames, and signaling of dropped predicted frames.
-
Citations
73 Claims
-
1. In a computer system, a method of processing images in a sequence of video images, the method comprising:
-
determining a fraction for a current image in the sequence, wherein the fraction represents an estimated temporal distance position for the current image relative to an interval between a first reference image for the current image and a second reference image for the current image; and
processing the fraction along with a motion vector for the first reference image, wherein the motion vector represents motion in the first reference image relative to a second reference image for the current image, and wherein the processing the fraction along with the motion vector results in a representation of motion in the current image relative to the first reference image. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. In a computer system, a method of processing images in a sequence of video images, the method comprising:
-
determining a fraction for a region of a current image in the sequence, wherein the fraction represents an estimated temporal distance position for the current image relative to an interval between a first reference image for the current image and a second reference image for the current image; and
processing the fraction along with a motion vector for the first reference image, wherein the motion vector represents motion in the first reference image relative to the second reference image, and wherein the processing the fraction along with the motion vector results in a representation of motion in the region of the current image. - View Dependent Claims (14, 15, 16, 17)
-
-
18. In a computer system, a method of encoding images in a sequence of video images, the method comprising:
-
determining a fraction for a current image in the sequence, wherein the current image has a previous reference image and a future reference image, and wherein the fraction represents a temporal position for the current image relative to its reference images;
selecting direct mode prediction for a current macroblock in the current image;
finding a motion vector for a co-located macroblock in the future reference image;
scaling the motion vector for the co-located macroblock using the fraction. - View Dependent Claims (19, 20, 21, 22, 23)
-
-
24. In a computer system, a method of processing images in a sequence of video images, the method comprising:
-
determining a temporal position of a current image in the sequence, wherein the current image has plural references, wherein the temporal position is between a first reference image based on at least one reference for the current image and a second reference image for the current image, and wherein the temporal position is determined independent of time stamps; and
processing the current image based on the temporal position of the current image and a motion vector for the first at least one reference image, wherein the motion vector represents motion in the first reference image relative to the second reference image, and wherein the processing results in a representation of motion in the current image. - View Dependent Claims (25, 26, 27)
-
-
28. In a computer system, a method of encoding a current image in a sequence of video images, the current image having at least two reference images in the sequence, the method comprising:
-
analyzing the at least two reference images along with the current image to determine whether the current image is to be predictively encoded based on the at least two reference images;
based on the analyzing, encoding the current image independently from the at least two reference images; and
assigning an image type to the current image, wherein the image type indicates that the current image is encoded independently from the at least two reference images. - View Dependent Claims (29, 30, 31, 32, 33)
-
-
34. In a video decoder, a method of decoding a current image in an encoded video image sequence, the current image having at least two reference images in the video image sequence, wherein the decoding yields a decoded video stream, the method comprising:
-
receiving an image type for the current image, wherein the image type indicates that the current image is encoded independently from the at least two reference images, and analyzing bit rate constraints for the decoding; and
determining whether to omit the current image from the decoded video stream based on the analyzing and the image type for the current image. - View Dependent Claims (35)
-
-
36. In a computer system, a computer-implemented method of processing video images in a video image sequence, the method comprising:
processing a bit plane for a bi-directionally predicted video image, wherein the bit plane comprises binary information signifying whether macroblocks in the bi-directionally predicted video image are encoded using direct mode prediction or non-direct mode prediction. - View Dependent Claims (37, 38, 39, 40, 41, 42, 43)
-
44. In a computer system, a method of processing images in a sequence of video images, the method comprising:
-
determining a value representing a forward motion vector component for a macroblock in the current image;
determining a value representing a backward motion vector component for the macroblock in the current image;
adding the value representing the forward motion vector to a forward buffer;
adding the value representing the backward motion vector to a backward buffer; and
predicting motion vectors for other macroblocks in the current image using values in the forward buffer and values in the backward buffer. - View Dependent Claims (45, 46, 47)
-
-
48. In a computer system, a method of processing an image in a sequence of video images, the method comprising:
for a direct mode predicted macroblock in the image;
determining a non-zero value representing a forward motion vector component for the direct mode predicted macroblock;
determining a non-zero value representing a backward motion vector component for the direct mode predicted macroblock;
adding the non-zero values to one or more buffers;
wherein values in the one or more buffers are used to predict motion vectors for other macroblocks in the image.
-
49. In a computer system, a method of estimating motion for a bi-directionally predicted image in a sequence of video images, wherein the bi-directionally predicted image comprises macroblocks, and wherein the bi-directionally predicted image has a first reference image and a second reference image, the method comprising:
-
selecting a motion vector resolution for the bi-directionally predicted image from among plural motion vector resolutions;
selecting an interpolation filter for the bi-directionally predicted image from among plural interpolation filters; and
encoding the bi-directionally predicted image using the selected motion vector resolution and the selected interpolation filter. - View Dependent Claims (50, 51, 52, 53, 54)
-
-
55. In a computer system, a method of predicting motion for a bi-directionally predicted image in a sequence of video images, wherein the bi-directionally predicted image comprises macroblocks, and wherein the bi-directionally predicted image has a first reference image and a second reference image, the method comprising:
-
selecting a motion vector resolution for the bi-directionally predicted image from among plural motion vector resolutions, wherein the plural motion vector resolutions include a half-pixel resolution and a quarter pixel resolution;
selecting an interpolation filter for the bi-directionally predicted image from among plural interpolation filters, wherein the plural interpolation filters include a bicubic interpolation filter and a bilinear interpolation filter; and
encoding the bi-directionally predicted image using the selected motion vector resolution and the selected interpolation filter.
-
-
56. In a computer system, a method of predicting motion for a bi-directionally predicted image in a sequence of video images, wherein the bi-directionally predicted image comprises macroblocks, and wherein the bi-directionally predicted image has a first reference image and a second reference image, the method comprising:
-
selecting a motion vector mode for the bi-directionally predicted image from a set of plural motion vector modes, wherein the set of plural motion vector modes includes;
a one motion vector, quarter-pixel resolution, bicubic interpolation filter mode;
a one motion vector, half-pixel resolution, bicubic interpolation filter mode; and
a one motion vector, half-pixel resolution, bilinear interpolation filter mode; and
encoding the bi-directionally predicted image using the selected motion vector mode. - View Dependent Claims (57)
-
-
58. In a computer system, a method of processing images in a video image sequence to yield a processed video image sequence, wherein the processing is performed at a constrained bit rate, the method comprising:
-
monitoring bits used during the processing;
based on the monitoring, determining whether to omit a current image having two reference images from the processed video image sequence, wherein the current image has a number of bits required to process the current image;
wherein, at the time of the determining, a number of bits available for use in the processing is greater than or equal to the number of bits required to process the current image. - View Dependent Claims (59, 60, 61, 62, 63, 64, 65)
-
-
66. In a computer system, a method of processing images in a video image sequence to yield a processed video image sequence, wherein the images in the video image sequence comprise images having two reference images, wherein the processing is performed at a constrained bit rate, and wherein images in the video image sequence are operable to be omitted from the processed video image sequence based on the constrained bit rate, the method comprising:
determining whether to omit a current image having two reference images from the processed video image sequence, wherein the current image has a number of bits required to process the current image, and wherein the determining comprises;
if more than half of n images processed prior to the current image were omitted from the processed video image sequence, then omitting the current image from the processed video sequence if the number of bits required to process the current image is greater than the average bits per image used to process the n images processed prior to the current image; and
if half or less than half of the n images processed prior to the current image were omitted from the processed video image sequence, then omitting the current image from the processed video sequence if the number of bits required to process the current image is greater than the twice the average bits per image used to process the n images processed prior to the current image. - View Dependent Claims (67)
-
68. A method of processing a video image sequence, wherein the processing yields an encoded video image sequence in a bit stream having plural bit stream levels, wherein the plural bit stream levels include a frame level, and wherein the video image sequence comprises predicted images, the method comprising:
-
omitting a predicted image in the video image sequence from the encoded video image sequence; and
representing the omitted predicted image with a frame-level indicator in the bit stream;
wherein the frame-level indicator is operable to indicate the omitted predicted image to a video decoder. - View Dependent Claims (69, 70, 71, 72, 73)
-
Specification