ENCODING BLOCKS IN VIDEO FRAMES CONTAINING TEXT USING HISTOGRAMS OF GRADIENTS
First Claim
1. An apparatus, comprising:
- a block processing pipeline configured to process blocks of pixels from video frames;
wherein the block processing pipeline comprises a block input component;
wherein, for each of a plurality of blocks of pixels from a video frame, the block input component is configured to;
receive input data representing the block of pixels;
compute gradient values for the block of pixels in two or more directions;
compute one or more histograms representing statistics derived from the gradient values for the block of pixels;
determine a likelihood that the block of pixels represents a portion of the video frame that contains text, wherein to determine the likelihood that the block of pixels represents a portion of the video frame that contains text, the block input component is configured to determine a presence or absence of a dominant gradient direction in the block of pixels, dependent on the one or more computed histograms; and
determine one or more parameter values for encoding the block of pixels, dependent on the likelihood that the block of pixels represents a portion of the video frame that contains text.
1 Assignment
0 Petitions
Accused Products
Abstract
A block input component of a video encoding pipeline may, for a block of pixels in a video frame, compute gradients in multiple directions, and may accumulate counts of the computed gradients in one or more histograms. The block input component may analyze the histogram(s) to compute block-level statistics and determine whether a dominant gradient direction exists in the block, indicating the likelihood that it represents an image containing text. If text is likely, various encoding parameter values may be selected to improve the quality of encoding for the block (e.g., by lowering a quantization parameter value). The computed statistics or selected encoding parameter values may be passed to other stages of the pipeline, and used to bias or control selection of a prediction mode, an encoding mode, or a motion vector. Frame-level or slice-level parameter values may be generated from gradient histograms of multiple blocks.
38 Citations
20 Claims
-
1. An apparatus, comprising:
-
a block processing pipeline configured to process blocks of pixels from video frames; wherein the block processing pipeline comprises a block input component; wherein, for each of a plurality of blocks of pixels from a video frame, the block input component is configured to; receive input data representing the block of pixels; compute gradient values for the block of pixels in two or more directions; compute one or more histograms representing statistics derived from the gradient values for the block of pixels; determine a likelihood that the block of pixels represents a portion of the video frame that contains text, wherein to determine the likelihood that the block of pixels represents a portion of the video frame that contains text, the block input component is configured to determine a presence or absence of a dominant gradient direction in the block of pixels, dependent on the one or more computed histograms; and determine one or more parameter values for encoding the block of pixels, dependent on the likelihood that the block of pixels represents a portion of the video frame that contains text. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method, comprising:
-
inputting data representing a block of pixels from a video frame to a video encoding pipeline comprising a plurality of stages, each stage configured to perform at least one operation on blocks of pixels passing through the pipeline; and performing, by one or more stages of the pipeline; computing gradient values for the block of pixels in two or more directions; computing one or more histograms representing statistics derived from the gradient values for the block of pixels; determining that the block of pixels represents a portion of the video frame that is likely to contain text, wherein said determining comprises determining that there is a dominant gradient direction in the block of pixels, dependent on the one or more computed histograms; in response to said determining that the block of pixels represents a portion of the video frame that is likely to contain text, determining a quantization parameter value for use in encoding the block of pixels in the video encoding pipeline; and making the quantization parameter value available to one or more operations of the video encoding pipeline. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A device, comprising:
-
a memory; and an apparatus configured to process video frames and to store the processed video frames as frame data to the memory; wherein the apparatus is configured to; receive input data representing a block of pixels from a video frame; compute gradient values for the block of pixels in two or more directions; compute one or more histograms representing statistics derived from the gradient values for the block of pixels; store data representing the one or more histograms in a data structure in the memory; determine a classification parameter value for the block of pixels, wherein the classification parameter value indicates a likelihood that the block of pixels represents a portion of the video frame that contains text, wherein to determine the classification parameter value, the apparatus is configured to determine a presence or absence of a dominant gradient direction in the block of pixels, dependent on the one or more computed histograms; store the classification parameter value in the data structure in the memory; and perform an encoding operation for the block of pixels, dependent on the stored data representing the one or more histograms or the stored classification parameter. - View Dependent Claims (20)
-
Specification