High-performance block-matching VLSI architecture with low memory bandwidth for power-efficient multimedia devices
First Claim
1. A high-performance block-matching Very-Large-Scale Integration (VLSI) architecture, for executing a motion estimation of an coding operation with low memory bandwidth, for a power-efficient multimedia device, the high-performance block-matching VLSI architecture comprising:
- an external memory, for saving data of a search window of a reference frame;
a motion estimation hardware processor, for finding out a plurality of corresponding best matched blocks and a plurality of corresponding motion vectors of a plurality of current blocks of different time frames of a same address from the search window according to a best matching algorithm (BMA); and
a data bus, coupled to the external memory and the motion estimation processor for transmitting data,wherein the motion estimation processor comprises an internal memory, a memory processing block, an address selection processing block, a predicting search path processing block, a BMA processing block, and a motion estimation result processing block,wherein the memory processing block controls a data access operation between the internal memory and the external memory at least by loading at the same time data of a plurality of different current blocks at the same address comprising blocks of time frame t−
n*T, and the search window is in the reference frame, wherein n are all positive integers between and including 0 and m, t is the time of a current block, and T is the interval between two frames;
the address selection processing block selects a current block address in a current frame;
the predicting search path processing block executes a prediction of a search path regarding the plurality of current blocks according to the current block address selected by the address selection processing block, so as to predict the search path corresponding to the plurality of current blocks in the search window, wherein motion vectors, adaptive search ranges, and search paths of a plurality of adjacent blocks of the current block are obtained, a predicted motion vector and a predicted adaptive search range of the current block are predicted, and the predicted search path in the search window is predicted according to the predicted motion vector, the predicted adaptive search range, and a current search pattern of the current block and the search path of the adjacent blocks;
the BMA processing block loads only data designated by the predicted search path of the search window of the reference frame t−
m*T−
T from the external memory to the internal memory, and finds out the best matched blocks and the motion vectors by the BMA, according to the search path predicted by the predicting search path processing block, wherein data designated by the predicted search path is less than data of the search window, m is a positive integer greater than zero and is the maximum number of frames used for motion estimation minus 1; and
the motion estimation result processing block recording the motion vectors of the plurality of current blocks and the best matched blocks.
1 Assignment
0 Petitions
Accused Products
Abstract
A high-performance block-matching VLSI architecture with low memory bandwidth for power-efficient multimedia devices is disclosed. The architecture uses several current blocks with the same spatial address in different current frames to search the best matched blocks in the search window of the reference frame based on the best matching algorithm (BMA) to implement the process of motion estimation in video coding. The scheme of the architecture using several current blocks for one search window greatly increases data reuse, accelerates the process of motion estimation, and reduces the data bandwidth and the power consumption.
12 Citations
10 Claims
-
1. A high-performance block-matching Very-Large-Scale Integration (VLSI) architecture, for executing a motion estimation of an coding operation with low memory bandwidth, for a power-efficient multimedia device, the high-performance block-matching VLSI architecture comprising:
-
an external memory, for saving data of a search window of a reference frame; a motion estimation hardware processor, for finding out a plurality of corresponding best matched blocks and a plurality of corresponding motion vectors of a plurality of current blocks of different time frames of a same address from the search window according to a best matching algorithm (BMA); and a data bus, coupled to the external memory and the motion estimation processor for transmitting data, wherein the motion estimation processor comprises an internal memory, a memory processing block, an address selection processing block, a predicting search path processing block, a BMA processing block, and a motion estimation result processing block, wherein the memory processing block controls a data access operation between the internal memory and the external memory at least by loading at the same time data of a plurality of different current blocks at the same address comprising blocks of time frame t−
n*T, and the search window is in the reference frame, wherein n are all positive integers between and including 0 and m, t is the time of a current block, and T is the interval between two frames;the address selection processing block selects a current block address in a current frame; the predicting search path processing block executes a prediction of a search path regarding the plurality of current blocks according to the current block address selected by the address selection processing block, so as to predict the search path corresponding to the plurality of current blocks in the search window, wherein motion vectors, adaptive search ranges, and search paths of a plurality of adjacent blocks of the current block are obtained, a predicted motion vector and a predicted adaptive search range of the current block are predicted, and the predicted search path in the search window is predicted according to the predicted motion vector, the predicted adaptive search range, and a current search pattern of the current block and the search path of the adjacent blocks; the BMA processing block loads only data designated by the predicted search path of the search window of the reference frame t−
m*T−
T from the external memory to the internal memory, and finds out the best matched blocks and the motion vectors by the BMA, according to the search path predicted by the predicting search path processing block, wherein data designated by the predicted search path is less than data of the search window, m is a positive integer greater than zero and is the maximum number of frames used for motion estimation minus 1; andthe motion estimation result processing block recording the motion vectors of the plurality of current blocks and the best matched blocks. - View Dependent Claims (2, 3)
-
-
4. A high-performance block-matching method for a Very-Large-Scale Integration (VLSI) architecture, for executing a motion estimation of an coding operation with low memory bandwidth, for a power-efficient multimedia device, the high-performance block-matching method comprising:
-
performing by one or more processors; step A;
starting the motion estimation, wherein data to be searched comprises data of a search window of a reference frame and data of a plurality of current blocks of different consecutive time frames of a same block address are saved in an external memory, then entering step B, wherein the plurality of current blocks comprise a block of each time frame t−
n*T, and the search window of the reference frame is at time t−
m*T−
T, wherein n are all positive integers between and including 0 and m, t is the time of a current block at time t, T is the interval between two frames, and m is a positive integer greater than zero and is the maximum number of frames used for motion estimation minus 1;step B;
selecting an address of the current blocks, then entering step C;step C;
loading the data of one current block corresponding to the address of the current block to an internal memory, then entering step D;step D;
finding out a predicted search path, then entering step E, wherein the step D further comprises;obtaining motion vectors, adaptive search ranges, and search paths of a plurality of adjacent blocks of the current block; predicting to obtain a predicted motion vector and a predicted adaptive search range of the current block; and predicting to obtain the predicted search path in the search window, according to the predicted motion vector, the predicted adaptive search range, and a current search pattern of the current block and the search path of the adjacent blocks; step E;
loading only data designated by the predicted search path in the search window from the external memory to the internal memory, wherein only data designated by the predicted search path is less than data of the search window, then entering step F;step F;
executing a best matching algorithm (BMA) matching operation according to a BMA to find out a best matched block according to the data designated by the predicted search path, then entering step G;step G;
determining if the BMA matching operation has been executed to all of the current blocks having the same block address, then entering step I, else entering step H;step H;
loading another current block having the address, and returning back to step D;step I;
completing the motion estimation of the current blocks having the address, according to the BMA matching operation result of the same address, then entering step J;step J;
determining whether BMA matching operations of current blocks of all current block addresses have been completed, then entering step L, else entering step K;step K;
selecting another current block address, and returning back to step C;step L;
generating a motion estimation result, then entering step M; andstep M;
completing the motion estimation. - View Dependent Claims (5, 6, 7, 8, 9, 10)
-
Specification