Return to search

Adaptive Motion Estimation Architecture for H.264/AVC Video Codec

This study contributes to the domain of application specific adaptive hardware architectures with a design approach on processing element array, interconnect structure and memory interface concurrently. As summarized below, our architectural design choices push the limits of on-chip data reuse and avoid redundant computations that are essential for the high throughput, small area, and low power demands of the consumer market.Motion estimation (ME) is a key component in the H.264/AVC standard. Full Search (FS) based ME achieves optimal peak signal-to-noise-ratio (PSNR), and is the most adopted algorithm for developing hardware motion estimators. In this study, we first design a variable block size motion estimation (VBSME) engine based on hybrid grained processing elements (PEs) and a 2D programmable interconnect structure, which is adaptive to all block size configurations of H.264. PEs operate in bit-serial manner using MSB-first arithmetic for early termination to reduce the amount of computations, and the 2D architecture enables on-chip data reuse between neighboring PEs in a bit-by-bit pipelined fashion. Our design reduces the gate count by 7x compared to its ASIC counterpart, operates at a comparable frequency while sustaining 30 and 60 frames per second (fps); and outperforms bit parallel and bit serial architectures in terms of throughput and performance per gate.Numerous fast search algorithms (diamond, hexagon, three-step, etc.) have been developed to reduce the computation burden and the excessive amount of memory transactions required by FS, with a compromise in compression quality. We improve our VBSME engine and introduce the first adaptive ME architecture that provides the end user with the flexibility of choosing between the high quality video service during power-rich state (FS mode), and extended video service (fast search mode). We resolve the irregular indexing scheme challenge of three-step search (3SS) by introducing an on-chip buffer structure with a memory interface, which is adaptive to data access patterns of the FS and 3SS methods. The architecture sustains the real time CIF format (352x288) video encoding at 30fps with an operational frequency as low as 17.6MHz, and consumes 1.98mW based on the 45nm technology, outperforming all other FS and 3SS architectures.

Identiferoai:union.ndltd.org:arizona.edu/oai:arizona.openrepository.com:10150/145460
Date January 2011
CreatorsSong, Yang
ContributorsAkoglu, Ali, Akoglu, Ali, Wang, Janet, Hariri, Salim
PublisherThe University of Arizona.
Source SetsUniversity of Arizona
LanguageEnglish
Detected LanguageEnglish
TypeElectronic Dissertation, text
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.

Page generated in 0.0022 seconds