Modeling the motion that occurs between frames of a video sequence is a key component of video coding applications. Typically it is not possible to represent the motion between frames by a single model and therefore a quad-tree structure is employed where smaller, variable size regions or blocks are allowed to take on separate motion models. Quad-tree structures however suffer from two fundamental forms of redundancy. First, quad-trees exhibit structural redundancy due to their inability to exploit the dependence between neighboring leaf nodes with different parents. The second form of redundancy is due to the quad-tree structure itself being limited to capture only horizontal and vertical edge discontinuities at dyadically related locations; this means that general discontinuities in the motion field, such as those caused by boundaries of moving objects, become difficult and expensive to model. In our work, we address the issue of structural redundancy by introducing leaf merging. We describe how the intuitively appealing leaf merging step can be incorporated into quad-tree motion representations for a range motion modeling contexts. In particular, the impact of rate-distortion (R-D) optimized merging for two motion coding schemes, these being spatially predictive coding, as used by H.264, and hierarchical coding, are considered. Our experimental results demonstrate that the merging step can provide significant gains in R-D performance for both the hierarchical and spatial prediction schemes. Hierarchical coding has the advantage that it offers scalable access to the motion information; however due to the redundancy it introduces hierarchical coding has not been traditionally pursued. Our work shows that much of this redundancy can be mitigated with the introduction of merging. To enable scalable decoding, we employ a merging scheme which ensures that the dependencies introduced via merging can be hierarchically decoded. Theoretical investigations confirm the inherent advantages of leaf merging for quad-tree motion models. To enable quad-tree structures to better model motion discontinuity boundaries, we introduce geometry information to the quad-tree representation. We choose to model motion and geometry using separate quad-tree structures; thereby enabling each attribute to be refined separately. We extend the leaf merging paradigm to incorporate the dual tree structure allowing regions to be formed that have both motion and geometry attributes, subject to rate-distortion optimization considerations. We employ hierarchical coding for the motion and geometry information and ensure that the merging process retains the property of resolution scalability. Experimental results show that the R-D performance of the merged dual tree representation, is significantly better than conventional motion modeling schemes. Theoretical investigations show that if both motion and boundary geometry can be perfectly modeled, then the merged dual tree representation is able to achieve optimal R-D performance. We explore resolution scalability of merged quad-tree representations. We consider a modified Lagrangian cost function that takes into account the possibility of scalable decoding. Experimental results reveal that the new cost objective can considerably improve scalability performance without significant loss in overall efficiency and with competitive performance at all resolutions.
Identifer | oai:union.ndltd.org:ADTP/272624 |
Date | January 2009 |
Creators | Mathew, Reji Kuruvilla , Electrical Engineering & Telecommunications, Faculty of Engineering, UNSW |
Publisher | Awarded by:University of New South Wales. Electrical Engineering & Telecommunications |
Source Sets | Australiasian Digital Theses Program |
Language | English |
Detected Language | English |
Rights | Copyright Mathew Reji Kuruvilla ., http://unsworks.unsw.edu.au/copyright |
Page generated in 0.0017 seconds