Spelling suggestions: "subject:"[een] VIDEO CODING"" "subject:"[enn] VIDEO CODING""
31 |
Image and Video Coding/Transcoding: A Rate Distortion ApproachYu, Xiang January 2008 (has links)
Due to the lossy nature of image/video compression and the expensive bandwidth and computation resources in a multimedia system, one of the key design issues for image and video coding/transcoding is to optimize trade-off among distortion, rate, and/or complexity. This thesis studies the application of rate distortion (RD) optimization approaches to image and video coding/transcoding for exploring the best RD performance of a video codec compatible to the newest video coding standard H.264 and for designing computationally efficient down-sampling algorithms with high visual fidelity in the discrete Cosine transform (DCT) domain.
RD optimization for video coding in this thesis considers two objectives, i.e., to achieve the best encoding efficiency in terms of minimizing the actual RD cost and to maintain decoding compatibility with the newest video coding standard H.264. By the actual RD cost, we mean a cost based on the final reconstruction error and the entire coding rate. Specifically, an operational RD method is proposed based on a soft decision quantization (SDQ) mechanism, which has its root in a fundamental RD theoretic study on fixed-slope lossy data compression. Using SDQ instead of hard decision quantization, we establish a general framework in which motion prediction, quantization, and entropy coding in a hybrid video coding scheme such as H.264 are jointly designed to minimize the actual RD cost on a frame basis. The proposed framework is applicable to optimize any hybrid video coding scheme, provided that specific algorithms are designed corresponding to coding syntaxes of a given standard codec, so as to maintain compatibility with the standard.
Corresponding to the baseline profile syntaxes and the main profile syntaxes of H.264, respectively, we have proposed three RD algorithms---a graph-based algorithm for SDQ given motion prediction and quantization step sizes, an algorithm for residual coding optimization given motion prediction, and an iterative overall algorithm for jointly optimizing motion prediction, quantization, and entropy coding---with them embedded in the indicated order. Among the three algorithms, the SDQ design is the core, which is developed based on a given entropy coding method. Specifically, two SDQ algorithms have been developed based on the context adaptive variable length coding (CAVLC) in H.264 baseline profile and the context adaptive binary arithmetic coding (CABAC) in H.264 main profile, respectively.
Experimental results for the H.264 baseline codec optimization show that for a set of typical testing sequences, the proposed RD method for H.264 baseline coding achieves a better trade-off between rate and distortion, i.e., 12\% rate reduction on average at the same distortion (ranging from 30dB to 38dB by PSNR) when compared with the RD optimization method implemented in H.264 baseline reference codec. Experimental results for optimizing H.264 main profile coding with CABAC show 10\% rate reduction over a main profile reference codec using CABAC, which also suggests 20\% rate reduction over the RD optimization method implemented in H.264 baseline reference codec, leading to our claim of having developed the best codec in terms of RD performance, while maintaining the compatibility with H.264.
By investigating trade-off between distortion and complexity, we have also proposed a designing framework for image/video transcoding with spatial resolution reduction, i.e., to down-sample compressed images/video with an arbitrary ratio in the DCT domain. First, we derive a set of DCT-domain down-sampling methods, which can be represented by a linear transform with double-sided matrix multiplication (LTDS) in the DCT domain. Then, for a pre-selected pixel-domain down-sampling method, we formulate an optimization problem for finding an LTDS to approximate the given pixel-domain method to achieve the best trade-off between visual quality and computational complexity. The problem is then solved by modeling an LTDS with a multi-layer perceptron network and using a structural learning with forgetting algorithm for training the network. Finally, by selecting a pixel-domain reference method with the popular Butterworth lowpass filtering and cubic B-spline interpolation, the proposed framework discovers an LTDS with better visual quality and lower computational complexity when compared with state-of-the-art methods in the literature.
|
32 |
Image and Video Coding/Transcoding: A Rate Distortion ApproachYu, Xiang January 2008 (has links)
Due to the lossy nature of image/video compression and the expensive bandwidth and computation resources in a multimedia system, one of the key design issues for image and video coding/transcoding is to optimize trade-off among distortion, rate, and/or complexity. This thesis studies the application of rate distortion (RD) optimization approaches to image and video coding/transcoding for exploring the best RD performance of a video codec compatible to the newest video coding standard H.264 and for designing computationally efficient down-sampling algorithms with high visual fidelity in the discrete Cosine transform (DCT) domain.
RD optimization for video coding in this thesis considers two objectives, i.e., to achieve the best encoding efficiency in terms of minimizing the actual RD cost and to maintain decoding compatibility with the newest video coding standard H.264. By the actual RD cost, we mean a cost based on the final reconstruction error and the entire coding rate. Specifically, an operational RD method is proposed based on a soft decision quantization (SDQ) mechanism, which has its root in a fundamental RD theoretic study on fixed-slope lossy data compression. Using SDQ instead of hard decision quantization, we establish a general framework in which motion prediction, quantization, and entropy coding in a hybrid video coding scheme such as H.264 are jointly designed to minimize the actual RD cost on a frame basis. The proposed framework is applicable to optimize any hybrid video coding scheme, provided that specific algorithms are designed corresponding to coding syntaxes of a given standard codec, so as to maintain compatibility with the standard.
Corresponding to the baseline profile syntaxes and the main profile syntaxes of H.264, respectively, we have proposed three RD algorithms---a graph-based algorithm for SDQ given motion prediction and quantization step sizes, an algorithm for residual coding optimization given motion prediction, and an iterative overall algorithm for jointly optimizing motion prediction, quantization, and entropy coding---with them embedded in the indicated order. Among the three algorithms, the SDQ design is the core, which is developed based on a given entropy coding method. Specifically, two SDQ algorithms have been developed based on the context adaptive variable length coding (CAVLC) in H.264 baseline profile and the context adaptive binary arithmetic coding (CABAC) in H.264 main profile, respectively.
Experimental results for the H.264 baseline codec optimization show that for a set of typical testing sequences, the proposed RD method for H.264 baseline coding achieves a better trade-off between rate and distortion, i.e., 12\% rate reduction on average at the same distortion (ranging from 30dB to 38dB by PSNR) when compared with the RD optimization method implemented in H.264 baseline reference codec. Experimental results for optimizing H.264 main profile coding with CABAC show 10\% rate reduction over a main profile reference codec using CABAC, which also suggests 20\% rate reduction over the RD optimization method implemented in H.264 baseline reference codec, leading to our claim of having developed the best codec in terms of RD performance, while maintaining the compatibility with H.264.
By investigating trade-off between distortion and complexity, we have also proposed a designing framework for image/video transcoding with spatial resolution reduction, i.e., to down-sample compressed images/video with an arbitrary ratio in the DCT domain. First, we derive a set of DCT-domain down-sampling methods, which can be represented by a linear transform with double-sided matrix multiplication (LTDS) in the DCT domain. Then, for a pre-selected pixel-domain down-sampling method, we formulate an optimization problem for finding an LTDS to approximate the given pixel-domain method to achieve the best trade-off between visual quality and computational complexity. The problem is then solved by modeling an LTDS with a multi-layer perceptron network and using a structural learning with forgetting algorithm for training the network. Finally, by selecting a pixel-domain reference method with the popular Butterworth lowpass filtering and cubic B-spline interpolation, the proposed framework discovers an LTDS with better visual quality and lower computational complexity when compared with state-of-the-art methods in the literature.
|
33 |
Fast Mode Decision Mechanism for Coding Efficiency Improvement in H.264/AVC and SVCChou, Bo-Yin 04 August 2009 (has links)
In order to speedup the encoding process of H.264/AVC and Scalable Video Coding (SVC), Temporal and Spatial Correlation-based Merging and Splitting (TSCMS) fast mode decision algorithm and Coded Block Pattern (CBP)-based fast mode decision algorithm are proposed in this thesis. TSCMS and CBP-based fast mode decision algorithms are applied to H.264/AVC and SVC, respectively. In TSCMS, Temporal Correlation (TC) is used to predict the Motion Vectors (MVs) of 8¡Ñ8 blocks in each macroblock. In addition, the merging and splitting procedure is adopted to predict the motion vectors of other blocks. Afterwards, the spatial correlation is performed to merge 16¡Ñ16 blocks instead of the conventional merge scheme. CBP value is the syntax used at each Macroblock (MB) header to indicate whether an MB contains residual information or not in CBP-based fast mode decision algorithm. The proposed algorithm can exclude the invalid modes for the mode prediction of the current MB in Enhancement Layer (EL) through the CBP values and MB modes of adjacent MBs in EL and the co-located Base Layer (BL) MB modes. Experimental results show that the proposed algorithms reduce computations significantly with negligible PSNR degradation and bit increase when compared to JM 12.3, JSVM 9.12, and the other existing methods.
|
34 |
Object based video coding /Shamim, Md. Ahsan, January 2000 (has links)
Thesis (M.Eng.)--Memorial University of Newfoundland, 2001. / Bibliography: leaves 108-112.
|
35 |
Extensive operators in lattices of partitions for digital video analysis /Gatica Perez, Daniel. January 2001 (has links)
Thesis (Ph. D.)--University of Washington, 2001. / Vita. Includes bibliographical references (p. 169-184).
|
36 |
Fast multi-frame and multi-block selection for H.264 video coding standard /Chang, Andy. January 2003 (has links)
Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2003. / Includes bibliographical references (leaves 57-58). Also available in electronic version. Access restricted to campus users.
|
37 |
Efficient intra prediction algorithm in H.264 /Meng, Bojun. January 2003 (has links)
Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2003. / Includes bibliographical references (leaves 66-68). Also available in electronic version. Access restricted to campus users.
|
38 |
Computational complexity reduction in the spatial scalable video coding encoder /Luo, Enming. January 2009 (has links)
Includes bibliographical references (p. 65-67).
|
39 |
Cross-layer perceptual optimization for wireless video transmissionAbdel Khalek, Amin Nazih 21 January 2014 (has links)
Bandwidth-intensive video streaming applications occupy an overwhelming fraction of bandwidth-limited wireless network traffic. Compressed video data are highly structured and the psycho-visual perception of distortions and losses closely depends on that structure. This dissertation exploits the inherent video data structure to develop perceptually-optimized transmission paradigms at different protocol layers that improve video quality of experience, introduce error resilience, and enable supporting more video users.
First, we consider the problem of network-wide perceptual quality optimization whereby different video users with (possibly different) real-time delay constraints are sharing wireless channel resources. Due to the inherently stochastic nature of wireless fading channels, we provide statistical delay guarantees using the theory of effective capacity. We derive the resource allocation policy that maximizes the sum video quality and show that the optimal operating point per user is such that the rate-distortion slope is the inverse of the supported video source rate per unit bandwidth, termed source spectral efficiency. We further propose a scheduling policy that maximizes the number of scheduled users that meet their QoS requirement.
Next, we develop user-level perceptual quality optimization techniques for non-scalable video streams. For non-scalable videos, we estimate packet loss visibility through a generalized linear model and use for prioritized packet delivery. We solve the problem of mapping video packets to MIMO subchannels and adapting per-stream rates to maximize the total perceptual value of successfully delivered packets per unit time. We show that the solution enables jointly reaping gains in terms of improved video quality and lower latency. Optimized packet-stream mapping enables transmission of more relevant packets over more reliable streams while unequal modulation opportunistically increases the transmission rate on the stronger streams to enable low latency delivery of high priority packets.
Finally, we develop user-level perceptual quality optimization techniques for scalable video streams. We propose online learning of the mapping between packet losses and quality degradation using nonparametric regression. This quality-loss mapping is subsequently used to provide unequal error protection for different video layers with perceptual quality guarantees. Channel-aware scalable codec adaptation and buffer management policies simultaneously ensure continuous high-quality playback. Across the various contributions, analytic results as well as video transmission simulations demonstrate the value of perceptual optimization in improving video quality and capacity. / text
|
40 |
Improved processing techniques for picture sequence coding蔡固庭, Choi, Koo-ting. January 1998 (has links)
published_or_final_version / Electrical Engineering / Master / Master of Philosophy
|
Page generated in 0.0309 seconds