Return to search

Visually lossless coding for the HEVC standard : efficient perceptual quantisation contributions for HEVC

In the context of video compression, visually lossless coding refers to a form of perceptual compression. The objectives are as follows: i) to lossy code a raw video sequence to the lowest possible bitrate; ii) to ensure that the compressed sequence is perceptually identical to the raw video data. Because of the vast bitrate reductions which cannot otherwise be achieved, the research and development of visually lossless coding techniques (e.g., perceptual quantisation methods) is considered to be important in contemporary video compression research, particularly for the High Efficiency Video Coding (HEVC) standard. The default quantisation techniques in HEVC — namely, Uniform Reconstruction Quantisation (URQ) and Rate Distortion Optimised Quantisation (RDOQ) — are not perceptually optimised. Neither URQ nor RDOQ take into account the Modulation Transfer Function (MTF)-based visual masking properties of the Human Visual System (HVS); e.g., luma and chroma spatial masking. Moreover, URQ and RDOQ do not intrinsically possess the capacity to distinguish luma data from chroma data. Both of these shortcomings can lead to coding inefficiency (i.e., wasting bits by not removing perceptually irrelevant data). Therefore, it is desirable to develop visually lossless coding (perceptual quantisation) techniques for HEVC. For example, by taking chrominance masking into account, perceptual quantisation techniques can be designed to discard — to a very high degree — chroma-based psychovisual redundancies from the chroma channels in raw YCbCr video data. To this end, four novel perceptual quantisation contributions are proposed in this thesis. In Chapter 3, a novel transform coefficient-level perceptual quantisation method is proposed. In HEVC, each frequency sub-band in the Discrete Cosine Transform (DCT) frequency domain constitutes a different level of perceptual importance to the HVS. In terms of perceptual importance, the DC coefficient (very low frequency) is the most important transform coefficient, whereas the AC coefficients farthest away from the DC coefficient (very high frequency AC coefficients) are the least perceptually relevant. Therefore, the proposed technique is designed to quantise AC coefficients based on their Euclidean distance from the DC coefficient. In Chapter 4, two novel perceptual quantisation methods are proposed, which are based on HVS visual masking in the spatial domain. The first technique operates at the Coding Unit (CU) level and the second operates at the Coding Block (CB) level. Both techniques exploit the fact that the HVS can tolerate high levels of distortion in high variance (busy) regions of compressed luma and chroma data. The CU-level method adjusts the Quantisation Parameter (QP) of a 2N×2N CU based on cross colour channel variance computations. The CB-level technique separately adjusts the QP of the Y, Cb and Cr CBs in a CU based on separate variance computations in each colour channel. In Chapter 5, a novel CB-level luma and chroma perceptual quantisation technique — based on a Just Noticeable Distortion (JND) model — is proposed for HEVC. The objective of this technique is to attain visually lossless coding at extremely low bitrates by exploiting HVS-related luminance adaptation and chrominance adaptation. Consequently, this facilitates JND perceptual quantisation based on luminance spatial masking and chrominance spatial masking. The proposed technique applies high levels of perceptual quantisation to luma and chroma data, which is achieved by separately adjusting the Quantisation Step Sizes (QSteps) at the level of the Y CB, the Cb CB and the Cr CB in a CU. To the best of the author’s knowledge, this is the first JND-based perceptual quantisation technique that is compatible with high bit depth YCbCr data irrespective of its chroma sampling ratio. The novel techniques proposed in this thesis are evaluated thoroughly. The methodology utilised in the experiments consists of an exhaustive subjective visual quality assessment in addition to an extensive objective visual quality evaluation. The subjective evaluation is based on the International Telecommunications Union (ITU) standardised assessments known as ITU-R: Rec. P.910. In these tests, several participants undertake a considerable number of subjective visual inspections (e.g., spatiotemporal analyses of the compressed sequences versus the raw video data) to ascertain the efficacy of the proposed contributions. The objective visual quality evaluation includes quantifying the mathematical reconstruction quality of the video data compressed by the proposed techniques. This is carried out by employing the Structural Similarity Index (SSIM) and Peak Signal-to-Noise Ratio (PSNR) visual quality metrics.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:752470
Date January 2017
CreatorsPrangnell, Lee
PublisherUniversity of Warwick
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://wrap.warwick.ac.uk/106761/

Page generated in 0.002 seconds