Return to search

On the Rate-Distortion-Perception Tradeoff for Lossy Compression

Deep generative models when utilized in lossy image compression tasks can reconstruct realistic looking outputs even at extremely low bit-rates, while traditional compression methods often exhibit noticeable artifacts under similar conditions. As a result, there has been a substantial surge of interest in both the information theoretic aspects and the practical architectures of deep learning based image compression. This thesis makes contributions to the emerging framework of rate-distortion-perception theory. The main results are summarized as follows:
1. We investigate the tradeoff among rate, distortion, and perception for binary sources. The distortion considered here is the Hamming distortion and the perception quality is measured by the total variation distance. We first derive a closed-form expression for the rate-distortion-perception tradeoff in the one-shot setting. This is followed by a complete characterization of the achievable distortion-perception region for a general representation. We then consider the universal setting in which the encoder is one-size-fits-all, and derive upper and lower bounds on the minimum rate penalty. Finally, we study successive refinement for both point-wise and set-wise versions of perception-constrained lossy compression. A necessary and sufficient condition for point-wise successive refinement and a sufficient condition for the successive refinability of universal representations are provided.
2. Next, we characterize the expression for the rate-distortion-perception function of vector Gaussian sources, which extends the result in the scalar counterpart, and show that in the high-perceptual-quality regime, each component of the reconstruction (including high-frequency components) is strictly correlated with that of the source, which is in contrast to the traditional water-filling solution. This result is obtained by optimizing over all possible encoder-decoder pairs subject to the distortion and perception constraints. We then consider the notion of universal representation where the encoder is fixed and the decoder is adapted to achieve different distortion-perception pairs. We characterize the achievable distortion-perception region for a fixed representation and demonstrate that the corresponding distortion-perception tradeoff is approximately optimal.
Our findings significantly enrich the nascent rate-distortion-perception theory, establishing a solid foundation for the field of learned image compression. / None / Doctor of Philosophy (PhD)

Identiferoai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/28976
Date January 2023
CreatorsQian, Jingjing
ContributorsChen, Jun, Electrical and Computer Engineering
Source SetsMcMaster University
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0063 seconds