Return to search

Advancing Learned Lossy Image Compression through Knowledge Distillation and Contextual Clustering

<p dir="ltr">In recent decades, the rapid growth of internet traffic, particularly driven by high-definition images/videos has highlighted the critical need for effective image compression to reduce bit rates and enable efficient data transmission. Learned lossy image compression (LIC), which uses end-to-end deep neural networks, has emerged as a highly promising method, even outperforming traditional methods such as the intra-coding of the versatile video coding (VVC) standard. This thesis contributes to the field of LIC in two ways. First, we present a theoretical bound-guided knowledge distillation technique, which utilizes estimated bound information rate-distortion (R-D) functions to guide the training of LIC models. Implemented with a modified hierarchical variational autoencoder (VAE), this method demonstrates superior rate-distortion performance with reduced computational complexity. Next, we introduce a token mixer neural architecture, referred to as <i>contextual clustering</i>, which serves as an alternative to conventional convolutional layers or self-attention mechanisms in transformer architectures. Contextual clustering groups pixels based on their cosine similarity and uses linear layers to aggregate features within each cluster. By integrating with current LIC methods, we not only improve coding performance but also reduce computational load. </p>

  1. 10.25394/pgs.27320958.v1
Identiferoai:union.ndltd.org:purdue.edu/oai:figshare.com:article/27320958
Date29 October 2024
CreatorsYichi Zhang (19960344)
Source SetsPurdue University
Detected LanguageEnglish
TypeText, Thesis
RightsCC BY 4.0
Relationhttps://figshare.com/articles/thesis/Advancing_Learned_Lossy_Image_Compression_through_Knowledge_Distillation_and_Contextual_Clustering/27320958

Page generated in 0.0049 seconds