• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 34
  • 4
  • 1
  • 1
  • 1
  • Tagged with
  • 45
  • 45
  • 23
  • 15
  • 15
  • 11
  • 11
  • 11
  • 10
  • 9
  • 8
  • 8
  • 7
  • 7
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Codage d'images avec et sans pertes à basse complexité et basé contenu / Lossy and lossless image coding with low complexity and based on the content

Liu, Yi 18 March 2015 (has links)
Ce projet de recherche doctoral vise à proposer solution améliorée du codec de codage d’images LAR (Locally Adaptive Resolution), à la fois d’un point de vue performances de compression et complexité. Plusieurs standards de compression d’images ont été proposés par le passé et mis à profit dans de nombreuses applications multimédia, mais la recherche continue dans ce domaine afin d’offrir de plus grande qualité de codage et/ou de plus faibles complexité de traitements. JPEG fut standardisé il y a vingt ans, et il continue pourtant à être le format de compression le plus utilisé actuellement. Bien qu’avec de meilleures performances de compression, l’utilisation de JPEG 2000 reste limitée due à sa complexité plus importe comparée à JPEG. En 2008, le comité de standardisation JPEG a lancé un appel à proposition appelé AIC (Advanced Image Coding). L’objectif était de pouvoir standardiser de nouvelles technologies allant au-delà des standards existants. Le codec LAR fut alors proposé comme réponse à cet appel. Le système LAR tend à associer une efficacité de compression et une représentation basée contenu. Il supporte le codage avec et sans pertes avec la même structure. Cependant, au début de cette étude, le codec LAR ne mettait pas en oeuvre de techniques d’optimisation débit/distorsions (RDO), ce qui lui fut préjudiciable lors de la phase d’évaluation d’AIC. Ainsi dans ce travail, il s’agit dans un premier temps de caractériser l’impact des principaux paramètres du codec sur l’efficacité de compression, sur la caractérisation des relations existantes entre efficacité de codage, puis de construire des modèles RDO pour la configuration des paramètres afin d’obtenir une efficacité de codage proche de l’optimal. De plus, basée sur ces modèles RDO, une méthode de « contrôle de qualité » est introduite qui permet de coder une image à une cible MSE/PSNR donnée. La précision de la technique proposée, estimée par le rapport entre la variance de l’erreur et la consigne, est d’environ 10%. En supplément, la mesure de qualité subjective est prise en considération et les modèles RDO sont appliqués localement dans l’image et non plus globalement. La qualité perceptuelle est visiblement améliorée, avec un gain significatif mesuré par la métrique de qualité objective SSIM. Avec un double objectif d’efficacité de codage et de basse complexité, un nouveau schéma de codage LAR est également proposé dans le mode sans perte. Dans ce contexte, toutes les étapes de codage sont modifiées pour un meilleur taux de compression final. Un nouveau module de classification est également introduit pour diminuer l’entropie des erreurs de prédiction. Les expérimentations montrent que ce codec sans perte atteint des taux de compression équivalents à ceux de JPEG 2000, tout en économisant 76% du temps de codage et de décodage. / This doctoral research project aims at designing an improved solution of the still image codec called LAR (Locally Adaptive Resolution) for both compression performance and complexity. Several image compression standards have been well proposed and used in the multimedia applications, but the research does not stop the progress for the higher coding quality and/or lower coding consumption. JPEG was standardized twenty years ago, while it is still a widely used compression format today. With a better coding efficiency, the application of the JPEG 2000 is limited by its larger computation cost than the JPEG one. In 2008, the JPEG Committee announced a Call for Advanced Image Coding (AIC). This call aims to standardize potential technologies going beyond existing JPEG standards. The LAR codec was proposed as one response to this call. The LAR framework tends to associate the compression efficiency and the content-based representation. It supports both lossy and lossless coding under the same structure. However, at the beginning of this study, the LAR codec did not implement the rate-distortion-optimization (RDO). This shortage was detrimental for LAR during the AIC evaluation step. Thus, in this work, it is first to characterize the impact of the main parameters of the codec on the compression efficiency, next to construct the RDO models to configure parameters of LAR for achieving optimal or sub-optimal coding efficiencies. Further, based on the RDO models, a “quality constraint” method is introduced to encode the image at a given target MSE/PSNR. The accuracy of the proposed technique, estimated by the ratio between the error variance and the setpoint, is about 10%. Besides, the subjective quality measurement is taken into consideration and the RDO models are locally applied in the image rather than globally. The perceptual quality is improved with a significant gain measured by the objective quality metric SSIM (structural similarity). Aiming at a low complexity and efficient image codec, a new coding scheme is also proposed in lossless mode under the LAR framework. In this context, all the coding steps are changed for a better final compression ratio. A new classification module is also introduced to decrease the entropy of the prediction errors. Experiments show that this lossless codec achieves the equivalent compression ratio to JPEG 2000, while saving 76% of the time consumption in average in encoding and decoding.
32

Rate-distortion based video coding with adaptive mean-removed vector quantization

Hamzaoui, Raouf, Saupe, Dietmar, Wagner, Marcel 01 February 2019 (has links)
In this paper we improve the rate-distortion performance of a previously proposed video coder based on frame replenishment and adaptive mean-removed vector quantization. This is realized by determining for each block of a given frame the optimal encoding mode in the rate-distortion sense. The algorithm is a new contribution to very low bit rate video coding with adaptive vector quantization suitable for videophone applications. Experimental results comparing the two coders for several test sequences at different bit rates are provided.
33

Matching Pursuit and Residual Vector Quantization: Applications in Image Coding

Ebrahimi-Moghadam, Abbas 09 1900 (has links)
In this thesis, novel progressive scalable region-of-interest (ROI) image coding schemes with rate-distortion-complexity trade-off based on residual vector quantization (RVQ) and matching pursuit (MP) are developed. RVQ and MP provide the encoder with multi-resolution signal analysis tools, which are useful for rate-distortion trade-off and can be used to render a selected region of an image with a specific quality. An image quality refinement strategy is presented in this thesis, which improves the quality of the ROI in a progressive manner. The reconstructed image can mimic foveated images in perceptual image coding context. The systems are unbalanced in the sense that the decoders have less computational requirements than the encoders. The methods also provide interactive way of information refinement for regions of image with receiver 's higher priority. The receiver is free to select multiple regions of interest and change his/her mind and choose alternative regions in the middle of signal transmission. The proposed RVQ and MP based image coding methods in this thesis raise a couple of issues and reveal some capabilities in image coding and communication. In RVQ based image coding, the effects of dictionary size, number of RVQ stages and the size of image blocks on the reconstructed image quality, the resulting bit rate, and the computational complexity are investigated. The progressive nature of the resulting bit-stream makes RVQ and MP based image coding methods suitable platforms for unequal error protection. Researchers have paid lots of attention to joint source-channel ( JSC) coding in recent years. In this popular framework, JSC decoding based on residual redundancy exploitation of a source coder output bit-stream is an interesting bandwidth efficient approach for signal reconstruction. In this thesis, we also addressed JSC decoding and error concealment problem for matching pursuit based coded images transmitted over a noisy memoryless channel. The problem is solved on minimum mean squared error (MMSE) estimation foundation and a suboptimal solution is devised, which yields high quality error concealment with different levels of computational complexity. The proposed decoding and error concealment solution takes advantage of the residual redundancy, which exists in neighboring image blocks as well as neighboring MP analysis stages, to improve the quality of the images with no increase in the required bandwidth. The effects of different parameters such as MP dictionary size and number of analysis stages on the performance of the proposed soft decoding method have also been investigated. / Thesis / Doctor of Philosophy (PhD)
34

Optimal source coding with signal transfer function constraints

Derpich, Milan January 2009 (has links)
Research Doctorate - Doctor of Philosophy (PhD) / This thesis presents results on optimal coding and decoding of discrete-time stochastic signals, in the sense of minimizing a distortion metric subject to a constraint on the bit-rate and on the signal transfer function from source to reconstruction. The first (preliminary) contribution of this thesis is the introduction of new distortion metric that extends the mean squared error (MSE) criterion. We give this extension the name Weighted-Correlation MSE (WCMSE), and use it as the distortion metric throughout the thesis. The WCMSE is a weighted sum of two components of the MSE: the variance of the error component uncorrelated to the source, on the one hand, and the remainder of the MSE, on the other. The WCMSE can take account of signal transfer function constraints by assigning a larger weight to deviations from a target signal transfer function than to source-uncorrelated distortion. Within this framework, the second contribution is the solution of a family of feedback quantizer design problems for wide sense stationary sources using an additive noise model for quantization errors. These associated problems consist of finding the frequency response of the filters deployed around a scalar quantizer that minimize the WCMSE for a fixed quantizer signal-to-(granular)-noise ratio (SNR). This general structure, which incorporates pre-, post-, and feedback filters, includes as special cases well known source coding schemes such as pulse coded modulation (PCM), Differential Pulse-Coded Modulation (DPCM), Sigma Delta converters, and noise-shaping coders. The optimal frequency response of each of the filters in this architecture is found for each possible subset of the remaining filters being given and fixed. These results are then applied to oversampled feedback quantization. In particular, it is shown that, within the linear model used, and for a fixed quantizer SNR, the MSE decays exponentially with oversampling ratio, provided optimal filters are used at each oversampling ratio. If a subtractively dithered quantizer is utilized, then the noise model is exact, and the SNR constraint can be directly related to the bit-rate if entropy coding is used, regardless of the number of quantization levels. On the other hand, in the case of fixed-rate quantization, the SNR is related to the number of quantization levels, and hence to the bit-rate, when overload errors are negligible. It is shown that, for sources with unbounded support, the latter condition is violated for sufficiently large oversampling ratios. By deriving an upper bound on the contribution of overload errors to the total WCMSE, a lower bound for the decay rate of the WCMSE as a function of the oversampling ratio is found for fixed-rate quantization of sources with finite or infinite support. The third main contribution of the thesis is the introduction of the rate-distortion function (RDF) when WCMSE is the distortion metric, denoted by WCMSE-RDF. We provide a complete characterization for Gaussian sources. The resulting WCMSE-RDF yields, as special cases, Shannon's RDF, as well as the recently introduced RDF for source-uncorrelated distortions (RDF-SUD). For cases where only source-uncorrelated distortion is allowed, the RDF-SUD is extended to include the possibility of linear-time invariant feedback between reconstructed signal and coder input. It is also shown that feedback quantization schemes can achieve a bit-rate only 0.254 bits/sample above this RDF by using the same filters that minimize the reconstruction MSE for a quantizer-SNR constraint. The fourth main contribution of this thesis is to provide a set of conditions under which knowledge of a realization of the RDF can be used directly to solve encoder-decoder design optimization problems. This result has direct implications in the design of subband coders with feedback, as well as in the design of encoder-decoder pairs for applications such as networked control. As the fifth main contribution of this thesis, the RDF-SUD is utilized to show that, for Gaussian sta-tionary sources with memory and MSE distortion criterion, an upper bound on the information-theoretic causal RDF can be obtained by means of an iterative numerical procedure, at all rates. This bound is tighter than 0:5 bits/sample. Moreover, if there exists a realization of the causal RDF in which the re-construction error is jointly stationary with the source, then the bound obtained coincides with the causal RDF. The iterative procedure proposed here to obtain Ritc(D) also yields a characterization of the filters in a scalar feedback quantizer having an operational rate that exceeds the bound by less than 0:254 bits/sample. This constitutes an upper bound on the optimal performance theoretically attainable by any causal source coder for stationary Gaussian sources under the MSE distortion criterion.
35

Transform Coefficient Thresholding and Lagrangian Optimization for H.264 Video Coding / Transformkoefficient-tröskling och Lagrangeoptimering för H.264 Videokodning

Carlsson, Pontus January 2004 (has links)
<p>H.264, also known as MPEG-4 Part 10: Advanced Video Coding, is the latest MPEG standard for video coding. It provides approximately 50% bit rate savings for equivalent perceptual quality compared to any previous standard. In the same fashion as previous MPEG standards, only the bitstream syntax and the decoder are specified. Hence, coding performance is not only determined by the standard itself but also by the implementation of the encoder. In this report we propose two methods for improving the coding performance while remaining fully compliant to the standard. </p><p>After transformation and quantization, the transform coefficients are usually entropy coded and embedded in the bitstream. However, some of them might be beneficial to discard if the number of saved bits are sufficiently large. This is usually referred to as coefficient thresholding and is investigated in the scope of H.264 in this report. </p><p>Lagrangian optimization for video compression has proven to yield substantial improvements in perceived quality and the H.264 Reference Software has been designed around this concept. When performing Lagrangian optimization, lambda is a crucial parameter that determines the tradeoff between rate and distortion. We propose a new method to select lambda and the quantization parameter for non-reference frames in H.264. </p><p>The two methods are shown to achieve significant improvements. When combined, they reduce the bitrate around 12%, while preserving the video quality in terms of average PSNR. </p><p>To aid development of H.264, a software tool has been created to visualize the coding process and present statistics. This tool is capable of displaying information such as bit distribution, motion vectors, predicted pictures and motion compensated block sizes.</p>
36

Transform Coefficient Thresholding and Lagrangian Optimization for H.264 Video Coding / Transformkoefficient-tröskling och Lagrangeoptimering för H.264 Videokodning

Carlsson, Pontus January 2004 (has links)
H.264, also known as MPEG-4 Part 10: Advanced Video Coding, is the latest MPEG standard for video coding. It provides approximately 50% bit rate savings for equivalent perceptual quality compared to any previous standard. In the same fashion as previous MPEG standards, only the bitstream syntax and the decoder are specified. Hence, coding performance is not only determined by the standard itself but also by the implementation of the encoder. In this report we propose two methods for improving the coding performance while remaining fully compliant to the standard. After transformation and quantization, the transform coefficients are usually entropy coded and embedded in the bitstream. However, some of them might be beneficial to discard if the number of saved bits are sufficiently large. This is usually referred to as coefficient thresholding and is investigated in the scope of H.264 in this report. Lagrangian optimization for video compression has proven to yield substantial improvements in perceived quality and the H.264 Reference Software has been designed around this concept. When performing Lagrangian optimization, lambda is a crucial parameter that determines the tradeoff between rate and distortion. We propose a new method to select lambda and the quantization parameter for non-reference frames in H.264. The two methods are shown to achieve significant improvements. When combined, they reduce the bitrate around 12%, while preserving the video quality in terms of average PSNR. To aid development of H.264, a software tool has been created to visualize the coding process and present statistics. This tool is capable of displaying information such as bit distribution, motion vectors, predicted pictures and motion compensated block sizes.
37

Rate Distortion Theory for Causal Video Coding: Characterization, Computation Algorithm, Comparison, and Code Design

Zheng, Lin January 2012 (has links)
Due to the sheer volume of data involved, video coding is an important application of lossy source coding, and has received wide industrial interest and support as evidenced by the development and success of a series of video coding standards. All MPEG-series and H-series video coding standards proposed so far are based upon a video coding paradigm called predictive video coding, where video source frames Xᵢ,i=1,2,...,N, are encoded in a frame by frame manner, the encoder and decoder for each frame Xᵢ, i =1, 2, ..., N, enlist help only from all previous encoded frames Sj, j=1, 2, ..., i-1. In this thesis, we will look further beyond all existing and proposed video coding standards, and introduce a new coding paradigm called causal video coding, in which the encoder for each frame Xᵢ can use all previous original frames Xj, j=1, 2, ..., i-1, and all previous encoded frames Sj, while the corresponding decoder can use only all previous encoded frames. We consider all studies, comparisons, and designs on causal video coding from an information theoretic point of view. Let R*c(D₁,...,D_N) (R*p(D₁,...,D_N), respectively) denote the minimum total rate required to achieve a given distortion level D₁,...,D_N > 0 in causal video coding (predictive video coding, respectively). A novel computation approach is proposed to analytically characterize, numerically compute, and compare the minimum total rate of causal video coding R*c(D₁,...,D_N) required to achieve a given distortion (quality) level D₁,...,D_N > 0. Specifically, we first show that for jointly stationary and ergodic sources X₁, ..., X_N, R*c(D₁,...,D_N) is equal to the infimum of the n-th order total rate distortion function R_{c,n}(D₁,...,D_N) over all n, where R_{c,n}(D₁,...,D_N) itself is given by the minimum of an information quantity over a set of auxiliary random variables. We then present an iterative algorithm for computing R_{c,n}(D₁,...,D_N) and demonstrate the convergence of the algorithm to the global minimum. The global convergence of the algorithm further enables us to not only establish a single-letter characterization of R*c(D₁,...,D_N) in a novel way when the N sources are an independent and identically distributed (IID) vector source, but also demonstrate a somewhat surprising result (dubbed the more and less coding theorem)---under some conditions on source frames and distortion, the more frames need to be encoded and transmitted, the less amount of data after encoding has to be actually sent. With the help of the algorithm, it is also shown by example that R*c(D₁,...,D_N) is in general much smaller than the total rate offered by the traditional greedy coding method by which each frame is encoded in a local optimum manner based on all information available to the encoder of the frame. As a by-product, an extended Markov lemma is established for correlated ergodic sources. From an information theoretic point of view, it is interesting to compare causal video coding and predictive video coding, which all existing video coding standards proposed so far are based upon. In this thesis, by fixing N=3, we first derive a single-letter characterization of R*p(D₁,D₂,D₃) for an IID vector source (X₁,X₂,X₃) where X₁ and X₂ are independent, and then demonstrate the existence of such X₁,X₂,X₃ for which R*p(D₁,D₂,D₃)>R*c(D₁,D₂,D₃) under some conditions on source frames and distortion. This result makes causal video coding an attractive framework for future video coding systems and standards. The design of causal video coding is also considered in the thesis from an information theoretic perspective by modeling each frame as a stationary information source. We first put forth a concept called causal scalar quantization, and then propose an algorithm for designing optimum fixed-rate causal scalar quantizers for causal video coding to minimize the total distortion among all sources. Simulation results show that in comparison with fixed-rate predictive scalar quantization, fixed-rate causal scalar quantization offers as large as 16% quality improvement (distortion reduction).
38

Rate Distortion Theory for Causal Video Coding: Characterization, Computation Algorithm, Comparison, and Code Design

Zheng, Lin January 2012 (has links)
Due to the sheer volume of data involved, video coding is an important application of lossy source coding, and has received wide industrial interest and support as evidenced by the development and success of a series of video coding standards. All MPEG-series and H-series video coding standards proposed so far are based upon a video coding paradigm called predictive video coding, where video source frames Xᵢ,i=1,2,...,N, are encoded in a frame by frame manner, the encoder and decoder for each frame Xᵢ, i =1, 2, ..., N, enlist help only from all previous encoded frames Sj, j=1, 2, ..., i-1. In this thesis, we will look further beyond all existing and proposed video coding standards, and introduce a new coding paradigm called causal video coding, in which the encoder for each frame Xᵢ can use all previous original frames Xj, j=1, 2, ..., i-1, and all previous encoded frames Sj, while the corresponding decoder can use only all previous encoded frames. We consider all studies, comparisons, and designs on causal video coding from an information theoretic point of view. Let R*c(D₁,...,D_N) (R*p(D₁,...,D_N), respectively) denote the minimum total rate required to achieve a given distortion level D₁,...,D_N > 0 in causal video coding (predictive video coding, respectively). A novel computation approach is proposed to analytically characterize, numerically compute, and compare the minimum total rate of causal video coding R*c(D₁,...,D_N) required to achieve a given distortion (quality) level D₁,...,D_N > 0. Specifically, we first show that for jointly stationary and ergodic sources X₁, ..., X_N, R*c(D₁,...,D_N) is equal to the infimum of the n-th order total rate distortion function R_{c,n}(D₁,...,D_N) over all n, where R_{c,n}(D₁,...,D_N) itself is given by the minimum of an information quantity over a set of auxiliary random variables. We then present an iterative algorithm for computing R_{c,n}(D₁,...,D_N) and demonstrate the convergence of the algorithm to the global minimum. The global convergence of the algorithm further enables us to not only establish a single-letter characterization of R*c(D₁,...,D_N) in a novel way when the N sources are an independent and identically distributed (IID) vector source, but also demonstrate a somewhat surprising result (dubbed the more and less coding theorem)---under some conditions on source frames and distortion, the more frames need to be encoded and transmitted, the less amount of data after encoding has to be actually sent. With the help of the algorithm, it is also shown by example that R*c(D₁,...,D_N) is in general much smaller than the total rate offered by the traditional greedy coding method by which each frame is encoded in a local optimum manner based on all information available to the encoder of the frame. As a by-product, an extended Markov lemma is established for correlated ergodic sources. From an information theoretic point of view, it is interesting to compare causal video coding and predictive video coding, which all existing video coding standards proposed so far are based upon. In this thesis, by fixing N=3, we first derive a single-letter characterization of R*p(D₁,D₂,D₃) for an IID vector source (X₁,X₂,X₃) where X₁ and X₂ are independent, and then demonstrate the existence of such X₁,X₂,X₃ for which R*p(D₁,D₂,D₃)>R*c(D₁,D₂,D₃) under some conditions on source frames and distortion. This result makes causal video coding an attractive framework for future video coding systems and standards. The design of causal video coding is also considered in the thesis from an information theoretic perspective by modeling each frame as a stationary information source. We first put forth a concept called causal scalar quantization, and then propose an algorithm for designing optimum fixed-rate causal scalar quantizers for causal video coding to minimize the total distortion among all sources. Simulation results show that in comparison with fixed-rate predictive scalar quantization, fixed-rate causal scalar quantization offers as large as 16% quality improvement (distortion reduction).
39

Optimal source coding with signal transfer function constraints

Derpich, Milan January 2009 (has links)
Research Doctorate - Doctor of Philosophy (PhD) / This thesis presents results on optimal coding and decoding of discrete-time stochastic signals, in the sense of minimizing a distortion metric subject to a constraint on the bit-rate and on the signal transfer function from source to reconstruction. The first (preliminary) contribution of this thesis is the introduction of new distortion metric that extends the mean squared error (MSE) criterion. We give this extension the name Weighted-Correlation MSE (WCMSE), and use it as the distortion metric throughout the thesis. The WCMSE is a weighted sum of two components of the MSE: the variance of the error component uncorrelated to the source, on the one hand, and the remainder of the MSE, on the other. The WCMSE can take account of signal transfer function constraints by assigning a larger weight to deviations from a target signal transfer function than to source-uncorrelated distortion. Within this framework, the second contribution is the solution of a family of feedback quantizer design problems for wide sense stationary sources using an additive noise model for quantization errors. These associated problems consist of finding the frequency response of the filters deployed around a scalar quantizer that minimize the WCMSE for a fixed quantizer signal-to-(granular)-noise ratio (SNR). This general structure, which incorporates pre-, post-, and feedback filters, includes as special cases well known source coding schemes such as pulse coded modulation (PCM), Differential Pulse-Coded Modulation (DPCM), Sigma Delta converters, and noise-shaping coders. The optimal frequency response of each of the filters in this architecture is found for each possible subset of the remaining filters being given and fixed. These results are then applied to oversampled feedback quantization. In particular, it is shown that, within the linear model used, and for a fixed quantizer SNR, the MSE decays exponentially with oversampling ratio, provided optimal filters are used at each oversampling ratio. If a subtractively dithered quantizer is utilized, then the noise model is exact, and the SNR constraint can be directly related to the bit-rate if entropy coding is used, regardless of the number of quantization levels. On the other hand, in the case of fixed-rate quantization, the SNR is related to the number of quantization levels, and hence to the bit-rate, when overload errors are negligible. It is shown that, for sources with unbounded support, the latter condition is violated for sufficiently large oversampling ratios. By deriving an upper bound on the contribution of overload errors to the total WCMSE, a lower bound for the decay rate of the WCMSE as a function of the oversampling ratio is found for fixed-rate quantization of sources with finite or infinite support. The third main contribution of the thesis is the introduction of the rate-distortion function (RDF) when WCMSE is the distortion metric, denoted by WCMSE-RDF. We provide a complete characterization for Gaussian sources. The resulting WCMSE-RDF yields, as special cases, Shannon's RDF, as well as the recently introduced RDF for source-uncorrelated distortions (RDF-SUD). For cases where only source-uncorrelated distortion is allowed, the RDF-SUD is extended to include the possibility of linear-time invariant feedback between reconstructed signal and coder input. It is also shown that feedback quantization schemes can achieve a bit-rate only 0.254 bits/sample above this RDF by using the same filters that minimize the reconstruction MSE for a quantizer-SNR constraint. The fourth main contribution of this thesis is to provide a set of conditions under which knowledge of a realization of the RDF can be used directly to solve encoder-decoder design optimization problems. This result has direct implications in the design of subband coders with feedback, as well as in the design of encoder-decoder pairs for applications such as networked control. As the fifth main contribution of this thesis, the RDF-SUD is utilized to show that, for Gaussian sta-tionary sources with memory and MSE distortion criterion, an upper bound on the information-theoretic causal RDF can be obtained by means of an iterative numerical procedure, at all rates. This bound is tighter than 0:5 bits/sample. Moreover, if there exists a realization of the causal RDF in which the re-construction error is jointly stationary with the source, then the bound obtained coincides with the causal RDF. The iterative procedure proposed here to obtain Ritc(D) also yields a characterization of the filters in a scalar feedback quantizer having an operational rate that exceeds the bound by less than 0:254 bits/sample. This constitutes an upper bound on the optimal performance theoretically attainable by any causal source coder for stationary Gaussian sources under the MSE distortion criterion.
40

On Non-Convex Splitting Methods For Markovian Information Theoretic Representation Learning

Teng Hui Huang (12463926) 27 April 2022 (has links)
<p>In this work, we study a class of Markovian information theoretic optimization problems motivated by the recent interests in incorporating mutual information as performance metrics which gives evident success in representation learning, feature extraction and clustering problems. In particular, we focus on the information bottleneck (IB) and privacy funnel (PF) methods and their recent multi-view, multi-source generalizations that gain attention because the performance significantly improved with multi-view, multi-source data. Nonetheless, the generalized problems challenge existing IB and PF solves in terms of the complexity and their abilities to tackle large-scale data. </p> <p>To address this, we study both the IB and PF under a unified framework and propose solving it through splitting methods, including renowned algorithms such as alternating directional method of multiplier (ADMM), Peaceman-Rachford splitting (PRS) and Douglas-Rachford splitting (DRS) as special cases. Our convergence analysis and the locally linear rate of convergence results give rise to new splitting method based IB and PF solvers that can be easily generalized to multi-view IB, multi-source PF. We implement the proposed methods with gradient descent and empirically evaluate the new solvers in both synthetic and real-world datasets. Our numerical results demonstrate improved performance over the state-of-the-art approach with significant reduction in complexity. Furthermore, we consider the practical scenario where there is distribution mismatch between training and testing data generating processes under a known bounded divergence constraint. In analyzing the generalization error, we develop new techniques inspired by the input-output mutual information approach and tighten the existing generalization error bounds.</p>

Page generated in 0.1262 seconds