• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • Tagged with
  • 4
  • 4
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Improved Rate Control for Low-Delay Communications in H.264/AVC Video Coding Standard

Wu, Sheng-Wang 17 August 2004 (has links)
In real-time, two way video communications, how to minimize the end-to-end delay for transmitting video data is very important. Since the delay produced by bits accumulated in the encoder buffer must be very small, we need an improved rate control to encode the video with high quality and maintain low buffer fullness. One approach to reduce the buffer fullness is to skip the encoding frames, but the frame-skipping will produce undesirable motion discontinuity in the encoded video sequence. In this thesis, we study the impact of low delay constraint in H.264 rate control and its improvements. The drawback of the H.264 rate control is it cannot handle the frame-skipping mechanism well. To modify this, we control the quantization parameter of each I-frame to avoid the buffer overflow and frame-skipping. Since encoding the I-frame by different quantization parameter will generate different rate and distortion for a group of pictures (GOP), we use Lagrangian optimization to find the tradeoff between rate and distortion for a GOP. By the estimation models of rate and distortion for a GOP, calculate the Lagrangian cost for each possible quantization parameter of I-frame, the quantization parameter with minimum Lagrangian cost will be our choice for I-frame. Simulation results show that our proposed rate control encode the video sequence with less skipped frames and with higher PSNR compared to H.264 rate control under low delay constraint.
2

Transform Coefficient Thresholding and Lagrangian Optimization for H.264 Video Coding / Transformkoefficient-tröskling och Lagrangeoptimering för H.264 Videokodning

Carlsson, Pontus January 2004 (has links)
<p>H.264, also known as MPEG-4 Part 10: Advanced Video Coding, is the latest MPEG standard for video coding. It provides approximately 50% bit rate savings for equivalent perceptual quality compared to any previous standard. In the same fashion as previous MPEG standards, only the bitstream syntax and the decoder are specified. Hence, coding performance is not only determined by the standard itself but also by the implementation of the encoder. In this report we propose two methods for improving the coding performance while remaining fully compliant to the standard. </p><p>After transformation and quantization, the transform coefficients are usually entropy coded and embedded in the bitstream. However, some of them might be beneficial to discard if the number of saved bits are sufficiently large. This is usually referred to as coefficient thresholding and is investigated in the scope of H.264 in this report. </p><p>Lagrangian optimization for video compression has proven to yield substantial improvements in perceived quality and the H.264 Reference Software has been designed around this concept. When performing Lagrangian optimization, lambda is a crucial parameter that determines the tradeoff between rate and distortion. We propose a new method to select lambda and the quantization parameter for non-reference frames in H.264. </p><p>The two methods are shown to achieve significant improvements. When combined, they reduce the bitrate around 12%, while preserving the video quality in terms of average PSNR. </p><p>To aid development of H.264, a software tool has been created to visualize the coding process and present statistics. This tool is capable of displaying information such as bit distribution, motion vectors, predicted pictures and motion compensated block sizes.</p>
3

Transform Coefficient Thresholding and Lagrangian Optimization for H.264 Video Coding / Transformkoefficient-tröskling och Lagrangeoptimering för H.264 Videokodning

Carlsson, Pontus January 2004 (has links)
H.264, also known as MPEG-4 Part 10: Advanced Video Coding, is the latest MPEG standard for video coding. It provides approximately 50% bit rate savings for equivalent perceptual quality compared to any previous standard. In the same fashion as previous MPEG standards, only the bitstream syntax and the decoder are specified. Hence, coding performance is not only determined by the standard itself but also by the implementation of the encoder. In this report we propose two methods for improving the coding performance while remaining fully compliant to the standard. After transformation and quantization, the transform coefficients are usually entropy coded and embedded in the bitstream. However, some of them might be beneficial to discard if the number of saved bits are sufficiently large. This is usually referred to as coefficient thresholding and is investigated in the scope of H.264 in this report. Lagrangian optimization for video compression has proven to yield substantial improvements in perceived quality and the H.264 Reference Software has been designed around this concept. When performing Lagrangian optimization, lambda is a crucial parameter that determines the tradeoff between rate and distortion. We propose a new method to select lambda and the quantization parameter for non-reference frames in H.264. The two methods are shown to achieve significant improvements. When combined, they reduce the bitrate around 12%, while preserving the video quality in terms of average PSNR. To aid development of H.264, a software tool has been created to visualize the coding process and present statistics. This tool is capable of displaying information such as bit distribution, motion vectors, predicted pictures and motion compensated block sizes.
4

Constrained optimization for machine learning : algorithms and applications

Gallego-Posada, Jose 06 1900 (has links)
Le déploiement généralisé de modèles d’apprentissage automatique de plus en plus performants a entraîné des pressions croissantes pour améliorer la robustesse, la sécurité et l’équité de ces modèles—-souvent en raison de considérations réglementaires et éthiques. En outre, la mise en œuvre de solutions d’intelligence artificielle dans des applications réelles est limitée par leur incapacité actuelle à garantir la conformité aux normes industrielles et aux réglementations gouvernementales. Les pipelines standards pour le développement de modèles d’apprentissage automatique adoptent une mentalité de “construire maintenant, réparer plus tard”, intégrant des mesures de sécurité a posteriori. Cette accumulation continue de dette technique entrave le progrès du domaine à long terme. L’optimisation sous contraintes offre un cadre conceptuel accompagné d’outils algorithmiques permettant d’imposer de manière fiable des propriétés complexes sur des modèles d’apprentissage automatique. Cette thèse appelle à un changement de paradigme dans lequel les contraintes constituent une partie intégrante du processus de développement des modèles, visant à produire des modèles d’apprentissage automatique qui sont intrinsèquement sécurisés par conception. Cette thèse offre une perspective holistique sur l’usage de l’optimisation sous contraintes dans les tâches d’apprentissage profond. Nous examinerons i) la nécessité de formulations contraintes, ii) les avantages offerts par le point de vue de l’optimisation sous contraintes, et iii) les défis algorithmiques qui surgissent dans la résolution de ces problèmes. Nous présentons plusieurs études de cas illustrant l’application des techniques d’optimisation sous contraintes à des problèmes courants d’apprentissage automatique. Dans la Contribution I, nous plaidons en faveur de l’utilisation des formulations sous contraintes en apprentissage automatique. Nous soutenons qu’il est préférable de gérer des régularisateurs interprétables via des contraintes explicites plutôt que par des pénalités additives, particulièrement lorsqu’il s’agit de modèles non convexes. Nous considérons l’entraînement de modèles creux avec une régularisation L0 et démontrons que i) il est possible de trouver des solutions réalisables et performantes à des problèmes de grande envergure avec des contraintes non convexes ; et que ii) l’approche contrainte peut éviter les coûteux ajustements par essais et erreurs inhérents aux techniques basées sur les pénalités. La Contribution II approfondit la contribution précédente en imposant des contraintes explicites sur le taux de compression atteint par les Représentations Neuronales Implicites—-une classe de modèles visant à entreposer efficacement des données (telles qu’une image) dans les paramètres d’un réseau neuronal. Dans ce travail, nous nous concentrons sur l’interaction entre la taille du modèle, sa capacité représentationnelle, et le temps d’entraînement requis. Plutôt que de restreindre la taille du modèle à un budget fixe (qui se conforme au taux de compression requis), nous entraînons un modèle surparamétré et creux avec des contraintes de taux de compression. Cela nous permet d’exploiter la puissance de modèles plus grands pour obtenir de meilleures reconstructions, plus rapidement, sans avoir à nous engager à leur taux de compression indésirable. La Contribution III présente les avantages des formulations sous contraintes dans une application réaliste de la parcimonie des modèles avec des contraintes liées à l’équité non différentiables. Les performances des réseaux neuronaux élagués se dégradent de manière inégale entre les sous-groupes de données, nécessitant ainsi l’utilisation de techniques d’atténuation. Nous proposons une formulation qui impose des contraintes sur les changements de précision du modèle dans chaque sous-groupe, contrairement aux travaux antérieurs qui considèrent des contraintes basées sur des métriques de substitution (telles que la perte du sous-groupe). Nous abordons les défis de la non-différentiabilité et de la stochasticité posés par nos contraintes proposées, et démontrons que notre méthode s’adapte de manière fiable aux problèmes d’optimisation impliquant de grands modèles et des centaines de sous-groupes. Dans la Contribution IV, nous nous concentrons sur la dynamique de l’optimisation lagrangienne basée sur le gradient, une technique populaire pour résoudre les problèmes sous contraintes non convexes en apprentissage profond. La nature adversariale du jeu min-max lagrangien le rend sujet à des comportements oscillatoires ou instables. En nous basant sur des idées tirées de la littérature sur les régulateurs PID, nous proposons un algorithme pour modifier les multiplicateurs de Lagrange qui offre une dynamique d’entraînement robuste et stable. Cette contribution met en place les bases pour que les praticiens adoptent et mettent en œuvre des approches sous contraintes avec confiance dans diverses applications réelles. Dans la Contribution V, nous fournissons un aperçu de Cooper : une bibliothèque pour l’optimisation sous contraintes basée sur le lagrangien dans PyTorch. Cette bibliothèque open-source implémente toutes les contributions principales présentées dans les chapitres précédents et s’intègre harmonieusement dans le cadre PyTorch. Nous avons développé Cooper dans le but de rendre les techniques d’optimisation sous contraintes facilement accessibles aux chercheurs et praticiens de l’apprentissage automatique. / The widespread deployment of increasingly capable machine learning models has resulted in mounting pressures to enhance the robustness, safety and fairness of such models--often arising from regulatory and ethical considerations. Further, the implementation of artificial intelligence solutions in real-world applications is limited by their current inability to guarantee compliance with industry standards and governmental regulations. Current standard pipelines for developing machine learning models embrace a “build now, fix later” mentality, retrofitting safety measures as afterthoughts. This continuous incurrence of technical debt hinders the progress of the field in the long-term. Constrained optimization offers a conceptual framework accompanied by algorithmic tools for reliably enforcing complex properties on machine learning models. This thesis calls for a paradigm shift in which constraints constitute an integral part of the model development process, aiming to produce machine learning models that are inherently secure by design. This thesis provides a holistic perspective on the use of constrained optimization in deep learning tasks. We shall explore i) the need for constrained formulations, ii) the advantages afforded by the constrained optimization standpoint and iii) the algorithmic challenges arising in the solution of such problems. We present several case-studies illustrating the application of constrained optimization techniques to popular machine learning problems. In Contribution I, we advocate for the use of constrained formulations in machine learning. We argue that it is preferable to handle interpretable regularizers via explicit constraints, rather than using additive penalties, specially when dealing with non-convex models. We consider the training of sparse models with L0-regularization and demonstrate that i) it is possible to find feasible, well-performing solutions to large-scale problems with non-convex constraints; and that ii) the constrained approach can avoid the costly trial-and-error tuning inherent to penalty-based techniques. Contribution II expands on the previous contribution by imposing explicit constraints on the compression-rate achieved by Implicit Neural Representations—-a class of models that aim to efficiently store data (such as an image) within a neural network’s parameters. In this work we concentrate on the interplay between the model size, its representational capacity and the required training time. Rather than restricting the model size to a fixed budget (that complies with the required compression rate), we train an overparametrized, sparse model with compression-rate constraints. This allows us to exploit the power of larger models to achieve better reconstructions, faster; without having to commit to their undesirable compression rate. Contribution III showcases the advantages of constrained formulations in a realistic model sparsity application with non-differentiable fairness-related constraints. The performance of pruned neural networks degrades unevenly across data sub-groups, thus requiring the use of mitigation techniques. We propose a formulation that imposes constraints on changes in the model accuracy in each sub-group, in contrast to prior work which considers constraints based on surrogate metrics (such as the sub-group loss). We address the non-differentiability and stochasticity challenges posed by our proposed constraints, and demonstrate that our method scales reliably to optimization problems involving large models and hundreds of sub-groups. In Contribution IV, we focus on the dynamics of gradient-based Lagrangian optimization, a popular technique for solving the non-convex constrained problems arising in deep learning. The adversarial nature of the min-max Lagrangian game makes it prone to oscillatory or unstable behaviors. Based on ideas from the PID control literature, we propose an algorithm for updating the Lagrange multipliers which yields robust, stable training dynamics. This contribution lays the groundwork for practitioners to adopt and implement constrained approaches confidently in diverse real-world applications. In Contribution V, we provide an overview of Cooper: a library for Lagrangian-based constrained optimization in PyTorch. This open-source library implements all the core contributions presented in the preceding chapters and integrates seamlessly with the PyTorch framework. We developed Cooper with the goal of making constrained optimization techniques readily available to machine learning researchers and practitioners.

Page generated in 0.0978 seconds