Global ETD Search

111	Improving sampling, optimization and feature extraction in Boltzmann machines Desjardins, Guillaume 12 1900 (has links) L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires. / Despite the current widescale success of deep learning in training large scale hierarchical models through supervised learning, unsupervised learning promises to play a crucial role towards solving general Artificial Intelligence, where agents are expected to learn with little to no supervision. The work presented in this thesis tackles the problem of unsupervised feature learning and density estimation, using a model family at the heart of the deep learning phenomenon: the Boltzmann Machine (BM). We present contributions in the areas of sampling, partition function estimation, optimization and the more general topic of invariant feature learning. With regards to sampling, we present a novel adaptive parallel tempering method which dynamically adjusts the temperatures under simulation to maintain good mixing in the presence of complex multi-modal distributions. When used in the context of stochastic maximum likelihood (SML) training, the improved ergodicity of our sampler translates to increased robustness to learning rates and faster per epoch convergence. Though our application is limited to BM, our method is general and is applicable to sampling from arbitrary probabilistic models using Markov Chain Monte Carlo (MCMC) techniques. While SML gradients can be estimated via sampling, computing data likelihoods requires an estimate of the partition function. Contrary to previous approaches which consider the model as a black box, we provide an efficient algorithm which instead tracks the change in the log partition function incurred by successive parameter updates. Our algorithm frames this estimation problem as one of filtering performed over a 2D lattice, with one dimension representing time and the other temperature. On the topic of optimization, our thesis presents a novel algorithm for applying the natural gradient to large scale Boltzmann Machines. Up until now, its application had been constrained by the computational and memory requirements of computing the Fisher Information Matrix (FIM), which is square in the number of parameters. The Metric-Free Natural Gradient algorithm (MFNG) avoids computing the FIM altogether by combining a linear solver with an efficient matrix-vector operation. The method shows promise in that the resulting updates yield faster per-epoch convergence, despite being slower in terms of wall clock time. Finally, we explore how invariant features can be learnt through modifications to the BM energy function. We study the problem in the context of the spike & slab Restricted Boltzmann Machine (ssRBM), which we extend to handle both binary and sparse input distributions. By associating each spike with several slab variables, latent variables can be made invariant to a rich, high dimensional subspace resulting in increased invariance in the learnt representation. When using the expected model posterior as input to a classifier, increased invariance translates to improved classification accuracy in the low-label data regime. We conclude by showing a connection between invariance and the more powerful concept of disentangling factors of variation. While invariance can be achieved by pooling over subspaces, disentangling can be achieved by learning multiple complementary views of the same subspace. In particular, we show how this can be achieved using third-order BMs featuring multiplicative interactions between pairs of random variables. Réseaux de neurones Apprentissage profond Apprentissage non-supervisé Apprentissage de représentations Machines de Boltzmann Échantillonnage Gradient naturel Modèles bilinéaires Fonction de partition Neural networks Deep learning Unsupervised learning Feature learning Boltzmann machines Markov chain Monte Carlo Parallel tempering Natural gradient Bilinear models Partition function
112	Contributions à la génération aléatoire pour des classes d'automates finis / Contributions to uniform random generation for finite automata classes Joly, Jean-Luc 23 March 2016 (has links) Le concept d’automate, central en théorie des langages, est l’outil d’appréhension naturel et efficace de nombreux problèmes concrets. L’usage intensif des automates finis dans un cadre algorithmique s ’illustre par de nombreux travaux de recherche. La correction et l’ évaluation sont les deux questions fondamentales de l’algorithmique. Une méthode classique d’ évaluation s’appuie sur la génération aléatoire contrôlée d’instances d’entrée. Les travaux d´écrits dans cette thèse s’inscrivent dans ce cadre et plus particulièrement dans le domaine de la génération aléatoire uniforme d’automates finis.L’exposé qui suit propose d’abord la construction d’un générateur aléatoire d’automates à pile déterministes, real time. Cette construction s’appuie sur la méthode symbolique. Des résultats théoriques et une étude expérimentale sont exposés.Un générateur aléatoire d’automates non-déterministes illustre ensuite la souplesse d’utilisation de la méthode de Monte-Carlo par Chaînes de Markov (MCMC) ainsi que la mise en œuvre de l’algorithme de Metropolis - Hastings pour l’ échantillonnage à isomorphisme près. Un résultat sur le temps de mélange est donné dans le cadre général .L’ échantillonnage par méthode MCMC pose le problème de l’évaluation du temps de mélange dans la chaîne. En s’inspirant de travaux antérieurs pour construire un générateur d’automates partiellement ordonnés, on montre comment différents outils statistiques permettent de s’attaquer à ce problème. / The concept of automata, central to language theory, is the natural and efficient tool to apprehendvarious practical problems.The intensive use of finite automata in an algorithmic framework is illustrated by numerous researchworks.The correctness and the evaluation of performance are the two fundamental issues of algorithmics.A classic method to evaluate an algorithm is based on the controlled random generation of inputs.The work described in this thesis lies within this context and more specifically in the field of theuniform random generation of finite automata.The following presentation first proposes to design a deterministic, real time, pushdown automatagenerator. This design builds on the symbolic method. Theoretical results and an experimental studyare given.This design builds on the symbolic method. Theoretical results and an experimental study are given.A random generator of non deterministic automata then illustrates the flexibility of the Markov ChainMonte Carlo methods (MCMC) as well as the implementation of the Metropolis-Hastings algorithm tosample up to isomorphism. A result about the mixing time in the general framework is given.The MCMC sampling methods raise the problem of the mixing time in the chain. By drawing on worksalready completed to design a random generator of partially ordered automata, this work shows howvarious statistical tools can form a basis to address this issue. Génération aléatoire uniforrme Automates finis non déterministes Algotithme de Metropolis-Hastings Automates partiellement ordonnés Test d'autocorrélation Test de Gelman-Rubin Test du khi-deux Uniform random generation Non deterministic finite automata Markov chain Monte-Carlo methods Metropolis-Hadtings algorithm Partially ordered automata Autocorrelation test Gelman-Rubin test Chi square test 629.8
113	Bayesian fusion of multi-band images : A powerful tool for super-resolution / Fusion Bayésienne des multi-bandes Images : Un outil puissant pour la Super-résolution Wei, Qi 24 September 2015 (has links) L’imagerie hyperspectrale (HS) consiste à acquérir une même scène dans plusieurs centaines de bandes spectrales contiguës (dimensions d'un cube de données), ce qui a conduit à trois types d'applications pertinentes, telles que la détection de cibles, la classification et le démélange spectral. Cependant, tandis que les capteurs hyperspectraux fournissent une information spectrale abondante, leur résolution spatiale est généralement plus limitée. Ainsi, la fusion d’une image HS avec d'autres images à haute résolution de la même scène, telles que les images multispectrales (MS) ou panchromatiques (PAN) est un problème intéressant. Le problème de fusionner une image HS de haute résolution spectrale mais de résolution spatiale limitée avec une image auxiliaire de haute résolution spatiale mais de résolution spectrale plus limitée (parfois qualifiée de fusion multi-résolution) a été exploré depuis de nombreuses années. D'un point de vue applicatif, ce problème est également important et est motivé par ceratins projets, comme par exemple le project Japonais HISIU, qui vise à fusionner des images MS et HS recalées acquises pour la même scène avec les mêmes conditions. Les techniques de fusion bayésienne permettent une interprétation intuitive du processus de fusion via la définition de la loi a posteriori de l’image à estimer (qui est de hautes résolutions spatiale et spectrale). Puisque le problème de fusion est généralement mal posé, l’inférence bayésienne offre un moyen pratique pour régulariser le problème en définissant une loi a priori adaptée à la scène d'intérêt. Les différents chapitres de cette thèse sont résumés ci-dessous. Le introduction présente le modèle général de fusion et les hypothèses statistiques utilisées pour les images multi-bandes observées, c’est-à-dire les images HS, MS ou PAN. Les images observées sont des versions dégradées de l'image de référence (à hautes résolutions spatiale et spectrale) qui résultent par exemple d’un flou spatial et spectral et/ou d’un sous-échantillonnage liés aux caractéristiques des capteurs. Les propriétés statistiques des mesures sont alors obtenues directement à partir d’un modèle linéaire traduisant ces dégradations et des propriétés statistiques du bruit. Le chapitre 1 s’intéresse à une technique de fusion bayésienne pour les images multi-bandes de télédétection, à savoir pour les images HS, MS et PAN. Tout d'abord, le problème de fusion est formulé dans un cadre d'estimation bayésienne. Une loi a priori Gaussienne exploitant la géométrie du problème est définie et un algorithme d’estimation Bayésienne permettant d’estimer l’image de référence est étudié. Pour obtenir des estimateurs Bayésiens liés à la distribution postérieure résultant, deux algorithmes basés sur échantillonnage de Monte Carlo et l'optimisation stratégie ont été développés. Le chapitre 2 propose une approche variationnelle pour la fusion d’images HS et MS. Le problème de fusion est formulé comme un problème inverse dont la solution est l'image d’intérêt qui est supposée vivre dans un espace de dimension résuite. Un terme de régularisation imposant des contraintes de parcimonie est défini avec soin. Ce terme traduit le fait que les patches de l'image cible sont bien représentés par une combinaison linéaire d’atomes appartenant à un dictionnaire approprié. Les atomes de ce dictionnaire et le support des coefficients des décompositions des patches sur ces atomes sont appris à l’aide de l’image de haute résolution spatiale. Puis, conditionnellement à ces dictionnaires et à ces supports, le problème de fusion est résolu à l’aide d’un algorithme d’optimisation alternée (utilisant l’algorithme ADMM) qui estime de manière itérative l’image d’intérêt et les coefficients de décomposition. / Hyperspectral (HS) imaging, which consists of acquiring a same scene in several hundreds of contiguous spectral bands (a three dimensional data cube), has opened a new range of relevant applications, such as target detection [MS02], classification [C.-03] and spectral unmixing [BDPD+12]. However, while HS sensors provide abundant spectral information, their spatial resolution is generally more limited. Thus, fusing the HS image with other highly resolved images of the same scene, such as multispectral (MS) or panchromatic (PAN) images is an interesting problem. The problem of fusing a high spectral and low spatial resolution image with an auxiliary image of higher spatial but lower spectral resolution, also known as multi-resolution image fusion, has been explored for many years [AMV+11]. From an application point of view, this problem is also important as motivated by recent national programs, e.g., the Japanese next-generation space-borne hyperspectral image suite (HISUI), which fuses co-registered MS and HS images acquired over the same scene under the same conditions [YI13]. Bayesian fusion allows for an intuitive interpretation of the fusion process via the posterior distribution. Since the fusion problem is usually ill-posed, the Bayesian methodology offers a convenient way to regularize the problem by defining appropriate prior distribution for the scene of interest. The aim of this thesis is to study new multi-band image fusion algorithms to enhance the resolution of hyperspectral image. In the first chapter, a hierarchical Bayesian framework is proposed for multi-band image fusion by incorporating forward model, statistical assumptions and Gaussian prior for the target image to be restored. To derive Bayesian estimators associated with the resulting posterior distribution, two algorithms based on Monte Carlo sampling and optimization strategy have been developed. In the second chapter, a sparse regularization using dictionaries learned from the observed images is introduced as an alternative of the naive Gaussian prior proposed in Chapter 1. instead of Gaussian prior is introduced to regularize the ill-posed problem. Identifying the supports jointly with the dictionaries circumvented the difficulty inherent to sparse coding. To minimize the target function, an alternate optimization algorithm has been designed, which accelerates the fusion process magnificently comparing with the simulation-based method. In the third chapter, by exploiting intrinsic properties of the blurring and downsampling matrices, a much more efficient fusion method is proposed thanks to a closed-form solution for the Sylvester matrix equation associated with maximizing the likelihood. The proposed solution can be embedded into an alternating direction method of multipliers or a block coordinate descent method to incorporate different priors or hyper-priors for the fusion problem, allowing for Bayesian estimators. In the last chapter, a joint multi-band image fusion and unmixing scheme is proposed by combining the well admitted linear spectral mixture model and the forward model. The joint fusion and unmixing problem is solved in an alternating optimization framework, mainly consisting of solving a Sylvester equation and projecting onto a simplex resulting from the non-negativity and sum-to-one constraints. The simulation results conducted on synthetic and semi-synthetic images illustrate the advantages of the developed Bayesian estimators, both qualitatively and quantitatively. Imagerie hyperspectrale Fusion d'images Démélange spectral Problèmes inverses Inférence Bayésienne Optimisation Représentation parcimonieuse Equation de Sylvester Hyperspectral image Image fusion Spectral unmixing Inverse problems Bayesian inference Markov Chain Monte Carlo methods Optimization Sparse representation Sylvester equation
114	Eléments de théorie du risque en finance et assurance / Elements of risk theory in finance and insurance Mostoufi, Mina 17 December 2015 (has links) Cette thèse traite de la théorie du risque en finance et en assurance. La mise en pratique du concept de comonotonie, la dépendance du risque au sens fort, est décrite pour identifier l’optimum de Pareto et les allocations individuellement rationnelles Pareto optimales, la tarification des options et la quantification des risques. De plus, il est démontré que l’aversion au risque monotone à gauche, un raffinement pertinent de l’aversion forte au risque, caractérise tout décideur à la Yaari, pour qui, l’assurance avec franchise est optimale. Le concept de comonotonie est introduit et discuté dans le chapitre 1. Dans le cas de risques multiples, on adopte l’idée qu’une forme naturelle pour les compagnies d’assurance de partager les risques est la Pareto optimalité risque par risque. De plus, l’optimum de Pareto et les allocations individuelles Pareto optimales sont caractérisées. Le chapitre 2 étudie l’application du concept de comonotonie dans la tarification des options et la quantification des risques. Une nouvelle variable de contrôle de la méthode de Monte Carlo est introduite et appliquée aux “basket options”, aux options asiatiques et à la TVaR. Finalement dans le chapitre 3, l’aversion au risque au sens fort est raffinée par l’introduction de l’aversion au risque monotone à gauche qui caractérise l’optimalité de l’assurance avec franchise dans le modèle de Yaari. De plus, il est montré que le calcul de la franchise s’effectue aisément. / This thesis deals with the risk theory in Finance and Insurance. Application of the Comonotonicity concept, the strongest risk dependence, is described for identifying the Pareto optima and Individually Rational Pareto optima allocations, option pricing and quantification of risk. Furthermore it is shown that the left monotone risk aversion, a meaningful refinement of strong risk aversion, characterizes Yaari’s decision makers for whom deductible insurance is optimal. The concept of Comonotonicity is introduced and discussed in Chapter 1. In case of multiple risks, the idea that a natural way for insurance companies to optimally share risks is risk by risk Pareto-optimality is adopted. Moreover, the Pareto optimal and individually Pareto optimal allocations are characterized. The Chapter 2 investigates the application of the Comonotonicity concept in option pricing and quantiﬁcation of risk. A novel control variate Monte Carlo method is introduced and its application is explained for basket options, Asian options and TVaR. Finally in Chapter 3 the strong risk aversion is refined by introducing the left-monotone risk aversion which characterizes the optimality of deductible insurance within the Yaari’s model. More importantly, it is shown that the computation of the deductible is tractable. Partage de risque multivarié Comonotonicité Optimum de Pareto individuellement Variable de contrôle Méthode de Monte-Carlo Modèle de Yaari Left monotone risk aversion de Jewitt Optimalité du contrat de franchise Multivariate risk sharing Comonotonicity Individually Rational Pareto optima Control variate Monte Carlo method Yaari’s model Left-monotone risk aversion Optimal insurance contract 510 338.5
115	Modèle d'agrégation des avis des experts, en fiabilité d'équipements Handi, Youssef January 2021 (has links) (PDF) No description available. Algorithme MCMC Algorithme de Métropolis-Hastings Avis d'experts Calcul de vraisemblance Calcul du DIC Chaînes de Markov Densités de valeurs simulées Distribution stationnaire Données HQ Ergodicité Fiabilité d'équipements Méthode de mélange Méthode de la moyenne Méthode de Monte-Carlo Méthode simulation Modèle bayésien Modèle d'agrégation Modèle de combinaison Modélisation p-valeur Quantiles simulés Validation de résultats

Page generated in 0.0688 seconds