Global ETD Search

301	Non-convex Bayesian Learning via Stochastic Gradient Markov Chain Monte Carlo Wei Deng (11804435) 18 December 2021 (has links) <div>The rise of artificial intelligence (AI) hinges on the efficient training of modern deep neural networks (DNNs) for non-convex optimization and uncertainty quantification, which boils down to a non-convex Bayesian learning problem. A standard tool to handle the problem is Langevin Monte Carlo, which proposes to approximate the posterior distribution with theoretical guarantees. However, non-convex Bayesian learning in real big data applications can be arbitrarily slow and often fails to capture the uncertainty or informative modes given a limited time. As a result, advanced techniques are still required.</div><div><br></div><div>In this thesis, we start with the replica exchange Langevin Monte Carlo (also known as parallel tempering), which is a Markov jump process that proposes appropriate swaps between exploration and exploitation to achieve accelerations. However, the na\"ive extension of swaps to big data problems leads to a large bias, and the bias-corrected swaps are required. Such a mechanism leads to few effective swaps and insignificant accelerations. To alleviate this issue, we first propose a control variates method to reduce the variance of noisy energy estimators and show a potential to accelerate the exponential convergence. We also present the population-chain replica exchange and propose a generalized deterministic even-odd scheme to track the non-reversibility and obtain an optimal round trip rate. Further approximations are conducted based on stochastic gradient descents, which yield a user-friendly nature for large-scale uncertainty approximation tasks without much tuning costs. </div><div><br></div><div>In the second part of the thesis, we study scalable dynamic importance sampling algorithms based on stochastic approximation. Traditional dynamic importance sampling algorithms have achieved successes in bioinformatics and statistical physics, however, the lack of scalability has greatly limited their extensions to big data applications. To handle this scalability issue, we resolve the vanishing gradient problem and propose two dynamic importance sampling algorithms based on stochastic gradient Langevin dynamics. Theoretically, we establish the stability condition for the underlying ordinary differential equation (ODE) system and guarantee the asymptotic convergence of the latent variable to the desired fixed point. Interestingly, such a result still holds given non-convex energy landscapes. In addition, we also propose a pleasingly parallel version of such algorithms with interacting latent variables. We show that the interacting algorithm can be theoretically more efficient than the single-chain alternative with an equivalent computational budget.</div> Statistics Stochastic Analysis and Modelling Monte Carlo Algorithm Artificial intelligence Importance sampling Computer vision Langevin Dynamics Variance reduction techniques Wang-Landau algorithm Interacting particles Hamiltonian Monte Carlo Log-Sobolev inequality Metropolis Hasting Deep neural network Stochastic variance-reduced gradient Wasserstein distance Convolutional neural network Deterministic even odd scheme Non-reversibility Stochastic approximation Monte Carlo Stochastic differential equation Stochastic gradient descent Parallel tempering Stochastic approximation Replica exchange Stochastic gradient Langevin dynamics Markov Chain Monte Carlo
302	Parallel distributed-memory particle methods for acquisition-rate segmentation and uncertainty quantifications of large fluorescence microscopy images Afshar, Yaser 17 October 2016 (has links) Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. Another issue is the information loss during image acquisition due to limitations of the optical imaging systems. Analysis of the acquired images may, therefore, find multiple solutions (or no solution) due to imaging noise, blurring, and other uncertainties introduced during image acquisition. In this thesis, we address the computational processing time and memory issues by developing a distributed parallel algorithm for segmentation of large fluorescence-microscopy images. The method is based on the versatile Discrete Region Competition (Cardinale et al., 2012) algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collective solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 10^10 pixels) but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data inspection and interactive experiments. Second, we estimate the segmentation uncertainty on large images that do not fit the main memory of a single computer. We there- fore develop a distributed parallel algorithm for efficient Markov- chain Monte Carlo Discrete Region Sampling (Cardinale, 2013). The parallel algorithm provides a measure of segmentation uncertainty in a statistically unbiased way. It approximates the posterior probability densities over the high-dimensional space of segmentations around the previously found segmentation. / Moderne Fluoreszenzmikroskopie, wie zum Beispiel Lichtblattmikroskopie, erlauben die Aufnahme hochaufgelöster, 3-dimensionaler Bilder. Dies führt zu einen Engpass bei der Bearbeitung und Analyse der aufgenommenen Bilder, da die Aufnahmerate die Datenverarbeitungsrate übersteigt. Zusätzlich können diese Bilder so groß sein, dass sie die Speicherkapazität eines einzelnen Computers überschreiten. Hinzu kommt der aus Limitierungen des optischen Abbildungssystems resultierende Informationsverlust während der Bildaufnahme. Bildrauschen, Unschärfe und andere Messunsicherheiten können dazu führen, dass Analysealgorithmen möglicherweise mehrere oder keine Lösung für Bildverarbeitungsaufgaben finden. Im Rahmen der vorliegenden Arbeit entwickeln wir einen verteilten, parallelen Algorithmus für die Segmentierung von speicherintensiven Fluoreszenzmikroskopie-Bildern. Diese Methode basiert auf dem vielseitigen "Discrete Region Competition" Algorithmus (Cardinale et al., 2012), der sich bereits in anderen Anwendungen als nützlich für die Segmentierung von Mikroskopie-Bildern erwiesen hat. Das hier präsentierte Verfahren unterteilt das Eingangsbild in kleinere Unterbilder, welche auf die Speicher mehrerer Computer verteilt werden. Die Koordinierung des globalen Segmentierungsproblems wird durch die Benutzung von Netzwerkkommunikation erreicht. Dies erlaubt die Segmentierung von sehr großen Bildern, wobei wir die Anwendung des Algorithmus auf Bildern mit bis zu 10^10 Pixeln demonstrieren. Zusätzlich wird die Segmentierungsgeschwindigkeit erhöht und damit vergleichbar mit der Aufnahmerate des Mikroskops. Dies ist eine Grundvoraussetzung für die intelligenten Mikroskope der Zukunft, und es erlaubt die Online-Betrachtung der aufgenommenen Daten, sowie interaktive Experimente. Wir bestimmen die Unsicherheit des Segmentierungsalgorithmus bei der Anwendung auf Bilder, deren Größe den Speicher eines einzelnen Computers übersteigen. Dazu entwickeln wir einen verteilten, parallelen Algorithmus für effizientes Markov-chain Monte Carlo "Discrete Region Sampling" (Cardinale, 2013). Dieser Algorithmus quantifiziert die Segmentierungsunsicherheit statistisch erwartungstreu. Dazu wird die A-posteriori-Wahrscheinlichkeitsdichte über den hochdimensionalen Raum der Segmentierungen in der Umgebung der zuvor gefundenen Segmentierung approximiert. info:eu-repo/classification/ddc/004 ddc:004
303	Bayesian modelling of integrated data and its application to seabird populations Reynolds, Toby J. January 2010 (has links) Integrated data analyses are becoming increasingly popular in studies of wild animal populations where two or more separate sources of data contain information about common parameters. Here we develop an integrated population model using abundance and demographic data from a study of common guillemots (Uria aalge) on the Isle of May, southeast Scotland. A state-space model for the count data is supplemented by three demographic time series (productivity and two mark-recapture-recovery (MRR)), enabling the estimation of prebreeder emigration rate - a parameter for which there is no direct observational data, and which is unidentifiable in the separate analysis of MRR data. A Bayesian approach using MCMC provides a flexible and powerful analysis framework. This model is extended to provide predictions of future population trajectories. Adopting random effects models for the survival and productivity parameters, we implement the MCMC algorithm to obtain a posterior sample of the underlying process means and variances (and population sizes) within the study period. Given this sample, we predict future demographic parameters, which in turn allows us to predict future population sizes and obtain the corresponding posterior distribution. Under the assumption that recent, unfavourable conditions persist in the future, we obtain a posterior probability of 70% that there is a population decline of >25% over a 10-year period. Lastly, using MRR data we test for spatial, temporal and age-related correlations in guillemot survival among three widely separated Scottish colonies that have varying overlap in nonbreeding distribution. We show that survival is highly correlated over time for colonies/age classes sharing wintering areas, and essentially uncorrelated for those with separate wintering areas. These results strongly suggest that one or more aspects of winter environment are responsible for spatiotemporal variation in survival of British guillemots, and provide insight into the factors driving multi-population dynamics of the species. 591.7
304	Multiple sequence analysis in the presence of alignment uncertainty Herman, Joseph L. January 2014 (has links) Sequence alignment is one of the most intensely studied problems in bioinformatics, and is an important step in a wide range of analyses. An issue that has gained much attention in recent years is the fact that downstream analyses are often highly sensitive to the specific choice of alignment. One way to address this is to jointly sample alignments along with other parameters of interest. In order to extend the range of applicability of this approach, the first chapter of this thesis introduces a probabilistic evolutionary model for protein structures on a phylogenetic tree; since protein structures typically diverge much more slowly than sequences, this allows for more reliable detection of remote homologies, improving the accuracy of the resulting alignments and trees, and reducing sensitivity of the results to the choice of dataset. In order to carry out inference under such a model, a number of new Markov chain Monte Carlo approaches are developed, allowing for more efficient convergence and mixing on the high-dimensional parameter space. The second part of the thesis presents a directed acyclic graph (DAG)-based approach for representing a collection of sampled alignments. This DAG representation allows the initial collection of samples to be used to generate a larger set of alignments under the same approximate distribution, enabling posterior alignment probabilities to be estimated reliably from a reasonable number of samples. If desired, summary alignments can then be generated as maximum-weight paths through the DAG, under various types of loss or scoring functions. The acyclic nature of the graph also permits various other types of algorithms to be easily adapted to operate on the entire set of alignments in the DAG. In the final part of this work, methodology is introduced for alignment-DAG-based sequence annotation using hidden Markov models, and RNA secondary structure prediction using stochastic context-free grammars. Results on test datasets indicate that the additional information contained within the DAG allows for improved predictions, resulting in substantial gains over simply analysing a set of alignments one by one. 572.8
305	Estimation of State Space Models and Stochastic Volatility Miller Lira, Shirley 09 1900 (has links) Ma thèse est composée de trois chapitres reliés à l'estimation des modèles espace-état et volatilité stochastique. Dans le première article, nous développons une procédure de lissage de l'état, avec efficacité computationnelle, dans un modèle espace-état linéaire et gaussien. Nous montrons comment exploiter la structure particulière des modèles espace-état pour tirer les états latents efficacement. Nous analysons l'efficacité computationnelle des méthodes basées sur le filtre de Kalman, l'algorithme facteur de Cholesky et notre nouvelle méthode utilisant le compte d'opérations et d'expériences de calcul. Nous montrons que pour de nombreux cas importants, notre méthode est plus efficace. Les gains sont particulièrement grands pour les cas où la dimension des variables observées est grande ou dans les cas où il faut faire des tirages répétés des états pour les mêmes valeurs de paramètres. Comme application, on considère un modèle multivarié de Poisson avec le temps des intensités variables, lequel est utilisé pour analyser le compte de données des transactions sur les marchés financières. Dans le deuxième chapitre, nous proposons une nouvelle technique pour analyser des modèles multivariés à volatilité stochastique. La méthode proposée est basée sur le tirage efficace de la volatilité de son densité conditionnelle sachant les paramètres et les données. Notre méthodologie s'applique aux modèles avec plusieurs types de dépendance dans la coupe transversale. Nous pouvons modeler des matrices de corrélation conditionnelles variant dans le temps en incorporant des facteurs dans l'équation de rendements, où les facteurs sont des processus de volatilité stochastique indépendants. Nous pouvons incorporer des copules pour permettre la dépendance conditionnelle des rendements sachant la volatilité, permettant avoir différent lois marginaux de Student avec des degrés de liberté spécifiques pour capturer l'hétérogénéité des rendements. On tire la volatilité comme un bloc dans la dimension du temps et un à la fois dans la dimension de la coupe transversale. Nous appliquons la méthode introduite par McCausland (2012) pour obtenir une bonne approximation de la distribution conditionnelle à posteriori de la volatilité d'un rendement sachant les volatilités d'autres rendements, les paramètres et les corrélations dynamiques. Le modèle est évalué en utilisant des données réelles pour dix taux de change. Nous rapportons des résultats pour des modèles univariés de volatilité stochastique et deux modèles multivariés. Dans le troisième chapitre, nous évaluons l'information contribuée par des variations de volatilite réalisée à l'évaluation et prévision de la volatilité quand des prix sont mesurés avec et sans erreur. Nous utilisons de modèles de volatilité stochastique. Nous considérons le point de vue d'un investisseur pour qui la volatilité est une variable latent inconnu et la volatilité réalisée est une quantité d'échantillon qui contient des informations sur lui. Nous employons des méthodes bayésiennes de Monte Carlo par chaîne de Markov pour estimer les modèles, qui permettent la formulation, non seulement des densités a posteriori de la volatilité, mais aussi les densités prédictives de la volatilité future. Nous comparons les prévisions de volatilité et les taux de succès des prévisions qui emploient et n'emploient pas l'information contenue dans la volatilité réalisée. Cette approche se distingue de celles existantes dans la littérature empirique en ce sens que ces dernières se limitent le plus souvent à documenter la capacité de la volatilité réalisée à se prévoir à elle-même. Nous présentons des applications empiriques en utilisant les rendements journaliers des indices et de taux de change. Les différents modèles concurrents sont appliqués à la seconde moitié de 2008, une période marquante dans la récente crise financière. / My thesis consists of three chapters related to the estimation of state space models and stochastic volatility models. In the first chapter we develop a computationally efficient procedure for state smoothing in Gaussian linear state space models. We show how to exploit the special structure of state-space models to draw latent states efficiently. We analyze the computational efficiency of Kalman-filter-based methods, the Cholesky Factor Algorithm, and our new method using counts of operations and computational experiments. We show that for many important cases, our method is most efficient. Gains are particularly large for cases where the dimension of observed variables is large or where one makes repeated draws of states for the same parameter values. We apply our method to a multivariate Poisson model with time-varying intensities, which we use to analyze financial market transaction count data. In the second chapter, we propose a new technique for the analysis of multivariate stochastic volatility models, based on efficient draws of volatility from its conditional posterior distribution. It applies to models with several kinds of cross-sectional dependence. Full VAR coefficient and covariance matrices give cross-sectional volatility dependence. Mean factor structure allows conditional correlations, given states, to vary in time. The conditional return distribution features Student's t marginals, with asset-specific degrees of freedom, and copulas describing cross-sectional dependence. We draw volatility as a block in the time dimension and one-at-a-time in the cross-section. Following McCausland(2012), we use close approximations of the conditional posterior distributions of volatility blocks as Metropolis-Hastings proposal distributions. We illustrate using daily return data for ten currencies. We report results for univariate stochastic volatility models and two multivariate models. In the third chapter, we evaluate the information contributed by (variations of) realized volatility to the estimation and forecasting of volatility when prices are measured with and without error using a stochastic volatility model. We consider the viewpoint of an investor for whom volatility is an unknown latent variable and realized volatility is a sample quantity which contains information about it. We use Bayesian Markov Chain Monte Carlo (MCMC) methods to estimate the models, which allow the formulation of the posterior densities of in-sample volatilities, and the predictive densities of future volatilities. We then compare the volatility forecasts and hit rates from predictions that use and do not use the information contained in realized volatility. This approach is in contrast with most of the empirical realized volatility literature which most often documents the ability of realized volatility to forecast itself. Our empirical applications use daily index returns and foreign exchange during the 2008-2009 financial crisis. Modèles espace-état Volatilité stochastique Volatilité réalisée Compte de données Données haute fréquence State-space models Markov chain Monte Carlo Importance sampling Stochastic volatility Realized Volatility Count data High frequency financial data
306	Développement de modèles prédictifs de la toxicocinétique de substances organiques Peyret, Thomas 02 1900 (has links) Les modèles pharmacocinétiques à base physiologique (PBPK) permettent de simuler la dose interne de substances chimiques sur la base de paramètres spécifiques à l’espèce et à la substance. Les modèles de relation quantitative structure-propriété (QSPR) existants permettent d’estimer les paramètres spécifiques au produit (coefficients de partage (PC) et constantes de métabolisme) mais leur domaine d’application est limité par leur manque de considération de la variabilité de leurs paramètres d’entrée ainsi que par leur domaine d’application restreint (c. à d., substances contenant CH3, CH2, CH, C, C=C, H, Cl, F, Br, cycle benzénique et H sur le cycle benzénique). L’objectif de cette étude est de développer de nouvelles connaissances et des outils afin d’élargir le domaine d’application des modèles QSPR-PBPK pour prédire la toxicocinétique de substances organiques inhalées chez l’humain. D’abord, un algorithme mécaniste unifié a été développé à partir de modèles existants pour prédire les PC de 142 médicaments et polluants environnementaux aux niveaux macro (tissu et sang) et micro (cellule et fluides biologiques) à partir de la composition du tissu et du sang et de propriétés physicochimiques. L’algorithme résultant a été appliqué pour prédire les PC tissu:sang, tissu:plasma et tissu:air du muscle (n = 174), du foie (n = 139) et du tissu adipeux (n = 141) du rat pour des médicaments acides, basiques et neutres ainsi que pour des cétones, esters d’acétate, éthers, alcools, hydrocarbures aliphatiques et aromatiques. Un modèle de relation quantitative propriété-propriété (QPPR) a été développé pour la clairance intrinsèque (CLint) in vivo (calculée comme le ratio du Vmax (μmol/h/kg poids de rat) sur le Km (μM)), de substrats du CYP2E1 (n = 26) en fonction du PC n octanol:eau, du PC sang:eau et du potentiel d’ionisation). Les prédictions du QPPR, représentées par les limites inférieures et supérieures de l’intervalle de confiance à 95% à la moyenne, furent ensuite intégrées dans un modèle PBPK humain. Subséquemment, l’algorithme de PC et le QPPR pour la CLint furent intégrés avec des modèles QSPR pour les PC hémoglobine:eau et huile:air pour simuler la pharmacocinétique et la dosimétrie cellulaire d’inhalation de composés organiques volatiles (COV) (benzène, 1,2-dichloroéthane, dichlorométhane, m-xylène, toluène, styrène, 1,1,1 trichloroéthane et 1,2,4 trimethylbenzène) avec un modèle PBPK chez le rat. Finalement, la variabilité de paramètres de composition des tissus et du sang de l’algorithme pour les PC tissu:air chez le rat et sang:air chez l’humain a été caractérisée par des simulations Monte Carlo par chaîne de Markov (MCMC). Les distributions résultantes ont été utilisées pour conduire des simulations Monte Carlo pour prédire des PC tissu:sang et sang:air. Les distributions de PC, avec celles des paramètres physiologiques et du contenu en cytochrome P450 CYP2E1, ont été incorporées dans un modèle PBPK pour caractériser la variabilité de la toxicocinétique sanguine de quatre COV (benzène, chloroforme, styrène et trichloroéthylène) par simulation Monte Carlo. Globalement, les approches quantitatives mises en œuvre pour les PC et la CLint dans cette étude ont permis l’utilisation de descripteurs moléculaires génériques plutôt que de fragments moléculaires spécifiques pour prédire la pharmacocinétique de substances organiques chez l’humain. La présente étude a, pour la première fois, caractérisé la variabilité des paramètres biologiques des algorithmes de PC pour étendre l’aptitude des modèles PBPK à prédire les distributions, pour la population, de doses internes de substances organiques avant de faire des tests chez l’animal ou l’humain. / Physiologically-based pharmacokinetic (PBPK) models simulate the internal dose metrics of chemicals based on species-specific and chemical-specific parameters. The existing quantitative structure-property relationships (QSPRs) allow to estimate the chemical-specific parameters (partition coefficients (PCs) and metabolic constants) but their applicability is limited by their lack of consideration of variability in input parameters and their restricted application domain (i.e., substances containing CH3, CH2, CH, C, C=C, H, Cl, F, Br, benzene ring and H in benzene ring). The objective of this study was to develop new knowledge and tools to increase the applicability domain of QSPR-PBPK models for predicting the inhalation toxicokinetics of organic compounds in humans. First, a unified mechanistic algorithm was developed from existing models to predict macro (tissue and blood) and micro (cell and biological fluid) level PCs of 142 drugs and environmental pollutants on the basis of tissue and blood composition along with physicochemical properties. The resulting algorithm was applied to compute the tissue:blood, tissue:plasma and tissue:air PCs in rat muscle (n = 174), liver (n = 139) and adipose tissue (n = 141) for acidic, neutral, zwitterionic and basic drugs as well as ketones, acetate esters, alcohols, ethers, aliphatic and aromatic hydrocarbons. Then, a quantitative property-property relationship (QPPR) model was developed for the in vivo rat intrinsic clearance (CLint) (calculated as the ratio of the in vivo Vmax (μmol/h/kg bw rat) to the Km (μM)) of CYP2E1 substrates (n = 26) as a function of n-octanol:water PC, blood:water PC, and ionization potential). The predictions of the QPPR as lower and upper bounds of the 95% mean confidence intervals were then integrated within a human PBPK model. Subsequently, the PC algorithm and QPPR for CLint were integrated along with a QSPR model for the hemoglobin:water and oil:air PCs to simulate the inhalation pharmacokinetics and cellular dosimetry of volatile organic compounds (VOCs) (benzene, 1,2-dichloroethane, dichloromethane, m-xylene, toluene, styrene, 1,1,1-trichloroethane and 1,2,4 trimethylbenzene) using a PBPK model for rats. Finally, the variability in the tissue and blood composition parameters of the PC algorithm for rat tissue:air and human blood:air PCs was characterized by performing Markov chain Monte Carlo (MCMC) simulations. The resulting distributions were used for conducting Monte Carlo simulations to predict tissue:blood and blood:air PCs for VOCs. The distributions of PCs, along with distributions of physiological parameters and CYP2E1 content, were then incorporated within a PBPK model, to characterize the human variability of the blood toxicokinetics of four VOCs (benzene, chloroform, styrene and trichloroethylene) using Monte Carlo simulations. Overall, the quantitative approaches for PCs and CLint implemented in this study allow the use of generic molecular descriptors rather than specific molecular fragments to predict the pharmacokinetics of organic substances in humans. In this process, the current study has, for the first time, characterized the variability of the biological input parameters of the PC algorithms to expand the ability of PBPK models to predict the population distributions of the internal dose metrics of organic substances prior to testing in animals or humans. Toxicocinétique Simulation Monte Carlo Monte Carlo par chaîne de Markov Coefficient de partage Métabolisme Analyse d’incertitude Dosimétrie cellulaire Toxicokinetics Monte Carlo simulation Markov chain Monte Carlo Partition coefficient Metabolism Uncertainty analysis Cellular dosimetry
307	Improving sampling, optimization and feature extraction in Boltzmann machines Desjardins, Guillaume 12 1900 (has links) L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires. / Despite the current widescale success of deep learning in training large scale hierarchical models through supervised learning, unsupervised learning promises to play a crucial role towards solving general Artificial Intelligence, where agents are expected to learn with little to no supervision. The work presented in this thesis tackles the problem of unsupervised feature learning and density estimation, using a model family at the heart of the deep learning phenomenon: the Boltzmann Machine (BM). We present contributions in the areas of sampling, partition function estimation, optimization and the more general topic of invariant feature learning. With regards to sampling, we present a novel adaptive parallel tempering method which dynamically adjusts the temperatures under simulation to maintain good mixing in the presence of complex multi-modal distributions. When used in the context of stochastic maximum likelihood (SML) training, the improved ergodicity of our sampler translates to increased robustness to learning rates and faster per epoch convergence. Though our application is limited to BM, our method is general and is applicable to sampling from arbitrary probabilistic models using Markov Chain Monte Carlo (MCMC) techniques. While SML gradients can be estimated via sampling, computing data likelihoods requires an estimate of the partition function. Contrary to previous approaches which consider the model as a black box, we provide an efficient algorithm which instead tracks the change in the log partition function incurred by successive parameter updates. Our algorithm frames this estimation problem as one of filtering performed over a 2D lattice, with one dimension representing time and the other temperature. On the topic of optimization, our thesis presents a novel algorithm for applying the natural gradient to large scale Boltzmann Machines. Up until now, its application had been constrained by the computational and memory requirements of computing the Fisher Information Matrix (FIM), which is square in the number of parameters. The Metric-Free Natural Gradient algorithm (MFNG) avoids computing the FIM altogether by combining a linear solver with an efficient matrix-vector operation. The method shows promise in that the resulting updates yield faster per-epoch convergence, despite being slower in terms of wall clock time. Finally, we explore how invariant features can be learnt through modifications to the BM energy function. We study the problem in the context of the spike & slab Restricted Boltzmann Machine (ssRBM), which we extend to handle both binary and sparse input distributions. By associating each spike with several slab variables, latent variables can be made invariant to a rich, high dimensional subspace resulting in increased invariance in the learnt representation. When using the expected model posterior as input to a classifier, increased invariance translates to improved classification accuracy in the low-label data regime. We conclude by showing a connection between invariance and the more powerful concept of disentangling factors of variation. While invariance can be achieved by pooling over subspaces, disentangling can be achieved by learning multiple complementary views of the same subspace. In particular, we show how this can be achieved using third-order BMs featuring multiplicative interactions between pairs of random variables. Réseaux de neurones Apprentissage profond Apprentissage non-supervisé Apprentissage de représentations Machines de Boltzmann Échantillonnage Gradient naturel Modèles bilinéaires Fonction de partition Neural networks Deep learning Unsupervised learning Feature learning Boltzmann machines Markov chain Monte Carlo Parallel tempering Natural gradient Bilinear models Partition function
308	Perfektní simulace ve stochastické geometrii / Perfect simulation in stochastic geometry Sadil, Antonín January 2010 (has links) Perfect simulations are methods, which convert suitable Markov chain Monte Carlo (MCMC) algorithms into algorithms which return exact draws from the target distribution, instead of approximations based on long-time convergence to equilibrium. In recent years a lot of various perfect simulation algorithms were developed. This work provides a unified exposition of some perfect simulation algorithms with applications to spatial point processes, especially to the Strauss process and area-interaction process. Described algorithms and their properties are compared theoretically and also by a simulation study.
309	Modèle bayésien non paramétrique pour la segmentation jointe d'un ensemble d'images avec des classes partagées / Bayesian nonparametric model for joint segmentation of a set of images with shared classes Sodjo, Jessica 18 September 2018 (has links) Ce travail porte sur la segmentation jointe d’un ensemble d’images dans un cadre bayésien.Le modèle proposé combine le processus de Dirichlet hiérarchique (HDP) et le champ de Potts.Ainsi, pour un groupe d’images, chacune est divisée en régions homogènes et les régions similaires entre images sont regroupées en classes. D’une part, grâce au HDP, il n’est pas nécessaire de définir a priori le nombre de régions par image et le nombre de classes, communes ou non.D’autre part, le champ de Potts assure une homogénéité spatiale. Les lois a priori et a posteriori en découlant sont complexes rendant impossible le calcul analytique d’estimateurs. Un algorithme de Gibbs est alors proposé pour générer des échantillons de la loi a posteriori. De plus,un algorithme de Swendsen-Wang généralisé est développé pour une meilleure exploration dela loi a posteriori. Enfin, un algorithme de Monte Carlo séquentiel a été défini pour l’estimation des hyperparamètres du modèle.Ces méthodes ont été évaluées sur des images-test et sur des images naturelles. Le choix de la meilleure partition se fait par minimisation d’un critère indépendant de la numérotation. Les performances de l’algorithme sont évaluées via des métriques connues en statistiques mais peu utilisées en segmentation d’image. / This work concerns the joint segmentation of a set images in a Bayesian framework. The proposed model combines the hierarchical Dirichlet process (HDP) and the Potts random field. Hence, for a set of images, each is divided into homogeneous regions and similar regions between images are grouped into classes. On the one hand, thanks to the HDP, it is not necessary to define a priori the number of regions per image and the number of classes, common or not.On the other hand, the Potts field ensures a spatial consistency. The arising a priori and a posteriori distributions are complex and makes it impossible to compute analytically estimators. A Gibbs algorithm is then proposed to generate samples of the distribution a posteriori. Moreover,a generalized Swendsen-Wang algorithm is developed for a better exploration of the a posteriori distribution. Finally, a sequential Monte Carlo sampler is defined for the estimation of the hyperparameters of the model.These methods have been evaluated on toy examples and natural images. The choice of the best partition is done by minimization of a numbering free criterion. The performance are assessed by metrics well-known in statistics but unused in image segmentation. Inférence bayésienne Monte Carlo séquentiel Bayésien non paramétrique Processus de Dirichlet hiérarchique Champ de Potts Algorithme de Swendsen-Wang Segmentation Image Bayesian inference Markov chain Monte Carlo Sequential Monte Carlo Non parametric Bayesian Hierarchical Dirichlet process Potts field Swendsen-Wang algorithm Segmentation Image
310	Approche stochastique de l'analyse du « residual moveout » pour la quantification de l'incertitude dans l'imagerie sismique / A stochastic approach to uncertainty quantification in residual moveout analysis Tamatoro, Johng-Ay 09 April 2014 (has links) Le principale objectif de l'imagerie sismique pétrolière telle qu'elle est réalisée de nos jours est de fournir une image représentative des quelques premiers kilomètres du sous-sol. Cette image permettra la localisation des structures géologiques formant les réservoirs où sont piégées les ressources en hydrocarbures. Pour pouvoir caractériser ces réservoirs et permettre la production des hydrocarbures, le géophysicien utilise la migration-profondeur qui est un outil d'imagerie sismique qui sert à convertir des données-temps enregistrées lors des campagnes d'acquisition sismique en des images-profondeur qui seront exploitées par l'ingénieur-réservoir avec l'aide de l'interprète sismique et du géologue. Lors de la migration profondeur, les évènements sismiques (réflecteurs,…) sont replacés à leurs positions spatiales correctes. Une migration-profondeur pertinente requiert une évaluation précise modèle de vitesse. La précision du modèle de vitesse utilisé pour une migration est jugée au travers l'alignement horizontal des évènements présents sur les Common Image Gather (CIG). Les évènements non horizontaux (Residual Move Out) présents sur les CIG sont dus au ratio du modèle de vitesse de migration par la vitesse effective du milieu. L'analyse du Residual Move Out (RMO) a pour but d'évaluer ce ratio pour juger de la pertinence du modèle de vitesse et permettre sa mise à jour. Les CIG qui servent de données pour l'analyse du RMO sont solutions de problèmes inverses mal posés, et sont corrompues par du bruit. Une analyse de l'incertitude s'avère nécessaire pour améliorer l'évaluation des résultats obtenus. Le manque d'outils d'analyse de l'incertitude dans l'analyse du RMO en fait sa faiblesse. L'analyse et la quantification de l'incertitude pourrait aider à la prise de décisions qui auront des impacts socio-économiques importantes. Ce travail de thèse a pour but de contribuer à l'analyse et à la quantification de l'incertitude dans l'analyse des paramètres calculés pendant le traitement des données sismiques et particulièrement dans l'analyse du RMO. Pour atteindre ces objectifs plusieurs étapes ont été nécessaires. Elles sont entre autres :- L’appropriation des différents concepts géophysiques nécessaires à la compréhension du problème (organisation des données de sismique réflexion, outils mathématiques et méthodologiques utilisés);- Présentations des méthodes et outils pour l'analyse classique du RMO;- Interprétation statistique de l’analyse classique;- Proposition d’une approche stochastique;Cette approche stochastique consiste en un modèle statistique hiérarchique dont les paramètres sont :- la variance traduisant le niveau de bruit dans les données estimée par une méthode basée sur les ondelettes, - une fonction qui traduit la cohérence des amplitudes le long des évènements estimée par des méthodes de lissages de données,- le ratio qui est considéré comme une variable aléatoire et non comme un paramètre fixe inconnue comme c'est le cas dans l'approche classique de l'analyse du RMO. Il est estimé par des méthodes de simulations de Monte Carlo par Chaîne de Markov.L'approche proposée dans cette thèse permet d'obtenir autant de cartes de valeurs du paramètre qu'on le désire par le biais des quantiles. La méthodologie proposée est validée par l'application à des données synthétiques et à des données réelles. Une étude de sensibilité de l'estimation du paramètre a été réalisée. L'utilisation de l'incertitude de ce paramètre pour quantifier l'incertitude des positions spatiales des réflecteurs est présentée dans ce travail de thèse. / The main goal of the seismic imaging for oil exploration and production as it is done nowadays is to provide an image of the first kilometers of the subsurface to allow the localization and an accurate estimation of hydrocarbon resources. The reservoirs where these hydrocarbons are trapped are structures which have a more or less complex geology. To characterize these reservoirs and allow the production of hydrocarbons, the geophysicist uses the depth migration which is a seismic imaging tool which serves to convert time data recorded during seismic surveys into depth images which will be exploited by the reservoir engineer with the help of the seismic interpreter and the geologist. During the depth migration, seismic events (reflectors, diffractions, faults …) are moved to their correct locations in space. Relevant depth migration requires an accurate knowledge of vertical and horizontal seismic velocity variations (velocity model). Usually the so-called Common-Image-Gathers (CIGs) serve as a tool to verify correctness of the velocity model. Often the CIGs are computed in the surface offset (distance between shot point and receiver) domain and their flatness serve as criteria of the velocity model correctness. Residual moveout (RMO) of the events on CIGs due to the ratio of migration velocity model and effective velocity model indicates incorrectness of the velocity model and is used for the velocity model updating. The post-stacked images forming the CIGs which are used as data for the RMO analysis are the results of an inverse problem and are corrupt by noises. An uncertainty analysis is necessary to improve evaluation of the results. Dealing with the uncertainty is a major issue, which supposes to help in decisions that have important social and commercial implications. The goal of this thesis is to contribute to the uncertainty analysis and its quantification in the analysis of various parameters computed during the seismic processing and particularly in RMO analysis. To reach these goals several stages were necessary. We began by appropriating the various geophysical concepts necessary for the understanding of:- the organization of the seismic data ;- the various processing ;- the various mathematical and methodological tools which are used (chapters 2 and 3). In the chapter 4, we present different tools used for the conventional RMO analysis. In the fifth one, we give a statistical interpretation of the conventional RMO analysis and we propose a stochastic approach of this analysis. This approach consists in hierarchical statistical model where the parameters are: - the variance which express the noise level in the data ;- a functional parameter which express coherency of the amplitudes along events ; - the ratio which is assume to be a random variable and not an unknown fixed parameter as it is the case in conventional approach. The adjustment of data to the model done by using smoothing methods of data, combined with the using of the wavelets for the estimation of allow to compute the posterior distribution of given the data by the empirical Bayes methods. An estimation of the parameter is obtained by using Markov Chain Monte Carlo simulations of its posterior distribution. The various quantiles of these simulations provide different estimations of . The proposed methodology is validated in the sixth chapter by its application on synthetic data and real data. A sensitivity analysis of the estimation of the parameter was done. The using of the uncertainty of this parameter to quantify the uncertainty of the spatial positions of reflectors is presented in this thesis. Approche Bayésienne Approche stochastique Loi de probabilité Loi a posteriori Monte-Carlo par chaînes de Markov Metropolis - Hastings Analyse du « Residual moveout » Analyse de l’incertitude Semblance Données sismiques AVO Bayesian approach Stochastic approach Markov chain Monte Carlo Metropolis - Hastings Probability density function A posteriori distribution Residual Moveout Analysis Uncertainty analysis Semblance Seismic data AVO Common Image Gathers

Search results