• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • 1
  • Tagged with
  • 7
  • 7
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Adaptive Evolutionary Monte Carlo for Heuristic Optimization: With Applications to Sensor Placement Problems

Ren, Yuan 14 January 2010 (has links)
This dissertation presents an algorithm to solve optimization problems with "black-box" objective functions, i.e., functions that can only be evaluated by running a computer program. Such optimization problems often arise in engineering applications, for example, the design of sensor placement. Due to the complexity in engineering systems, the objective functions usually have multiple local optima and depend on a huge number of decision variables. These difficulties make many existing methods less effective. The proposed algorithm is called adaptive evolutionary Monte Carlo (AEMC), and it combines sampling-based and metamodel-based search methods. AEMC incorporates strengths from both methods and compensates limitations of each individual method. Specifically, the AEMC algorithm combines a tree-based predictive model with an evolutionary Monte Carlo sampling procedure for the purpose of heuristic optimization. AEMC is able to escape local optima due to the random sampling component, and it improves the quality of solutions quickly by using information learned from the tree-based model. AEMC is also an adaptive Markov chain Monte Carlo (MCMC) algorithm, and is in fact the rst adaptive MCMC algorithm that simulates multiple Markov chains in parallel. The ergodicity property of the AEMC algorithm is studied. It is proven that the distribution of samples obtained by AEMC converges asymptotically to the "target" distribution determined by the objective function. This means that AEMC has a larger probability of collecting samples from regions containing the global optimum than from other regions, which implies that AEMC will reach the global optimum given enough run time. The AEMC algorithm falls into the category of heuristic optimization algorithms, and is applicable to the problems that can be solved by other heuristic methods, such as genetic algorithm. Advantages of AEMC are demonstrated by applying it to a sensor placement problem in a manufacturing process, as well as to a suite of standard test functions. It is shown that AEMC is able to enhance optimization effectiveness and efficiency as compared to a few alternative strategies, including genetic algorithm, Markov chain Monte Carlo algorithms, and meta-model based methods. The effectiveness of AEMC for sampling purposes is also shown by applying it to a mixture Gaussian distribution.
2

Adaptive Stochastic Gradient Markov Chain Monte Carlo Methods for Dynamic Learning and Network Embedding

Tianning Dong (14559992) 06 February 2023 (has links)
<p>Latent variable models are widely used in modern data science for both statistic and dynamic data. This thesis focuses on large-scale latent variable models formulated for time series data and static network data. The former refers to the state space model for dynamic systems, which models the evolution of latent state variables and the relationship between the latent state variables and observations. The latter refers to a network decoder model, which map a large network into a low-dimensional space of latent embedding vectors. Both problems can be solved by adaptive stochastic gradient Markov chain Monte Carlo (MCMC), which allows us to simulate the latent variables and estimate the model parameters in a simultaneous manner and thus facilitates the down-stream statistical inference from the data. </p> <p><br></p> <p>For the state space model, its challenge is on inference for high-dimensional, large scale and long series data. The existing algorithms, such as particle filter or sequential importance sampler, do not scale well to the dimension of the system and the sample size of the dataset, and often suffers from the sample degeneracy issue for long series data. To address the issue, the thesis proposes the stochastic approximation Langevinized ensemble Kalman filter (SA-LEnKF) for jointly estimating the states and unknown parameters of the dynamic system, where the parameters are estimated on the fly based on the state variables simulated by the LEnKF under the framework of stochastic approximation MCMC. Under mild conditions, we prove its consistency in parameter estimation and ergodicity in state variable simulations. The proposed algorithm can be used in uncertainty quantification for long series, large scale, and high-dimensional dynamic systems. Numerical results on simulated datasets and large real-world datasets indicate its superiority over the existing algorithms, and its great potential in statistical analysis of complex dynamic systems encountered in modern data science. </p> <p><br></p> <p>For the network embedding problem, an appropriate embedding dimension is hard to determine under the theoretical framework of the existing methods, where the embedding dimension is often considered as a tunable hyperparameter or a choice of common practice. The thesis proposes a novel network embedding method with a built-in mechanism for embedding dimension selection. The basic idea is to treat the embedding vectors as the latent inputs for a deep neural network (DNN) model. Then by an adaptive stochastic gradient MCMC algorithm, we can simulate of the embedding vectors and estimate the parameters of the DNN model in a simultaneous manner. By the theory of sparse deep learning, the embedding dimension can be determined via imposing an appropriate sparsity penalty on the DNN model. Experiments on real-world networks show that our method can perform dimension selection in network embedding and meanwhile preserve network structures. </p> <p><br></p>
3

Slice Sampling with Multivariate Steps

Thompson, Madeleine 11 January 2012 (has links)
Markov chain Monte Carlo (MCMC) allows statisticians to sample from a wide variety of multidimensional probability distributions. Unfortunately, MCMC is often difficult to use when components of the target distribution are highly correlated or have disparate variances. This thesis presents three results that attempt to address this problem. First, it demonstrates a means for graphical comparison of MCMC methods, which allows researchers to compare the behavior of a variety of samplers on a variety of distributions. Second, it presents a collection of new slice-sampling MCMC methods. These methods either adapt globally or use the adaptive crumb framework for sampling with multivariate steps. They perform well with minimal tuning on distributions when popular methods do not. Methods in the first group learn an approximation to the covariance of the target distribution and use its eigendecomposition to take non-axis-aligned steps. Methods in the second group use the gradients at rejected proposed moves to approximate the local shape of the target distribution so that subsequent proposals move more efficiently through the state space. Finally, this thesis explores the scaling of slice sampling with multivariate steps with respect to dimension, resulting in a formula for optimally choosing scale tuning parameters. It shows that the scaling of untransformed methods can sometimes be improved by alternating steps from those methods with radial steps based on those of the polar slice sampler.
4

Slice Sampling with Multivariate Steps

Thompson, Madeleine 11 January 2012 (has links)
Markov chain Monte Carlo (MCMC) allows statisticians to sample from a wide variety of multidimensional probability distributions. Unfortunately, MCMC is often difficult to use when components of the target distribution are highly correlated or have disparate variances. This thesis presents three results that attempt to address this problem. First, it demonstrates a means for graphical comparison of MCMC methods, which allows researchers to compare the behavior of a variety of samplers on a variety of distributions. Second, it presents a collection of new slice-sampling MCMC methods. These methods either adapt globally or use the adaptive crumb framework for sampling with multivariate steps. They perform well with minimal tuning on distributions when popular methods do not. Methods in the first group learn an approximation to the covariance of the target distribution and use its eigendecomposition to take non-axis-aligned steps. Methods in the second group use the gradients at rejected proposed moves to approximate the local shape of the target distribution so that subsequent proposals move more efficiently through the state space. Finally, this thesis explores the scaling of slice sampling with multivariate steps with respect to dimension, resulting in a formula for optimally choosing scale tuning parameters. It shows that the scaling of untransformed methods can sometimes be improved by alternating steps from those methods with radial steps based on those of the polar slice sampler.
5

Bayesian Modeling and Adaptive Monte Carlo with Geophysics Applications

Wang, Jianyu January 2013 (has links)
<p>The first part of the thesis focuses on the development of Bayesian modeling motivated by geophysics applications. In Chapter 2, we model the frequency of pyroclastic flows collected from the Soufriere Hills volcano. Multiple change points within the dataset reveal several limitations of existing methods in literature. We propose Bayesian hierarchical models (BBH) by introducing an extra level of hierarchy with hyper parameters, adding a penalty term to constrain close consecutive rates, and using a mixture prior distribution to more accurately match certain circumstances in reality. We end the chapter with a description of the prediction procedure, which is the biggest advantage of the BBH in comparison with other existing methods. In Chapter 3, we develop new statistical techniques to model and relate three complex processes and datasets: the process of extrusion of magma into the lava dome, the growth of the dome as measured by its height, and the rockfalls as an indication of the dome's instability. First, we study the dynamic Negative Binomial branching process and use it to model the rockfalls. Moreover, a generalized regression model is proposed to regress daily rockfall numbers on the extrusion rate and dome height. Furthermore, we solve an inverse problem from the regression model and predict extrusion rate based on rockfalls and dome height.</p><p>The other focus of the thesis is adaptive Markov chain Monte Carlo (MCMC) method. In Chapter 4, we improve upon the Wang-Landau (WL) algorithm. The WL algorithm is an adaptive sampling scheme that modifies the target distribution to enable the chain to visit low-density regions of the state space. However, the approach relies heavily on a partition of the state space that is left to the user to specify. As a result, the implementation and the use of the algorithm are time-consuming and less automatic. We propose an automatic, adaptive partitioning scheme which continually refines the initial partition as needed during sampling. We show that this overcomes the limitations of the input user-specified partition, making the algorithm significantly more automatic and user-friendly while also making the performance dramatically more reliable and robust. In Chapter 5, we consider the convergence and autocorrelation aspects of MCMC. We propose an Exploration/Exploitation (XX) approach to constructing adaptive MCMC algorithms, which combines adaptation schemes of distinct types. The exploration piece uses adaptation strategies aiming at exploring new regions of the target distribution and thus improving the rate of convergence to equilibrium. The exploitation piece involves an adaptation component which decreases autocorrelation for sampling among regions already discovered. We demonstrate that the combined XX algorithm significantly outperforms either original algorithm on difficult multimodal sampling problems.</p> / Dissertation
6

Towards adaptive learning and inference : applications to hyperparameter tuning and astroparticle physics / Contributions à l'apprentissage et l'inférence adaptatifs : applications à l'ajustement d'hyperparamètres et à la physique des astroparticules

Bardenet, Rémi 19 November 2012 (has links)
Les algorithmes d'inférence ou d'optimisation possèdent généralement des hyperparamètres qu'il est nécessaire d'ajuster. Nous nous intéressons ici à l'automatisation de cette étape d'ajustement et considérons différentes méthodes qui y parviennent en apprenant en ligne la structure du problème considéré.La première moitié de cette thèse explore l'ajustement des hyperparamètres en apprentissage artificiel. Après avoir présenté et amélioré le cadre générique de l'optimisation séquentielle à base de modèles (SMBO), nous montrons que SMBO s'applique avec succès à l'ajustement des hyperparamètres de réseaux de neurones profonds. Nous proposons ensuite un algorithme collaboratif d'ajustement qui mime la mémoire qu'ont les humains d'expériences passées avec le même algorithme sur d'autres données.La seconde moitié de cette thèse porte sur les algorithmes MCMC adaptatifs, des algorithmes d'échantillonnage qui explorent des distributions de probabilité souvent complexes en ajustant leurs paramètres internes en ligne. Pour motiver leur étude, nous décrivons d'abord l'observatoire Pierre Auger, une expérience de physique des particules dédiée à l'étude des rayons cosmiques. Nous proposons une première partie du modèle génératif d'Auger et introduisons une procédure d'inférence des paramètres individuels de chaque événement d'Auger qui ne requiert que ce premier modèle. Ensuite, nous remarquons que ce modèle est sujet à un problème connu sous le nom de label switching. Après avoir présenté les solutions existantes, nous proposons AMOR, le premier algorithme MCMC adaptatif doté d'un réétiquetage en ligne qui résout le label switching. Nous présentons une étude empirique et des résultats théoriques de consistance d'AMOR, qui mettent en lumière des liens entre le réétiquetage et la quantification vectorielle / Inference and optimization algorithms usually have hyperparameters that require to be tuned in order to achieve efficiency. We consider here different approaches to efficiently automatize the hyperparameter tuning step by learning online the structure of the addressed problem. The first half of this thesis is devoted to hyperparameter tuning in machine learning. After presenting and improving the generic sequential model-based optimization (SMBO) framework, we show that SMBO successfully applies to the task of tuning the numerous hyperparameters of deep belief networks. We then propose an algorithm that performs tuning across datasets, mimicking the memory that humans have of past experiments with the same algorithm on different datasets. The second half of this thesis deals with adaptive Markov chain Monte Carlo (MCMC) algorithms, sampling-based algorithms that explore complex probability distributions while self-tuning their internal parameters on the fly. We start by describing the Pierre Auger observatory, a large-scale particle physics experiment dedicated to the observation of atmospheric showers triggered by cosmic rays. The models involved in the analysis of Auger data motivated our study of adaptive MCMC. We derive the first part of the Auger generative model and introduce a procedure to perform inference on shower parameters that requires only this bottom part. Our model inherently suffers from label switching, a common difficulty in MCMC inference, which makes marginal inference useless because of redundant modes of the target distribution. After reviewing existing solutions to label switching, we propose AMOR, the first adaptive MCMC algorithm with online relabeling. We give both an empirical and theoretical study of AMOR, unveiling interesting links between relabeling algorithms and vector quantization.
7

Thesis_deposit.pdf

Sehwan Kim (15348235) 25 April 2023 (has links)
<p>    Adaptive MCMC is advantageous over traditional MCMC due to its ability to automatically adjust its proposal distributions during the sampling process, providing improved sampling efficiency and faster convergence to the target distribution, especially in complex or high-dimensional problems. However, designing and validating the adaptive scheme cautiously is crucial to ensure algorithm validity and prevent the introduction of biases. This dissertation focuses on the use of Adaptive MCMC for deep learning, specifically addressing the mode collapse issue in Generative Adversarial Networks (GANs) and implementing Fiducial inference, and its application to Causal inference in individual treatment effect problems.</p> <p><br></p> <p>    First, GAN was recently introduced in the literature as a novel machine learning method for training generative models. However, GAN is very difficult to train due to the issue of mode collapse, i.e., lack of diversity among generated data. We figure out the reason why GAN suffers from this issue and lay out a new theoretical framework for GAN based on randomized decision rules such that the mode collapse issue can be overcome essentially. Under the new theoretical framework, the discriminator converges to a fixed point while the generator converges to a distribution at the Nash equilibrium.</p> <p><br></p> <p>    Second, Fiducial inference was generally considered as R.A. Fisher's a big blunder, but the goal he initially set, <em>making inference for the uncertainty of model parameters on the basis of observations</em>, has been continually pursued by many statisticians. By leveraging on advanced statistical computing techniques such as stochastic approximation Markov chain Monte Carlo, we develop a new statistical inference method, the so-called extended Fiducial inference, which achieves the initial goal of fiducial inference. </p> <p><br></p> <p>    Lastly, estimating ITE is important for decision making in various fields, particularly in health research where precision medicine is being investigated. Conditional average treatment effect (CATE) is often used for such purpose, but uncertainty quantification and explaining the variability of predicted ITE is still needed for fair decision making. We discuss using extended Fiducial inference to construct prediction intervals for ITE, and introduces a double neural net algorithm for efficient prediction and estimation of nonlinear ITE.</p>

Page generated in 0.055 seconds