• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 17
  • 1
  • Tagged with
  • 35
  • 35
  • 17
  • 16
  • 11
  • 10
  • 9
  • 9
  • 7
  • 7
  • 7
  • 7
  • 7
  • 7
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Advancements on the Interface of Computer Experiments and Survival Analysis

Wang, Yueyao 20 July 2022 (has links)
Design and analysis of computer experiments is an area focusing on efficient data collection (e.g., space-filling designs), surrogate modeling (e.g., Gaussian process models), and uncertainty quantification. Survival analysis focuses on modeling the period of time until a certain event happens. Data collection, prediction, and uncertainty quantification are also fundamental in survival models. In this dissertation, the proposed methods are motivated by a wide range of real world applications, including high-performance computing (HPC) variability data, jet engine reliability data, Titan GPU lifetime data, and pine tree survival data. This dissertation is to explore interfaces on computer experiments and survival analysis with the above applications. Chapter 1 provides a general introduction to computer experiments and survival analysis. Chapter 2 focuses on the HPC variability management application. We investigate the applicability of space-filling designs and statistical surrogates in the HPC variability management setting, in terms of design efficiency, prediction accuracy, and scalability. A comprehensive comparison of the design strategies and predictive methods is conducted to study the combinations' performance in prediction accuracy. Chapter 3 focuses on the reliability prediction application. With the availability of multi-channel sensor data, a single degradation index is needed to be compatible with most existing models. We propose a flexible framework with multi-sensory data to model the nonlinear relationship between sensors and the degradation process. We also involve the automatic variable selection to exclude sensors that have no effect on the underlying degradation process. Chapter 4 investigates inference approaches for spatial survival analysis under the Bayesian framework. The Markov chain Monte Carlo (MCMC) approaches and variational inferences performance are studied for two survival models, the cumulative exposure model and the proportional hazard (PH) model. The Titan GPU data and pine tree survival data are used to illustrate the capability of variational inference on spatial survival models. Chapter 5 provides some general conclusions. / Doctor of Philosophy / This dissertation focus on three projects related to computer experiments and survival analysis. Design and analysis of the computer experiment is an area focusing on efficient data collection, building predictive models, and uncertainty quantification. Survival analysis focuses on modeling the period of time until a certain event happens. Data collection, prediction, and uncertainty quantification are also fundamental in survival models. Thus, this dissertation aims to explore interfaces between computer experiments and survival analysis with real world applications. High performance computing systems aggregate a large number of computers to achieve high computing speed. The first project investigates the applicability of space-filling designs and statistical predictive models in the HPC variability management setting, in terms of design efficiency, prediction accuracy, and scalability. A comprehensive comparison of the design strategies and predictive methods is conducted to study the combinations' performance in prediction accuracy. The second project focuses on building a degradation index that describes the product's underlying degradation process. With the availability of multi-channel sensor data, a single degradation index is needed to be compatible with most existing models. We propose a flexible framework with multi-sensory data to model the nonlinear relationship between sensors and the degradation process. We also involve the automatic variable selection to exclude sensors that have no effect on the underlying degradation process. The spatial survival data are often observed when the survival data are collected over a spatial region. The third project studies inference approaches for spatial survival analysis under the Bayesian framework. The commonly used inference method, Markov chain Monte Carlo (MCMC) approach and the approximate inference approach, variational inference's performance are studied for two survival models. The Titan GPU data and pine tree survival data are used to illustrate the capability of variational inference on spatial survival models.
12

Bayesian Modeling of Complex High-Dimensional Data

Huo, Shuning 07 December 2020 (has links)
With the rapid development of modern high-throughput technologies, scientists can now collect high-dimensional complex data in different forms, such as medical images, genomics measurements. However, acquisition of more data does not automatically lead to better knowledge discovery. One needs efficient and reliable analytical tools to extract useful information from complex datasets. The main objective of this dissertation is to develop innovative Bayesian methodologies to enable effective and efficient knowledge discovery from complex high-dimensional data. It contains two parts—the development of computationally efficient functional mixed models and the modeling of data heterogeneity via Dirichlet Diffusion Tree. The first part focuses on tackling the computational bottleneck in Bayesian functional mixed models. We propose a computational framework called variational functional mixed model (VFMM). This new method facilitates efficient data compression and high-performance computing in basis space. We also propose a new multiple testing procedure in basis space, which can be used to detect significant local regions. The effectiveness of the proposed model is demonstrated through two datasets, a mass spectrometry dataset in a cancer study and a neuroimaging dataset in an Alzheimer's disease study. The second part is about modeling data heterogeneity by using Dirichlet Diffusion Trees. We propose a Bayesian latent tree model that incorporates covariates of subjects to characterize the heterogeneity and uncover the latent tree structure underlying data. This innovative model may reveal the hierarchical evolution process through branch structures and estimate systematic differences between groups of samples. We demonstrate the effectiveness of the model through the simulation study and a brain tumor real data. / Doctor of Philosophy / With the rapid development of modern high-throughput technologies, scientists can now collect high-dimensional data in different forms, such as engineering signals, medical images, and genomics measurements. However, acquisition of such data does not automatically lead to efficient knowledge discovery. The main objective of this dissertation is to develop novel Bayesian methods to extract useful knowledge from complex high-dimensional data. It has two parts—the development of an ultra-fast functional mixed model and the modeling of data heterogeneity via Dirichlet Diffusion Trees. The first part focuses on developing approximate Bayesian methods in functional mixed models to estimate parameters and detect significant regions. Two datasets demonstrate the effectiveness of proposed method—a mass spectrometry dataset in a cancer study and a neuroimaging dataset in an Alzheimer's disease study. The second part focuses on modeling data heterogeneity via Dirichlet Diffusion Trees. The method helps uncover the underlying hierarchical tree structures and estimate systematic differences between the group of samples. We demonstrate the effectiveness of the method through the brain tumor imaging data.
13

Bayesian Neural Networks for Financial Asset Forecasting / Bayesianska neurala nätverk för prediktion av finansiella tillgångar

Back, Alexander, Keith, William January 2019 (has links)
Neural networks are powerful tools for modelling complex non-linear mappings, but they often suffer from overfitting and provide no measures of uncertainty in their predictions. Bayesian techniques are proposed as a remedy to these problems, as these both regularize and provide an inherent measure of uncertainty from their posterior predictive distributions. By quantifying predictive uncertainty, we attempt to improve a systematic trading strategy by scaling positions with uncertainty. Exact Bayesian inference is often impossible, and approximate techniques must be used. For this task, this thesis compares dropout, variational inference and Markov chain Monte Carlo. We find that dropout and variational inference provide powerful regularization techniques, but their predictive uncertainties cannot improve a systematic trading strategy. Markov chain Monte Carlo provides powerful regularization as well as promising estimates of predictive uncertainty that are able to improve a systematic trading strategy. However, Markov chain Monte Carlo suffers from an extreme computational cost in the high-dimensional setting of neural networks. / Neurala nätverk är kraftfulla verktyg för att modellera komplexa icke-linjära avbildningar, men de lider ofta av överanpassning och tillhandahåller inga mått på osäkerhet i deras prediktioner. Bayesianska tekniker har föreslagits för att råda bot på dessa problem, eftersom att de både har en regulariserande effekt, samt har ett inneboende mått på osäkerhet genom den prediktiva posteriora fördelningen. Genom att kvantifiera prediktiv osäkerhet försöker vi förbättra en systematisk tradingstrategi genom att skala modellens positioner med den skattade osäkerheten. Exakt Bayesiansk inferens är oftast omöjligt, och approximativa metoder måste användas. För detta ändamål jämför detta examensarbete dropout, variational inference och Markov chain Monte Carlo. Resultaten indikerar att både dropout och variational inference är kraftfulla regulariseringstekniker, men att deras prediktiva osäkerheter inte kan användas för att förbättra en systematisk tradingstrategi. Markov chain Monte Carlo ger en kraftfull regulariserande effekt, samt lovande skattningar av osäkerhet som kan användas för att förbättra en systematisk tradingstrategi. Dock lider Markov chain Monte Carlo av en enorm beräkningsmässig komplexitet i ett så högdimensionellt problem som neurala nätverk.
14

Deep generative models for natural language processing

Miao, Yishu January 2017 (has links)
Deep generative models are essential to Natural Language Processing (NLP) due to their outstanding ability to use unlabelled data, to incorporate abundant linguistic features, and to learn interpretable dependencies among data. As the structure becomes deeper and more complex, having an effective and efficient inference method becomes increasingly important. In this thesis, neural variational inference is applied to carry out inference for deep generative models. While traditional variational methods derive an analytic approximation for the intractable distributions over latent variables, here we construct an inference network conditioned on the discrete text input to provide the variational distribution. The powerful neural networks are able to approximate complicated non-linear distributions and grant the possibilities for more interesting and complicated generative models. Therefore, we develop the potential of neural variational inference and apply it to a variety of models for NLP with continuous or discrete latent variables. This thesis is divided into three parts. Part I introduces a <b>generic variational inference framework</b> for generative and conditional models of text. For continuous or discrete latent variables, we apply a continuous reparameterisation trick or the REINFORCE algorithm to build low-variance gradient estimators. To further explore Bayesian non-parametrics in deep neural networks, we propose a family of neural networks that parameterise categorical distributions with continuous latent variables. Using the stick-breaking construction, an unbounded categorical distribution is incorporated into our deep generative models which can be optimised by stochastic gradient back-propagation with a continuous reparameterisation. Part II explores <b>continuous latent variable models for NLP</b>. Chapter 3 discusses the Neural Variational Document Model (NVDM): an unsupervised generative model of text which aims to extract a continuous semantic latent variable for each document. In Chapter 4, the neural topic models modify the neural document models by parameterising categorical distributions with continuous latent variables, where the topics are explicitly modelled by discrete latent variables. The models are further extended to neural unbounded topic models with the help of stick-breaking construction, and a truncation-free variational inference method is proposed based on a Recurrent Stick-breaking construction (RSB). Chapter 5 describes the Neural Answer Selection Model (NASM) for learning a latent stochastic attention mechanism to model the semantics of question-answer pairs and predict their relatedness. Part III discusses <b>discrete latent variable models</b>. Chapter 6 introduces latent sentence compression models. The Auto-encoding Sentence Compression Model (ASC), as a discrete variational auto-encoder, generates a sentence by a sequence of discrete latent variables representing explicit words. The Forced Attention Sentence Compression Model (FSC) incorporates a combined pointer network biased towards the usage of words from source sentence, which significantly improves the performance when jointly trained with the ASC model in a semi-supervised learning fashion. Chapter 7 describes the Latent Intention Dialogue Models (LIDM) that employ a discrete latent variable to learn underlying dialogue intentions. Additionally, the latent intentions can be interpreted as actions guiding the generation of machine responses, which could be further refined autonomously by reinforcement learning. Finally, Chapter 8 summarizes our findings and directions for future work.
15

On improving variational inference with low-variance multi-sample estimators

Dhekane, Eeshan Gunesh 08 1900 (has links)
Les progrès de l’inférence variationnelle, tels que l’approche de variational autoencoder (VI) (Kingma and Welling (2013), Rezende et al. (2014)) et ses nombreuses modifications, se sont avérés très efficaces pour l’apprentissage des représentations latentes de données. Importance-weighted variational inference (IWVI) par Burda et al. (2015) améliore l’inférence variationnelle en utilisant plusieurs échantillons indépendants et répartis de manière identique pour obtenir des limites inférieures variationnelles plus strictes. Des articles récents tels que l’approche de hierarchical importance-weighted autoencoders (HIWVI) par Huang et al. (2019) et la modélisation de la distribution conjointe par Klys et al. (2018) démontrent l’idée de modéliser une distribution conjointe sur des échantillons pour améliorer encore l’IWVI en le rendant efficace pour l’échantillon. L’idée sous-jacente de ce mémoire est de relier les propriétés statistiques des estimateurs au resserrement des limites variationnelles. Pour ce faire, nous démontrons d’abord une borne supérieure sur l’écart variationnel en termes de variance des estimateurs sous certaines conditions. Nous prouvons que l’écart variationnel peut être fait disparaître au taux de O(1/n) pour une grande famille d’approches d’inférence variationelle. Sur la base de ces résultats, nous proposons l’approche de Conditional-IWVI (CIWVI), qui modélise explicitement l’échantillonnage séquentiel et conditionnel de variables latentes pour effectuer importance-weighted variational inference, et une approche connexe de Antithetic-IWVI (AIWVI) par Klys et al. (2018). Nos expériences sur les jeux de données d’analyse comparative, tels que MNIST (LeCun et al. (2010)) et OMNIGLOT (Lake et al. (2015)), démontrent que nos approches fonctionnent soit de manière compétitive, soit meilleures que les références IWVI et HIWVI en tant que le nombre d’échantillons augmente. De plus, nous démontrons que les résultats sont conformes aux propriétés théoriques que nous avons prouvées. En conclusion, nos travaux fournissent une perspective sur le taux d’amélioration de l’inference variationelle avec le nombre d’échantillons utilisés et l’utilité de modéliser la distribution conjointe sur des représentations latentes pour l’efficacité de l’échantillon. / Advances in variational inference, such as variational autoencoders (VI) (Kingma and Welling (2013), Rezende et al. (2014)) along with its numerous modifications, have proven highly successful for learning latent representations of data. Importance-weighted variational inference (IWVI) by Burda et al. (2015) improves the variational inference by using multiple i.i.d. samples for obtaining tighter variational lower bounds. Recent works like hierarchical importance-weighted autoencoders (HIWVI) by Huang et al. (2019) and joint distribution modeling by Klys et al. (2018) demonstrate the idea of modeling a joint distribution over samples to further improve over IWVI by making it sample efficient. The underlying idea in this thesis is to connect the statistical properties of the estimators to the tightness of the variational bounds. Towards this, we first demonstrate an upper bound on the variational gap in terms of the variance of the estimators under certain conditions. We prove that the variational gap can be made to vanish at the rate of O(1/n) for a large family of VI approaches. Based on these results, we propose the approach of Conditional-IWVI (CIWVI), which explicitly models the sequential and conditional sampling of latent variables to perform importance-weighted variational inference, and a related approach of Antithetic-IWVI (AIWVI) by Klys et al. (2018). Our experiments on the benchmarking datasets MNIST (LeCun et al. (2010)) and OMNIGLOT (Lake et al. (2015)) demonstrate that our approaches perform either competitively or better than the baselines IWVI and HIWVI as the number of samples increases. Further, we also demonstrate that the results are in accordance with the theoretical properties we proved. In conclusion, our work provides a perspective on the rate of improvement in VI with the number of samples used and the utility of modeling the joint distribution over latent representations for sample efficiency in VI.
16

Computational Gene Expression Deconvolution

Otto, Dominik 23 August 2021 (has links)
Technologies such as micro-expression arrays and high-throughput sequenc- ing assays have accelerated research of genetic transcription in biological cells. Furthermore, many links between the gene expression levels and the pheno- typic characteristics of cells have been discovered. Our current understanding of transcriptomics as an intermediate regulatory layer between genomics and proteomics raises hope that we will soon be able to decipher many more cel- lular mechanisms through the exploration of gene transcription. However, although large amounts of expression data are measured, only lim- ited information can be extracted. One general problem is the large set of considered genomic features. Expression levels are often analyzed individually because of limited computational resources and unknown statistical dependen- cies among the features. This leads to multiple testing issues or can lead to overfitting models, commonly referred to as the “curse of dimensionality.” Another problem can arise from ignorance of measurement uncertainty. In particular, approaches that consider statistical significance can suffer from underestimating uncertainty for weakly expressed genes and consequently re- quire subjective manual measures to produce consistent results (e.g., domain- specific gene filters). In this thesis, we lay out a theoretical foundation for a Bayesian interpretation of gene expression data based on subtle assumptions. Expression measure- ments are related to latent information (e.g., the transcriptome composition), which we formulate as a probability distribution that represents the uncer- tainty over the composition of the original sample. Instead of analyzing univariate gene expression levels, we use the multivari- ate transcriptome composition space. To realize computational feasibility, we develop a scalable dimensional reduction that aims to produce the best approximation that can be used with the computational resources available. To enable the deconvolution of gene expression, we describe subtissue specific probability distributions of expression profiles. We demonstrate the suitabil- ity of our approach with two deconvolution applications: first, we infer the composition of immune cells, and second we reconstruct tumor-specific ex- pression patterns from bulk-RNA-seq data of prostate tumor tissue samples.:1 Introduction 1 1.1 State of the Art and Motivation 2 1.2 Scope of this Thesis 5 2 Notation and Abbreviations 7 2.1 Notations 7 2.2 Abbreviations 9 3 Methods 10 3.1 The Convolution Assumption 10 3.2 Principal Component Analysis 11 3.3 Expression Patterns 11 3.4 Bayes’ Theorem 12 3.5 Inference Algorithms 13 3.5.1 Inference Through Sampling 13 3.5.2 Variationa lInference 14 4 Prior and Conditional Probabilities 16 4.1 Mixture Coefficients 16 4.2 Distribution of Tumor Cell Content 18 4.2.1 Optimal Tumor Cell Content Drawing 20 4.3 Transcriptome Composition Distribution 21 4.3.1 Sequencing Read Distribution 21 4.3.1.1 Empirical Plausibility Investigation 25 4.3.2 Dirichletand Normality 29 4.3.3 Theta◦logTransformation 29 4.3.4 Variance Stabilization 32 4.4 Cell and Tissue-Type-Specific Expression Pattern Distributions 32 4.4.1 Method of Moments and Factor Analysis 33 4.4.1.1 Tumor Free Cells 33 4.4.1.2 Tumor Cells 34 4.4.2 Characteristic Function 34 4.4.3 Gaussian Mixture Model 37 4.5 Prior Covariance Matrix Distribution 37 4.6 Bayesian Survival Analysis 38 4.7 Demarcation from Existing Methods 40 4.7.1 Negative Binomial Distribution 40 4.7.2 Steady State Assumption 41 4.7.3 Partial Correlation 41 4.7.4 Interaction Networks 42 5 Feasibility via Dimensional Reduction 43 5.1 DR for Deconvolution of Expression Patterns 44 5.1.1 Systematically Differential Expression 45 5.1.2 Internal Distortion 46 5.1.3 Choosinga DR 46 5.1.4 Testing the DR 47 5.2 Transformed Density Functions 49 5.3 Probability Distribution of Mixtures in DR Space 50 5.3.1 Likelihood Gradient 51 5.3.2 The Theorem 52 5.3.3 Implementation 52 5.4 DR for Inference of Cell Composition 53 5.4.1 Problem Formalization 53 5.4.2 Naive PCA 54 5.4.3 Whitening 55 5.4.3.1 Covariance Inflation 56 5.4.4 DR Through Optimization 56 5.4.4.1 Starting Point 57 5.4.4.2 The Optimization Process 58 5.4.5 Results 59 5.5 Interpretation of DR 61 5.6 Comparison to Other DRs 62 5.6.1 Weighted Correlation Network Analysis 62 5.6.2 t-Distributed Stochastic Neighbor Embedding 65 5.6.3 Diffusion Map 66 5.6.4 Non-negativeMatrix Factorization 66 5.7 Conclusion 67 6 Data for Example Application 68 6.1 Immune Cell Data 68 6.1.1 Provided List of Publicly Available Data 68 6.1.2 Obtaining the Publicly Available RNA-seq Data 69 6.1.3 Obtaining the Publicly Available Expression Microarray Data 71 6.1.4 Data Sanitization 71 6.1.4.1 A Tagging Tool 72 6.1.4.2 Tagging Results 73 6.1.4.3 Automatic Sanitization 74 6.1.5 Data Unification 75 6.1.5.1 Feature Mapping 76 6.1.5.2 Feature Selection 76 6.2 Examples of Mixtures with Gold Standard 79 6.2.1 Expression Microarray Data 81 6.2.2 Normalized Expression 81 6.2.3 Composition of the Gold Standard 82 6.3 Tumor Expression Data 82 6.3.1 Tumor Content 82 6.4 Benchmark Reference Study 83 6.4.1 Methodology 83 6.4.2 Reproduction 84 6.4.3 Reference Hazard Model 85 7 Bayesian Models in Example Applications 87 7.1 Inference of Cell Composition 87 7.1.1 The Expression Pattern Distributions (EPDs) 88 7.1.2 The Complete Model 89 7.1.3 Start Values 89 7.1.4 Resource Limits 90 7.2 Deconvolution of Expression Patterns 91 7.2.1 The Distribution of Expression Pattern Distribution 91 7.2.2 The Complete Model 92 7.2.3 SingleSampleDeconvolution 93 7.2.4 A Simplification 94 7.2.5 Start Values 94 8 Results of Example Applications 96 8.1 Inference of Cell Composition 96 8.1.1 Single Composition Output 96 8.1.2 ELBO Convergence in Variational Inference 97 8.1.3 Difficulty-Divergence 97 8.1.3.1 Implementing an Alternative Stick-Breaking 98 8.1.3.2 Using MoreGeneral Inference Methods 99 8.1.3.3 UsingBetterData 100 8.1.3.4 Restriction of Variance of Cell-Type-Specific EPDs 100 8.1.3.5 Doing Fewer Iterations 100 8.1.4 Difficulty-Bias 101 8.1.5 Comparison to Gold Standard 101 8.1.6 Comparison to Competitors 101 8.1.6.1 Submission-Aginome-XMU 105 8.1.6.2 Submission-Biogem 105 8.1.6.3 Submission-DA505 105 8.1.6.4 Submission-AboensisIV 105 8.1.6.5 Submission-mittenTDC19 106 8.1.6.6 Submission-CancerDecon 106 8.1.6.7 Submission-CCB 106 8.1.6.8 Submission-D3Team 106 8.1.6.9 Submission-ICTD 106 8.1.6.10 Submission-Patrick 107 8.1.6.11 Conclusion for the Competitor Review 107 8.1.7 Implementation 107 8.1.8 Conclusion 108 8.2 Deconvolution of Expression Patterns 108 8.2.1 Difficulty-Multimodality 109 8.2.1.1 Order of Kernels 109 8.2.1.2 Posterior EPD Complexity 110 8.2.1.3 Tumor Cell Content Estimate 110 8.2.2 Difficulty-Time 110 8.2.3 The Inference Process 111 8.2.3.1 ELBO Convergence in Variational Inference 111 8.2.4 Posterior of Tumor Cell Content 112 8.2.5 Posterior of Tissue Specific Expression 112 8.2.6 PosteriorHazardModel 113 8.2.7 Gene Marker Study with Deconvoluted Tumor Expression 115 8.2.8 Hazard Model Comparison Overview 116 8.2.9 Implementation 116 9 Discussion 117 9.1 Limitations 117 9.1.1 Simplifying Assumptions 117 9.1.2 Computation Resources 118 9.1.3 Limited Data and Suboptimal Format 118 9.1.4 ItIsJustConsistency 119 9.1.5 ADVI Uncertainty Estimation 119 9.2 Outlook 119 9.3 Conclusion 121 A Appendix 123 A.1 Optimalα 123 A.2 Digamma Function and Logarithm 123 A.3 Common Normalization 124 A.3.1 CPMNormalization 124 A.3.2 TPMNormalization 124 A.3.3 VSTNormalization 125 A.3.4 PCA After Different Normalizations 125 A.4 Mixture Prior Per Tissue Source 125 A.5 Data 125 A.6 Cell Type Characterization without Whitening 133 B Proofs 137 Bibliography 140
17

Variational Inference for Data-driven Stochastic Programming

Prateek Jaiswal (11210091) 30 July 2021 (has links)
<div>Stochastic programs are standard models for decision-making under uncertainty and have been extensively studied in the operations research literature. In general, stochastic programming involves minimizing an expected cost function, where the expectation is with respect to fully specified stochastic models that quantify the aleatoric or `inherent' uncertainty in the decision-making problem. In practice, however, the stochastic models are unknown but can be estimated from data, introducing an additional epistemic uncertainty into the decision-making problem. The Bayesian framework provides a coherent way to quantify the epistemic uncertainty through the posterior distribution by combining prior beliefs of the decision-makers with the observed data. Bayesian methods have been used for data-driven decision-making in various applications such as inventory management, portfolio design, machine learning, optimal scheduling, and staffing, etc.</div><div> </div><div>Bayesian methods are challenging to implement, mainly due to the fact that the posterior is computationally intractable, necessitating the computation of approximate posteriors. Broadly speaking, there are two methods in the literature implementing approximate posterior inference. First are sampling-based methods such as Markov Chain Monte Carlo. Sampling-based methods are theoretically well understood, but they suffer from various issues like high variance, poor scalability to high-dimensional problems, and have complex diagnostics. Consequently, we propose to use optimization-based methods collectively known as variational inference (VI) that use information projections to compute an approximation to the posterior. Empirical studies have shown that VI methods are computationally faster and easily scalable to higher-dimensional problems and large datasets. However, the theoretical guarantees of these methods are not well understood. Moreover, VI methods are empirically and theoretically less explored in the decision-theoretic setting.</div><div><br></div><div> In this thesis, we first propose a novel VI framework for risk-sensitive data-driven decision-making, which we call risk-sensitive variational Bayes (RSVB). In RSVB, we jointly compute a risk-sensitive approximation to the `true' posterior and the optimal decision by solving a minimax optimization problem. The RSVB framework includes the naive approach of first computing a VI approximation to the true posterior and then using it in place of the true posterior for decision-making. We show that the RSVB approximate posterior and the corresponding optimal value and decision rules are asymptotically consistent, and we also compute their rate of convergence. We illustrate our theoretical findings in both parametric as well as nonparametric setting with the help of three examples: the single and multi-product newsvendor model and Gaussian process classification. Second, we present the Bayesian joint chance-constrained stochastic program (BJCCP) for modeling decision-making problems with epistemically uncertain constraints. We discover that using VI methods for posterior approximation can ensure the convexity of the feasible set in (BJCCP) unlike any sampling-based methods and thus propose a VI approximation for (BJCCP). We also show that the optimal value computed using the VI approximation of (BJCCP) are statistically consistent. Moreover, we derive the rate of convergence of the optimal value and compute the rate at which a VI approximate solution of (BJCCP) is feasible under the true constraints. We demonstrate the utility of our approach on an optimal staffing problem for an M/M/c queue. Finally, this thesis also contributes to the growing literature in understanding statistical performance of VI methods. In particular, we establish the frequentist consistency of an approximate posterior computed using a well known VI method that computes an approximation to the posterior distribution by minimizing the Renyi divergence from the ‘true’ posterior.</div>
18

Bayesian Sparse Regression with Application to Data-driven Understanding of Climate

Das, Debasish January 2015 (has links)
Sparse regressions based on constraining the L1-norm of the coefficients became popular due to their ability to handle high dimensional data unlike the regular regressions which suffer from overfitting and model identifiability issues especially when sample size is small. They are often the method of choice in many fields of science and engineering for simultaneously selecting covariates and fitting parsimonious linear models that are better generalizable and easily interpretable. However, significant challenges may be posed by the need to accommodate extremes and other domain constraints such as dynamical relations among variables, spatial and temporal constraints, need to provide uncertainty estimates and feature correlations, among others. We adopted a hierarchical Bayesian version of the sparse regression framework and exploited its inherent flexibility to accommodate the constraints. We applied sparse regression for the feature selection problem of statistical downscaling of the climate variables with particular focus on their extremes. This is important for many impact studies where the climate change information is required at a spatial scale much finer than that provided by the global or regional climate models. Characterizing the dependence of extremes on covariates can help in identification of plausible causal drivers and inform extremes downscaling. We propose a general-purpose sparse Bayesian framework for covariate discovery that accommodates the non-Gaussian distribution of extremes within a hierarchical Bayesian sparse regression model. We obtain posteriors over regression coefficients, which indicate dependence of extremes on the corresponding covariates and provide uncertainty estimates, using a variational Bayes approximation. The method is applied for selecting informative atmospheric covariates at multiple spatial scales as well as indices of large scale circulation and global warming related to frequency of precipitation extremes over continental United States. Our results confirm the dependence relations that may be expected from known precipitation physics and generates novel insights which can inform physical understanding. We plan to extend our model to discover covariates for extreme intensity in future. We further extend our framework to handle the dynamic relationship among the climate variables using a nonparametric Bayesian mixture of sparse regression models based on Dirichlet Process (DP). The extended model can achieve simultaneous clustering and discovery of covariates within each cluster. Moreover, the a priori knowledge about association between pairs of data-points is incorporated in the model through must-link constraints on a Markov Random Field (MRF) prior. A scalable and efficient variational Bayes approach is developed to infer posteriors on regression coefficients and cluster variables. / Computer and Information Science
19

Improved training of generative models

Goyal, Anirudh 11 1900 (has links)
No description available.
20

Inference on Markov random fields : methods and applications

Lienart, Thibaut January 2017 (has links)
This thesis considers the problem of performing inference on undirected graphical models with continuous state spaces. These models represent conditional independence structures that can appear in the context of Bayesian Machine Learning. In the thesis, we focus on computational methods and applications. The aim of the thesis is to demonstrate that the factorisation structure corresponding to the conditional independence structure present in high-dimensional models can be exploited to decrease the computational complexity of inference algorithms. First, we consider the smoothing problem on Hidden Markov Models (HMMs) and discuss novel algorithms that have sub-quadratic computational complexity in the number of particles used. We show they perform on par with existing state-of-the-art algorithms with a quadratic complexity. Further, a novel class of rejection free samplers for graphical models known as the Local Bouncy Particle Sampler (LBPS) is explored and applied on a very large instance of the Probabilistic Matrix Factorisation (PMF) problem. We show the method performs slightly better than Hamiltonian Monte Carlo methods (HMC). It is also the first such practical application of the method to a statistical model with hundreds of thousands of dimensions. In a second part of the thesis, we consider approximate Bayesian inference methods and in particular the Expectation Propagation (EP) algorithm. We show it can be applied as the backbone of a novel distributed Bayesian inference mechanism. Further, we discuss novel variants of the EP algorithms and show that a specific type of update mechanism, analogous to the mirror descent algorithm outperforms all existing variants and is robust to Monte Carlo noise. Lastly, we show that EP can be used to help the Particle Belief Propagation (PBP) algorithm in order to form cheap and adaptive proposals and significantly outperform classical PBP.

Page generated in 0.1345 seconds