Global ETD Search

11	Kernel Averaged Predictors for Space and Space-Time Processes Heaton, Matthew January 2011 (has links) <p>In many spatio-temporal applications a vector of covariates is measured alongside a spatio-temporal response. In such cases, the purpose of the statistical model is to quantify the change, in expectation or otherwise, in the response due to a change in the predictors while adequately accounting for the spatio-temporal structure of the response, the predictors, or both. The most common approach for building such a model is to confine the relationship between the response and the predictors to a single spatio-temporal coordinate. For spatio-temporal problems, however, the relationship between the response and predictors may not be so confined. For example, spatial models are often used to quantify the effect of pollution exposure on mortality. Yet, an unknown lag exists between time of exposure to pollutants and mortality. Furthermore, due to mobility and atmospheric movement, a spatial lag between pollution concentration and mortality may also exist (e.g. subjects may live in the suburbs where pollution levels are low but work in the city where pollution levels are high).</p><p>The contribution of this thesis is to propose a hierarchical modeling framework which captures complex spatio-temporal relationships between responses and covariates. Specifically, the models proposed here use kernels to capture spatial and/or temporal lagged effects. Several forms of kernels are proposed with varying degrees of complexity. In each case, however, the kernels are assumed to be parametric with parameters that are easily interpretable and estimable from the data. Full distributional results are given for the Gaussian setting along with consequences of model misspecification. The methods are shown to be effective in understanding the complex relationship between responses and covariates through various simulated examples and analyses of physical data sets.</p> / Dissertation Statistics Environmental Sciences Gaussian process Kernel average Lagged effect Ozone
12	Fast methods for identifying high dimensional systems using observations Plumlee, Matthew 08 June 2015 (has links) This thesis proposes new analysis tools for simulation models in the presence of data. To achieve a representation close to reality, simulation models are typically endowed with a set of inputs, termed parameters, that represent several controllable, stochastic or unknown components of the system. Because these models often utilize computationally expensive procedures, even modern supercomputers require a nontrivial amount of time, money, and energy to run for complex systems. Existing statistical frameworks avoid repeated evaluations of deterministic models through an emulator, constructed by conducting an experiment on the code. In high dimensional scenarios, the traditional framework for emulator-based analysis can fail due to the computational burden of inference. This thesis proposes a new class of experiments where inference from half a million observations is possible in seconds versus the days required for the traditional technique. In a case study presented in this thesis, the parameter of interest is a function as opposed to a scalar or a set of scalars, meaning the problem exists in the high dimensional regime. This work develops a new modeling strategy to nonparametrically study the functional parameter using Bayesian inference. Stochastic simulations are also investigated in the thesis. I describe the development of emulators through a framework termed quantile kriging, which allows for non-parametric representations of the stochastic behavior of the output whereas previous work has focused on normally distributed outputs. Furthermore, this work studied asymptotic properties of this methodology that yielded practical insights. Under certain regulatory conditions, there is the following result: By using an experiment that has the appropriate ratio of replications to sets of different inputs, we can achieve an optimal rate of convergence. Additionally, this method provided the basic tool for the study of defect patterns and a case study is explored. Model calibration Computer experiments Reproducing kernel Hilbert spaces Gaussian process
13	Constrained relative entropy minimization with applications to multitask learning Koyejo, Oluwasanmi Oluseye 15 July 2013 (has links) This dissertation addresses probabilistic inference via relative entropy minimization subject to expectation constraints. A canonical representation of the solution is determined without the requirement for convexity of the constraint set, and is given by members of an exponential family. The use of conjugate priors for relative entropy minimization is proposed, and a class of conjugate prior distributions is introduced. An alternative representation of the solution is provided as members of the prior family when the prior distribution is conjugate. It is shown that the solutions can be found by direct optimization with respect to members of such parametric families. Constrained Bayesian inference is recovered as a special case with a specific choice of constraints induced by observed data. The framework is applied to the development of novel probabilistic models for multitask learning subject to constraints determined by domain expertise. First, a model is developed for multitask learning that jointly learns a low rank weight matrix and the prior covariance structure between different tasks. The multitask learning approach is extended to a class of nonparametric statistical models for transposable data, incorporating side information such as graphs that describe inter-row and inter-column similarity. The resulting model combines a matrix-variate Gaussian process prior with inference subject to nuclear norm expectation constraints. In addition, a novel nonparametric model is proposed for multitask bipartite ranking. The proposed model combines a hierarchical matrix-variate Gaussian process prior with inference subject to ordering constraints and nuclear norm constraints, and is applied to disease gene prioritization. In many of these applications, the solution is found to be unique. Experimental results show substantial performance improvements as compared to strong baseline models. / text Relative entropy Maxent Multitask learning Exponential families Gaussian process
14	Laser-Based 3D Mapping and Navigation in Planetary Worksite Environments Tong, Chi Hay 14 January 2014 (has links) For robotic deployments in planetary worksite environments, map construction and navigation are essential for tasks such as base construction, scientific investigation, and in-situ resource utilization. However, operation in a planetary environment imposes sensing restrictions, as well as challenges due to the terrain. In this thesis, we develop enabling technologies for autonomous mapping and navigation by employing a panning laser rangefinder as our primary sensor on a rover platform. The mapping task is addressed as a three-dimensional Simultaneous Localization and Mapping (3D SLAM) problem. During operation, long-range 360 degree scans are obtained at infrequent stops. These scans are aligned using a combination of sparse features and odometry measurements in a batch alignment framework, resulting in accurate maps of planetary worksite terrain. For navigation, the panning laser rangefinder is configured to perform short, continuous sweeps while the rover is in motion. An appearance-based approach is taken, where laser intensity images are used to compute Visual Odometry (VO) estimates. We overcome the motion distortion issues by formulating the estimation problem in continuous time. This is facilitated by the introduction of Gaussian Process Gauss-Newton (GPGN), a novel algorithm for nonparametric, continuous-time, nonlinear, batch state estimation. Extensive experimental validation is provided for both mapping and navigation components using data gathered at multiple planetary analogue test sites. gaussian process gauss-newton state estimation slam 0771
15	Valid estimation and prediction inference in analysis of a computer model Nagy, Béla 11 1900 (has links) Computer models or simulators are becoming increasingly common in many fields in science and engineering, powered by the phenomenal growth in computer hardware over the past decades. Many of these simulators implement a particular mathematical model as a deterministic computer code, meaning that running the simulator again with the same input gives the same output. Often running the code involves some computationally expensive tasks, such as solving complex systems of partial differential equations numerically. When simulator runs become too long, it may limit their usefulness. In order to overcome time or budget constraints by making the most out of limited computational resources, a statistical methodology has been proposed, known as the "Design and Analysis of Computer Experiments". The main idea is to run the expensive simulator only at a relatively few, carefully chosen design points in the input space, and based on the outputs construct an emulator (statistical model) that can emulate (predict) the output at new, untried locations at a fraction of the cost. This approach is useful provided that we can measure how much the predictions of the cheap emulator deviate from the real response surface of the original computer model. One way to quantify emulator error is to construct pointwise prediction bands designed to envelope the response surface and make assertions that the true response (simulator output) is enclosed by these envelopes with a certain probability. Of course, to be able to make such probabilistic statements, one needs to introduce some kind of randomness. A common strategy that we use here is to model the computer code as a random function, also known as a Gaussian stochastic process. We concern ourselves with smooth response surfaces and use the Gaussian covariance function that is ideal in cases when the response function is infinitely differentiable. In this thesis, we propose Fast Bayesian Inference (FBI) that is both computationally efficient and can be implemented as a black box. Simulation results show that it can achieve remarkably accurate prediction uncertainty assessments in terms of matching coverage probabilities of the prediction bands and the associated reparameterizations can also help parameter uncertainty assessments. Computer experiment Bayesian inference Gaussian process Prediction uncertainty
16	Laser-Based 3D Mapping and Navigation in Planetary Worksite Environments Tong, Chi Hay 14 January 2014 (has links) For robotic deployments in planetary worksite environments, map construction and navigation are essential for tasks such as base construction, scientific investigation, and in-situ resource utilization. However, operation in a planetary environment imposes sensing restrictions, as well as challenges due to the terrain. In this thesis, we develop enabling technologies for autonomous mapping and navigation by employing a panning laser rangefinder as our primary sensor on a rover platform. The mapping task is addressed as a three-dimensional Simultaneous Localization and Mapping (3D SLAM) problem. During operation, long-range 360 degree scans are obtained at infrequent stops. These scans are aligned using a combination of sparse features and odometry measurements in a batch alignment framework, resulting in accurate maps of planetary worksite terrain. For navigation, the panning laser rangefinder is configured to perform short, continuous sweeps while the rover is in motion. An appearance-based approach is taken, where laser intensity images are used to compute Visual Odometry (VO) estimates. We overcome the motion distortion issues by formulating the estimation problem in continuous time. This is facilitated by the introduction of Gaussian Process Gauss-Newton (GPGN), a novel algorithm for nonparametric, continuous-time, nonlinear, batch state estimation. Extensive experimental validation is provided for both mapping and navigation components using data gathered at multiple planetary analogue test sites. gaussian process gauss-newton state estimation slam 0771
17	Inference for Continuous Stochastic Processes Using Gaussian Process Regression Fang, Yizhou January 2014 (has links) Gaussian process regression (GPR) is a long-standing technique for statistical interpolation between observed data points. Having originally been applied to spatial analysis in the 1950s, GPR offers highly nonlinear predictions with uncertainty adjusting to the degree of extrapolation -- at the expense of very few model parameters to be fit. Thus GPR has gained considerable popularity in statistical applications such as machine learning and nonparametric density estimation. In this thesis, we explore the potential for GPR to improve the efficiency of parametric inference for continuous-time stochastic processes. For almost all such processes, the likelihood function based on discrete observations cannot be written in closed-form. However, it can be very well approximated if the inter-observation time is small. Therefore, a popular strategy for parametric inference is to introduce missing data between actual observations. In a Bayesian context, samples from the posterior distribution of the parameters and missing data are then typically obtained using Markov chain Monte Carlo (MCMC) methods, which can be computationally very expensive. Here, we consider the possibility of using GPR to impute the marginal distribution of the missing data directly. These imputations could then be leveraged to produce independent draws from the joint posterior by Importance Sampling, for a significant gain in computational efficiency. In order to illustrate the methodology, three continuous processes are examined. The first one is based on a neural excitation model with a non-standard periodic component. The second and third are popular financial models often used for option pricing. While preliminary inferential results are quite promising, we point out several improvements to the methodology which remain to be explored. Gaussian process regression continuous stochastic process parametric inference
18	Valid estimation and prediction inference in analysis of a computer model Nagy, Béla 11 1900 (has links) Computer models or simulators are becoming increasingly common in many fields in science and engineering, powered by the phenomenal growth in computer hardware over the past decades. Many of these simulators implement a particular mathematical model as a deterministic computer code, meaning that running the simulator again with the same input gives the same output. Often running the code involves some computationally expensive tasks, such as solving complex systems of partial differential equations numerically. When simulator runs become too long, it may limit their usefulness. In order to overcome time or budget constraints by making the most out of limited computational resources, a statistical methodology has been proposed, known as the "Design and Analysis of Computer Experiments". The main idea is to run the expensive simulator only at a relatively few, carefully chosen design points in the input space, and based on the outputs construct an emulator (statistical model) that can emulate (predict) the output at new, untried locations at a fraction of the cost. This approach is useful provided that we can measure how much the predictions of the cheap emulator deviate from the real response surface of the original computer model. One way to quantify emulator error is to construct pointwise prediction bands designed to envelope the response surface and make assertions that the true response (simulator output) is enclosed by these envelopes with a certain probability. Of course, to be able to make such probabilistic statements, one needs to introduce some kind of randomness. A common strategy that we use here is to model the computer code as a random function, also known as a Gaussian stochastic process. We concern ourselves with smooth response surfaces and use the Gaussian covariance function that is ideal in cases when the response function is infinitely differentiable. In this thesis, we propose Fast Bayesian Inference (FBI) that is both computationally efficient and can be implemented as a black box. Simulation results show that it can achieve remarkably accurate prediction uncertainty assessments in terms of matching coverage probabilities of the prediction bands and the associated reparameterizations can also help parameter uncertainty assessments. Computer experiment Bayesian inference Gaussian process Prediction uncertainty
19	An exploration of building design and optimisation methods using Kriging meta-modelling Wood, Michael James January 2016 (has links) This thesis investigates the application of Kriging meta-modelling techniques in the field of building design and optimisation. In conducting this research, there were two key motivational factors. The first is the need for building designers to have tools that allow low energy buildings to be designed in a fast and efficient manner. The second motivating factor is the need for optimisation tools that account, or help account, for the wide variety of uses that a building might have; so-called Robust Optimisation (RO). This thesis therefore includes an analysis of Kriging meta-modelling and first applies this to simple building problems. I then use this simple building model to determine the effect of the updated UK Test Reference Years (TRYs) on energy consumption. Second, I examine Kriging-based optimisation techniques for a single objective. I then revisit the single-building meta-model to examine the effect of uncertainty on a neighbourhood of buildings and compare the results to the output of a brute-force analysis of a full building simulator. The results show that the Kriging emulation is an effective tool for creating a meta-model of a building. The subsequent use in the analysis of the effect of TRYs on building shows that UK buildings are likely to use less heating in the future but are likely to overheat more. In the final two chapters I use the techniques developed to create a robust building optimisation algorithm as well as using Kriging to improve the optimisation efficiency of the well-known NSGA-II algorithm. I show that the Kriging-based robust optimiser effectively finds more robust solutions than traditional global optimisation. I also show that Kriging techniques can be used to augment NSGA-II so that it finds more diverse solutions to some types of multi-objective optimisation problems. The results show that Kriging has significant potential in this field and I reveal many potential areas of future research. This thesis shows how a Kriging-enhanced NSGA-II multi-objective optimisation algorithm can be used to improve the performance of NSGA-II. This new algorithm has been shown to speed up the convergence of some multi-objective optimisation algorithms significantly. Although further work is required to verify the results for a wider variety of building applications, the initial results are promising. 690
20	Modelling and analysis of oscillations in gene expression through neural development Phillips, Nick January 2016 (has links) The timing of differentiation underlies the development of any organ system. In neural development, the expression of the transcription factor Hes1 has been shown to be oscillatory in neural progenitors, but at a low steady state in differentiated neurons. This change in the dynamics of expression marks the timing of differentiation. We previously constructed a mathematical model to test the experimental hypothesis that the topology of the miR-9/Hes1 network and specifically the accumulation of the micro-RNA, miR-9, could terminate Hes1 oscillations and account for the timing of neuronal differentiation, using deterministic delay differential equations. However, biochemical reactions are the result of random encounters between discrete numbers of molecules, and some of these molecules may be present at low numbers. The finite number of molecules interacting within the system leads to inherent randomness, and this is known as intrinsic stochasticity. The stochastic model predicts that low molecular number causes the time to differentiation to be distributed, which is in agreement with recent experimental evidence and considered important to generate cell type diversity. For the exact same model, fewer reacting molecules causes a decrease in the average time to differentiation, showing that the number of molecules can systematically change the timing of differentiation. Oscillations are important for a wide range of biological processes, but current methods for discovering oscillatory genes have primarily been designed for measurements performed on a population of cells. We introduce a new approach for analysing biological time series data designed for cases where the underlying dynamics of gene expression is inherently noisy at a single cell level. Our analysis method combines mechanistic stochastic modelling with the powerful methods of Bayesian nonparametric regression, and can distinguish oscillatory expression in single cell data from random fluctuations of nonoscillatory gene expression, despite peak-to-peak variability in period and amplitude of single cell oscillations. Models of gene expression commonly involve delayed biological processes, but the combination of stochasticity, delay and nonlinearity lead to emergent dynamics that are not understood at a theoretical level. We develop a theory to explain these effects, and apply it to a simple model of gene regulation. The new theory can account for long time-scale dynamics and nonlinear character of the system that emerge when the number of interacting molecules becomes low. Both the absolute length and the uncertainty in the delay time are shown to be crucial in controlling the magnitude of nonlinear effects. 612.8

Search results