41 
A comparison of polynomial chaos and Gaussian process emulation for uncertainty quantification in computer experimentsOwen, Nathan Edward January 2017 (has links)
Computer simulation of real world phenomena is now ubiquitous in science, because experimentation in the field can be expensive, timeconsuming, or impossible in practice. Examples include climate science, where future climate is examined under global warming scenarios, and cosmology, where the evolution of galaxies is studied from the beginning of the universe to present day. Combining complex mathematical models and numerical procedures to solve them in a computer program, these simulators are computationally expensive, in that they can take months to complete a single run. The practice of using a simulator to understand reality raises some interesting scientific questions, and there are many sources of uncertainty to consider. For example, the discrepancy between the simulator and the real world process. The field of uncertainty quantification is concerned with the characterisation and reduction of all uncertainties present in computational and real world problems. A key bottleneck in any uncertainty quantification analysis is the cost of evaluating the simulator. The solution is to replace the expensive simulator with a surrogate model, which is computationally faster to run, and can be used in subsequent analyses. Polynomial chaos and Gaussian process emulation are surrogate models developed independently in the engineering and statistics communities respectively over the last 25 years. Despite tackling similar problems in the field, there has been little interaction and collaboration between the two communities. This thesis provides a critical comparison of the two methods for a range of criteria and examples, from simple test functions to simulators used in industry. Particular focus is on the approximation accuracy of the surrogates under changes in the size and type of the experimental design. It is concluded that one method does not unanimously outperform the other, but advantages can be gained in some cases, such that the preferred method depends on the modelling goals of the practitioner. This is the first direct comparison of polynomial chaos and Gaussian process emulation in the literature. This thesis also proposes a novel methodology called probabilistic polynomial chaos, which is a hybrid of polynomial chaos and Gaussian process emulation. The approach draws inspiration from an emerging field in scientific computation known as probabilistic numerics, which treats classical numerical methods as statistical inference problems. In particular, a probabilistic integration technique called Bayesian quadrature, which employs Gaussian process emulators, is applied to a traditional form of polynomial chaos. The result is a probabilistic version of polynomial chaos, providing uncertainty information where the simulator has not yet been run.

42 
Inference and decision making in large weakly dependent graphical modelsTurner, Lisa January 2017 (has links)
This thesis considers the problem of searching a large set of items, such as emails, for a small subset which are relevant to a given query. This can be implemented in a sequential manner – whereby knowledge from items that have already been screened is used to assist in the selection of subsequent items to screen. Often the items being searched have an underlying network structure. Using the network structure and a modelling assumption that relevant items and participants are likely to cluster together can greatly increase the rate of screening relevant items. However, inference in this type of model is computationally expensive. In the first part of this thesis, we show that Bayes linear methods provide a natural approach to modelling this data. We develop a new optimisation problem for Bernoulli random variables, called constrained Bayes linear, which has additional constraints incorporated into the Bayes linear optimisation problem. For nonlinear relationships between the latent variable and observations, Bayes linear will give a poor approximation. We propose a novel sequential Monte Carlo method for sequential inference on the network, which better copes with nonlinear relationships. We give a method for simulating the random variables based upon the Bayes linear methodology. Finally, we look at the effect the ordering of the random variables has on the joint probability distribution of binary random variables, when they are simulated using this proposed Bayes linear method.

43 
Graphical model selection for Gaussian conditional random fields in the presence of latent variables : theory and application to geneticsFrot, Benjamin January 2016 (has links)
The task of performing graphical model selection arises in many applications in science and engineering. The field of application of interest in this thesis relates to the needs of datasets that include genetic and multivariate phenotypic data. There are several factors that make this problem particularly challenging: some of the relevant variables might not be observed, highdimensionality might cause identifiability issues and, finally, it might be preferable to learn the model over a subset of the collection while conditioning on the rest of the variables, e.g. genetic variants. We suggest addressing these problems by learning a conditional Gaussian graphical model, while accounting for latent variables. Building on recent advances in this field, we decompose the parameters of a conditional Markov random field into the sum of a sparse and a lowrank matrix. We derive convergence bounds for this novel estimator, show that it is wellbehaved in the highdimensional regime and describe algorithms that can be used when the number of variables is in the thousands. Through simulations, we illustrate the conditions required for identifiability and show that this approach is consistent in a wider range of settings. In order to show the practical implications of our work, we apply our method to two real datasets and devise a metric that makes use of an independent source of information to assess the biological relevance of the estimates. In our first application, we use the proposed approach to model the levels of 39 metabolic traits conditional on hundreds of genetic variants, in two independent cohorts. We find our results to be better replicated across cohorts than the ones obtained with other methods. In our second application, we look at a highdimensional gene expression dataset. We find that our method is capable of retrieving as many biologically relevant genegene interactions as other methods while retrieving fewer irrelevant interaction.

44 
On asymptotic stability of stochastic differential equations with delay in infinite dimensional spacesWang, C. January 2017 (has links)
In most stochastic dynamical systems which describe process in engineering, physics and economics, stochastic components and random noise are often involved. Stochastic effects of these models are often used to capture the uncertainty about the operating systems. Motivated by the development of analysis and theory of stochastic processes, as well as the studies of natural sciences, the theory of stochastic differential equations in infinite dimensional spaces evolves gradually into a branch of modern analysis. In the analysis of such systems, we want to investigate their stabilities. This thesis is mainly concerned about the studies of the stability property of stochastic differential equations in infinite dimensional spaces, mainly in Hilbert spaces. Chapter 1 is an overview of the studies. In Chapter 2, we recall basic notations, definitions and preliminaries, especially those on stochastic integration and stochastic differential equations in infinite dimensional spaces. In this way, such notions as QWiener processes, stochastic integrals, mild solutions will be reviewed. We also introduce the concepts of several types of stability. In Chapter 3, we are mainly concerned about the moment exponential stability of neutral impulsive stochastic delay partial differential equations with Poisson jumps. By employing the fixed point theorem, the pth moment exponential stability of mild solutions to system is obtained. In Chapter 4, we firstly attempt to recall an impulsiveintegral inequality by considering impulsive effects in stochastic systems. Then we define an attracting set and study the exponential stability of mild solutions to impulsive neutral stochastic delay partial differential equations with Poisson jumps by employing impulsiveintegral inequality. Chapter 5 investigates pth moment exponential stability and almost sure asymptotic stability of mild solutions to stochastic delay integrodifferential equations. Finally in Chapter 6, we study the exponential stability of neutral impulsive stochastic delay partial differential equations driven by a fractional Brownian motion.

45 
Markovian rough pathsOgrodnik, Marcel Bogdan January 2016 (has links)
The accumulated local pvariation functional, originally presented by Cass et al. (2013), arises naturally in the theory of rough paths in estimates both for solutions to rough differential equations (RDEs), and for the higherorder terms of the signature (or Lyons lift). In stochastic examples, it has been observed that the tails of the accumulated local pvariation functional typically decay much faster than the tails of classical pvariation. This observation has been decisive, e.g. for problems involving Malliavin calculus for Gaussian rough paths as illustrated in the work by Cass et al. (2015). All of the examples treated so far have been in this Gaussian setting, that contains a great deal of additional structure. In this paper we work in the context of Markov processes on a locally compact Polish space E, which are associated to a class of Dirichlet forms. In this general framework, we first prove a betterthanexponential tail estimate for the accumulated local pvariation functional derived from the intrinsic metric of this Dirichlet form. By then specialising to a class of Dirichlet forms on the step⌊p⌋ free nilpotent group, which are subelliptic in the sense of FeffermanPhong, we derive a better than exponential tail estimate for a class of Markovian rough paths. This class includes, but also goes beyond, the examples studied by Friz and Victoir (2008). We comment on the significance of these estimates to recent results, including the results of Hao (2014) and Chevyrev and Lyons (2015).

46 
A variational and numerical study of aggregationdiffusion gradient flowsPatacchini, Francesco Saverio January 2017 (has links)
This thesis is dedicated to the variational and numerical study of a particular class of continuity equations called aggregationdiffusion equations. They model the evolution of a continuum body whose total mass is conserved in time, undergoing up to three distinct phenomena: diffusion, confinement and aggregation. Diffusion describes the motion of the body’s particles from crowded regions of space to sparser ones; confinement results from an external potential field independent of the mass distribution of the body; and aggregation describes the nonlocal particle interaction within the body. Due to this wide range of effects, aggregationdiffusion equations are encountered in a large variety of applications coming from, among many others, porous medium flows, granular flows, crystallisation, biological swarming, bacterial chemotaxis, stellar collapse, and economics. An aggregationdiffusion equation has the very interesting and rich mathematical property of being the gradient flow for some energy functional on the space of probability measures, which formally means that any solution evolves so as to decrease this energy every time as much as possible. In this thesis we exploit this gradientflow structure of aggregationdiffusion equations in order to derive properties of solutions and approximate them by discrete particles. We focus on two main aspects of aggregationdiffusion gradient flows: the variational analysis of the pure aggregation equation, i.e., the study of minimisers of the energy when only nonlocal aggregation effects are present; and the particle approximation of solutions, especially when only diffusive effects are taken into account. Regarding the former aspect, we prove that minimisers exist, enjoy some regularity, are supported on sets of specific dimensionality, and can be approximated by finitely supported discrete minimisers. Regarding the latter aspect, we illustrate theoretically and numerically that diffusion can be interpreted at the discrete level by a deterministic motion of particles preserving a gradientflow structure.

47 
Theory and applications of stochastic processesWaugh, W. A. O'N. January 1955 (has links)
No description available.

48 
Parameterising and evaluating Markov models for ionchannelsEpstein, M. J. January 2015 (has links)
Ligand gated ionchannels are proteins that are embedded in the membranes of cells. They play many crucial physiological roles, enabling fast celltocell communication be tween neuronal cells and communication between the nervous and musculoskeletal sys tems. A key feature of their physiology is their ability to form a pore, through which a small quantal current in the form of ions can ow. Such channels are statistically modelled as Markov processes. However, given the limitations of experimental data, establishing model parameter identi�ability and robust model comparison remain im portant statistical issues. This Thesis considers the use of Likelihood and Bayesian methods to investigate the parameterisation and selection of mechanistic Markov models for ionchannel gating. The biological and statistical background is placed in context, particularly with regard to statistical issues of limited time resolution of electrophysiological recordings which signi�cantly complicates the model likelihood. A canonical ionchannel model for the Acetylcholine receptor is described, which is used to assess the use of pro�le likelihoods to answer questions concerning model parameter identi�ability. MCMC techniques are then introduced to sample from model posterior distributions in order to examine candidate models within the Bayesian paradigm. Even simple mod els exhibit complex posterior distributions. This motivates a thorough assessment of sophisticated MCMC samplers to perform Bayesian inference in such models. A prin cipled method using preconditioned or adaptive MCMC algorithms is found to provide an e�ective sampling strategy. Model parameterisation and predictive uncertainty in model posterior outputs is then assessed using both real and synthetic data. Model discrimination based on visual inspection of posterior predictive output is not always conclusive. A parallel tempering sampling strategy is successfully implemented to estimate Bayes Factors for candidate models. This quantitative technique can dis criminate between competing models that otherwise produce visually similar predictive output for the gating dynamics of the Acetylcholine receptor.

49 
Characterisation of disordered structuresButler, Paul January 2017 (has links)
In this thesis I will look at how large, complex structures can be interpreted and evaluated using an information theoretic approach. The work specifically investigates techniques to understand disordered materials. It explains a novel framework using statistical methods to investigate structural information of very large data sets. This framework facilitates understanding of complex structures through the quantification of information and disorder. Large scale structures including granular media and amorphous atomic systems can also be processed. The need to deal with larger complex structures has been driven by new methods used to characterise amorphous materials, such as atomic scale tomography. In addition, computers are allowing for the creation of larger and larger data sets for researchers to analyse, requiring new techniques for storing and understanding information. As it has become possible to analyse large complex systems there has been a corresponding increase in attempts to scientifically understand these systems. New, manmade, complex systems have emerged such as the stock market and online networks. This has boosted interest in their interpretation, with the hopes they can be more easily manipulated or controlled. Crystallography has been applied to great effect in biology, having been used to discover the structure of DNA and develop new drugs (UNESCO,2013). However it only describes crystal structure, which can be a drawback as a large majority of matter is amorphous. As such it is hoped that interpreting and understanding disorder may lead to similar breakthroughs in disordered materials. Entropic measures such as the mutual information and Kullback Leibler Divergence are used to investigate the nature of structural information and its impact on the system. I examine how this information propagates in a system, and how it could quantify the amount of organisation in a system that is structurally disordered. The methodology introduced in this thesis extracts useful information from large data sets to allow for a quantification of disorder. The calculated entropy for amorphous packings is generally less than 1 bit with Mutual information between 0 and 0.1 bits. The results verify direct correlation between Mutual Information and the correlation coefficient using various techniques. The Mutual information shows most information is obtained where sphere density is highest, following a similar trend to that of the Radial distribution function, and generally increasing for higher packing fractions. Evidence of the Random Close Packed (RCP) and Random Loose Packed (RLP) limits in two dimensions is shown, as well as evidence of both phases in timelapsed 3D packings. The Kullback Leibler Divergence is also explored as a relative measure of disorder. This is achieved by calculating redundant information in packings so that areas of low and high order can be shown. Results present colour maps displaying relative information in random disk packings from which motifs can be identified. For higher packing fractions distinct borders form for areas of low and high information, particularly where crystallisation has occurred. Again, these results show an increase in information for more densely packed structures, as expected, with a Kullback Leibler divergence of between 0 and 1 bits. Finally I introduce the concept of selfreferential order which provides a way to quantify structural organisation in noncrystalline materials, by referencing part of the system in a similar way to a unit cell. This allows a step forward in understanding and characterising disorder, helping to develop a framework to encode amorphous structures in an efficient way. These results show increasing information for higher packing fractions as well as further evidence of RLP and RCP limits around packing fractions of 0.54 and 0.64 respectively.

50 
Modelling share prices as a random walk on a Markov chainSamci Karadeniz, Rukiye January 2017 (has links)
In the financial area, a simple but also realistic means of modelling real data is very important. Several approaches are considered to model and analyse the data presented herein. We start by considering a random walk on an additive functional of a discrete time Markov chain perturbed by Gaussian noise as a model for the data as working with a continuous time model is more convenient for option prices. Therefore, we consider the renowned (and open) embedding problem for Markov chains: not every discrete time Markov chain has an underlying continuous time Markov chain. One of the main goals of this research is to analyse whether the discrete time model permits extension or embedding to the continuous time model. In addition, the volatility of share price data is estimated and analysed by the same procedure as for share price processes. This part of the research is an extensive case study on the embedding problem for financial data and its volatility. Another approach to modelling share price data is to consider a random walk on the lamplighter group. Specifically, we model data as a Markov chain with a hidden random walk on the lamplighter group Z3 and on the tensor product of groups Z2 ⊗ Z2. The lamplighter group has a specific structure where the hidden information is actually explicit. We assume that the positions of the lamplighters are known, but we do not know the status of the lamps. This is referred to as a hidden random walk on the lamplighter group. A biased random walk is constructed to fit the data. Monte Carlo simulations are used to find the best fit for smallest trace norm difference of the transition matrices for the tensor product of the original transition matrices from the (appropriately split) data. In addition, splitting data is a key method for both our first and second models. The tensor product structure comes from the split of the data. This requires us to deal with the missing data. We apply a variety of statistical techniques such as Expectation Maximization Algorithm and Machine Learning Algorithm (C4.5). In this work we also analyse the quantum data and compute option prices for the binomial model via quantum data.

Page generated in 0.0338 seconds