Global ETD Search

251	Monte Carlo integration in discrete undirected probabilistic models Hamze, Firas 05 1900 (has links) This thesis contains the author’s work in and contributions to the field of Monte Carlo sampling for undirected graphical models, a class of statistical model commonly used in machine learning, computer vision, and spatial statistics; the aim is to be able to use the methodology and resultant samples to estimate integrals of functions of the variables in the model. Over the course of the study, three different but related methods were proposed and have appeared as research papers. The thesis consists of an introductory chapter discussing the models considered, the problems involved, and a general outline of Monte Carlo methods. The three subsequent chapters contain versions of the published work. The second chapter, which has appeared in (Hamze and de Freitas 2004), is a presentation of new MCMC algorithms for computing the posterior distributions and expectations of the unknown variables in undirected graphical models with regular structure. For demonstration purposes, we focus on Markov Random Fields (MRFs). By partitioning the MRFs into non-overlapping trees, it is possible to compute the posterior distribution of a particular tree exactly by conditioning on the remaining tree. These exact solutions allow us to construct efficient blocked and Rao-Blackwellised MCMC algorithms. We show empirically that tree sampling is considerably more efficient than other partitioned sampling schemes and the naive Gibbs sampler, even in cases where loopy belief propagation fails to converge. We prove that tree sampling exhibits lower variance than the naive Gibbs sampler and other naive partitioning schemes using the theoretical measure of maximal correlation. We also construct new information theory tools for comparing different MCMC schemes and show that, under these, tree sampling is more efficient. Although the work discussed in Chapter 2 exhibited promise on the class of graphs to which it was suited, there are many cases where limiting the topology is quite a handicap. The work in Chapter 3 was an exploration in an alternative methodology for approximating functions of variables representable as undirected graphical models of arbitrary connectivity with pairwise potentials, as well as for estimating the notoriously difficult partition function of the graph. The algorithm, published in (Hamze and de Freitas 2005), fits into the framework of sequential Monte Carlo methods rather than the more widely used MCMC, and relies on constructing a sequence of intermediate distributions which get closer to the desired one. While the idea of using “tempered” proposals is known, we construct a novel sequence of target distributions where, rather than dropping a global temperature parameter, we sequentially couple individual pairs of variables that are, initially, sampled exactly from a spanning treeof the variables. We present experimental results on inference and estimation of the partition function for sparse and densely-connected graphs. The final contribution of this thesis, presented in Chapter 4 and also in (Hamze and de Freitas 2007), emerged from some empirical observations that were made while trying to optimize the sequence of edges to add to a graph so as to guide the population of samples to the high-probability regions of the model. Most important among these observations was that while several heuristic approaches, discussed in Chapter 1, certainly yielded improvements over edge sequences consisting of random choices, strategies based on forcing the particles to take large, biased random walks in the state-space resulted in a more efficient exploration, particularly at low temperatures. This motivated a new Monte Carlo approach to treating complex discrete distributions. The algorithm is motivated by the N-Fold Way, which is an ingenious event-driven MCMC sampler that avoids rejection moves at any specific state. The N-Fold Way can however get “trapped” in cycles. We surmount this problem by modifying the sampling process to result in biased state-space paths of randomly chosen length. This alteration does introduce bias, but the bias is subsequently corrected with a carefully engineered importance sampler. Graphical models Monte Carlo inference Bayesian inference
252	Quantifying Urban and Agricultural Nonpoint Source Total Phosphorus Fluxes Using Distributed Watershed Models and Bayesian Inference Wellen, Christopher Charles 14 January 2014 (has links) Despite decades of research, the water quality of many lakes is impaired by excess total phosphorus loading. Four studies were undertaken using watershed models to understand the temporal and spatial variability of diffuse urban and agricultural total phosphorus pollution to Hamilton Harbour, Ontario, Canada. In the first study, a novel Bayesian framework was introduced to apply Spatially Referenced Regressions on Watershed Attributes (SPARROW) to catchments with few long term load monitoring sites but many sporadic monitoring sites. The results included reasonable estimates of whole-basin total phosphorus load and recommendations to optimize future monitoring. In the second study, the static SPARROW model was extended to allow annual time series estimates of watershed loads and the attendant source-sink processes. Results suggest that total phosphorus loads and source areas vary significantly at annual timescales. Further, the total phosphorus export rate of agricultural areas was estimated to be nearly twice that of urban areas. The third study presents a novel Bayesian framework that postulates that the watershed response to precipitation occurs in distinct states, which in turn are characterized by different model parameterizations. This framework is applied to Soil-Water Assessment Tool (SWAT) models of an urban creek (Redhill Creek) and an agricultural creek (Grindstone Creek) near Hamilton. The results suggest that during the limnological growing season (May – September), urban areas are responsible for the bulk of overland flow in both Creeks: In Redhill Creek, between 90% and 98% of all surface runoff, and in Grindstone Creek, between 95% and 99% of all surface runoff. In the fourth chapter, suspended sediment is used as a surrogate for total phosphorus. Despite disagreements regarding sediment source apportionment between three model applications, Bayesian model averaging allows an unambiguous identification of urban land uses as the main source of suspended sediments during the growing season. Taken together, these results suggest that multiple models must be used to arrive at a comprehensive understanding of total phosphorus loading. Further, while urban land uses may not be the primary source of sediment (and total phosphorus) loading annually, their source strength is increased relative to agricultural land uses during the growing season. Bayesian Hydrology Modelling Watershed 0388 0368
253	An application of Bayesian analysis in determining appropriate sample sizes for use in US Army operational tests Cordova, Robert Lee 08 1900 (has links) No description available. Bayesian statistical decision theory Sampling (Statistics)
254	A comparison of classical and Bayesian statistical analysis in operational testing Coyle, Philip Vincent 08 1900 (has links) No description available. Bayesian statistical decision theory Statistical decision
255	An application of Bayesian statistical methods in the detemination of sample size for operational testing in the U S Army Baker, Robert Michael 08 1900 (has links) No description available. Bayesian statistical decision theory Sampling (Statistics)
256	Bayesian Methods for On-Line Gross Error Detection and Compensation Gonzalez, Ruben Unknown Date No description available. Bayesian Inference Data Reconciliation Gross Error Detection
257	Bayesian optimal design for changepoint problems Atherton, Juli. January 2007 (has links) We consider optimal design for changepoint problems with particular attention paid to situations where the only possible change is in the mean. Optimal design for changepoint problems has only been addressed in an unpublished doctoral thesis, and in only one journal article, which was in a frequentist setting. The simplest situation we consider is that of a stochastic process that may undergo a, change at an unknown instant in some interval. The experimenter can take n measurements and is faced with one or more of the following optimal design problems: Where should these n observations be taken in order to best test for a change somewhere in the interval? Where should the observations be taken in order to best test for a change in a specified subinterval? Assuming that a change will take place, where should the observations be taken so that that one may best estimate the before-change mean as well as the after-change mean? We take a Bayesian approach, with a risk based on squared error loss, as a design criterion function for estimation, and a risk based on generalized 0-1 loss, for testing. We also use the Spezzaferri design criterion function for model discrimination, as an alternative criterion function for testing. By insisting that all observations are at least a minimum distance apart in order to ensure rough independence, we find the optimal design for all three problems. We ascertain the optimal designs by writing the design criterion functions as functions of the design measure, rather than of the designs themselves. We then use the geometric form of the design measure space and the concavity of the criterion function to find the optimal design measure. There is a straightforward correspondence between the set of design measures and the set of designs. Our approach is similar in spirit, although rather different in detail, from that introduced by Kiefer. In addition, we consider design for estimation of the changepoint itself, and optimal designs for the multipath changepoint problem. We demonstrate why the former problem most likely has a prior-dependent solution while the latter problems, in their most general settings, are complicated by the lack of concavity of the design criterion function. / Nous considérons, dans cette dissertation, les plans d'expérience bayésiens optimauxpour les problèmes de point de rupture avec changement d'espérance. Un cas de pointde rupture avec changement d'espérance à une seule trajectoire se présente lorsqu'uneséquence de données est prélevée le long d'un axe temporelle (ou son équivalent) etque leur espérance change de valeur. Ce changement, s'il survient, se produit à unendroit sur l'axe inconnu de l'expérimentateur. Cet endroit est appelé "point derupture". Le fait que la position du point de rupture soit inconnue rend les tests etl'inférence difficiles dans les situations de point de rupture à une seule trajectoire. Bayesian statistical decision theory. Optimal designs (Statistics)
258	Bayesian framework for multiple acoustic source tracking Zhong, Xionghu January 2010 (has links) Acoustic source (speaker) tracking in the room environment plays an important role in many speech and audio applications such as multimedia, hearing aids and hands-free speech communication and teleconferencing systems; the position information can be fed into a higher processing stage for high-quality speech acquisition, enhancement of a specific speech signal in the presence of other competing talkers, or keeping a camera focused on the speaker in a video-conferencing scenario. Most of existing systems focus on the single source tracking problem, which assumes one and only one source is active all the time, and the state to be estimated is simply the source position. However, in practical scenarios, multiple speakers may be simultaneously active, and the tracking algorithm should be able to localise each individual source and estimate the number of sources. This thesis contains three contributions towards solutions to multiple acoustic source tracking in a moderate noisy and reverberant environment. The first contribution of this thesis is proposing a time-delay of arrival (TDOA) estimation approach for multiple sources. Although the phase transform (PHAT) weighted generalised cross-correlation (GCC) method has been employed to extract the TDOAs of multiple sources, it is primarily used for a single source scenario and its performance for multiple TDOA estimation has not been comprehensively studied. The proposed approach combines the degenerate unmixing estimation technique (DUET) and GCC method. Since the speech mixtures are assumed window-disjoint orthogonal (WDO) in the time-frequency domain, the spectrograms can be separated by employing DUET, and the GCC method can then be applied to the spectrogram of each individual source. The probabilities of detection and false alarm are also proposed to evaluate the TDOA estimation performance under a series of experimental parameters. Next, considering multiple acoustic sources may appear nonconcurrently, an extended Kalman particle filtering (EKPF) is developed for a special multiple acoustic source tracking problem, namely “nonconcurrent multiple acoustic tracking (NMAT)”. The extended Kalman filter (EKF) is used to approximate the optimum weights, and the subsequent particle filtering (PF) naturally takes the previous position estimates as well as the current TDOA measurements into account. The proposed approach is thus able to lock on the sharp change of the source position quickly, and avoid the tracking-lag in the general sequential importance resampling (SIR) PF. Finally, these investigations are extended into an approach to track the multiple unknown and time-varying number of acoustic sources. The DUET-GCC method is used to obtain the TDOA measurements for multiple sources and a random finite set (RFS) based Rao-blackwellised PF is employed and modified to track the sources. Each particle has a RFS form encapsulating the states of all sources and is capable of addressing source dynamics: source survival, new source appearance and source deactivation. A data association variable is defined to depict the source dynamic and its relation to the measurements. The Rao-blackwellisation step is used to decompose the state: the source positions are marginalised by using an EKF, and only the data association variable needs to be handled by a PF. The performances of all the proposed approaches are extensively studied under different noisy and reverberant environments, and are favorably comparable with the existing tracking techniques. 621.382
259	Modelling severe asthma variation Newby, Christopher James January 2013 (has links) Asthma is a heterogeneity disease that is mostly managed successfully using bronchodilators and anti-inflammatory drugs. Around 10%-15% of asthmatics however have difficult or severe asthma which is less responsive to treatments. Asthma and in particular severe asthma are now thought of a description of symptoms which may contain possible sub-groups with possible different pathologies which could be useful for targeting different drugs for different sub-groups. However little statistical work has been carried out to determine these sub-phenotypes. Studies have been carried out to partition severe asthma variables in to a number of sub-groups but the algorithms used in these studies are not based on statistical inference and it is difficult to select the number of best fitting sub-groups using such methods. It is also unclear where the clusters or sub-groups returned are actual sub-groups or reflect a bigger non-normal distribution. In the thesis we have developed a statistical model that combines factor analysis, a method used to obtain independent factors to describe processes allowing for variation over variables, and infinite mixture modelling, a process that involves determining the most probable number of mixtures or clusters thus allowing for variation over individuals. This model created is a Dirichlet process normal mixture latent variable model DPNMLVN and it is capable of determining the correct number of mixtures over each factor. The model was tested with simulations and used to analysis two severe asthma datasets and a cancer clinical trial. Sub-groups were found that reflect a high Eosinophilic group and an average eosinophilic group, a late onset older non atopic group and a highly atopic younger early onset group. In the clinical trial data 3 distinct mixtures were found relating to existing biomarkers not used in the mixture analysis. 616.238
260	Parallel Stochastic Estimation on Multicore Platforms Rosén, Olov January 2015 (has links) The main part of this thesis concerns parallelization of recursive Bayesian estimation methods, both linear and nonlinear such. Recursive estimation deals with the problem of extracting information about parameters or states of a dynamical system, given noisy measurements of the system output and plays a central role in signal processing, system identification, and automatic control. Solving the recursive Bayesian estimation problem is known to be computationally expensive, which often makes the methods infeasible in real-time applications and problems of large dimension. As the computational power of the hardware is today increased by adding more processors on a single chip rather than increasing the clock frequency and shrinking the logic circuits, parallelization is one of the most powerful ways of improving the execution time of an algorithm. It has been found in the work of this thesis that several of the optimal filtering methods are suitable for parallel implementation, in certain ranges of problem sizes. For many of the suggested parallelizations, a linear speedup in the number of cores has been achieved providing up to 8 times speedup on a double quad-core computer. As the evolution of the parallel computer architectures is unfolding rapidly, many more processors on the same chip will soon become available. The developed methods do not, of course, scale infinitely, but definitely can exploit and harness some of the computational power of the next generation of parallel platforms, allowing for optimal state estimation in real-time applications. / CoDeR-MP Recursive estimation Parallelization Bayesian estimation Anomaly detection

Search results