• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2046
  • 601
  • 262
  • 260
  • 61
  • 32
  • 26
  • 19
  • 15
  • 14
  • 10
  • 8
  • 6
  • 6
  • 5
  • Tagged with
  • 4144
  • 813
  • 761
  • 731
  • 722
  • 721
  • 713
  • 661
  • 578
  • 451
  • 433
  • 416
  • 410
  • 370
  • 315
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
321

Predictive modelling and uncertainty quantification of UK forest growth

Lonsdale, Jack Henry January 2015 (has links)
Forestry in the UK is dominated by coniferous plantations. Sitka spruce (Picea sitchensis) and Scots pine (Pinus sylvestris) are the most prevalent species and are mostly grown in single age mono-culture stands. Forest strategy for Scotland, England, and Wales all include efforts to achieve further afforestation. The aim of this afforestation is to provide a multi-functional forest with a broad range of benefits. Due to the time scale involved in forestry, accurate forecasts of stand productivity (along with clearly defined uncertainties) are essential to forest managers. These can be provided by a range of approaches to modelling forest growth. In this project model comparison, Bayesian calibration, and data assimilation methods were all used to attempt to improve forecasts and understanding of uncertainty therein of the two most important conifers in UK forestry. Three different forest growth models were compared in simulating growth of Scots pine. A yield table approach, the process-based 3PGN model, and a Stand Level Dynamic Growth (SLeDG) model were used. Predictions were compared graphically over the typical productivity range for Scots pine in the UK. Strengths and weaknesses of each model were considered. All three produced similar growth trajectories. The greatest difference between models was in volume and biomass in unthinned stands where the yield table predicted a much larger range compared to the other two models. Future advances in data availability and computing power should allow for greater use of process-based models, but in the interim more flexible dynamic growth models may be more useful than static yield tables for providing predictions which extend to non-standard management prescriptions and estimates of early growth and yield. A Bayesian calibration of the SLeDG model was carried out for both Sitka spruce and Scots pine in the UK for the first time. Bayesian calibrations allow both model structure and parameters to be assessed simultaneously in a probabilistic framework, providing a model with which forecasts and their uncertainty can be better understood and quantified using posterior probability distributions. Two different structures for including local productivity in the model were compared with a Bayesian model comparison. A complete calibration of the more probable model structure was then completed. Example forecasts from the calibration were compatible with existing yield tables for both species. This method could be applied to other species or other model structures in the future. Finally, data assimilation was investigated as a way of reducing forecast uncertainty. Data assimilation assumes that neither observations nor models provide a perfect description of a system, but combining them may provide the best estimate. SLeDG model predictions and LiDAR measurements for sub-compartments within Queen Elizabeth Forest Park were combined with an Ensemble Kalman Filter. Uncertainty was reduced following the second data assimilation in all of the state variables. However, errors in stand delineation and estimated stand yield class may have caused observational uncertainty to be greater thus reducing the efficacy of the method for reducing overall uncertainty.
322

Statistical viability assessment of a photovoltaic system in the presence of data uncertainty

Clohessy, Chantelle May January 2017 (has links)
This thesis investigates statistical techniques that can be used to improve estimates and methods in feasibility assessments of photovoltaic (PV) systems. The use of these techniques are illustrated for a case study of a 1MW PV system proposed for the Nelson Mandela Metropolitan University South Campus in Port Elizabeth, South Africa. The results from the study provide strong support for the use of multivariate profile analysis and interval estimate plots for the assessment of solar resource data. A unique view to manufacturing process control in the generation of energy from a PV system is identified. This link between PV energy generation and process control is lacking in the literature and exploited in this study. Variance component models are used to model power output and energy yield estimates of the proposed PV system. The variance components are simulated using Bayesian simulation techniques. Bayesian tolerance intervals are derived from the variance components and are used to determine what percentage of future power output and energy yield values fall within an interval with a certain probability. The results from the estimated tolerance intervals were informative and provided expected power outputs and energy yields for a given month and specific season. The methods improve on current techniques used to assess the energy output of a system.
323

Bayesian surrogates for functional response modeling and metamaterial rapid design

Guo, Xiao 01 January 2017 (has links)
In many scientific and engineering researches, Bayesian surrogate models are utilized to handle nonlinear data for regression and classification tasks. In this thesis, we consider a real-life problem, functional response modeling of metamaterial and its rapid design, to which we establish and test such models. To familiarize with this subject, some fundamental electromagnetic physics are provided.. Noticing that the dispersive data are usually in rational form, a two-stage modeling approach is proposed, where in the first stage, a universal link function is formulated to rationally approximate the data with a few discrete parameters, namely poles and residues. Then they are used to synthesize equivalent circuits, and surrogate models are applied to circuit elements in the second stage.. To start with a regression scheme, the classical Gaussian process (GP) is introduced, which proceeds by parameterizing a covariance function of any continuous inputs, and infers hyperparameters given the training data. Two metamaterial prototypes are illustrated to demonstrate the methodology of model building, whose results are shown to prove the efficiency and precision of probabilistic pre- dictions. One well-known problem with metamaterial functionality is its great variability in resonance identities, which shows discrepancy in approximation orders required to fit the data with rational functions. In order to give accurate prediction, both approximation order and the presenting circuit elements should be inferred, by classification and regression, respectively. An augmented Bayesian surrogate model, which integrates GP multiclass classification, Bayesian treed GP regression, is formulated to provide a systematic dealing to such unique physical phenomenon. Meanwhile, the nonstationarity and computational complexity are well scaled with such model.. Finally, as one of the most advantageous property of Bayesian perspective, probabilistic assessment to underlying uncertainties is also discussed and demonstrated with detailed formulation and examples.
324

Approximate inference of Bayesian networks through edge deletion

Thornton, Julie Ann January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / William Hsu / Bayesian networks are graphical models whose nodes represent random variables and whose edges represent conditional dependence between variables. Each node in a Bayesian network is equipped with a conditional probability function that expresses the likelihood that the node will take on different values given the values of its parents. A common task for a Bayesian network is to perform inference by computing the marginal probabilities of each possible value for each node. In this thesis, I introduce three new algorithms for approximate inference of Bayesian networks that use edge deletion techniques. The first reduces a network to its maximal weight spanning tree using the Kullback-Leibler information divergence as edge weights, and then runs Pearl’s algorithm on the resulting tree. Because Pearl’s algorithm can perform inference on a tree in linear time, as opposed to the exponential running time of all general exact inference algorithms, this reduction results in a tremendous speedup in inference. The second algorithm applies triangulation pre-processing rules that are guaranteed to be optimal if the original graph has a treewidth of four or less, and then deletes edges from the network and continues applying rules so that the resulting triangulated graph will have a maximum clique size of no more than five. The junction tree exact inference algorithm can then be run on the reduced triangulated graph. While the junction tree algorithm has an exponential worst-case running time in the size of the maximum clique in the triangulated graph, placing a bound on the clique size effectively places a polynomial time bound on the inference procedure. The third algorithm deletes edges from a triangulation of the original network until the maximum clique size in the triangulated graph is below a desired bound. Again, the junction tree algorithm can then be run on the resulting triangulated graph, and the bound on the maximum clique size will also polynomially bound the inference time. When tested for efficiency and accuracy on common Bayesian networks, these three algorithms perform up to 10,000 times faster than current exact and approximate techniques while achieving error values close to those of sampling techniques.
325

Evolution of HIV-1 subtype C gp120 envelope sequences in the female genital tract and blood plasma during acute and chronic infection

Ramdayal, Kavisha January 2014 (has links)
Philosophiae Doctor - PhD / Heterosexual transmission of HIV-1 via the female genital tract is the leading route of HIV infection in sub-Saharan Africa. Viruses then traffic between the cervical compartment and blood ensuring pervasive infection. Previous studies have however reported the existence of genetically diverse viral populations in various tissue types, each evolving under separate selective pressures within a single individual, though it is still unclear how compartmentalization dynamics change over acute and chronic infection in the absence of ARVs. To better characterize intrahost evolution and the movement of viruses between different anatomical tissue types, statistical and phylogenetic methods were used to reconstruct temporal dynamics between blood plasma and cervico-vaginal lavage (CVL) derived HIV-1 subtype C gp120 envelope sequences. A total of 206 cervical and 253 blood plasma sequences obtained from four treatment naïve women enrolled in the CAPRISA Acute Infection study cohort in South Africa were evaluated for evidence of genotypic and phenotypic differences between viral populations from each tissue type up to 3.6 years post-infection. Evidence for tissue-specific differences in genetic diversity, V-loop length variation, codon-based selection, co-receptor usage, hypermutation, recombination and potential N-linked glycosylation (PNLG) site accumulation were investigated. Of the four participants studied, two anonymously identified as CAP270 and CAP217 showed evidence of infection with a single HIV-1 variant, whereas CAP177 and CAP261 showed evidence of infection by more than one variant. As a result, genetic diversity, PNLGs accumulation and the number of detectable recombination events along the gp120 env region were lowest in the former patients and highest in the latter. Overall, genetic diversity increased over the course of infection in all participants and correlated significantly with viral load measurements from the blood plasma in one of the four participants tested (i.e. CAP177). Employing a structured coalescent model approach, rates of viral migration between anatomical tissue types on time-measured genealogies were also estimated. No persistent evidence for the existence of separate viral populations in the cervix and blood plasma was found in any of the participants and instead, sequences generally clustered together by time point on Bayesian Maximum Clade Credibility (MCC) trees. Clades that were monophyletic by tissue type comprised mostly of low diversity or monotypic sequences from the same time point, consistent with bursts of viral replication. Tissue-specific monophyletic clades also generally contained few sequences and were interspersed among sequences from both tissue-types. Tree and distance-based statistical tests were employed to further evaluate the degree to which cervical and blood plasma viruses clustered together on Bayesian MCC trees using the Slatkin-Maddison (S-M), Simmonds Association index (AI), Monophyletic Clade (MC), Wright’s measure of population subdivision (FST) and Hudson’s Nearest Neighbour (Snn) statistics, in the presence and absence of monotypic and low diversity sequences. Statistical evidence for the presence of tissue-specific population structure disappeared or was greatly reduced after the removal of monotypic and low diversity sequences, except in CAP177 and CAP217, in 3/5 of longitudinal tree and distance-based tests. Analysis of phenotypic differences between viral populations from the blood plasma and cervix revealed inconsistent tissue-specific patterns in genetic diversity, codon-based selection, co-receptor usage, hypermutation, recombination, V-loop length variation and PNLG site accumulation during acute and chronic infection among all participants. There is therefore no evidence to support the existence of distinct viral populations within the blood plasma and cervical compartments longitudinally, however slightly constrained populations may exist within the female genital tract at isolated time points, based on the statistical findings presented in this study.
326

HaMMLeT: An Infinite Hidden Markov Model with Local Transitions

Dawson, Colin Reimer, Dawson, Colin Reimer January 2017 (has links)
In classical mixture modeling, each data point is modeled as arising i.i.d. (typically) from a weighted sum of probability distributions. When data arises from different sources that may not give rise to the same mixture distribution, a hierarchical model can allow the source contexts (e.g., documents, sub-populations) to share components while assigning different weights across them (while perhaps coupling the weights to "borrow strength" across contexts). The Dirichlet Process (DP) Mixture Model (e.g., Rasmussen (2000)) is a Bayesian approach to mixture modeling which models the data as arising from a countably infinite number of components: the Dirichlet Process provides a prior on the mixture weights that guards against overfitting. The Hierarchical Dirichlet Process (HDP) Mixture Model (Teh et al., 2006) employs a separate DP Mixture Model for each context, but couples the weights across contexts. This coupling is critical to ensure that mixture components are reused across contexts. An important application of HDPs is to time series models, in particular Hidden Markov Models (HMMs), where the HDP can be used as a prior on a doubly infinite transition matrix for the latent Markov chain, giving rise to the HDP-HMM (first developed, as the "Infinite HMM", by Beal et al. (2001), and subsequently shown to be a case of an HDP by Teh et al. (2006)). There, the hierarchy is over rows of the transition matrix, and the distributions across rows are coupled through a top-level Dirichlet Process. In the first part of the dissertation, I present a formal overview of Mixture Models and Hidden Markov Models. I then turn to a discussion of Dirichlet Processes and their various representations, as well as associated schemes for tackling the problem of doing approximate inference over an infinitely flexible model with finite computa- tional resources. I will then turn to the Hierarchical Dirichlet Process (HDP) and its application to an infinite state Hidden Markov Model, the HDP-HMM. These models have been widely adopted in Bayesian statistics and machine learning. However, a limitation of the vanilla HDP is that it offers no mechanism to model correlations between mixture components across contexts. This is limiting in many applications, including topic modeling, where we expect certain components to occur or not occur together. In the HMM setting, we might expect certain states to exhibit similar incoming and outgoing transition probabilities; that is, for certain rows and columns of the transition matrix to be correlated. In particular, we might expect pairs of states that are "similar" in some way to transition frequently to each other. The HDP-HMM offers no mechanism to model this similarity structure. The central contribution of the dissertation is a novel generalization of the HDP- HMM which I call the Hierarchical Dirichlet Process Hidden Markov Model With Local Transitions (HDP-HMM-LT, or HaMMLeT for short), which allows for correlations between rows and columns of the transition matrix by assigning each state a location in a latent similarity space and promoting transitions between states that are near each other. I present a Gibbs sampling scheme for inference in this model, employing auxiliary variables to simplify the relevant conditional distributions, which have a natural interpretation after re-casting the discrete time Markov chain as a continuous time Markov Jump Process where holding times are integrated out, and where some jump attempts "fail". I refer to this novel representation as the Markov Process With Failed Jumps. I test this model on several synthetic and real data sets, showing that for data where transitions between similar states are more common, the HaMMLeT model more effectively finds the latent time series structure underlying the observations.
327

Combining measurements with deterministic model outputs: predicting ground-level ozone

Liu, Zhong 05 1900 (has links)
The main topic of this thesis is how to combine model outputs from deterministic models with measurements from monitoring stations for air pollutants or other meteorological variables. We consider two different approaches to address this particular problem. The first approach is by using the Bayesian Melding (BM) model proposed by Fuentes and Raftery (2005). We successfully implement this model and conduct several simulation studies to examine the performance of this model in different scenarios. We also apply the melding model to the ozone data to show the importance of using the Bayesian melding model to calibrate the model outputs. That is, to adjust the model outputs for the prediction of measurements. Due to the Bayesian framework of the melding model, we can extend it to incorporate other components such as ensemble models, reversible jump MCMC for variable selection. However, the BM model is purely a spatial model and we generally have to deal with space-time dataset in practice. The deficiency of the BM approach leads us to a second approach, an alternative to the BM model, which is a linear mixed model (different from most linear mixed models, the random effects being spatially correlated) with temporally and spatially correlated residuals. We assume the spatial and temporal correlation are separable and use an AR process to model the temporal correlation. We also develop a multivariate version of this model. Both the melding model and linear mixed model are Bayesian hierarchical models, which can better estimate the uncertainties of the estimates and predictions. / Science, Faculty of / Statistics, Department of / Graduate
328

Monte Carlo integration in discrete undirected probabilistic models

Hamze, Firas 05 1900 (has links)
This thesis contains the author’s work in and contributions to the field of Monte Carlo sampling for undirected graphical models, a class of statistical model commonly used in machine learning, computer vision, and spatial statistics; the aim is to be able to use the methodology and resultant samples to estimate integrals of functions of the variables in the model. Over the course of the study, three different but related methods were proposed and have appeared as research papers. The thesis consists of an introductory chapter discussing the models considered, the problems involved, and a general outline of Monte Carlo methods. The three subsequent chapters contain versions of the published work. The second chapter, which has appeared in (Hamze and de Freitas 2004), is a presentation of new MCMC algorithms for computing the posterior distributions and expectations of the unknown variables in undirected graphical models with regular structure. For demonstration purposes, we focus on Markov Random Fields (MRFs). By partitioning the MRFs into non-overlapping trees, it is possible to compute the posterior distribution of a particular tree exactly by conditioning on the remaining tree. These exact solutions allow us to construct efficient blocked and Rao-Blackwellised MCMC algorithms. We show empirically that tree sampling is considerably more efficient than other partitioned sampling schemes and the naive Gibbs sampler, even in cases where loopy belief propagation fails to converge. We prove that tree sampling exhibits lower variance than the naive Gibbs sampler and other naive partitioning schemes using the theoretical measure of maximal correlation. We also construct new information theory tools for comparing different MCMC schemes and show that, under these, tree sampling is more efficient. Although the work discussed in Chapter 2 exhibited promise on the class of graphs to which it was suited, there are many cases where limiting the topology is quite a handicap. The work in Chapter 3 was an exploration in an alternative methodology for approximating functions of variables representable as undirected graphical models of arbitrary connectivity with pairwise potentials, as well as for estimating the notoriously difficult partition function of the graph. The algorithm, published in (Hamze and de Freitas 2005), fits into the framework of sequential Monte Carlo methods rather than the more widely used MCMC, and relies on constructing a sequence of intermediate distributions which get closer to the desired one. While the idea of using “tempered” proposals is known, we construct a novel sequence of target distributions where, rather than dropping a global temperature parameter, we sequentially couple individual pairs of variables that are, initially, sampled exactly from a spanning treeof the variables. We present experimental results on inference and estimation of the partition function for sparse and densely-connected graphs. The final contribution of this thesis, presented in Chapter 4 and also in (Hamze and de Freitas 2007), emerged from some empirical observations that were made while trying to optimize the sequence of edges to add to a graph so as to guide the population of samples to the high-probability regions of the model. Most important among these observations was that while several heuristic approaches, discussed in Chapter 1, certainly yielded improvements over edge sequences consisting of random choices, strategies based on forcing the particles to take large, biased random walks in the state-space resulted in a more efficient exploration, particularly at low temperatures. This motivated a new Monte Carlo approach to treating complex discrete distributions. The algorithm is motivated by the N-Fold Way, which is an ingenious event-driven MCMC sampler that avoids rejection moves at any specific state. The N-Fold Way can however get “trapped” in cycles. We surmount this problem by modifying the sampling process to result in biased state-space paths of randomly chosen length. This alteration does introduce bias, but the bias is subsequently corrected with a carefully engineered importance sampler. / Science, Faculty of / Computer Science, Department of / Graduate
329

Tolerance intervals for variance component models using a Bayesian simulation procedure

Sarpong, Abeam Danso January 2013 (has links)
The estimation of variance components serves as an integral part of the evaluation of variation, and is of interest and required in a variety of applications (Hugo, 2012). Estimation of the among-group variance components is often desired for quantifying the variability and effectively understanding these measurements (Van Der Rijst, 2006). The methodology for determining Bayesian tolerance intervals for the one – way random effects model has originally been proposed by Wolfinger (1998) using both informative and non-informative prior distributions (Hugo, 2012). Wolfinger (1998) also provided relationships with frequentist methodologies. From a Bayesian point of view, it is important to investigate and compare the effect on coverage probabilities if negative variance components are either replaced by zero, or completely disregarded from the simulation process. This research presents a simulation-based approach for determining Bayesian tolerance intervals in variance component models when negative variance components are either replaced by zero, or completely disregarded from the simulation process. This approach handles different kinds of tolerance intervals in a straightforward fashion. It makes use of a computer-generated sample (Monte Carlo process) from the joint posterior distribution of the mean and variance parameters to construct a sample from other relevant posterior distributions. This research makes use of only non-informative Jeffreys‟ prior distributions and uses three Bayesian simulation methods. Comparative results of different tolerance intervals obtained using a method where negative variance components are either replaced by zero or completely disregarded from the simulation process, is investigated and discussed in this research.
330

Bayesian methods in music modelling

Peeling, Paul Halliday January 2011 (has links)
This thesis presents several hierarchical generative Bayesian models of musical signals designed to improve the accuracy of existing multiple pitch detection systems and other musical signal processing applications whilst remaining feasible for real-time computation. At the lowest level the signal is modelled as a set of overlapping sinusoidal basis functions. The parameters of these basis functions are built into a prior framework based on principles known from musical theory and the physics of musical instruments. The model of a musical note optionally includes phenomena such as frequency and amplitude modulations, damping, volume, timbre and inharmonicity. The occurrence of note onsets in a performance of a piece of music is controlled by an underlying tempo process and the alignment of the timings to the underlying score of the music. A variety of applications are presented for these models under differing inference constraints. Where full Bayesian inference is possible, reversible-jump Markov Chain Monte Carlo is employed to estimate the number of notes and partial frequency components in each frame of music. We also use approximate techniques such as model selection criteria and variational Bayes methods for inference in situations where computation time is limited or the amount of data to be processed is large. For the higher level score parameters, greedy search and conditional modes algorithms are found to be sufficiently accurate. We emphasize the links between the models and inference algorithms developed in this thesis with that in existing and parallel work, and demonstrate the effects of making modifications to these models both theoretically and by means of experimental results.

Page generated in 0.0616 seconds