• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 209
  • 193
  • 31
  • 18
  • 12
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 555
  • 555
  • 214
  • 196
  • 106
  • 101
  • 73
  • 67
  • 67
  • 67
  • 66
  • 57
  • 54
  • 50
  • 49
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
121

The Geometry of Data: Distance on Data Manifolds

Chu, Casey 01 January 2016 (has links)
The increasing importance of data in the modern world has created a need for new mathematical techniques to analyze this data. We explore and develop the use of geometry—specifically differential geometry—as a means for such analysis, in two parts. First, we provide a general framework to discover patterns contained in time series data using a geometric framework of assigning distance, clustering, and then forecasting. Second, we attempt to define a Riemannian metric on the space containing the data in order to introduce a notion of distance intrinsic to the data, providing a novel way to probe the data for insight.
122

Smart Classifiers and Bayesian Inference for Evaluating River Sensitivity to Natural and Human Disturbances: A Data Science Approach

Underwood, Kristen 01 January 2018 (has links)
Excessive rates of channel adjustment and riverine sediment export represent societal challenges; impacts include: degraded water quality and ecological integrity, erosion hazards to infrastructure, and compromised public safety. The nonlinear nature of sediment erosion and deposition within a watershed and the variable patterns in riverine sediment export over a defined timeframe of interest are governed by many interrelated factors, including geology, climate and hydrology, vegetation, and land use. Human disturbances to the landscape and river networks have further altered these patterns of water and sediment routing. An enhanced understanding of river sediment sources and dynamics is important for stakeholders, and will become more critical under a nonstationary climate, as sediment yields are expected to increase in regions of the world that will experience increased frequency, persistence, and intensity of storm events. Practical tools are needed to predict sediment erosion, transport and deposition and to characterize sediment sources within a reasonable measure of uncertainty. Water resource scientists and engineers use multidimensional data sets of varying types and quality to answer management-related questions, and the temporal and spatial resolution of these data are growing exponentially with the advent of automated samplers and in situ sensors (i.e., “big data”). Data-driven statistics and classifiers have great utility for representing system complexity and can often be more readily implemented in an adaptive management context than process-based models. Parametric statistics are often of limited efficacy when applied to data of varying quality, mixed types (continuous, ordinal, nominal), censored or sparse data, or when model residuals do not conform to Gaussian distributions. Data-driven machine-learning algorithms and Bayesian statistics have advantages over Frequentist approaches for data reduction and visualization; they allow for non-normal distribution of residuals and greater robustness to outliers. This research applied machine-learning classifiers and Bayesian statistical techniques to multidimensional data sets to characterize sediment source and flux at basin, catchment, and reach scales. These data-driven tools enabled better understanding of: (1) basin-scale spatial variability in concentration-discharge patterns of instream suspended sediment and nutrients; (2) catchment-scale sourcing of suspended sediments; and (3) reach-scale sediment process domains. The developed tools have broad management application and provide insights into landscape drivers of channel dynamics and riverine solute and sediment export.
123

Statistical Analysis and Modeling of PM<sub>2.5</sub> Speciation Metals and Their Mixtures

Ibrahimou, Boubakari 10 November 2014 (has links)
Exposure to fine particulate matter (PM2.5) in the ambient air is associated with various health effects. There is increasing evidence which implicates the central role played by specific chemical components such as heavy metals of PM2.5. Given the fact that humans are exposed to complex mixtures of environmental pollutants such as PM2.5, research efforts are intensifying to study the mixtures composition and the emission sources of ambient PM, and the exposure-related health effects. Factor analysis as well source apportionment models are statistical tools potentially useful for characterizing mixtures in PM2.5. However, classic factor analysis is designed to analyze samples of independent data. To handle (spatio-)temporally correlated PM2.5 data, a Bayesian approach is developed and using source apportionment, a latent factor is converted to a mixture by utilizing loadings to compute mixture coefficients. Additionally there have been intensified efforts in studying the metal composition and variation in ambient PM as well as its association with health outcomes. We use non parametric smoothing methods to study the spatio-temporal patterns and variation of common PM metals and their mixtures. Lastly the risk of low birth weight following exposure to metal mixtures during pregnancy is being investigated.
124

Statistical Modeling and Prediction of HIV/AIDS Prognosis: Bayesian Analyses of Nonlinear Dynamic Mixtures

Lu, Xiaosun 10 July 2014 (has links)
Statistical analyses and modeling have contributed greatly to our understanding of the pathogenesis of HIV-1 infection; they also provide guidance for the treatment of AIDS patients and evaluation of antiretroviral (ARV) therapies. Various statistical methods, nonlinear mixed-effects models in particular, have been applied to model the CD4 and viral load trajectories. A common assumption in these methods is all patients come from a homogeneous population following one mean trajectories. This assumption unfortunately obscures important characteristic difference between subgroups of patients whose response to treatment and whose disease trajectories are biologically different. It also may lack the robustness against population heterogeneity resulting misleading or biased inference. Finite mixture models, also known as latent class models, are commonly used to model nonpredetermined heterogeneity in a population; they provide an empirical representation of heterogeneity by grouping the population into a finite number of latent classes and modeling the population through a mixture distribution. For each latent class, a finite mixture model allows individuals in each class to vary around their own mean trajectory, instead of a common one shared by all classes. Furthermore, a mixture model has ability to cluster and estimate class membership probabilities at both population and individual levels. This important feature may help physicians to better understand a particular patient disease progression and refine the therapeutical strategy in advance. In this research, we developed mixture dynamic model and related Bayesian inferences via Markov chain Monte Carlo (MCMC). One real data set from HIV/AIDS clinical management and another from clinical trial were used to illustrate the proposed models and methods. This dissertation explored three topics. First, we modeled the CD4 trajectories using a finite mixture model with four distinct components of which the mean functions are designed based on Michaelis-Menten function. Relevant covariates both baseline and time-varying were considered and model comparison and selection were based on such-criteria as Deviance Information Criteria (DIC). Class membership model was allowed to depend on covariates for prediction. Second, we explored disease status prediction HIV/AIDS using the latent class membership model. Third, we modeled viral load trajectories using a finite mixture model with three components of which the mean functions are designed based on published HIV dynamic systems. Although this research is motivated by HIV/AIDS studies, the basic concepts and methods developed here have much broader applications in management of other chronic diseases; they can also be applied to dynamic systems in other fields. Implementation of our methods using the publicly- vailable WinBUGS package suggest that our approach can be made quite accessible to practicing statisticians and data analysts.
125

Bayesian wavelet approaches for parameter estimation and change point detection in long memory processes

Ko, Kyungduk 01 November 2005 (has links)
The main goal of this research is to estimate the model parameters and to detect multiple change points in the long memory parameter of Gaussian ARFIMA(p, d, q) processes. Our approach is Bayesian and inference is done on wavelet domain. Long memory processes have been widely used in many scientific fields such as economics, finance and computer science. Wavelets have a strong connection with these processes. The ability of wavelets to simultaneously localize a process in time and scale domain results in representing many dense variance-covariance matrices of the process in a sparse form. A wavelet-based Bayesian estimation procedure for the parameters of Gaussian ARFIMA(p, d, q) process is proposed. This entails calculating the exact variance-covariance matrix of given ARFIMA(p, d, q) process and transforming them into wavelet domains using two dimensional discrete wavelet transform (DWT2). Metropolis algorithm is used for sampling the model parameters from the posterior distributions. Simulations with different values of the parameters and of the sample size are performed. A real data application to the U.S. GNP data is also reported. Detection and estimation of multiple change points in the long memory parameter is also investigated. The reversible jump MCMC is used for posterior inference. Performances are evaluated on simulated data and on the Nile River dataset.
126

Rationing & Bayesian expectations with application to the labour market

Förster, Hannah January 2006 (has links)
The first goal of the present work focuses on the need for different rationing methods of the The Global Change and Financial Transition (GFT) work- ing group at the Potsdam Institute for Climate Impact Research (PIK): I provide a toolbox which contains a variety of rationing methods to be ap- plied to micro-economic disequilibrium models of the lagom model family. This toolbox consists of well known rationing methods, and of rationing methods provided specifically for lagom. To ensure an easy application the toolbox is constructed in modular fashion. The second goal of the present work is to present a micro-economic labour market where heterogenous labour suppliers experience consecu- tive job opportunities and need to decide whether to apply for employ- ment. The labour suppliers are heterogenous with respect to their qualifi- cations and their beliefs about the application behaviour of their competi- tors. They learn simultaneously – in Bayesian fashion – about their individ- ual perceived probability to obtain employment conditional on application (PPE) by observing each others’ application behaviour over a cycle of job opportunities. / In vorliegender Arbeit beschäftige ich mich mit zwei Dingen. Zum einen entwickle ich eine Modellierungstoolbox, die verschiedene Rationierungs- methoden enthält. Diese Rationierungsmethoden sind entweder aus der Literatur bekannt, oder wurden speziell für die lagom Modellfamilie ent- wickelt. Zum anderen zeige ich, dass man mit Hilfe von Rationierungsmetho- den aus der Modellierungstoolbox einen fiktiven Arbeitsmarkt modellie- ren kann. Auf diesem agieren arbeitssuchende Agenten, die heterogen im Bezug auf ihre Qualifikation und ihre Vorstellungen über das Bewerbungs- verhalten ihrer Konkurrenten sind. Sie erfahren aufeinanderfolgende Job- angebote und beobachten das Bewerbungsverhalten ihrer Konkurrenten, um in Bayesianischer Weise über ihre individuelle Wahrscheinlichkeit eine Stelle zu erhalten zu lernen.
127

Phylogenetic Studies in the Lamiales with Special Focus on Scrophulariaceae and Stilbaceae

Kornhall, Per January 2004 (has links)
This thesis deals with plants from the flowering plant order Lamiales, and especially the two families Scrophulariaceae and Stilbaceae. Both families have their main geographical distribution in southern Africa. The thesis presents phylogenies of Scrophulariaceae s. lat. that can be used as a framework both for a future formal classification of the Scrophulariaceae and of allied taxa. A new circumscription of the tribe Manuleeae of Scrophulariaceae is presented including also genera earlier placed in the tribe Selagineae (sometimes recognised as a family of its own, Selaginaceae). Manuleeae now consists of the genera: Barthlottia, Chaenostoma, Chenopodiopsis, Dischisma, Glekia, Globulariopsis, Glumicalyx, Gosela, Hebenstretia, Jamesbrittenia, Limosella, Lyperia, Manulea, Manuleopsis, Melanospermum, Phyllopodium, Polycarena, Pseudoselago, Reyemeia, Selago, Strobilopsis, Sutera, Tetraselago, Trieenea and Zaluzianskya. The genera Sutera and Selago are given new circumscriptions; Sutera is divided into two genera, Sutera and Chaenostoma. Selago is circumscribed to contain also taxa that formerly have been placed in Microdon and Cromidon. A new circumscription and infrafamiliar classification of the family Stilbaceae is also presented. Stilbaceae will consist of the three tribes: Bowkerieae, consisting of the genera Anastrabe, Bowkeria and Ixianthes; Hallerieae, consisting of Charadrophila and Halleria; and Stilbeae, consisting of Nuxia and Stilbe. Furthermore, the genera Campylostachys, Euthystachys, Kogelbergia and Retzia are all included in the genus Stilbe. The results in the thesis are based on parsimony and Bayesian phylogenetic inferences of DNA sequence data. Further, morphological characters are analysed and compared to the molecular phylogenies.
128

Bayesian Modeling of Conditional Densities

Li, Feng January 2013 (has links)
This thesis develops models and associated Bayesian inference methods for flexible univariate and multivariate conditional density estimation. The models are flexible in the sense that they can capture widely differing shapes of the data. The estimation methods are specifically designed to achieve flexibility while still avoiding overfitting. The models are flexible both for a given covariate value, but also across covariate space. A key contribution of this thesis is that it provides general approaches of density estimation with highly efficient Markov chain Monte Carlo methods. The methods are illustrated on several challenging non-linear and non-normal datasets. In the first paper, a general model is proposed for flexibly estimating the density of a continuous response variable conditional on a possibly high-dimensional set of covariates. The model is a finite mixture of asymmetric student-t densities with covariate-dependent mixture weights. The four parameters of the components, the mean, degrees of freedom, scale and skewness, are all modeled as functions of the covariates. The second paper explores how well a smooth mixture of symmetric components can capture skewed data. Simulations and applications on real data show that including covariate-dependent skewness in the components can lead to substantially improved performance on skewed data, often using a much smaller number of components. We also introduce smooth mixtures of gamma and log-normal components to model positively-valued response variables. In the third paper we propose a multivariate Gaussian surface regression model that combines both additive splines and interactive splines, and a highly efficient MCMC algorithm that updates all the multi-dimensional knot locations jointly. We use shrinkage priors to avoid overfitting with different estimated shrinkage factors for the additive and surface part of the model, and also different shrinkage parameters for the different response variables. In the last paper we present a general Bayesian approach for directly modeling dependencies between variables as function of explanatory variables in a flexible copula context. In particular, the Joe-Clayton copula is extended to have covariate-dependent tail dependence and correlations. Posterior inference is carried out using a novel and efficient simulation method. The appendix of the thesis documents the computational implementation details. / <p>At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 3: In press. Paper 4: Manuscript.</p>
129

Nonparametric Bayesian Context Learning for Buried Threat Detection

Ratto, Christopher Ralph January 2012 (has links)
<p>This dissertation addresses the problem of detecting buried explosive threats (i.e., landmines and improvised explosive devices) with ground-penetrating radar (GPR) and hyperspectral imaging (HSI) across widely-varying environmental conditions. Automated detection of buried objects with GPR and HSI is particularly difficult due to the sensitivity of sensor phenomenology to variations in local environmental conditions. Past approahces have attempted to mitigate the effects of ambient factors by designing statistical detection and classification algorithms to be invariant to such conditions. These methods have generally taken the approach of extracting features that exploit the physics of a particular sensor to provide a low-dimensional representation of the raw data for characterizing targets from non-targets. A statistical classification rule is then usually applied to the features. However, it may be difficult for feature extraction techniques to adapt to the highly nonlinear effects of near-surface environmental conditions on sensor phenomenology, as well as to re-train the classifier for use under new conditions. Furthermore, the search for an invariant set of features ignores that possibility that one approach may yield best performance under one set of terrain conditions (e.g., dry), and another might be better for another set of conditions (e.g., wet).</p><p>An alternative approach to improving detection performance is to consider exploiting differences in sensor behavior across environments rather than mitigating them, and treat changes in the background data as a possible source of supplemental information for the task of classifying targets and non-targets. This approach is referred to as context-dependent learning. </p><p>Although past researchers have proposed context-based approaches to detection and decision fusion, the definition of context used in this work differs from those used in the past. In this work, context is motivated by the physical state of the world from which an observation is made, and not from properties of the observation itself. The proposed context-dependent learning technique therefore utilized additional features that characterize soil properties from the sensor background, and a variety of nonparametric models were proposed for clustering these features into individual contexts. The number of contexts was assumed to be unknown a priori, and was learned via Bayesian inference using Dirichlet process priors.</p><p>The learned contextual information was then exploited by an ensemble on classifiers trained for classifying targets in each of the learned contexts. For GPR applications, the classifiers were trained for performing algorithm fusion For HSI applications, the classifiers were trained for performing band selection. The detection performance of all proposed methods were evaluated on data from U.S. government test sites. Performance was compared to several algorithms from the recent literature, several which have been deployed in fielded systems. Experimental results illustrate the potential for context-dependent learning to improve detection performance of GPR and HSI across varying environments.</p> / Dissertation
130

Invariant Procedures for Model Checking, Checking for Prior-Data Conflict and Bayesian Inference

Jang, Gun Ho 13 August 2010 (has links)
We consider a statistical theory as being invariant when the results of two statisticians' independent data analyses, based upon the same statistical theory and using effectively the same statistical ingredients, are the same. We discuss three aspects of invariant statistical theories. Both model checking and checking for prior-data conflict are assessments of single null hypothesis without any specific alternative hypothesis. Hence, we conduct these assessments using a measure of surprise based on a discrepancy statistic. For the discrete case, it is natural to use the probability of obtaining a data point that is less probable than the observed data. For the continuous case, the natural analog of this is not invariant under equivalent choices of discrepancies. A new method is developed to obtain an invariant assessment. This approach also allows several discrepancies to be combined into one discrepancy via a single P-value. Second, Bayesians developed many noninformative priors that are supposed to contain no information concerning the true parameter value. Any of these are data dependent or improper which can lead to a variety of difficulties. Gelman (2006) introduced the notion of the weak informativity as a comprimise between informative and noninformative priors without a precise definition. We give a precise definition of weak informativity using a measure of prior-data conflict that assesses whether or not a prior places its mass around the parameter values having relatively high likelihood. In particular, we say a prior Pi_2 is weakly informative relative to another prior Pi_1 whenever Pi_2 leads to fewer prior-data conflicts a priori than Pi_1. This leads to a precise quantitative measure of how much less informative a weakly informative prior is. In Bayesian data analysis, highest posterior density inference is a commonly used method. This approach is not invariant to the choice of dominating measure or reparametrizations. We explore properties of relative surprise inferences suggested by Evans (1997). Relative surprise inferences which compare the belief changes from a priori to a posteriori are invariant under reparametrizations. We mainly focus on the connection of relative surprise inferences to classical Bayesian decision theory as well as important optimalities.

Page generated in 0.0905 seconds