Global ETD Search

141	Lois a priori non-informatives et la modélisation par mélange / Non-informative priors and modelization by mixtures Kamary, Kaniav 15 March 2016 (has links) L’une des grandes applications de la statistique est la validation et la comparaison de modèles probabilistes au vu des données. Cette branche des statistiques a été développée depuis la formalisation de la fin du 19ième siècle par des pionniers comme Gosset, Pearson et Fisher. Dans le cas particulier de l’approche bayésienne, la solution à la comparaison de modèles est le facteur de Bayes, rapport des vraisemblances marginales, quelque soit le modèle évalué. Cette solution est obtenue par un raisonnement mathématique fondé sur une fonction de coût.Ce facteur de Bayes pose cependant problème et ce pour deux raisons. D’une part, le facteur de Bayes est très peu utilisé du fait d’une forte dépendance à la loi a priori (ou de manière équivalente du fait d’une absence de calibration absolue). Néanmoins la sélection d’une loi a priori a un rôle vital dans la statistique bayésienne et par conséquent l’une des difficultés avec la version traditionnelle de l’approche bayésienne est la discontinuité de l’utilisation des lois a priori impropres car ils ne sont pas justifiées dans la plupart des situations de test. La première partie de cette thèse traite d’un examen général sur les lois a priori non informatives, de leurs caractéristiques et montre la stabilité globale des distributions a posteriori en réévaluant les exemples de [Seaman III 2012]. Le second problème, indépendant, est que le facteur de Bayes est difficile à calculer à l’exception des cas les plus simples (lois conjuguées). Une branche des statistiques computationnelles s’est donc attachée à résoudre ce problème, avec des solutions empruntant à la physique statistique comme la méthode du path sampling de [Gelman 1998] et à la théorie du signal. Les solutions existantes ne sont cependant pas universelles et une réévaluation de ces méthodes suivie du développement de méthodes alternatives constitue une partie de la thèse. Nous considérons donc un nouveau paradigme pour les tests bayésiens d’hypothèses et la comparaison de modèles bayésiens en définissant une alternative à la construction traditionnelle de probabilités a posteriori qu’une hypothèse est vraie ou que les données proviennent d’un modèle spécifique. Cette méthode se fonde sur l’examen des modèles en compétition en tant que composants d’un modèle de mélange. En remplaçant le problème de test original avec une estimation qui se concentre sur le poids de probabilité d’un modèle donné dans un modèle de mélange, nous analysons la sensibilité sur la distribution a posteriori conséquente des poids pour divers modélisation préalables sur les poids et soulignons qu’un intérêt important de l’utilisation de cette perspective est que les lois a priori impropres génériques sont acceptables, tout en ne mettant pas en péril la convergence. Pour cela, les méthodes MCMC comme l’algorithme de Metropolis-Hastings et l’échantillonneur de Gibbs et des approximations de la probabilité par des méthodes empiriques sont utilisées. Une autre caractéristique de cette variante facilement mise en œuvre est que les vitesses de convergence de la partie postérieure de la moyenne du poids et de probabilité a posteriori correspondant sont assez similaires à la solution bayésienne classique / One of the major applications of statistics is the validation and comparing probabilistic models given the data. This branch statistics has been developed since the formalization of the late 19th century by pioneers like Gosset, Pearson and Fisher. In the special case of the Bayesian approach, the comparison solution of models is the Bayes factor, ratio of marginal likelihoods, whatever the estimated model. This solution is obtained by a mathematical reasoning based on a loss function. Despite a frequent use of Bayes factor and its equivalent, the posterior probability of models, by the Bayesian community, it is however problematic in some cases. First, this rule is highly dependent on the prior modeling even with large datasets and as the selection of a prior density has a vital role in Bayesian statistics, one of difficulties with the traditional handling of Bayesian tests is a discontinuity in the use of improper priors since they are not justified in most testing situations. The first part of this thesis deals with a general review on non-informative priors, their features and demonstrating the overall stability of posterior distributions by reassessing examples of [Seaman III 2012].Beside that, Bayes factors are difficult to calculate except in the simplest cases (conjugate distributions). A branch of computational statistics has therefore emerged to resolve this problem with solutions borrowing from statistical physics as the path sampling method of [Gelman 1998] and from signal processing. The existing solutions are not, however, universal and a reassessment of the methods followed by alternative methods is a part of the thesis. We therefore consider a novel paradigm for Bayesian testing of hypotheses and Bayesian model comparison. The idea is to define an alternative to the traditional construction of posterior probabilities that a given hypothesis is true or that the data originates from a specific model which is based on considering the models under comparison as components of a mixture model. By replacing the original testing problem with an estimation version that focus on the probability weight of a given model within a mixture model, we analyze the sensitivity on the resulting posterior distribution of the weights for various prior modelings on the weights and stress that a major appeal in using this novel perspective is that generic improper priors are acceptable, while not putting convergence in jeopardy. MCMC methods like Metropolis-Hastings algorithm and the Gibbs sampler are used. From a computational viewpoint, another feature of this easily implemented alternative to the classical Bayesian solution is that the speeds of convergence of the posterior mean of the weight and of the corresponding posterior probability are quite similar.In the last part of the thesis we construct a reference Bayesian analysis of mixtures of Gaussian distributions by creating a new parameterization centered on the mean and variance of those models itself. This enables us to develop a genuine non-informative prior for Gaussian mixtures with an arbitrary number of components. We demonstrate that the posterior distribution associated with this prior is almost surely proper and provide MCMC implementations that exhibit the expected component exchangeability. The analyses are based on MCMC methods as the Metropolis-within-Gibbs algorithm, adaptive MCMC and the Parallel tempering algorithm. This part of the thesis is followed by the description of R package named Ultimixt which implements a generic reference Bayesian analysis of unidimensional mixtures of Gaussian distributions obtained by a location-scale parameterization of the model. This package can be applied to produce a Bayesian analysis of Gaussian mixtures with an arbitrary number of components, with no need to specify the prior distribution. Distribution de mélange Loi a priori non-Informative Analyse bayésienne A priori impropre Choix du modèle bayésien Méthodes de MCMC Mixture distribution Non-Informative prior Bayesian analysis Improper prior Bayesian model choice MCMC methods 519.5
142	DETERMINING MACROSCOPIC TRANSPORT PARAMETERS AND MICROBIOTA RESPONSE USING MACHINE LEARNING TECHNIQUES Miad Boodaghidizaji (15339991) 27 April 2023 (has links) <p>Determining the macroscopic properties such as diffusivity, concentration, and viscosity is of paramount importance to many engineering applications. The determination of macroscopic properties from experimental or numerical data is a challenging task due to the inverse nature of these problems. Data analytic techniques with recent advances in machine learning as well as optimization techniques have enabled tackling problems that were once considered impossible to solve. In the current proposal, we focus on using Bayesian and the state of the art machine learning techniques to solve three problems that involve calculations of the macroscopic transport properties. </p> <p><br></p> <p>i) We developed a Bayesian approach to estimate the diffusion coefficient of rhodamine 6G in breast cancer spheroids. Determination of the diffusivity values of drugs in tumors is crucial to understanding drug resistivity, particularly in breast cancer tumors. To this end, we invoked Bayesian inference to solve the problem of determining the light attenuation coefficient and diffusion coefficient in breast cancer spheroids for Rhodamine 6G (R6G) as a mock drug for the tyrosine kinase inhibitor, Neratinib. We noticed that the diffusion coefficient values do not noticeably vary across a HER2+ breast cancer cell line as a function of transglutaminase 2 levels, even in the presence of fibroblast cells. </p> <p><br></p> <p>ii) We developed a multi-fidelity model to predict the rheological properties of a suspension of fibers using neural networks and Gaussian processes. Determining the rheological properties of fiber suspensions is of indispensable to many industrial applications. To this end, multi-fidelity Gaussian processes and neural networks were utilized to predict the apparent viscosity. Results indicated that with tuned hyperparameters, both the multi-fidelity Gaussian processes and neural networks lead to predictions with a high level of accuracy, where neural networks demonstrate marginally better performance.</p> <p><br></p> <p><br></p> <p>iii) We developed machine learning models to analyze measles,</p> <p>mumps, rubella, and varicella (MMRV) vaccines using Raman and absorption spectra. Monitoring the concentration of viral particles is indispensable to producing vaccines or anti-viral medications. To this end, we designed and optimized a convolutional neural network and random forest models to map spectroscopic signals to concentration values. Results indicated that when the joint Raman-absorption signals are used for training, prediction accuracies are higher, with the random forest model demonstrating marginally better performance. </p> <p><br></p> <p>iv) We developed four machine learning models, including random forest, support vector machine, artificial neural networks, and convolutional neural networks to classify diseases using gut microbiota data. We distinguished between Parkinson’s disease, Crohn’s disease (CD), ulcerative colitis (UC), human immune deficiency virus (HIV), and healthy control (HC) subjects in the</p> <p>presence and absence of fiber treatments. Our analysis demonstrated that it would be possible to use machine learning to distinguish between healthy and non-healthy cases in addition to predicting four different types of diseases with very high accuracy. </p> <p>v</p> Machine Learning Bayesian Analysis Inverse Problem
143	Branching Out with Mixtures: Phylogenetic Inference That’s Not Afraid of a Little Uncertainty / Förgreningar med mixturer: Fylogenetisk inferens som inte räds lite osäkerhet Molén, Ricky January 2023 (has links) Phylogeny, the study of evolutionary relationships among species and other taxa, plays a crucial role in understanding the history of life. Bayesian analysis using Markov chain Monte Carlo (MCMC) is a widely used approach for inferring phylogenetic trees, but it suffers from slow convergence in higher dimensions and is slow to converge. This thesis focuses on exploring variational inference (VI), a methodology that is believed to lead to improved speed and accuracy of phylogenetic models. However, VI models are known to concentrate the density of the learned approximation in high-likelihood areas. This thesis evaluates the current state of Variational Inference Bayesian Phylogenetics (VBPI) and proposes a solution using a mixture of components to improve the VBPI method's performance on complex datasets and multimodal latent spaces. Additionally, we cover the basics of phylogenetics to provide a comprehensive understanding of the field. / Fylogeni, vilket är studien av evolutionära relationer mellan arter och andra taxonomiska grupper, spelar en viktig roll för att förstå livets historia. En ofta använd metod för att dra slutsatser om fylogenetiska träd är bayesiansk analys med Markov Chain Monte Carlo (MCMC), men den lider av långsam konvergens i högre dimensioner och kräver oändligt med tid. Denna uppsats fokuserar på att undersöka hur variationsinferens (VI) kan nyttjas inom fylogenetisk inferens med hög noggranhet. Vi fokuserar specifik på en modell kallad VBPI. Men VI-modeller är allmänt kända att att koncentrera sig på höga sannolikhetsområden i posteriorfördelningar. Vi utvärderar prestandan för Variatinal Inference Baysian Phylogenetics (VBPI) och föreslår en förbättring som använder mixturer av förslagsfördelningar för att förbättra VBPI-modellens förmåga att hantera mer komplexa datamängder och multimodala posteriorfördelningar. Utöver dettta går vi igenom grunderna i fylogenetik för att ge en omfattande förståelse av området. Phylogeny Bayesian analysis Markov chain Monte Carlo Variational inference Mixture of proposal distributions Fylogeni Bayesiansk analys Markov Chain Monte Carlo Variationsinferens Mixturer av förslagsfördelningar Other Mathematics Annan matematik
144	Science Based Human Reliability Analysis: Using Digital Nuclear Power Plant Simulators for Human Reliability Research Shirley, Rachel B. 23 October 2017 (has links) No description available. Nuclear Engineering
145	Population connectivity: combining methods for estimating avian dispersal and migratory linkages Ibarguen, Siri B. 30 March 2004 (has links) No description available. Biology, Ecology Population connectivity Migratory linkages Dispersal Henslow's sparrow Ammodramus henslowii Spatial autocorrelation Gene flow Meme flow Geographic variation in song Private alleles Bayesian analysis Stable isotope ratios
146	Distribution-based Approach to Take Advantage of Automatic Passenger Counter Data in Estimating Period Route-level Transit Passenger Origin-Destination Flows:Methodology Development, Numerical Analyses and Empirical Investigations Ji, Yuxiong 21 March 2011 (has links) No description available. Civil Engineering Origin-Destination flow estimation Public transportation Transit route Automatic Passenger Counter (APC) Expection Maximization (EM) algorithm
147	Bayesian Analysis of Temporal and Spatio-temporal Multivariate Environmental Data El Khouly, Mohamed Ibrahim 09 May 2019 (has links) High dimensional space-time datasets are available nowadays in various aspects of life such as economy, agriculture, health, environment, etc. Meanwhile, it is challenging to reveal possible connections between climate change and weather extreme events such as hurricanes or tornadoes. In particular, the relationship between tornado occurrence and climate change has remained elusive. Moreover, modeling multivariate spatio-temporal data is computationally expensive. There is great need to computationally feasible models that account for temporal, spatial, and inter-variables dependence. Our research focuses on those areas in two ways. First, we investigate connections between changes in tornado risk and the increase in atmospheric instability over Oklahoma. Second, we propose two multiscale spatio-temporal models, one for multivariate Gaussian data, and the other for matrix-variate Gaussian data. Those frameworks are novel additions to the existing literature on Bayesian multiscale models. In addition, we have proposed parallelizable MCMC algorithms to sample from the posterior distributions of the model parameters with enhanced computations. / Doctor of Philosophy / Over 1000 tornadoes are reported every year in the United States causing massive losses in lives and possessions according to the National Oceanic and Atmospheric Administration. Therefore, it is worthy to investigate possible connections between climate change and tornado occurrence. However, there are massive environmental datasets in three or four dimensions (2 or 3 dimensional space, and time), and the relationship between tornado occurrence and climate change has remained elusive. Moreover, it is computationally expensive to analyze those high dimensional space-time datasets. In part of our research, we have found a significant relationship between occurrence of strong tornadoes over Oklahoma and meteorological variables. Some of those meteorological variables have been affected by ozone depletion and emissions of greenhouse gases. Additionally, we propose two Bayesian frameworks to analyze multivariate space-time datasets with fast and feasible computations. Finally, our analyses indicate different patterns of temperatures at atmospheric altitudes with distinctive rates over the United States. Spatio-temporal analysis Bayesian analysis Multiscale models Dynamic linear models Climate change Tornado risk Markov Chain Monte Carlo Trend analysis Multivariate analysis Matrix-variate Gaussian distribution Reanalysis data.
148	On Death in the Mesolithic : Or the Mortuary Practices of the Last Hunter-Gatherers of the South-Western Iberian Peninsula, 7th–6th Millennium BCE Peyroteo Stjerna, Rita January 2016 (has links) The history of death is entangled with the history of changing social values, meaning that a shift in attitudes to death will be consistent with changes in a society’s world view. Late Mesolithic shell middens in the Tagus and Sado valleys, Portugal, constitute some of the largest and earliest burial grounds known, arranged and maintained by people with a hunting, fishing, and foraging lifestyle, c 6000–5000 cal BCE. These sites have been interpreted in the light of economic and environmental processes as territorial claims to establish control over limited resources. This approach does not explain the significance of the frequent disposal of the dead in neighbouring burial grounds, and how these places were meaningful and socially recognized. The aim of this dissertation is to answer these questions through the detailed analysis of museum collections of human burials from these sites, excavated between the late nineteenth century and the 1960s. I examine the burial activity of the last hunter-gatherers of the south-western Iberian Peninsula from an archaeological perspective, and explain the burial phenomenon through the lens of historical and humanist approaches to death and hunter-gatherers, on the basis of theoretical concepts of social memory, place, mortuary ritual practice, and historical processes. Human burials are investigated in terms of time and practice based on the application of three methods: radiocarbon dating and Bayesian analysis to define the chronological framework of the burial activity at each site and valley; stable isotope analysis of carbon and nitrogen aimed at defining the burial populations by the identification of dietary choices; and archaeothanatology to reconstruct and define central practices in the treatment of the dead. This dissertation provides new perspectives on the role and relevance of the shell middens in the Tagus and Sado valleys. Hunter-gatherers frequenting these sites were bound by shared social practices, which included the formation and maintenance of burial grounds, as a primary means of history making. Death rituals played a central role in the life of these hunter-gatherers in developing a sense of community, as well as maintaining social ties in both life and death. death Late Mesolithic hunter-gatherers social memory place burial practices mortuary ritual historical process south-western Iberian Peninsula archaeothanatology radiocarbon dating and Bayesian analysis stable isotopes (carbon and nitrogen) shell middens museum collections
149	Modelos de regressão estáticos e dinâmicos para taxas ou proporções: uma abordagem bayesiana / Regression of static and dynamic models for proportions or rates: a Bayesian approach Correia, Leandro Tavares 01 June 2015 (has links) Este trabalho apresenta um estudo de dados com resposta em intervalos limitados, mais especificamente no intervalo [0,1], como no caso de taxas e proporções. Em diversos casos práticos esta estrutura de dados apresenta uma quantidade não negligenciável de valores extremos (0 e 1) e que modelos usuais não são adequados para sua análise. Para esta situação propomos, por meio de um enfoque Bayesiano, modelos de regressão beta inflacionado de zeros e uns (BIZU) e modelos de regressão Tobit duplamente censurado adaptados nesse intervalo. Técnicas de diagnóstico e qualidade do ajuste também são discutidas. Apresentamos a análise desta estrutura de dados no contexto de série de tempo por meio da abordagem Bayesiana de modelos dinâmicos. Estudos de comportamento e previsão de séries de tempo foram explorados utilizando técnicas de Monte Carlo sequencial, conhecidas como filtro de partículas. Particularidades e competitividade entre as duas classes de modelos também foram discutidas. / This paper presents a study focused on observations in a limited interval , more specifically in [0,1] , such as rate and proportion data. In many practical cases this data structure has a considerable amount of extreme values (0 and 1) and usual classical models are not suitable for this type of data set. We propose two class of regression models to deal with this context: beta inflated of zeros and ones (BIZU) models and Tobit doubly censored models adapted in this interval. Fit quality and diagnostic techniques are also discussed. Time series of proportions are also developed through Bayesian dynamic models. Forecasting and behavioral analysis were explored using sequential Monte Carlo techniques, known as particle filters. Particularities and competitiveness between the two classes of models were also discussed as well. Análise bayesiana Bayesian analysis Beta distribution Beta inflated model Distribuição beta Dynamic models Filtro de partículas Modelo beta inflacionado Modelo Tobit Modelos dinâmicos Particle filters Proporções Proportions Regressão Regression Tobit model
150	Modelos de regressão estáticos e dinâmicos para taxas ou proporções: uma abordagem bayesiana / Regression of static and dynamic models for proportions or rates: a Bayesian approach Leandro Tavares Correia 01 June 2015 (has links) Este trabalho apresenta um estudo de dados com resposta em intervalos limitados, mais especificamente no intervalo [0,1], como no caso de taxas e proporções. Em diversos casos práticos esta estrutura de dados apresenta uma quantidade não negligenciável de valores extremos (0 e 1) e que modelos usuais não são adequados para sua análise. Para esta situação propomos, por meio de um enfoque Bayesiano, modelos de regressão beta inflacionado de zeros e uns (BIZU) e modelos de regressão Tobit duplamente censurado adaptados nesse intervalo. Técnicas de diagnóstico e qualidade do ajuste também são discutidas. Apresentamos a análise desta estrutura de dados no contexto de série de tempo por meio da abordagem Bayesiana de modelos dinâmicos. Estudos de comportamento e previsão de séries de tempo foram explorados utilizando técnicas de Monte Carlo sequencial, conhecidas como filtro de partículas. Particularidades e competitividade entre as duas classes de modelos também foram discutidas. / This paper presents a study focused on observations in a limited interval , more specifically in [0,1] , such as rate and proportion data. In many practical cases this data structure has a considerable amount of extreme values (0 and 1) and usual classical models are not suitable for this type of data set. We propose two class of regression models to deal with this context: beta inflated of zeros and ones (BIZU) models and Tobit doubly censored models adapted in this interval. Fit quality and diagnostic techniques are also discussed. Time series of proportions are also developed through Bayesian dynamic models. Forecasting and behavioral analysis were explored using sequential Monte Carlo techniques, known as particle filters. Particularities and competitiveness between the two classes of models were also discussed as well. Análise bayesiana Distribuição beta Filtro de partículas Modelo beta inflacionado Modelo Tobit Modelos dinâmicos Proporções Regressão Bayesian analysis Beta distribution Beta inflated model Dynamic models Particle filters Proportions Regression Tobit model

Search results