Global ETD Search

1	On Estimating Topology and Divergence Times in Phylogenetics Svennblad, Bodil January 2008 (has links) <p>This PhD thesis consists of an introduction and five papers, dealing with statistical methods in phylogenetics.</p><p>A phylogenetic tree describes the evolutionary relationships among species assuming that they share a common ancestor and that evolution takes place in a tree like manner. Our aim is to reconstruct the evolutionary relationships from aligned DNA sequences.</p><p>In the first two papers we investigate two measures of confidence for likelihood based methods, bootstrap frequencies with Maximum Likelihood (ML) and Bayesian posterior probabilities. We show that an earlier claimed approximate equivalence between them holds under certain conditions, but not in the current implementations of the two methods.</p><p>In the following two papers the divergence times of the internal nodes are considered. The ML estimate of the divergence time of the root is improved if longer sequences are analyzed or if more taxa are added. We show that the gain in precision is faster with longer sequences than with more taxa. We also show that the algorithm of the software package PATHd8 may give biased estimates if the global molecular clock is violated. A change of the algorithm to obtain unbiased estimates is therefore suggested.</p><p>The last paper deals with non-informative priors when using the Bayesian approach in phylogenetics. The term is not uniquely defined in the literature. We adopt the idea of data translated likelihoods and derive the so called Jeffreys' prior for branch lengths using Jukes Cantor model of evolution.</p> Mathematical statistics Phylogenetics Divergence Time Likelihood based methods Non-informative prior bootstrap support Matematisk statistik
2	On Estimating Topology and Divergence Times in Phylogenetics Svennblad, Bodil January 2008 (has links) This PhD thesis consists of an introduction and five papers, dealing with statistical methods in phylogenetics. A phylogenetic tree describes the evolutionary relationships among species assuming that they share a common ancestor and that evolution takes place in a tree like manner. Our aim is to reconstruct the evolutionary relationships from aligned DNA sequences. In the first two papers we investigate two measures of confidence for likelihood based methods, bootstrap frequencies with Maximum Likelihood (ML) and Bayesian posterior probabilities. We show that an earlier claimed approximate equivalence between them holds under certain conditions, but not in the current implementations of the two methods. In the following two papers the divergence times of the internal nodes are considered. The ML estimate of the divergence time of the root is improved if longer sequences are analyzed or if more taxa are added. We show that the gain in precision is faster with longer sequences than with more taxa. We also show that the algorithm of the software package PATHd8 may give biased estimates if the global molecular clock is violated. A change of the algorithm to obtain unbiased estimates is therefore suggested. The last paper deals with non-informative priors when using the Bayesian approach in phylogenetics. The term is not uniquely defined in the literature. We adopt the idea of data translated likelihoods and derive the so called Jeffreys' prior for branch lengths using Jukes Cantor model of evolution. Mathematical statistics Phylogenetics Divergence Time Likelihood based methods Non-informative prior bootstrap support Matematisk statistik
3	A method to establish non-informative prior probabilities for risk-based decision analysis Min, Namhong 28 April 2014 (has links) In Bayesian decision analysis, uncertainty and risk are accounted for with probabilities for the possible states, or states of nature, that affect the outcome of a decision. Application of Bayes’ theorem requires non-informative prior probabilities, which represent the probabilities of states of nature for a decision maker under complete ignorance. These prior probabilities are then subsequently updated with any and all available information in assessing probabilities for making decisions. The conventional approach for the non-informative probability distribution is based on Bernoulli’s principle of insufficient reason. This principle assigns a uniform distribution to uncertain states when a decision maker has no information about the states of nature. The principle of insufficient reason has three difficulties: it may inadvertently provide a biased starting point for decision making, it does not provide a consistent set of probabilities, and it violates reasonable axioms of decision theory. The first objective of this study is to propose and describe a new method to establish non-informative prior probabilities for decision making under uncertainty. The proposed decision-based method is focuses on decision outcomes that include preference in decision alternatives and decision consequences. The second objective is to evaluate the logic and rationality basis of the proposed decision-based method. The decision-based method overcomes the three weaknesses associated with the principle of insufficient reason, and provides an unbiased starting point for decision making. It also produces consistent non-informative probabilities. Finally, the decision-based method satisfies axioms of decision theory that characterize the case of no information (or complete ignorance). The third and final objective is to demonstrate the application of the decision-based method to practical decision making problems in engineering. Four major practical implications are illustrated and discussed with these examples. First, the method is practical because it is feasible in decisions with a large number of decision alternatives and states of nature and it is applicable to both continuous and discrete random variables of finite and infinite ranges. Second, the method provides an objective way to establish non-informative prior probabilities that capture a highly nonlinear relationship between states of nature. Third, we can include any available information through Bayes’ theorem by updating the non-informative probabilities without the need to assume more than is actually contained in the information. Lastly, two different decision making problems with the same states of nature may have different non-informative probabilities. / text Non-informative prior probabilities Decision making Uncertainty Logic Rationality Decision-based method Practical decision making
4	Eliciting Expert Knowledge for Bayesian Logistic Regression in Species Habitat Modelling Kynn, Mary January 2005 (has links) This research aims to develop a process for eliciting expert knowledge and incorporating this knowledge as prior distributions for a Bayesian logistic regression model. This work was motivated by the need for less data reliant methods of modelling species habitat distributions. A comprehensive review of the research from both cognitive psychology and the statistical literature provided specific recommendations for the creation of an elicitation scheme. These were incorporated into the design of a Bayesian logistic regression model and accompanying elicitation scheme. This model and scheme were then implemented as interactive, graphical software called ELICITOR created within the BlackBox Component Pascal environment. This software was specifically written to be compatible with existing Bayesian analysis software, winBUGS as an odd-on component. The model, elicitation scheme and software were evaluated through five case studies of various fauna and flora species. For two of these there were sufficient data for a comparison of expert and data-driven models. The case studies confirmed that expert knowledge can be quantified and formally incorporated into a logistic regression model. Finally, they provide a basis for a thorough discussion of the model, scheme and software extensions and lead to recommendations for elicitation research. Bayesian statistics component software elicit expert knowledge informative prior logistic regression presence absence models prior species habitat models
5	Bayesian Model Diagnostics and Reference Priors for Constrained Rate Models of Count Data Sonksen, Michael David 26 September 2011 (has links) No description available. Statistics asymptotics conditional independence lack-of-fit hierarchical models Markov chain Monte Carlo non-informative prior mortality rates smoking mortality
6	Lois a priori non-informatives et la modélisation par mélange / Non-informative priors and modelization by mixtures Kamary, Kaniav 15 March 2016 (has links) L’une des grandes applications de la statistique est la validation et la comparaison de modèles probabilistes au vu des données. Cette branche des statistiques a été développée depuis la formalisation de la fin du 19ième siècle par des pionniers comme Gosset, Pearson et Fisher. Dans le cas particulier de l’approche bayésienne, la solution à la comparaison de modèles est le facteur de Bayes, rapport des vraisemblances marginales, quelque soit le modèle évalué. Cette solution est obtenue par un raisonnement mathématique fondé sur une fonction de coût.Ce facteur de Bayes pose cependant problème et ce pour deux raisons. D’une part, le facteur de Bayes est très peu utilisé du fait d’une forte dépendance à la loi a priori (ou de manière équivalente du fait d’une absence de calibration absolue). Néanmoins la sélection d’une loi a priori a un rôle vital dans la statistique bayésienne et par conséquent l’une des difficultés avec la version traditionnelle de l’approche bayésienne est la discontinuité de l’utilisation des lois a priori impropres car ils ne sont pas justifiées dans la plupart des situations de test. La première partie de cette thèse traite d’un examen général sur les lois a priori non informatives, de leurs caractéristiques et montre la stabilité globale des distributions a posteriori en réévaluant les exemples de [Seaman III 2012]. Le second problème, indépendant, est que le facteur de Bayes est difficile à calculer à l’exception des cas les plus simples (lois conjuguées). Une branche des statistiques computationnelles s’est donc attachée à résoudre ce problème, avec des solutions empruntant à la physique statistique comme la méthode du path sampling de [Gelman 1998] et à la théorie du signal. Les solutions existantes ne sont cependant pas universelles et une réévaluation de ces méthodes suivie du développement de méthodes alternatives constitue une partie de la thèse. Nous considérons donc un nouveau paradigme pour les tests bayésiens d’hypothèses et la comparaison de modèles bayésiens en définissant une alternative à la construction traditionnelle de probabilités a posteriori qu’une hypothèse est vraie ou que les données proviennent d’un modèle spécifique. Cette méthode se fonde sur l’examen des modèles en compétition en tant que composants d’un modèle de mélange. En remplaçant le problème de test original avec une estimation qui se concentre sur le poids de probabilité d’un modèle donné dans un modèle de mélange, nous analysons la sensibilité sur la distribution a posteriori conséquente des poids pour divers modélisation préalables sur les poids et soulignons qu’un intérêt important de l’utilisation de cette perspective est que les lois a priori impropres génériques sont acceptables, tout en ne mettant pas en péril la convergence. Pour cela, les méthodes MCMC comme l’algorithme de Metropolis-Hastings et l’échantillonneur de Gibbs et des approximations de la probabilité par des méthodes empiriques sont utilisées. Une autre caractéristique de cette variante facilement mise en œuvre est que les vitesses de convergence de la partie postérieure de la moyenne du poids et de probabilité a posteriori correspondant sont assez similaires à la solution bayésienne classique / One of the major applications of statistics is the validation and comparing probabilistic models given the data. This branch statistics has been developed since the formalization of the late 19th century by pioneers like Gosset, Pearson and Fisher. In the special case of the Bayesian approach, the comparison solution of models is the Bayes factor, ratio of marginal likelihoods, whatever the estimated model. This solution is obtained by a mathematical reasoning based on a loss function. Despite a frequent use of Bayes factor and its equivalent, the posterior probability of models, by the Bayesian community, it is however problematic in some cases. First, this rule is highly dependent on the prior modeling even with large datasets and as the selection of a prior density has a vital role in Bayesian statistics, one of difficulties with the traditional handling of Bayesian tests is a discontinuity in the use of improper priors since they are not justified in most testing situations. The first part of this thesis deals with a general review on non-informative priors, their features and demonstrating the overall stability of posterior distributions by reassessing examples of [Seaman III 2012].Beside that, Bayes factors are difficult to calculate except in the simplest cases (conjugate distributions). A branch of computational statistics has therefore emerged to resolve this problem with solutions borrowing from statistical physics as the path sampling method of [Gelman 1998] and from signal processing. The existing solutions are not, however, universal and a reassessment of the methods followed by alternative methods is a part of the thesis. We therefore consider a novel paradigm for Bayesian testing of hypotheses and Bayesian model comparison. The idea is to define an alternative to the traditional construction of posterior probabilities that a given hypothesis is true or that the data originates from a specific model which is based on considering the models under comparison as components of a mixture model. By replacing the original testing problem with an estimation version that focus on the probability weight of a given model within a mixture model, we analyze the sensitivity on the resulting posterior distribution of the weights for various prior modelings on the weights and stress that a major appeal in using this novel perspective is that generic improper priors are acceptable, while not putting convergence in jeopardy. MCMC methods like Metropolis-Hastings algorithm and the Gibbs sampler are used. From a computational viewpoint, another feature of this easily implemented alternative to the classical Bayesian solution is that the speeds of convergence of the posterior mean of the weight and of the corresponding posterior probability are quite similar.In the last part of the thesis we construct a reference Bayesian analysis of mixtures of Gaussian distributions by creating a new parameterization centered on the mean and variance of those models itself. This enables us to develop a genuine non-informative prior for Gaussian mixtures with an arbitrary number of components. We demonstrate that the posterior distribution associated with this prior is almost surely proper and provide MCMC implementations that exhibit the expected component exchangeability. The analyses are based on MCMC methods as the Metropolis-within-Gibbs algorithm, adaptive MCMC and the Parallel tempering algorithm. This part of the thesis is followed by the description of R package named Ultimixt which implements a generic reference Bayesian analysis of unidimensional mixtures of Gaussian distributions obtained by a location-scale parameterization of the model. This package can be applied to produce a Bayesian analysis of Gaussian mixtures with an arbitrary number of components, with no need to specify the prior distribution. Distribution de mélange Loi a priori non-Informative Analyse bayésienne A priori impropre Choix du modèle bayésien Méthodes de MCMC Mixture distribution Non-Informative prior Bayesian analysis Improper prior Bayesian model choice MCMC methods 519.5
7	Small population bias and sampling effects in stochastic mortality modelling Chen, Liang January 2017 (has links) Pension schemes are facing more difficulties on matching their underlying liabilities with assets, mainly due to faster mortality improvements for their underlying populations, better environments and medical treatments and historically low interest rates. Given most of the pension schemes are relatively much smaller than the national population, modelling and forecasting the small populations' longevity risk become urgent tasks for both the industrial practitioners and academic researchers. This thesis starts with a systematic analysis on the influence of population size on the uncertainties of mortality estimates and forecasts with a stochastic mortality model, based on a parametric bootstrap methodology with England and Wales males as our benchmark population. The population size has significant effect on the uncertainty of mortality estimates and forecasts. The volatilities of small populations are over-estimated by the maximum likelihood estimators. A Bayesian model is developed to improve the estimation of the volatilities and the predictions of mortality rates for the small populations by employing the information of larger population with informative prior distributions. The new model is validated with the simulated small death scenarios. The Bayesian methodologies generate smoothed estimations for the mortality rates. Moreover, a methodology is introduced to use the information of large population for obtaining unbiased volatilities estimations given the underlying prior settings. At last, an empirical study is carried out based on the Scotland mortality dataset.

1

Page generated in 0.0719 seconds