Return to search

Rooting major cellular radiations using statistical phylogenetics

Phylogenetics focuses on learning about evolutionary relationships between species. These relationships can be represented by phylogenetic trees, where similar species are grouped together as sharing a recent common ancestor. The common ancestor of all the species of the tree is the root of the tree. The root is fundamental to the biological interpretation of the tree, providing a critical reference point for polarising ancestor-descendant relation- ships and determining the order in which key traits evolved along the tree (Embley and Martin, 2006). Despite its importance, most models of sequence evolution are unable to infer the root of a phylogenetic tree. They are based on homogeneous continuous time Markov processes (CTMPs) that are assumed to be stationary and time-reversible, with the mathematical consequence that the likelihood of a tree does not depend on where it is rooted. As a result, the root of the tree cannot be inferred as part of the analysis. Other methods which are generally used to root evolutionary trees can be problematic. For example, the outgroup rooting method is susceptible to a long-branch attraction arte- fact. Paralogue rooting requires pairs of paralogous genes which underwent an ancient gene duplication event to be present in all species being analysed, and the number of such genes is limited. In this thesis we explore an alternative model-based approach, adopting a substitution model in which changing the root position changes the likelihood of the tree. We explore the e ect of relaxing reversibility and stationarity assumptions and allowing the position of the root to be another unknown quantity in the model. We propose two hierarchical non-reversible models which are centred on a reversible model but perturbed to allow non- reversibility. The models di er in the degree of structure imposed on the perturbations. We also explore non-stationary models, and the combination of relaxing both the reversibility and the stationarity assumptions. The analysis is performed in the Bayesian framework using Markov chain Monte Carlo methods. We illustrate the performance of the models in analyses of simulated datasets using two types of topological priors. We also investigate the e ect of di erent topologies and branch lengths on the inference. Our results illustrate the usefulness of modelling non- reversibility and non-stationarity for root inference, and also demonstrate the sensitivity of the analysis to topological priors. We then apply the models to real biological datasets, the radiation of polyploid yeasts and the radiation of primates, for which there is a robust biological opinion about the root position. Finally we apply the models to an open question in biology: rooting the ribosomal tree of life.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:719778
Date January 2016
CreatorsCherlin, Svetlana
PublisherUniversity of Newcastle upon Tyne
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/10443/3492

Page generated in 0.0022 seconds