1 |
Statistical analysis of natural selection in RNA virus populationsBhatt, Samir January 2010 (has links)
A key goal of modern evolutionary biology is the identification of genes or genome regions that have been targeted by natural selection. Methods for detecting natural selection utilise the information sampled in contemporary gene sequences and test for deviation from the null hypothesis of neutrality. One such method is the McDonald Kreitman test (MK test), which detects the the molecular 'footprint' left by natural selection by considering the frequency of observed mutations within the sampled population. In this thesis I investigate the applicability of the MK test to viral populations and develop several new methods based on the original MK test. In chapter 2, I use a combination of simulation and methodological improvements to show that the MK test can have low error when applied to analysis of RNA virus populations. Then, in chapter 3, I develop an extension of the MK test with the purpose of estimating rates of adaptive fixation for all genes of the human influenza A virus subtypes H1N1 and H3N2. My results are consistent with previous studies on selection in influenza virus populations, and provide a new perspective on the evolutionary dynamics of human influenza virus. In chapter 4 I develop a formal statistical framework based, on the MK test, for calculating the number of non neutral sites at any frequency range in the site frequency spectrum. In this framework, I introduce a new method for reconstructing the site frequency spectrum that incorporates sampling error and allows for the inclusion of prior knowledge. Using this new framework I show that the majority of nucleotide sites in hepatitis C virus sequences sampled during chronic infection represent deleterious mutations. Finally, in chapter 5 I use the generalised framework introduced in chapter 4 to develop a statistic for evaluating the deleterious mutation load of a population. I apply this test sequences that represent 96 RNA virus genes and show that my approach has comparable power to equivalent phylogenetic methods. In this thesis I have developed computationally efficient methods for analysis of genetic data from virus populations. It is my hope that these methods will become useful given the explosion in sequence data that has accompanied recent improvements in sequencing technology.
|
2 |
Extensions du modèle standard neutre pertinentes pour l'analyse de la diversité génétique / Extensions of the standard neutral model relevant for the analysis of genetic diversityLapierre, Marguerite 25 September 2017 (has links)
Cette thèse se place dans le cadre de l'analyse des forces évolutives qui génèrent les polymorphismes et les divergences entre les génomes d'une même espèce. Le cadre théorique utilisé dans la majorité des domaines de l'évolution moléculaire est la théorie neutraliste, proposée par Motoo Kimura en 1968. Ce modèle est caractérisé par les hypothèses de neutralité, de taille constante de la population étudiée, et de panmixie. Dans un premier temps nous avons cherché à comprendre comment ce cadre théorique est utilisé en pratique et quelles peuvent être les conséquences de ces hypothèses sur les inférences et les prédictions faites dans ce cadre théorique. Pour cela nous avons mené deux études confrontant des données à des méthodes existantes d'inférence démographique. Une première étude a montré que les méthodes utilisées fréquemment pour l'inférence démographique microbienne, basées sur la reconstruction d'un arbre phylogénétique unique, sont biaisées par la sélection, la recombinaison et les biais d'échantillonnage. Nous avons ensuite comparé plusieurs méthodes d'inférence démographique en les appliquant à une population humaine africaine, les Yoruba. Cette étude a montré les limites d'une méthode existante, et elle illustre le problème d'identifiabilité des histoires démographiques lorsque l'inférence est basée sur le spectre de fréquence. Enfin, dans un troisième temps nous avons analysé plusieurs jeux de données de polymorphisme génétique avec un modèle de référence alternatif à coalescences multiples avec démographie. Nous avons comparé comment le modèle de référence actuel et ce modèle alternatif pouvaient expliquer les données observées de diversité génétique. / The general setting of this thesis is the analysis of evolutionary forces that generate polymorphisms and divergence between genomes within a species. The theoretical framework used in the majority of disciplines of molecular evolution is the neutral theory, formulated by Motoo Kimura in 1968. This model is characterized by the hypotheses of neutrality, constant population size and panmixia. First, we investigated how this theoretical framework is used in practice and what are the consequences of these hypotheses on the inferences and predictions made in this framework. To this end, we carried out two studies confronting existing demographic inference methods with data. A first study demonstrated that methods frequently used for bacterial demographic inference, based on a single reconstructed phylogenetic tree, are biased by selection, recombination and sampling bias. We then compared several demographic inference methods, by applying them to an African human population, the Yoruba. This study showed the limits of an existing method, and illustrates the issue of identifiability of demographic histories, when the inference is based on the site frequency spectrum. Finally, in a third study we analyzed several genetic polymorphism datasets with an alternative reference model comprising multiple mergers and demography. We compared how the current reference model and this alternative model can explain the observed genetic diversity.
|
3 |
Post-glacial colonization, demographic history, and selection in <em>Arabidopsis lyrata</em>:genome-wide and candidate gene based approachMattila, T. (Tiina) 31 October 2017 (has links)
Abstract
Demographic history and natural selection are central forces shaping the genetic diversity of populations. Knowledge on these forces increases understanding of processes shaping genetic variability of populations. In this PhD thesis I investigated demographic history and selection in multiple populations of Arabidopsis lyrata, an outcrossing herbaceous plant species of the Brassicaceae family. Due to its wide distribution in the temperate and boreal regions, A. lyrata serves as a good model system to study population genetic consequences of colonization of northern latitudes. The first aim of this study was to characterize the demographic and colonization history of the species using site frequency spectra estimated from whole-genome diversity data. Another aim was to detect genetic loci targeted by recent selective sweeps at genome-wide scale as well as at candidate flowering time genes. Patterns of genome-wide selection at linked sites (linked selection) were also compared between populations of Capsella grandiflora and A. lyrata with contrasting demographic histories.
Evidence for strong effective population size decline in the past few hundred thousand years was detected in A. lyrata populations species-wide. This study also suggests recent Scandinavian colonization from an unknown refugium, distinct from the Central European source population. Selection analyses revealed loci targeted by positive selection in two Scandinavian lineages after the recent population split as well as selective sweeps in flowering time genes in the colonizing populations. In comparison with the studied C. grandiflora population, the Norwegian A. lyrata population had weaker purifying selection and no evidence for reduction of diversity around genes was found. This thesis offers novel information on species colonization history and its genome-wide effects, which is important for understanding the framework of local adaptation. / Tiivistelmä
Populaation demografinen historia ja luonnonvalinta ovat keskeisiä populaation perinnöllisen muuntelun muokkaajia. Näiden tekijöiden tutkimus on tärkeää eliöiden sopeutumisen ymmärtämiselle. Tässä väitöskirjassa tutkin demografista historiaa ja valintaa monivuotisen ristisiittoiseen ruohovartisen Brassicaceae-heimon kasvilajin idänpitkäpalon (Arabidopsis lyrata) useissa eri populaatioissa. Idänpitkäpalko on erinomainen mallilaji pohjoiseen ympäristöön sopeutumisen tutkimukseen, koska sen toisistaan eristäytyneet paikalliset populaatiot ovat levittäytyneet laajalle boreaalisella ja lauhkealla ilmastovyöhykkeellä. Tutkimuksen tarkoituksena oli luonnehtia populaatioiden demografista historiaa ja kolonisaatioreittejä käyttäen koko perimän laajuisesta muunteluaineistosta estimoituja alleelifrekvenssispektrejä. Lisäksi koko perimän laajuista aineistoa sekä kukkimisaikaa ohjaavien geenien sekvenssejä käytettiin positiivisen luonnonvalinnan merkkien tunnistukseen. Genominlaajuista kytkeytynyttä valintaa vertailtiin toiseen ristisiittoiseen Brassicaceae-heimon lajin Capsella grandifloran populaatioon, jonka demografinen historia poikkeaa huomattavasti tutkituista idänpitkäpalon populaatioista.
Tutkimuksessa havaittiin, että kaikissa tutkituissa idänpitkäpalon populaatioissa tehollinen populaatiokoko oli pienentynyt viimeisen muutaman sadantuhannen vuoden aikana. Kolonisaatiohistorian tarkastelu osoitti, että idänpitkäpalon skandinaaviset populaatiot ovat todennäköisesti peräisin keskieurooppalaisesta refugiosta erillisestä läntisestä refugiosta. Skandinavian kolonisaation yhteydessä vaikuttaneen positiivisen luonnonvalinnan merkkejä havaittiin useissa eri genomin osissa sekä erityisesti valojaksoa mittaavissa geeneissä. Tämä kertoo erilaisiin valojaksoihin sopeutumisen tärkeydestä skandinaavisen kolonisaation yhteydessä. Verrattuna tutkittuun C. grandifloran populaatioon, idänpitkäpalolla puhdistavan valinnan havaittiin olevan heikompaa ja muuntelun vähenemistä geenien ympärillä ei havaittu. Tämä tutkimus tarjoaa uutta tietoa Skandinavian kolonisaatiohistoriasta ja sen genominlaajuisista vaikutuksista. Tutkimuksessa tuotettua tietoa voidaan hyödyntää paikallisen sopeutumisen ymmärtämisessä.
|
4 |
Aspects of exchangeable coalescent processesPitters, Hermann-Helmut January 2015 (has links)
In mathematical population genetics a multiple merger <i>n</i>-coalescent process, or <i>Λ</i> <i>n</i>-coalescent process, {<i>Π<sup>n</sup>(t) t</i> ≥ 0} models the genealogical tree of a sample of size <i>n</i> (e.g. of DNA sequences) drawn from a large population of haploid individuals. We study various properties of <i>Λ</i> coalescents. Novel in our approach is that we introduce the partition lattice as well as cumulants into the study of functionals of coalescent processes. We illustrate the success of this approach on several examples. Cumulants allow us to reveal the relation between the tree height, <i>T<sub>n</sub></i>, respectively the total branch length, <i>L<sub>n</sub></i>, of the genealogical tree of Kingman’s <i>n</i>-coalescent, arguably the most celebrated coalescent process, and the Riemann zeta function. Drawing on results from lattice theory, we give a spectral decomposition for the generator of both the Kingman and the Bolthausen-Sznitman <i>n</i>-coalescent, the latter of which emerges as a genealogy in models of populations undergoing selection. Taking mutations into account, let <i>M<sub>j</sub></i> count the number of mutations that are shared by <i>j</i> individuals in the sample. The random vector (<i>M<sub>1</sub></i>,...,<i>M<sub>n-1</sub></i>), known as the site frequency spectrum, can be measured from genetical data and is therefore an important statistic from the point of view of applications. Fu worked out the expected value, the variance and the covariance of the marginals of the site frequency spectrum. Using the partition lattice we derive a formula for the cumulants of arbitrary order of the marginals of the site frequency spectrum. Following another line of research, we provide a law of large numbers for a family of <i>Λ</i> coalescents. To be more specific, we show that the process {<i>#Π<sup>n</sup>(t), t</i> ≥ 0} recording the number <i>#Π<sup>n</sup>(t)</i> of individuals in the coalescent at time <i>t</i>, coverges, after a suitable rescaling, towards a deterministic limit as the sample size <i>n</i> grows without bound. In the statistical physics literature this limit is known as a hydrodynamic limit. Up to date the hydrodynamic limit was known for Kingman’s coalescent, but not for other <i>Λ</i> coalescents. We work out the hydrodynamic limit for beta coalescents that come down from infinity, which is an important subclass of the <i>Λ</i> coalescents.
|
5 |
Demography of Birch Populations across ScandinaviaSendrowski, Janek January 2022 (has links)
Boreal forests are particularly vulnerable to climate change, experiencing a much more drastic increase in temperatures and having a limited amount of more northern refugia. The trees making up these vast and important ecosystems already had to adapt previously to environmental pressures brought about by the repeated glaciations during past ice ages. Studying the patterns of adaption of these trees can thus provide valuable insights on how to mitigate future damage. This thesis presents and analyses population structure, demo- graphic history and the distribution of fitness effects (DFE) of the diploid Betula pendula and tetraploid B. pubescens across Scandinavia. Birches–being widespread in boreal forests as well as having great economical importance–constitute superb model species. The analyses of this work confirm the expectations on postglacial population expansion and diploid-tetraploid introgression. They furthermore ascertain the presence of two genetic clusters and a remarkably similar DFE for the species. This work also contributes with a transparent, reproducible and reusable pipeline which facilitates running similar analyses for related species.
|
Page generated in 0.0989 seconds