Return to search

Statistical analysis of natural selection in RNA virus populations

A key goal of modern evolutionary biology is the identification of genes or genome regions that have been targeted by natural selection. Methods for detecting natural selection utilise the information sampled in contemporary gene sequences and test for deviation from the null hypothesis of neutrality. One such method is the McDonald Kreitman test (MK test), which detects the the molecular 'footprint' left by natural selection by considering the frequency of observed mutations within the sampled population. In this thesis I investigate the applicability of the MK test to viral populations and develop several new methods based on the original MK test. In chapter 2, I use a combination of simulation and methodological improvements to show that the MK test can have low error when applied to analysis of RNA virus populations. Then, in chapter 3, I develop an extension of the MK test with the purpose of estimating rates of adaptive fixation for all genes of the human influenza A virus subtypes H1N1 and H3N2. My results are consistent with previous studies on selection in influenza virus populations, and provide a new perspective on the evolutionary dynamics of human influenza virus. In chapter 4 I develop a formal statistical framework based, on the MK test, for calculating the number of non neutral sites at any frequency range in the site frequency spectrum. In this framework, I introduce a new method for reconstructing the site frequency spectrum that incorporates sampling error and allows for the inclusion of prior knowledge. Using this new framework I show that the majority of nucleotide sites in hepatitis C virus sequences sampled during chronic infection represent deleterious mutations. Finally, in chapter 5 I use the generalised framework introduced in chapter 4 to develop a statistic for evaluating the deleterious mutation load of a population. I apply this test sequences that represent 96 RNA virus genes and show that my approach has comparable power to equivalent phylogenetic methods. In this thesis I have developed computationally efficient methods for analysis of genetic data from virus populations. It is my hope that these methods will become useful given the explosion in sequence data that has accompanied recent improvements in sequencing technology.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:543055
Date January 2010
CreatorsBhatt, Samir
ContributorsPybus, Oliver
PublisherUniversity of Oxford
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://ora.ox.ac.uk/objects/uuid:64341c38-f09e-48ed-84e8-7ab9f171a753

Page generated in 0.0023 seconds