Global ETD Search

Return to search

Development of New Methods for Inferring and Evaluating Phylogenetic Trees

Inferring phylogeny is a difficult computational problem. Heuristics are necessary to minimize the time spent evaluating non optimal trees. In paper I, we developed an approach for heuristic searching, using a genetic algorithm. Genetic algorithms mimic the natural selections ability to solve complex problems. The algorithm can reduce the time required for weighted maximum parsimony phylogenetic inference using protein sequences, especially for data sets involving large number of taxa. Evaluating and comparing the ability of phylogenetic methods to infer the correct topology is complex. In paper II, we developed software that determines the minimum subtree prune and regraft (SPR) distance between binary trees to ease the process. The minimum SPR distance can be used to measure the incongruence between trees inferred using different methods. Given a known topology the methods could be evaluated on their ability to infer the correct phylogeny given specific data. The minimum SPR software the intermediate trees that separate two binary trees. In paper III we developed software that given a set of incongruent trees determines the median SPR consensus tree i.e. the tree that explains the trees with a minimum of SPR operations. We investigated the median SPR consensus tree and its possible interpretation as a species tree given a set of gene trees. We used a set of α-proteobacteria gene trees to test the ability of the algorithm to infer a species tree and compared it to previous studies. The results show that the algorithm can successfully reconstruct a species tree. Expressed sequence tag (EST) data is important in determining intron-exon boundaries, single nucleotide polymorphism and the coding sequence of genes. In paper IV we aligned ESTs to the genome to evaluate the quality of EST data. The results show that many ESTs are contaminated by vector sequences and low quality regions. The reliability of EST data is largely determined by the clustering of the ESTs and the association of the clusters to the correct portion of genome. We investigate the performance of EST clustering using the genome as template compared to previously existing methods using pair-wise alignments. The results show that using the genome as guidance improves the resulting EST clusters in respect to the extent ESTs originating from the same transcriptional unit are separated into disjunct clusters.

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-7501
Date	January 2007
Creators	Hill, Tobias
Publisher	Uppsala universitet, Institutionen för neurovetenskap, Uppsala : Acta Universitatis Upsaliensis
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Doctoral thesis, comprehensive summary, info:eu-repo/semantics/doctoralThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess
Relation	Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine, 1651-6206 ; 225

Page generated in 0.0026 seconds

Development of New Methods for Inferring and Evaluating Phylogenetic Trees

Description

Links & Downloads

Tags

Additional Fields