Thesis (MSc)--Stellenbosch University, 2011. / ENGLISH ABSTRACT: Phylogenetic analysis is the study of evolutionary relationships among organisms.
To this end, phylogenetic trees, or evolutionary trees, are used to
depict the evolutionary relationships between organisms as reconstructed from
DNA sequence data. The likelihood of a given tree is commonly calculated
for many purposes including inferring phylogenies, sampling from the space of
likely trees and inferring other parameters governing the evolutionary process.
This is done using Felsenstein’s algorithm, a widely implemented dynamic
programming approach that reduces the computational complexity from exponential
to linear in the number of taxa. However, with the advent of efficient
modern sequencing techniques the size of data sets are rapidly increasing beyond
current computational capability.
Parallel computing has been used successfully to address many similar
problems and is currently receiving attention in the realm of phylogenetic
analysis. Work has been done using data decomposition, where the likelihood
calculation is parallelised over DNA sequence sites. We propose an alternative
way of parallelising the likelihood calculation, which we call segmentation,
where the tree is broken down into subtrees and the likelihood of each subtree
is calculated concurrently over multiple processes. We introduce our proposed
system, which aims to drastically increase the size of trees that can be practically
used in phylogenetic analysis. Then, we evaluate the system on large
phylogenies which are constructed from both real and synthetic data, to show
that a larger decrease of run times are obtained when the system is used. / AFRIKAANSE OPSOMMING:Filogenetiese analise is die studie van evolusionêre verwantskappe tussen
organismes. Filogenetiese of evolusionêre bome word aangewend om die evolusionêre
verwantskappe, soos herwin vanuit DNS-kettings data, tussen organismes
uit te beeld. Die aanneemlikheid van ’n gegewe filogenie word oor die
algemeen bereken en aangewend vir menigte doeleindes, insluitende die afleiding
van filogenetiese bome, om te monster vanuit ’n versameling van sulke
moontlike bome en vir die afleiding van ander belangrike parameters in die evolusionêre
proses. Dit word vermag met behulp van Felsenstein se algoritme,
’n alombekende benaderingwyse wat gebruik maak van dinamiese programmering
om die berekeningskompleksiteit van eksponensieel na lineêr in die aantal
taxa, te herlei. Desnieteenstaande, het die koms van moderne, doeltreffender
orderingsmetodes groter datastelle tot gevolg wat vinnig besig is om bestaande
berekeningsvermoë te oorskry.
Parallelle berekeningsmetodes is reeds suksesvol toegepas om vele soortgelyke
probleme op te los, met groot belangstelling tans in die sfeer van filogenetiese
analise. Werk is al gedoen wat gebruik maak van data dekomposisie, waar
die aanneemlikheidsberekening oor die DNS basisse geparallelliseer word. Ons
stel ’n alternatiewe metode voor, wat ons segmentasie noem, om die aanneemlikheidsberekening
te parallelliseer, deur die filogenetiese boom op te breek in
sub-bome, en die aanneemlikheid van elke sub-boom gelyklopend te bereken
oor verskeie verwerkingseenhede. Ons stel ’n stelsel voor wat dit ten doel het
om ’n drastiese toename in die grootte van die bome wat gebruik kan word in
filogenetiese analise, teweeg te bring. Dan, word ons voorgestelde stelsel op
groot filogenetiese bome, wat vanaf werklike en sintetiese data gekonstrueer is,
evalueer. Dit toon aan dat ’n groter afname in looptyd verkry word wanneer
die stelsel in gebruik is.
Identifer | oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/17919 |
Date | 12 1900 |
Creators | Hayward, Peter |
Contributors | Scheffler, K., Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. |
Publisher | Stellenbosch : Stellenbosch University |
Source Sets | South African National ETD Portal |
Language | en_ZA |
Detected Language | English |
Type | Thesis |
Format | 88 p. : ill. |
Rights | Stellenbosch University |
Page generated in 0.0028 seconds