Return to search

Robust sequence alignment using evolutionary rates coupled with an amino acid substitution matrix

Selective pressures at the DNA level shape genes into pro les consisting of patterns of
rapidly evolving sites and sites withstanding change. These pro les remain detectable
even when protein sequences become extensively diverged. It has been hypothesised
that these patterns can be used as gene identi ers. A common task in molecular biology
is to infer functional, structural or evolutionary relationships by querying a database
using an algorithm. However, problems arise when sequence similarity is low.
The problem is that the algorithm produces numerous
false positives when highly conserved datasets are aligned. To increase the
sensitivity of the algorithm, the evolutionary rate based approach was reimplemented
and coupled with a conventional BLOSUM substitution matrix to produce a new implementation
called BLOSUM-FIRE. The two approaches are combined in a dynamic
scoring function, which uses the selective pressure to score aligned residues. Analysis
of quality of alignments produced, revealed that the new implementation of the FIRE
algorithm performs as well as conventional algorithms. In addition, the Evolutionary
rate Database (EvoDB), which is a compilation of evolutionary rate pro les of all the
members of the PFAM-A protein domain database has been developed. The EvoDB
database can be queried using FIRE to infer protein domain functions. The utility
of this algorithm and database was tested by inferring the domain functions of the
Hepatitis B X protein. Results show that the BLOSUM-FIRE algorithm was able
to accurately identify the domain function of HBx as a trans-activation protein using
EvoDB. The biological relevance
of these results was not validated and requires further interrogation; however, these
proteins share vital roles in viral replication. This study demonstrates the utility
of an evolutionary rate based approach and demonstrates that such an approach is
robust when coupled with an amino acid substitution matrix yielding results comparable
to conventional algorithms. EvoDB is a catalogue of the evolutionary rate
pro les and provides the corresponding phylogenetic trees, PFAM-A alignments and
annotated accession identi er data. The BLOSUM-FIRE software and user manual
including the EvoDB
at le database and release notes have been made freely available
at www.bioinf.wits.ac.za/software/fire. The BLOSUM-FIRE algorithm and
EvoDB database present a tier of information untapped by current databases and tools. / A dissertation submitted to the Faculty of Health Sciences, University of the
Witwatersrand, Johannesburg, in ful lment of the requirements of the degree
of
Master of Science (Medicine).

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:wits/oai:wiredspace.wits.ac.za:10539/18660
Date January 2014
CreatorsNdhlovu, Andrew
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf

Page generated in 0.0056 seconds