Return to search

Improving the quality of multiple sequence alignment

Multiple sequence alignment is an important bioinformatics problem, with applications
in diverse types of biological analysis, such as structure prediction, phylogenetic analysis
and critical sites identification. In recent years, the quality of multiple sequence
alignment was improved a lot by newly developed methods, although it remains a
difficult task for constructing accurate alignments, especially for divergent sequences.
In this dissertation, we propose three new methods (PSAlign, ISPAlign, and NRAlign)
for further improving the quality of multiple sequences alignment.
In PSAlign, we propose an alternative formulation of multiple sequence alignment based
on the idea of finding a multiple alignment which preserves all the pairwise alignments
specified by edges of a given tree. In contrast with traditional NP-hard formulations, our
preserving alignment formulation can be solved in polynomial time without using a
heuristic, while still retaining very good performance when compared to traditional
heuristics. In ISPAlign, by using additional hits from database search of the input sequences, a few
strategies have been proposed to significantly improve alignment accuracy, including the
construction of profiles from the hits while performing profile alignment, the inclusion
of high scoring hits into the input sequences, the use of intermediate sequence search to
link distant homologs, and the use of secondary structure information.
In NRAlign, we observe that it is possible to further improve alignment accuracy by
taking into account alignment of neighboring residues when aligning two residues, thus
making better use of horizontal information. By modifying existing multiple alignment
algorithms to make use of horizontal information, we show that this strategy is able to
consistently improve over existing algorithms on all the benchmarks that are commonly
used to measure alignment accuracy.

Identiferoai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/ETD-TAMU-3111
Date15 May 2009
CreatorsLu, Yue
ContributorsSze, Sing-Hoi
Source SetsTexas A and M University
Languageen_US
Detected LanguageEnglish
TypeBook, Thesis, Electronic Dissertation, text
Formatelectronic, application/pdf, born digital

Page generated in 0.0018 seconds