The multiple sequence alignment (MSA) is a fundamental technique of molecular biology. Biological sequences are aligned with each other vertically in order to show the similarities and differences among them. Due to its importance, many algorithms have been proposed. With dynamic programming, finding the optimal alignment for a pair of sequences can be done in O(n2) time, where n is the length of the two strings. Unfortunately, for the general optimization problem of aligning k sequences of length n , O(nk) time is required.
In this thesis, we shall first propose an efficient group alignment method to perform the alignment between two groups of sequences. Then we shall propose a clustering method to build the tree topology for merging. The clustering method is based on the concept that the two sequences having the longest distance should be split into two clusters. By our experiments, both the alignment quality and required time of our algorithm are better than those of NJ (neighbor joining) algorithm and Clustal W algorithm.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0823101-124209 |
Date | 23 August 2001 |
Creators | Huang, Kuen-Feng |
Contributors | D. J. Guan, Yue-Li Wang, Chang-Biau Yang, Sing-Ling Lee |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0823101-124209 |
Rights | not_available, Copyright information available at source archive |
Page generated in 0.0015 seconds