With the improvement of biological techniques, the amount of biosequences
data, such as DNA, RNA and protein sequences, are growing explosively.
It is almost impossible to handle such huge amount of data purely by manpower.
Thus the requirement of the great computing power is essential.
There are some ways to treat biosequence data, finding identical biosequences,
searching similar biosequences, or mining the signature of biosequences.
All of these are based on the same problems, the biosequence alignment
problems.
In this dissertation, we shall study the biosequence alignment problems to
raise the biological meaning of the optimal or near-optimal alignments since the
biologists and computer scientists sometimes argue
the biological meaning of the mathematically optimal alignment
obtained based on some scoring functions.
We first study the methods to improve the optimal alignment of two given
biosequences. Since usually the optimal alignment is not unique, there
should exist the best one among the optimal alignments, and we try to
extract this by defining some other criteria to judge the goodness of
the alignments when the traditional methods cannot decide which is the better one.
Two algorithms are proposed for solving the newly defined biosequence
alignment problems, the smoothest optimal alignment and the most
conserved optimal alignment problems. Some other criteria are also discussed
since most of them can be solved in a similar way.
Then we notice that the most biologically meaningful alignment may not
be the optimal one since there is no perfect scoring matrix. We address
our candidates in those near-optimal alignments, and present a tracing
marking function to get all near-optimal alignments and use the criterion
"the most conserved" to filter it, which is named as the
near-optimal block alignment (NBA) problem.
Finally, as everybody knows that existing scoring matrices are not
perfect at all, we try to figure out how we choose the winner
when multiple scoring matrices are applied. We define some
reasonable schemes to decide the winner alignment.
In this dissertation, we solve and discuss the algorithms for near-optimal
alignment problems on biosequences.
In the future, we would like to do some experiments to support
or reject these concepts.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0826108-182735 |
Date | 26 August 2008 |
Creators | Tseng, Kuo-Tsung |
Contributors | Chung-Nan Lee, Yaw-Ling Lin, Biing-Feng Wang, M. S. Chang, Chang-Biau Yang, Bang-Ye Wu, S. C. Tai |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0826108-182735 |
Rights | unrestricted, Copyright information available at source archive |
Page generated in 0.0018 seconds