The protein fold recognition problem is one of the important topics in biophysics.
It is believed that the primary structure of a protein is helpful to drawing its three-dimensional (3D) structure.
Given a target protein (a sequence of amino acids), the
protein fold recognition problem is to decide which fold group
of some protein structure database the target protein belongs to.
Since more than two fold groups are to be located in this problem, it
is a multi-class classification problem.
Recently, many researchers have solved this problem by using the
popular machine learning tools, such as neural networks (NN) and support
vector machines (SVM). In this thesis, we use the SVM tool to solve this
problem. Our strategy is to find out the effective features which
can be used as an efficient guide to the classification problem.
We build the feature preference table to
help us to find out effective feature combinations quickly.
We take 27 well-known fold groups
in SCOP (Structural Classification of Proteins) as our data set. Our
experimental results show that our method achieves the overall prediction
accuracy of 61.4%, which is better than the previous method (56.5%).
With the same feature combinations, our prediction accuracy is also
higher than the previous results. These results show that our method
is indeed effective for the fold recognition problem.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-1011107-054209 |
Date | 11 October 2007 |
Creators | Lin, Jyun-syong |
Contributors | Chia-ning Yang, Chang-biau Yang, Shyue-horng Shiau |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-1011107-054209 |
Rights | off_campus_withheld, Copyright information available at source archive |
Page generated in 0.0054 seconds