Return to search

An Effective Feature Selection for Protein Fold Recognition

The protein fold recognition problem is one of the important topics in biophysics.
It is believed that the primary structure of a protein is helpful to drawing its three-dimensional (3D) structure.
Given a target protein (a sequence of amino acids), the
protein fold recognition problem is to decide which fold group
of some protein structure database the target protein belongs to.
Since more than two fold groups are to be located in this problem, it
is a multi-class classification problem.
Recently, many researchers have solved this problem by using the
popular machine learning tools, such as neural networks (NN) and support
vector machines (SVM). In this thesis, we use the SVM tool to solve this
problem. Our strategy is to find out the effective features which
can be used as an efficient guide to the classification problem.
We build the feature preference table to
help us to find out effective feature combinations quickly.
We take 27 well-known fold groups
in SCOP (Structural Classification of Proteins) as our data set. Our
experimental results show that our method achieves the overall prediction
accuracy of 61.4%, which is better than the previous method (56.5%).
With the same feature combinations, our prediction accuracy is also
higher than the previous results. These results show that our method
is indeed effective for the fold recognition problem.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-1011107-054209
Date11 October 2007
CreatorsLin, Jyun-syong
ContributorsChia-ning Yang, Chang-biau Yang, Shyue-horng Shiau
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-1011107-054209
Rightsoff_campus_withheld, Copyright information available at source archive

Page generated in 0.0017 seconds