Global ETD Search

Return to search

Protein Fold Recognition Using Adaboost Learning Strategy

Protein structure prediction is one of the most important and difficult problems in computational molecular biology. Unlike sequence-only comparison, protein fold recognition based on machine learning algorithms attempts to detect similarities between protein structures which might not be accompanied with any significant sequence similarity. It takes advantage of the information from structural and physic properties beyond sequence information. In this thesis, we present a novel classifier on protein fold recognition, using AdaBoost algorithm that hybrids to k Nearest Neighbor classifier. The experiment framework consists of two tasks: (i) carry out cross validation within the training dataset, and (ii) test on unseen validation dataset, in which 90% of the proteins have less than 25% sequence identity in training samples. Our result yields 64.7% successful rate in classifying independent validation dataset into 27 types of protein folds. Our experiments on the task of protein folding recognition prove the merit of this approach, as it shows that AdaBoost strategy coupling with weak learning classifiers lead to improved and robust performance of 64.7% accuracy versus 61.2% accuracy in published literatures using identical sample sets, feature representation, and class labels.

Identifer	oai:union.ndltd.org:IUPUI/oai:scholarworks.iupui.edu:1805/2267
Date	29 September 2010
Creators	Su, Yijing
Source Sets	Indiana University-Purdue University Indianapolis
Language	en_US
Detected Language	English
Type	Thesis

Page generated in 0.0024 seconds

Protein Fold Recognition Using Adaboost Learning Strategy

Description

Links & Downloads

Tags

Additional Fields