Return to search

A Classification System For The Problem Of Protein Subcellular Localization

The focus of this study is on predicting the subcellular localization of a protein. Subcellular localization
information is important for protein function annotation which is a fundamental problem in computational
biology. For this problem, a classification system is built that has two main parts: a predictor that is
based on a feature mapping technique to extract biologically meaningful information from protein sequences
and a client/server architecture for searching and predicting subcellular localizations. In the first part of the
thesis, we describe a feature mapping technique based on frequent patterns. In the feature mapping technique we describe,
frequent patterns in a protein sequence dataset were identified using a search technique based on a priori
property and the distribution of these patterns over a new sample is used as a feature vector for classification.
The effect of a number of feature selection methods on the classification performance is investigated and the best
one is applied. The method is assessed on the subcellular localization
prediction problem with 4 compartments (Endoplasmic reticulum (ER) targeted, cytosolic, mitochondrial, and nuclear)
and the dataset is the same used in P2SL. Our method improved the overall accuracy to 91.71% which was
originally 81.96% by P2SL. In the second part of the thesis, a client/server architecture is designed and implemented
based on Simple Object Access Protocol (SOAP) technology which provides a user-friendly interface for accessing the
protein subcellular localization predictions. Client part is in fact a Cytoscape plug-in that is used for functional
enrichment of biological networks. Instead of the individual use of subcellular localization information,
this plug-in lets biologists to analyze a set of genes/proteins under system view.

Identiferoai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/3/12608914/index.pdf
Date01 September 2007
CreatorsAlay, Gokcen
ContributorsAtalay, Volkan
PublisherMETU
Source SetsMiddle East Technical Univ.
LanguageEnglish
Detected LanguageEnglish
TypeM.S. Thesis
Formattext/pdf
RightsTo liberate the content for public access

Page generated in 0.0023 seconds