With the advances in sequencing technologies, the number of protein sequences with
unknown function increases rapidly. Hence, computational methods for functional annotation
of these protein sequences become of the upmost importance. In this thesis,
we first defined a feature space mapping of protein primary sequences to fixed dimensional
numerical vectors. This mapping, which is called the Subsequence Profile Map
(SPMap), takes into account the models of the subsequences of protein sequences. The
resulting vectors were used as an input to support vector machines (SVM) for functional
classification of proteins. Second, we defined the protein functional annotation problem
as a classification problem and construct a classification framework defined on Gene Ontology
(GO) terms. Dierent classification methods as well as their combinations are
assessed on this framework which is based on 300 GO molecular function terms. The reiv
sults showed that combination enhances the classification accuracy. The resultant system
is made publicly available as an online function annotation tool.
Identifer | oai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/12609767/index.pdf |
Date | 01 August 2008 |
Creators | Sarac, Omer Sinan |
Contributors | Atalay, Volkan |
Publisher | METU |
Source Sets | Middle East Technical Univ. |
Language | English |
Detected Language | English |
Type | Ph.D. Thesis |
Format | text/pdf |
Rights | To liberate the content for public access |
Page generated in 0.0016 seconds