Machine learning is applied to a challenging and biologically significant protein classification problem: the prediction of flavonoid UGT acceptor regioselectivity from primary protein sequence. Novel indices characterizing graphical models of protein residues are introduced. The indices are compared with existing amino acid indices and found to cluster residues appropriately. A variety of models employing the indices are then investigated by examining their performance when analyzed using nearest neighbor, support vector machine, and Bayesian neural network classifiers. Improvements over nearest neighbor classifications relying on standard alignment similarity scores are reported.
Identifer | oai:union.ndltd.org:ETSU/oai:dc.etsu.edu:etd-3172 |
Date | 19 December 2009 |
Creators | Jackson, Arthur Rhydon |
Publisher | Digital Commons @ East Tennessee State University |
Source Sets | East Tennessee State University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Electronic Theses and Dissertations |
Rights | Copyright by the authors. |
Page generated in 0.0695 seconds