High-throughput genomics projects have resulted in a rapid accumulation of protein sequences. Therefore, computational methods that can predict protein functions and functional sites efficiently and accurately are in high demand. In addition, prediction methods utilizing only sequence information are of particular interest because for most proteins, 3-dimensional structures are not available. However, there are several key challenges in developing methods for predicting protein function and functional sites. These challenges include the following: the construction of representative datasets to train and evaluate the method, the collection of features related to the protein functions, the selection of the most useful features, and the integration of selected features into suitable computational models. In this proposed study, we tackle these challenges by developing procedures for benchmark dataset construction and protein feature extraction, implementing efficient feature selection strategies, and developing effective machine learning algorithms for protein function and functional site predictions. We investigate these challenges in three bioinformatics tasks: the discovery of transmembrane beta-barrel (TMB) proteins in gram-negative bacterial proteomes, the identification of deleterious non-synonymous single nucleotide polymorphisms (nsSNPs), and the identification of helix-turn-helix (HTH) motifs from protein sequence.
Identifer | oai:union.ndltd.org:UTAHS/oai:digitalcommons.usu.edu:etd-1291 |
Date | 01 May 2009 |
Creators | Hu, Jing |
Publisher | DigitalCommons@USU |
Source Sets | Utah State University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | All Graduate Theses and Dissertations |
Rights | Copyright for this work is held by the author. Transmission or reproduction of materials protected by copyright beyond that allowed by fair use requires the written permission of the copyright owners. Works not in the public domain cannot be commercially exploited without permission of the copyright owner. Responsibility for any use rests exclusively with the user. For more information contact Andrew Wesolek (andrew.wesolek@usu.edu). |
Page generated in 0.0014 seconds