Return to search

Predicting Function of Genes and Proteins from Sequence, Structure and Expression Data

<p>Functional genomics refers to the task of determining gene and protein function for whole genomes, and requires computational analysis of large amounts of biological data including DNA and protein sequences, protein structures and gene expressions. Machine learning methods provide a powerful tool to this end by first inducing general models from such data and already characterized genes or proteins and then by providing hypotheses on the functions of the remaining, uncharacterized cases.</p><p>This study contains four parts giving novel contributions to functional genomics through the analysis of different biological data and different aspects of biological functions. Gene Ontology played an important part in this research providing a controlled vocabulary for describing the cellular roles of genes and proteins in terms of specific molecular functions and broad biological processes.</p><p>The first part used gene expression time profiles to learn models capable of predicting the participation of genes in biological processes. The model consists of IF-THEN rules associating biological processes with minimal set of discrete changes in expression level over limited periods of time. The models were used to hypothesize new biological processes for both characterized and uncharacterized genes.</p><p>The second part investigated the combinatorial nature of gene regulation by inducing IF-THEN rules associating minimal combinations of sequence motifs common to genes with similar expression profiles. Such combinations were shown to be significantly correlated to function, and provided hypotheses on the mechanisms behind the regulation of gene expression in several biological responses.</p><p>The third part used a novel concept of local descriptors of protein structure to investigate sequence patterns governing protein structure at a local level and to predict the topological class (fold) of protein domains from sequence. Finally, the fourth part used local descriptors to represent protein structure and induced IF-THEN rule models predicting molecular function from structure.</p>

Identiferoai:union.ndltd.org:UPSALLA/oai:DiVA.org:uu-4490
Date January 2004
CreatorsHvidsten, Torgeir R.
PublisherUppsala University, The Linnaeus Centre for Bioinformatics, Uppsala : Acta Universitatis Upsaliensis
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeDoctoral thesis, comprehensive summary, text
RelationComprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, 1104-232X ; 999

Page generated in 0.0012 seconds