In the past decade, automatisation has led to an immense increase of data in biology. Next generation sequencing techniques will produce a vast amount of sequences across all species in the coming years. In many cases, identifying the function and biological role of a protein from its sequence can be a complicated and time-intensive task. The identification of a protein's interaction partners is a tremendous help for understanding the biological context in which it is involved. In order to fully characterise a protein-protein interaction (PPIs), it is necessary to know the three-dimensional structure of the interacting partners. Despite optimisation efforts from projects such as the Protein Structure Initivative, determining the structure of a protein through crystallography remains a time- and cost-intensive procedure. The primary aim of the research described in this dissertation was to produce a World Wide Web resource that facilitates visual exploration and validation (or questioning) of data derived from functional genomics experiments, by building upon existing structural information about direct physical PPIs. Secondary aims were (i) to demonstrate the utility of the new resource, and (ii) its application in biological research. We created a database that emphasises specifically the intersection between the PPIs-results emerging from the structural biology and functional genomics communities. The BISC database holds BInary SubComplexes and Modellable Interactions in current functional genomics databases (BICS-MI). It is publicly available at hyyp://bisc.cse.ucsc.edu. BISC is divided in three sections that deliver three types of information of interest to users seeking to investigate or browse PPIs. The template section (BISCHom and BISCHet) is devoted to those PPIs that are characterised in structural detail, i.e. binary SCs extracted from experimentally determined three-dimensional structures. BISCHom and BISCHet contain the homodimeric (13,583 records) and heterodimeric (5612 records) portions of these, respectively. Besides interactive, embedded Jmol displays emphasising the interface, standard information and links are provided, e.g. sequence information and SPOP classification for both partners, interface size and energy scores (PISA). An automated launch of the MolSurfer program enables the user to investigate electrostatic and hydrophobic correlation between the partners, at the inter-molecular interface. The modellable interactions section (BISC0MI) identifies potentially modellable interactions in three major functional genomics interaction databases (BioGRID), IntAct, HPRD). To create BISC-MI all PPIs that are amenable to automated homology modelling based on conservative similarity cut-offs and whose partner protein sequences have recrods in the UniProt database, have been extracted. The modellable interaction services (BISC-MI Services) section offers, upon user request, modelled SC-structures for any PPIs in BISC-MI. This is enabled through an untomated template-based (homology) modelling protocol using the popular MODELLER program. First, a multiple sequence alignment (MSA) is generated using MUSCLE, between the target and homologous proteins collected from UniProt (only reviewed proteins from organisms whose genome has been completely sequenced are included to find putative orthologs). Then a sequence-to-profile alignment is generated to integrate the template structure in the MSA. All models are produced upon user request to ensure that the most recent sequence data for the MSAs are used. Models generated through this protocol are expected to be more accurate generally than models offered by other automated resources that rely on pairwise alignments, e.g. ModBase. Two small studies were carried out to demonstrate the usability and utility of BISC in biological research. (1) Interaction data in functional genomics databases often suffers from insufficient experimental and reporting standards. For example, multiple protein complexes are typically recorded as an inferred set of binary interactions. Using the 20S core particle of the yeast proteasome as an example, we demonstrate how the BISC Web resource can be used as a starting point for further investigation of such inferred interactions. (2) Malaria, a mosquito-borne disease, affects 3500-500 million people worldwide. Still very little is known about the malarial parasites' genes and their protein functions. For Plasmodium falciparum, the most lethal among the malaria parasites, only one experimentally derived medium scale PPIs set is available. The validity of this set has been doubted in the the malarial biologist community. We modelled and investigated eleven binary interactions from this set using the BISC modelling pipeline. Alongside we compared the BISC models of the individual partners to those obtained from ModBase.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:563581 |
Date | January 2011 |
Creators | Jüttemann, Thomas |
Contributors | Gerloff, Dietlind. : Walkinshaw, Malcolm. : Goryanin, Igor |
Publisher | University of Edinburgh |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | http://hdl.handle.net/1842/5666 |
Page generated in 0.0024 seconds