Spelling suggestions: "subject:"old arecognition"" "subject:"old 2recognition""
1 |
Using Chemical Crosslinking and Mass Spectrometry for Protein Model Validation and Fold RecognitionMak, Esther W. M. January 2006 (has links)
The 3D structures of proteins may provide important clues to their functions and roles in complex biological pathways. Traditional methods such as X-ray crystallography and NMR are not feasible for all proteins, while theoretical models are typically not validated by experimental data. This project investigates the use of chemical crosslinkers as an experimental means of validating these models. Five target proteins were successfully purified from yeast whole cell extract: Transketolase (TKL1), inorganic pyrophosphatase (IPP1), amidotransferase/cyclase HIS7, phosphoglycerate kinase (PGK1) and enolase (ENO1). These TAP-tagged target proteins from yeast <em>Saccharomyces cerevisiae</em> allowed the protein to be isolated in two affinity purification steps. Subsequent structural analysis used the homobifunctional chemical crosslinker BS<sup>3</sup> to join pairs of lysine residues on the surface of the purified protein via a flexible spacer arm. Mass spectrometry (MS) analysis of the crosslinked protein generated a set of mass values for crosslinked and non-crosslinked peptides, which was used to identify surface lysine residues in close proximity. The Automatic Spectrum Assignment Program was used to assign sequence information to the crosslinked peptides. This data provided inter-residue distance constraints that can be used to validate or refute theoretical protein structure models generated by structure prediction software such as SWISS-MODEL and RAPTOR. This approach was able to validate the structure models for four of the target proteins, TKL1, IPP1, HIS7 and ENO1. It also successfully selected the correct models for TKL1 and IPP1 from a protein model library and provided weak support for the HIS7, PGK1 and ENO1 models.
|
2 |
Using Chemical Crosslinking and Mass Spectrometry for Protein Model Validation and Fold RecognitionMak, Esther W. M. January 2006 (has links)
The 3D structures of proteins may provide important clues to their functions and roles in complex biological pathways. Traditional methods such as X-ray crystallography and NMR are not feasible for all proteins, while theoretical models are typically not validated by experimental data. This project investigates the use of chemical crosslinkers as an experimental means of validating these models. Five target proteins were successfully purified from yeast whole cell extract: Transketolase (TKL1), inorganic pyrophosphatase (IPP1), amidotransferase/cyclase HIS7, phosphoglycerate kinase (PGK1) and enolase (ENO1). These TAP-tagged target proteins from yeast <em>Saccharomyces cerevisiae</em> allowed the protein to be isolated in two affinity purification steps. Subsequent structural analysis used the homobifunctional chemical crosslinker BS<sup>3</sup> to join pairs of lysine residues on the surface of the purified protein via a flexible spacer arm. Mass spectrometry (MS) analysis of the crosslinked protein generated a set of mass values for crosslinked and non-crosslinked peptides, which was used to identify surface lysine residues in close proximity. The Automatic Spectrum Assignment Program was used to assign sequence information to the crosslinked peptides. This data provided inter-residue distance constraints that can be used to validate or refute theoretical protein structure models generated by structure prediction software such as SWISS-MODEL and RAPTOR. This approach was able to validate the structure models for four of the target proteins, TKL1, IPP1, HIS7 and ENO1. It also successfully selected the correct models for TKL1 and IPP1 from a protein model library and provided weak support for the HIS7, PGK1 and ENO1 models.
|
3 |
Structural bioinformatics analysis of the family of human ubiquitin-specific proteasesZhu, Xiao January 2007 (has links)
Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal.
|
4 |
A Fold Recognition Approach to Modeling of Structurally Variable RegionsLevefelt, Christer January 2004 (has links)
<p>A novel approach is proposed for modeling of structurally variable regions in proteins. In this approach, a prerequisite sequence-structure alignment is examined for regions where the target sequence is not covered by the structural template. These regions, extended with a number of residues from adjacent stem regions, are submitted to fold recognition. The alignments produced by fold recognition are integrated into the initial alignment to create a multiple alignment where gaps in the main structural template are covered by local structural templates. This multiple alignment is used to create a protein model by existing protein modeling techniques.</p><p>Several alternative parameters are evaluated using a set of ten proteins. One set of parameters is selected and evaluated using another set of 31 proteins. The most promising result is for loop regions not located at the C- or N-terminal of a protein, where the method produces an average RMSD 12% lower than the loop modeling provided with the program MODELLER. This improvement is shown to be statistically significant.</p>
|
5 |
Structural bioinformatics analysis of the family of human ubiquitin-specific proteasesZhu, Xiao January 2007 (has links)
Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal
|
6 |
Functional characterization of proteins involved in cell cycle by structure-based computational methodsSontheimer, Jana 14 May 2012 (has links) (PDF)
In the recent years, a rapidly increasing amount of experimental data has been generated by high-throughput technologies. Despite of these large quantities of protein-related data and the development of computational prediction methods, the function of many proteins is still unknown. In the human proteome, at least 20% of the annotated proteins are not characterized. Thus, the question, how to predict protein function from its amino acid sequence, remains to be answered for many proteins. Classical bioinformatics approaches for function prediction are based on inferring function from well-characterized homologs, which are identified based on sequence similarity. However, these methods fail to identify distant homologs with low sequence similarity. As protein structure is more conserved than sequence in protein families, structure-based methods (e.g. fold recognition) may recognize possible structural similarities even at low sequence similarity and therefore provide information for function inference. These fold recognition methods have already been proven to be successful for individual proteins, but their automation for high-throughput application is difficult due to intrinsic challenges of these techniques, mainly caused by a high false positive rate. Automated identification of remote homologs based on fold recognition methods would allow a signi cant improvement in functional annotation of proteins. My approach was to combine structure-based computational prediction methods with experimental data from genome-wide RNAi screens to support the establishment of functional hypotheses by improving the analysis of protein structure prediction results.
In the first part of my thesis, I characterized proteins from the Ska complex by computational methods. I showed the benefit of including experimental information to identify remote homologs: Integration of functional data helped to reduce the number of false positives in fold recognition results and made it possible to establish interesting functional hypotheses based on high con dence structural predictions. Based on the structural hypothesis of a GLEBS motif in c13orf3 (Ska3), I could derive a potential molecular mechanism that could explain the observed phenotype.
In the second part of my thesis, my goal was to develop computational tools and automated analysis techniques to be able to perform structure-based functional annotation in a high-throughput way. I designed and implemented key tools that were successfully integrated into a computational platform, called StrAnno, which I set up together with my colleagues. These novel computational modules include a domain prediction algorithm and a graphical overview that facilitates and accelerates the analysis of results.
StrAnno can be seen as a first step towards automatic functional annotation of proteins by structure-based methods. First, the analysis of long hit lists to identify promising candidates for further analysis is substantially facilitated by integration and combination of various sequence-based computational tools and data from functional databases. Second, the developed post-processing tools accelerate the evaluation of structural and functional hypotheses. False positives from the threading result lists are removed by various filters, and analysis of the possible true positives is greatly enhanced by the graphical overview. With these two essential benefits, fold recognition techniques are applicable to large-scale approaches. By applying this developed methodology to hits from a genome-wide cell cycle RNAi screen and evaluating structural hypotheses by molecular modeling techniques, I aimed to associate biological functions to human proteins and link the RNAi phenotype to a molecular function. For two selected human proteins, c20orf43 and HJURP, I could establish interesting structural and functional hypotheses. These predictions were based on templates with low sequence identity (10-20%). The uncharacterized human protein c20orf43 might be a E3 SUMO-ligase that could be involved either in DNA repair or rRNA regulatory processes. Based on the structural hypotheses of two domains of HJURP, I predicted a potential link to ubiquitylation processes and direct DNA binding. In addition, I substantiated the cell cycle arrest phenotype of these two genes upon RNAi knockdown.
Fold recognition methods are a promising alternative for functional annotation of proteins that escape sequence-based annotation due to their low sequence identity to well-characterized protein families. The structural and functional hypotheses I established in my thesis open the door to investigate the molecular mechanisms of previously uncharacterized proteins, which may provide new insights into cellular mechanisms.
|
7 |
A Fold Recognition Approach to Modeling of Structurally Variable RegionsLevefelt, Christer January 2004 (has links)
A novel approach is proposed for modeling of structurally variable regions in proteins. In this approach, a prerequisite sequence-structure alignment is examined for regions where the target sequence is not covered by the structural template. These regions, extended with a number of residues from adjacent stem regions, are submitted to fold recognition. The alignments produced by fold recognition are integrated into the initial alignment to create a multiple alignment where gaps in the main structural template are covered by local structural templates. This multiple alignment is used to create a protein model by existing protein modeling techniques. Several alternative parameters are evaluated using a set of ten proteins. One set of parameters is selected and evaluated using another set of 31 proteins. The most promising result is for loop regions not located at the C- or N-terminal of a protein, where the method produces an average RMSD 12% lower than the loop modeling provided with the program MODELLER. This improvement is shown to be statistically significant.
|
8 |
Functional characterization of proteins involved in cell cycle by structure-based computational methodsSontheimer, Jana 16 April 2012 (has links)
In the recent years, a rapidly increasing amount of experimental data has been generated by high-throughput technologies. Despite of these large quantities of protein-related data and the development of computational prediction methods, the function of many proteins is still unknown. In the human proteome, at least 20% of the annotated proteins are not characterized. Thus, the question, how to predict protein function from its amino acid sequence, remains to be answered for many proteins. Classical bioinformatics approaches for function prediction are based on inferring function from well-characterized homologs, which are identified based on sequence similarity. However, these methods fail to identify distant homologs with low sequence similarity. As protein structure is more conserved than sequence in protein families, structure-based methods (e.g. fold recognition) may recognize possible structural similarities even at low sequence similarity and therefore provide information for function inference. These fold recognition methods have already been proven to be successful for individual proteins, but their automation for high-throughput application is difficult due to intrinsic challenges of these techniques, mainly caused by a high false positive rate. Automated identification of remote homologs based on fold recognition methods would allow a signi cant improvement in functional annotation of proteins. My approach was to combine structure-based computational prediction methods with experimental data from genome-wide RNAi screens to support the establishment of functional hypotheses by improving the analysis of protein structure prediction results.
In the first part of my thesis, I characterized proteins from the Ska complex by computational methods. I showed the benefit of including experimental information to identify remote homologs: Integration of functional data helped to reduce the number of false positives in fold recognition results and made it possible to establish interesting functional hypotheses based on high con dence structural predictions. Based on the structural hypothesis of a GLEBS motif in c13orf3 (Ska3), I could derive a potential molecular mechanism that could explain the observed phenotype.
In the second part of my thesis, my goal was to develop computational tools and automated analysis techniques to be able to perform structure-based functional annotation in a high-throughput way. I designed and implemented key tools that were successfully integrated into a computational platform, called StrAnno, which I set up together with my colleagues. These novel computational modules include a domain prediction algorithm and a graphical overview that facilitates and accelerates the analysis of results.
StrAnno can be seen as a first step towards automatic functional annotation of proteins by structure-based methods. First, the analysis of long hit lists to identify promising candidates for further analysis is substantially facilitated by integration and combination of various sequence-based computational tools and data from functional databases. Second, the developed post-processing tools accelerate the evaluation of structural and functional hypotheses. False positives from the threading result lists are removed by various filters, and analysis of the possible true positives is greatly enhanced by the graphical overview. With these two essential benefits, fold recognition techniques are applicable to large-scale approaches. By applying this developed methodology to hits from a genome-wide cell cycle RNAi screen and evaluating structural hypotheses by molecular modeling techniques, I aimed to associate biological functions to human proteins and link the RNAi phenotype to a molecular function. For two selected human proteins, c20orf43 and HJURP, I could establish interesting structural and functional hypotheses. These predictions were based on templates with low sequence identity (10-20%). The uncharacterized human protein c20orf43 might be a E3 SUMO-ligase that could be involved either in DNA repair or rRNA regulatory processes. Based on the structural hypotheses of two domains of HJURP, I predicted a potential link to ubiquitylation processes and direct DNA binding. In addition, I substantiated the cell cycle arrest phenotype of these two genes upon RNAi knockdown.
Fold recognition methods are a promising alternative for functional annotation of proteins that escape sequence-based annotation due to their low sequence identity to well-characterized protein families. The structural and functional hypotheses I established in my thesis open the door to investigate the molecular mechanisms of previously uncharacterized proteins, which may provide new insights into cellular mechanisms.
|
9 |
Model Detection Based upon Amino Acid PropertiesMenlove, Kit J. 09 August 2010 (has links) (PDF)
Similarity searches are an essential component to most bioinformatic applications. They form the bases of structural motif identification, gene identification, and insights into functional associations. With the rapid increase in the available genetic data through a wide variety of databases, similarity searches are an essential tool for accessing these data in an informative and productive way. In our chapter, we provide an overview of similarity searching approaches, related databases, and parameter options to achieve the best results for a variety of applications. We then provide a worked example and some notes for consideration. Homology detection is one of the most basic and fundamental problems at the heart of bioinformatics. It is central to problems currently under intense investigation in protein structure prediction, phylogenetic analyses, and computational drug development. Currently discriminative methods for homology detection, which are not readily interpretable, are substantially more powerful than their more interpretable counterparts, particularly when sequence identity is very low. Here I present a computational graph-based framework for homology inference using physiochemical amino acid properties which aims to both reduce the gap in accuracy between discriminative and generative methods and provide a framework for easily identifying the physiochemical basis for the structural similarity between proteins. The accuracy of my method slightly improves on the accuracy of PSI-BLAST, the most popular generative approach, and underscores the potential of this methodology given a more robust statistical foundation.
|
10 |
Fold recognition and alignment in the 'twilight zone'Hill, Jamie Richard January 2013 (has links)
At present, the most accurate approach to predicting protein structure, comparative modelling, builds a model of a target sequence using known protein structures as templates. Comparative modelling becomes markedly less accurate in the ‘twilight zone’, where the target protein shares little sequence identity with all known templates. There are two main causes of this inaccuracy: first, it becomes difficult to identify good structural templates; second, it becomes difficult to determine which amino acids in the template are structurally equivalent to those in the target. These are problems of fold recognition and target-template alignment respectively. In this thesis, new approaches are developed to address both these problems. The alignment problem is investigated in the special case of membrane proteins. These are key modelling targets as they resist structure determination and are pharmaceutically important. The approach taken here is to use ‘environment specific substitution tables’ (ESSTs)– that is, to alter the alignment scoring system for each local environment of the template structure. We show how ESSTs can be made for membrane proteins, tested for robustness of construction, and used to infer the most important evolutionary pressures acting on protein structure. The incorporation of ESSTs into a multiple sequence alignment method leads to more accurate alignments of membrane proteins, and so to more accurate models. Recently, algorithms have been developed that predict contacts in protein structures from a multiple sequence alignment of homologous sequences. We explore the potential of these predictions for fold recognition by developing an algorithm that makes no use of amino acid identity, and so should be agnostic to the existence of a ‘twilight zone’. We show that whilst this is not the case, our method is complementary to state-of-the-art approaches.
|
Page generated in 0.091 seconds