Global ETD Search

21	A computational investigation of solubility, functionality and the adaptation in subcellular compartments of proteins Chan, Pedro January 2011 (has links) A cell is considered to be the smallest unit of life. It carries out a variety of biochemical reactions through the activities of proteins and protein enzymes. In order to perform functions, proteins must be in their native folded state together with the correct environmental conditions. A slight change in pH or temperature could cause disruption to the electrostatic interactions within the protein, thus leading to conformational change and the loss of activity. Studies have shown that solubility could be enhanced by increasing the number of charges on the protein surface. And from the studies of extremophiles, we learned that the presence of non-polar aromatic residues could be a key for thermostable proteins. Thus, charges are important to determine the function and adaptation of proteins.Over the decades, large amount of protein sequence and structure information relating to molecular biology has been produced. By employing algorithms, computational and statistical techniques, it is possible to analyse these data to solve biological problems. Often these investigations are based mainly on sequences since their numbers outstrip the number of available structures. However, adding structures would allow us to investigate problems such as the relationship between charges, sequence, structure and functions, which is the aim of this study.In this thesis, the relationships between proteins and function were examined by various electrostatic features derived from charges and also geometric properties from structures. One interesting finding is that the averaged value of pH of maximum stability of proteins within a subcellular location was highly correlated to the pH of that subcellular compartment, which was due to pKas (of histidines), and their locations on the proteins. We also found that the size of the largest non-charged patch on the protein surface correlates with solubility and provides a predictor with a maximum accuracy of 76%. The use of novel charge-based methods shows little improvement in distinguishing between enzymes and non-enzymes. However, the method of using real charges with grid size of 1 angstrom has paved a way into the idea of using charges and dipoles pattern from enzyme active site to distinguish different enzymes. Finally, a web-tool for displaying conserved residues on 3D protein structure is made available to the public for identifying residues that may be of functional importance. 572
22	Automated structural annotation of the malaria proteome and identification of candidate proteins for modelling and crystallization studies Joubert, Yolandi 29 July 2008 (has links) Malaria is the cause of over one million deaths per year, primarily in African children. The parasite responsible for the most virulent form of malaria, is Plasmodium falciparum. Protein structure plays a pivotal role in elucidating mechanisms of parasite functioning and resistance to anti-malarial drugs. Protein structure furthermore aids the determination of protein function, which can together with the structure be used to identify novel drug targets in the parasite. However, various structural features in P. falciparum proteins complicate the experimental determination of protein three dimensional structures. Furthermore, the presence of parasite-specific inserts results in reduced similarity of these proteins to orthologous proteins with experimentally determined structures. The lack of solved structures in the malaria parasite, together with limited similarities to proteins in the Protein Data Bank, necessitate genome-scale structural annotation of P. falciparum proteins. Additionally, the annotation of a range of structural features facilitates the identification of suitable targets for structural studies. An integrated structural annotation system was constructed and applied to all the predicted proteins in P. falciparum, Plasmodium vivax and Plasmodium yoelii. Similarity searches against the PDB, Pfam, Superfamily, PROSITE and PRINTS were included. In addition, the following predictions were made for the P. falciparum proteins: secondary structure, transmembrane helices, protein disorder, low complexity, coiled-coils and small molecule interactions. P. falciparum protein-protein interactions and proteins exported to the RBC were annotated from literature. Finally, a selection of proteins were threaded through a library of SCOP folds. All the results are stored in a relational PostgreSQL database and can be viewed through a web interface (http://deepthought.bi.up.ac.za:8080/Annotation). In order to select groups of proteins which fulfill certain criteria with regard to structural and functional features, a query tool was constructed. Using this tool, criteria regarding the presence or absence of all the predicted features can be specified. Analysis of the results obtained revealed that P. falciparum protein-interacting proteins contain a higher percentage of predicted disordered residues than non-interacting proteins. Proteins interacting with 10 or more proteins have a disordered content concentrated in the range of 60-100%, while the disorder distribution for proteins having only one interacting partner, was more evenly spread. Comparisons of structural and sequence features between the three species, revealed that P. falciparum proteins tend to be longer and vary more in length than the other two species. P. falciparum proteins also contained more predicted low complexity and disorder content than proteins from P. yoelii and P. vivax. P. falciparumprotein targets for experimental structure determination, comparative modeling and in silico docking studies were putatively identified based on structural features. For experimental structure determination, 178 targets were identi_ed. These targets contain limited contents of predicted transmembrane helix, disorder, coiled-coils, low complexity and signal peptide, as these features may complicate steps in the experimental structure determination procedure. In addition, the targets display low similarity to proteins in the PDB. Comparisons of the targets to proteins with crystal structures, revealed that the structures and predicted targets had similar sequence properties and predicted structural features. A group of 373 proteins which displayed high levels of similarity to proteins in the PDB, were identified as targets for comparitive modeling studies. Finally, 197 targets for in silico docking were identified based on predicted small molecule interactions and the availability of a 3D structure. / Dissertation (MSc)--University of Pretoria, 2008. / Biochemistry / unrestricted Annotation Protein disorder Protein function Protein structure Anti-malaria drugs Malaria Crystallization studies UCTD
23	Machine Learning Approaches Towards Protein Structure and Function Prediction Aashish Jain (10933737) 04 August 2021 (has links) <div> <div> <div> <p>Proteins are drivers of almost all biological processes in the cell. The functions of a protein are dependent on their three-dimensional structure and elucidating the structure and function of proteins is key to understanding how a biological system operates. In this research, we developed computational methods using machine learning techniques to predicts the structure and function of proteins. Protein 3D structure prediction has advanced significantly in recent years, largely due to deep learning approaches that predict inter-residue contacts and, more recently, distances using multiple sequence alignments (MSAs). The performance of these models depends on the number of similar protein sequences to the query protein, wherein some cases similar sequences are few but dissimilar sequences with local similarities are more and can be helpful. We have developed a novel deep learning-based approach AttentiveDist which further improves over the previous state of art. We added an attention mechanism where dis-similar sequences are also used (increasing number of sequences) and the model itself determines which information from such sequences it should attend to. We showed that the improvement of distance predictions was successfully transferred to achieve better protein tertiary structure modeling. We also show that structure prediction from a predicted distance map can be further enhanced by using predicted inter-residue sidechain center distances and main-chain hydrogen-bonds. Protein function prediction is another avenue we explored where we want to predict the function that a protein will perform. The crux of the approach is to predict the function of protein based on the function of similar sequences. Here, we developed a method where we use dissimilar sequences to extract additional information and improve performance over the previous approaches. We used phylogenetic analysis to determine if a dissimilar sequence can be close to the query sequence and thus can provide functional information. Our method was ranked highly in worldwide protein function prediction competition CAFA3 (2016-2019). Further, we expanded the method with a neural network to predict protein toxicity that can be used as a safety check for human-designed protein sequences.</p></div></div></div> Bioinformatics Bioinformatics Software protein structure prediction algorithms protein function prediction method Deep Learning Applications
24	Vyhledávání příbuzných proteinů s modifikovanou funkcí / Detection of Related Proteins with Modified Function Hon, Jiří January 2015 (has links) Protein engineering is a young dynamic discipline with great amount of potential practical applications. However, its success is primarily based on perfect knowledge and usage of all existing information about protein function and structure. To achieve that, protein engineering is supported by plenty of bioinformatic tools and analysis. The goal of this project is to create a new tool for protein engineering that would enable researchers to identificate related proteins with modified function in still growing biological databases. The tool is designed as an automated workflow of existing bioinformatic analyses that leads to identification of proteins with the same type of enzymatic function, but with slightly modified properties - primarily in terms of selectivity, reaction speed and stability.
25	Protein Function Prediction Using Decision Tree Technique Yedida, Venkata Rama Kumar Swamy 02 September 2008 (has links) No description available. Bioinformatics protein function prediction decision tree data mining gene ontology sequence similairty
26	Analysis of Gene Expression Data for Gene Ontology Based Protein Function Prediction Macholan, Robert Daniel 13 May 2011 (has links) No description available. Bioinformatics Computer Science Gene Expression Gene Ontology Protein Function Prediction Slope Matrix
27	An Interdisciplinary Approach: Computational Sequence Motif Search and Prediction of Protein Function with Experimental Validation Choi, Hyunjin 29 October 2013 (has links) Pathogens colonize their hosts by releasing molecules that can enter host cells. A biotrophic oomycete plant pathogen, Phytophthora sojae harbors a superfamily of effector genes whose protein products enter the cells of the host, soybean. Many of the effectors contain an RXLR-dEER motif in their N-terminus. More than 400 members belonging to this family have been previously identified using a Hidden Markov Model. Amino acids flanking the RXLR motif have been utilized to identify effector proteins from the P. sojae secretome, despite the high level of sequence divergence among the members of this protein family. I present here machine learning methods to identify protein candidates that belong to a particular class, such as the effector superfamily. Converting the flanking amino acid sequences of RXLR motifs (or other candidate motifs) into numeric values that reflect their physical properties enabled the protein sequences to be analyzed through these methods. The methods evaluated include Support Vector Machines and a related spherical classification method that I have developed. I also approached the effector prediction problem by building functional linkage networks and have produced lists of predicted P. sojae effector proteins. I tested the best candidate through gene gun bombardment assays using the beta-glucuronidase reporter system, which revealed that there is a high likelihood that the candidate can enter the soybean cells. / Ph. D. Protein function prediction Phytophthora sojae effectors support vector machines functional linkage network amino acid physical properties
28	Conséquences du contexte haplotypique sur la fonctionnalité des protéines : application à la mucoviscidose / Consequences of the haplotype context on protein function : application to cystic fibrosis Cuppens, Tania 07 May 2019 (has links) Notre génome contient des centaines de milliers de variants génétiques, qui pour la plupart, n’ont aucun impact sur notre santé. Après séquençage, il faut les filtrer pour ne conserver que ceux qui sont potentiellement impliquées dans une maladie. On utilise des annotateurs qui prédisent l’impact des variants. Ces prédictions sont faites sans tenir compte des variants en cis dans le même gène. Pourtant, des variants neutres peuvent, lorsqu’ils sont réunis chez un individu, devenir délétères. J’ai donc développé l’outil bioinformatique GEMPROT qui permet de visualiser l’effet des variants génétiques sur la séquence protéique et de mettre en évidence les combinaisons de variants touchant un même domaine fonctionnel.J’ai ensuite étudié l’impact de deux variants associés à la p.Phe508del (508del) sur la protéine CFTR.Le variant p.Val470M est présent sur tous les haplotypes portant la délétion mais pas sur la séquence de référence, qui est généralement utilisée pour la construction de plasmides. Nous avons montré des différences de fonction de la protéine CFTR selon l’acide aminé en position 470. La fonction est augmentée avec une Valine et il convient donc de s’assurer, lors de la construction de plasmides, que le contexte haplotypique des variants étudiés est bien respecté. Le variant p.Ile1027Thr conduit à une dégradation de la fonction de la protéine 508del.Ce variant n’est présent que sur une partie des haplotypes 508del et pourrait donc avoir un effet modificateur de l’expression de la délétion. En conclusion, nous montrons l’importance de la prise en compte des contextes haplotypiques dans l’étude des maladies et proposons un outil bioinformatique pour le faire. / We all carry hundreds of thousands genetic variations in our genome that, for the most of them, have no impact on our health. After sequencing, they must be filtered to only retain those potentially involved in a disease. We use annotators that predict the impact of variants.These predictions are done for each variant taken independently without considering cis variants in the same gene. However, neutral variants can become deleterious when associated together. I have developed the bioinformatics tool GEMPROT, which makes it possible to visualize the effect of genetic variants on the protein sequence and to highlight combinations of variants affecting the same functional domain.I then studied the impact of two variants associated with p.Phe508del (508del) on CFTR protein function.The variant p.Val470M is present on all carrying deletion haplotypes but not on the reference sequence, which is generally used for the construction of plasmids. We have shown differences in the function of the mutated CFTR protein 508del according to the amino acid at position 470. The function is increased with a Valine and it is therefore necessary to ensure, when constructing plasmids, that the haplotype context of the studied variants is well respected.The variant p.Ile1027Thr leads to a degradation of the function of the 508del protein. This variant is present only on a portion of the 508del haplotypes and could therefore have a modifying effect on deletion expression. In conclusion, we show the importance of considering haplotype contexts in the diseases studies and propose a bioinformatics tool to do so. Outil bioinformatique Contexte haplotypique Visualisation Mucoviscidose Fonction protéique Bioinformatic tool Haplotype context Visualization Cystic fibrosis Protein function
29	From protein sequence to structural instability and disease Wang, Lixiao January 2010 (has links) A great challenge in bioinformatics is to accurately predict protein structure and function from its amino acid sequence, including annotation of protein domains, identification of protein disordered regions and detecting protein stability changes resulting from amino acid mutations. The combination of bioinformatics, genomics and proteomics becomes essential for the investigation of biological, cellular and molecular aspects of disease, and therefore can greatly contribute to the understanding of protein structures and facilitating drug discovery. In this thesis, a PREDICTOR, which consists of three machine learning methods applied to three different but related structure bioinformatics tasks, is presented: using profile Hidden Markov Models (HMMs) to identify remote sequence homologues, on the basis of protein domains; predicting order and disorder in proteins using Conditional Random Fields (CRFs); applying Support Vector Machines (SVMs) to detect protein stability changes due to single mutation. To facilitate structural instability and disease studies, these methods are implemented in three web servers: FISH, OnD-CRF and ProSMS, respectively. For FISH, most of the work presented in the thesis focuses on the design and construction of the web-server. The server is based on a collection of structure-anchored hidden Markov models (saHMM), which are used to identify structural similarity on the protein domain level. For the order and disorder prediction server, OnD-CRF, I implemented two schemes to alleviate the imbalance problem between ordered and disordered amino acids in the training dataset. One uses pruning of the protein sequence in order to obtain a balanced training dataset. The other tries to find the optimal p-value cut-off for discriminating between ordered and disordered amino acids. Both these schemes enhance the sensitivity of detecting disordered amino acids in proteins. In addition, the output from the OnD-CRF web server can also be used to identify flexible regions, as well as predicting the effect of mutations on protein stability. For ProSMS, we propose, after careful evaluation with different methods, a clustered by homology and a non-clustered model for a three-state classification of protein stability changes due to single amino acid mutations. Results for the non-clustered model reveal that the sequence-only based prediction accuracy is comparable to the accuracy based on protein 3D structure information. In the case of the clustered model, however, the prediction accuracy is significantly improved when protein tertiary structure information, in form of local environmental conditions, is included. Comparing the prediction accuracies for the two models indicates that the prediction of mutation stability of proteins that are not homologous is still a challenging task. Benchmarking results show that, as stand-alone programs, these predictors can be comparable or superior to previously established predictors. Combined into a program package, these mutually complementary predictors will facilitate the understanding of structural instability and disease from protein sequence. protein domain remote homologue protein function point mutation protein family protein stability HMMs CRFs SVMs
30	Differential Roles of Tryptophan Residues in the Functional Expression of Human Anion Exchanger 1 Okawa, Yuka 15 August 2012 (has links) Anion exchanger 1 (AE1) is a 95 kDa glycoprotein that facilitates Cl-/HCO3- exchange across the erythrocyte plasma membrane. Seven conserved tryptophan (Trp) residues are in the AE1 membrane domain; at the membrane interface (Trp648, Trp662, and Trp723), in transmembrane segment (TM) 4 (Trp492 and Trp496), and in hydrophilic loops (Trp831, and Trp848). All 7 Trp residues were individually mutated into alanine (Ala) and phenylalanine (Phe) and transiently expressed in human embryonic kidney (HEK)-293 cells. The 7 Trp residues could be grouped into three classes according to the impact of the mutations on the functional expression of AE1: class 1, normal expression, class 2, expression decreased, and class 3, expression decreased by Ala substitution. These results indicate that Trp residues play differential roles in AE1 expression depending on their location in the protein and suggest that Trp mutants with a low expression are misfolded and retained in the ER. Anion exchanger 1 Membrane protein Tryptophan Protein expression Protein function Trafficking Cell surface expression N-glycosylation 0487 0379

Search results