• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • Tagged with
  • 5
  • 5
  • 5
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Improving protein docking with binding site prediction

Huang, Bingding 17 July 2008 (has links) (PDF)
Protein-protein and protein-ligand interactions are fundamental as many proteins mediate their biological function through these interactions. Many important applications follow directly from the identification of residues in the interfaces between protein-protein and protein-ligand interactions, such as drug design, protein mimetic engineering, elucidation of molecular pathways, and understanding of disease mechanisms. The identification of interface residues can also guide the docking process to build the structural model of protein-protein complexes. This dissertation focuses on developing computational approaches for protein-ligand and protein-protein binding site prediction and applying these predictions to improve protein-protein docking. First, we develop an automated approach LIGSITEcs to predict protein-ligand binding site, based on the notion of surface-solvent-surface events and the degree of conservation of the involved surface residues. We compare our algorithm to four other approaches, LIGSITE, CAST, PASS, and SURFNET, and evaluate all on a dataset of 48 unbound/bound structures and 210 bound-structures. LIGSITEcs performs slightly better than the other tools and achieves a success rate of 71% and 75%, respectively. Second, for protein-protein binding site, we develop metaPPI, a meta server for interface prediction. MetaPPI combines results from a number of tools, such as PPI_Pred, PPISP, PINUP, Promate, and SPPIDER, which predict enzyme-inhibitor interfaces with success rates of 23% to 55% and other interfaces with 10% to 28% on a benchmark dataset of 62 complexes. After refinement, metaPPI significantly improves prediction success rates to 70% for enzyme-inhibitor and 44% for other interfaces. Third, for protein-protein docking, we develop a FFT-based docking algorithm and system BDOCK, which includes specific scoring functions for specific types of complexes. BDOCK uses family-based residue interface propensities as a scoring function and obtains improvement factors of 4-30 for enzyme-inhibitor and 4-11 for antibody-antigen complexes in two specific SCOP families. Furthermore, the degrees of buriedness of surface residues are integrated into BDOCK, which improves the shape discriminator for enzyme-inhibitor complexes. The predicted interfaces from metaPPI are integrated as well, either during docking or after docking. The evaluation results show that reliable interface predictions improve the discrimination between near-native solutions and false positive. Finally, we propose an implicit method to deal with the flexibility of proteins by softening the surface, to improve docking for non enzyme-inhibitor complexes.
2

PePIP : a Pipeline for Peptide-Protein Interaction-site Prediction / PePIP : en Pipeline for Förutsägelse av Peptid-Protein Bindnings-site

Johansson-Åkhe, Isak January 2017 (has links)
Protein-peptide interactions play a major role in several biological processes, such as cellproliferation and cancer cell life-cycles. Accurate computational methods for predictingprotein-protein interactions exist, but few of these method can be extended to predictinginteractions between a protein and a particularly small or intrinsically disordered peptide. In this thesis, PePIP is presented. PePIP is a pipeline for predicting where on a given proteina given peptide will most probably bind. The pipeline utilizes structural aligning to perusethe Protein Data Bank for possible templates for the interaction to be predicted, using thelarger chain as the query. The possible templates are then evaluated as to whether they canrepresent the query protein and peptide using a Random Forest classifier machine learningalgorithm, and the best templates are found by using the evaluation from the Random Forest in combination with hierarchical clustering. These final templates are then combined to givea prediction of binding site. PePIP is proven to be highly accurate when testing on a set of 502 experimentally determinedprotein-peptide structures, suggesting a binding site on the correct part of the protein- surfaceroughly 4 out of 5 times.
3

Improving computational predictions of Cis-regulatory binding sites in genomic data

Rezwan, Faisal Ibne January 2011 (has links)
Cis-regulatory elements are the short regions of DNA to which specific regulatory proteins bind and these interactions subsequently influence the level of transcription for associated genes, by inhibiting or enhancing the transcription process. It is known that much of the genetic change underlying morphological evolution takes place in these regions, rather than in the coding regions of genes. Identifying these sites in a genome is a non-trivial problem. Experimental (wet-lab) methods for finding binding sites exist, but all have some limitations regarding their applicability, accuracy, availability or cost. On the other hand computational methods for predicting the position of binding sites are less expensive and faster. Unfortunately, however, these algorithms perform rather poorly, some missing most binding sites and others over-predicting their presence. The aim of this thesis is to develop and improve computational approaches for the prediction of transcription factor binding sites (TFBSs) by integrating the results of computational algorithms and other sources of complementary biological evidence. Previous related work involved the use of machine learning algorithms for integrating predictions of TFBSs, with particular emphasis on the use of the Support Vector Machine (SVM). This thesis has built upon, extended and considerably improved this earlier work. Data from two organisms was used here. Firstly the relatively simple genome of yeast was used. In yeast, the binding sites are fairly well characterised and they are normally located near the genes that they regulate. The techniques used on the yeast genome were also tested on the more complex genome of the mouse. It is known that the regulatory mechanisms of the eukaryotic species, mouse, is considerably more complex and it was therefore interesting to investigate the techniques described here on such an organism. The initial results were however not particularly encouraging: although a small improvement on the base algorithms could be obtained, the predictions were still of low quality. This was the case for both the yeast and mouse genomes. However, when the negatively labeled vectors in the training set were changed, a substantial improvement in performance was observed. The first change was to choose regions in the mouse genome a long way (distal) from a gene over 4000 base pairs away - as regions not containing binding sites. This produced a major improvement in performance. The second change was simply to use randomised training vectors, which contained no meaningful biological information, as the negative class. This gave some improvement over the yeast genome, but had a very substantial benefit for the mouse data, considerably improving on the aforementioned distal negative training data. In fact the resulting classifier was finding over 80% of the binding sites in the test set and moreover 80% of the predictions were correct. The final experiment used an updated version of the yeast dataset, using more state of the art algorithms and more recent TFBSs annotation data. Here it was found that using randomised or distal negative examples once again gave very good results, comparable to the results obtained on the mouse genome. Another source of negative data was tried for this yeast data, namely using vectors taken from intronic regions. Interestingly this gave the best results.
4

Improving protein docking with binding site prediction

Huang, Bingding 10 July 2008 (has links)
Protein-protein and protein-ligand interactions are fundamental as many proteins mediate their biological function through these interactions. Many important applications follow directly from the identification of residues in the interfaces between protein-protein and protein-ligand interactions, such as drug design, protein mimetic engineering, elucidation of molecular pathways, and understanding of disease mechanisms. The identification of interface residues can also guide the docking process to build the structural model of protein-protein complexes. This dissertation focuses on developing computational approaches for protein-ligand and protein-protein binding site prediction and applying these predictions to improve protein-protein docking. First, we develop an automated approach LIGSITEcs to predict protein-ligand binding site, based on the notion of surface-solvent-surface events and the degree of conservation of the involved surface residues. We compare our algorithm to four other approaches, LIGSITE, CAST, PASS, and SURFNET, and evaluate all on a dataset of 48 unbound/bound structures and 210 bound-structures. LIGSITEcs performs slightly better than the other tools and achieves a success rate of 71% and 75%, respectively. Second, for protein-protein binding site, we develop metaPPI, a meta server for interface prediction. MetaPPI combines results from a number of tools, such as PPI_Pred, PPISP, PINUP, Promate, and SPPIDER, which predict enzyme-inhibitor interfaces with success rates of 23% to 55% and other interfaces with 10% to 28% on a benchmark dataset of 62 complexes. After refinement, metaPPI significantly improves prediction success rates to 70% for enzyme-inhibitor and 44% for other interfaces. Third, for protein-protein docking, we develop a FFT-based docking algorithm and system BDOCK, which includes specific scoring functions for specific types of complexes. BDOCK uses family-based residue interface propensities as a scoring function and obtains improvement factors of 4-30 for enzyme-inhibitor and 4-11 for antibody-antigen complexes in two specific SCOP families. Furthermore, the degrees of buriedness of surface residues are integrated into BDOCK, which improves the shape discriminator for enzyme-inhibitor complexes. The predicted interfaces from metaPPI are integrated as well, either during docking or after docking. The evaluation results show that reliable interface predictions improve the discrimination between near-native solutions and false positive. Finally, we propose an implicit method to deal with the flexibility of proteins by softening the surface, to improve docking for non enzyme-inhibitor complexes.
5

Enrichment of miRNA targets in REST-regulated genes allows filtering of miRNA target predictions

Gebhardt, Marie Luise 08 January 2016 (has links)
Vorhersagen von miRNA-Bindestellen enthalten oft einen hohen Prozentsatz an falsch positiven Ergebnissen (24-70%). Gleichzeitig ist es schwierig die biologischen Interaktionen von miRNAs und ihren Zieltranskripten auf experimentellem Wege und Genom weit zu messen. Daher wurde in der vorliegenden Arbeit die Frage beantwortet, ob ChIP-Sequenzierungsdaten, von denen es immer mehr gibt, verwendet werden können, um Vorhersagen von miRNA-Bindestellen zu filtern. Dabei wurde von einem Netzwerk aus miRNAs und Transkriptionsfaktoren gebraucht gemacht, die Zieltranskripte gemeinsam regulieren. Zunächst wurden verschiedene Methoden getestet, mit denen „Peaks“ aus der ChIP-Sequenzierung Zielgenen zugeordnet werden können. Zielgenlisten des transkriptionalen Repressors RE1-silencing transcription factor (REST/NRSF) wurden mithilfe von ChIP-Sequenzierungsdaten erzeugt. Ein Algorithmus zur Suche nach überrepräsentierten miRNA-Zielgenen in REST-Genlisten basierend auf Vorhersagen von TargetScanHuman wurde entwickelt und angewandt. Die detektierten „enrichment“-miRNAs waren Teil eines vielfältig regulierten REST-miRNA-Netzwerks. Mögliche Funktionen von miRNAs wurden vorgeschlagen und ihre Rolle im gemeinsamen Netzwerk mit REST und im damit gebildeten Netzwerkmotiv (Inkoherente Schleife zur Vorwärtskopplung Typ 2) wurde analysiert. Es stellte sich heraus, dass ein Filtern der Vorhersagen tatsächlich möglich ist, da Gene, die sowohl von REST als auch von einer oder mehreren „enrichment“-miRNAs reguliert werden, einen höheren Anteil an wahren miRNA-Transkript-Interaktionen haben. / Predictions of miRNA binding sites suffer from high false positive rates (24-70%) and measuring biological interactions of miRNAs and target transcripts on a genome wide scale remains challenging. In the thesis at hand the question was answered if the ever growing body of ChIP-sequencing data can be applied to filter miRNA target predictions by making use of the underlying regulatory network of miRNAs and transcription factors. First different methods for association of ChIP-sequencing peaks to target genes were tested. Target gene lists of the transcriptional repressor RE1-silencing transcription factor (REST/NRSF) were generated by means of ChIP-sequencing data. An enrichment analysis tool based on predictions from TargetScanHuman was developed and applied to find ‘enrichment’-miRNAs with over-represented targets in the REST gene lists. The detected miRNAs were shown to be part of a highly regulated REST-miRNA network. Possible functions could be assigned to them and their role in the regulatory network and special network motifs (incoherent feedforward loop of type 2) was analyzed. It turned out that miRNA target predictions of genes shared by enrichment-miRNAs and REST had a higher proportion of true positive associations than the TargetScanHuman background, thus the procedure made a filtering possible.

Page generated in 0.1479 seconds