Global ETD Search

1	Métodos híbridos em docagem molecular: implementação, validação e aplicação / Hybrid methods in molecular docking: implementation, validation and application Muniz, Heloisa dos Santos 13 June 2018 (has links) A modelagem das interações entre macromoléculas e ligantes ainda se depara com diversos desafios na área de desenho de fármacos assistidos por computador. Apesar do crescimento da área, temas como a flexibilidade do receptor, funções de pontuação e solvatação ainda têm sido alvo de intensa investigação na comunidade científica. Com o objetivo de analisar a interação em milhares ou milhões de complexos, é imprescindível uma boa harmonização entre o custo computacional e a acurácia dos métodos computacionais que permitem a classificação de ligantes de acordo com a energia de interação. O LiBELa (Ligand Binding Energy Landscape) é um programa de docagem molecular com abordagem híbrida, ou seja, utiliza informações do ligante e do receptor durante o processo de docagem. Inicialmente, as características estéricas e eletrostáticas de um ligante de referência (cristalográfico, por exemplo) são utilizadas nos cálculos de similaridade e sobreposição, obtendo assim uma conformação inicial pré-otimizada do ligante testado. Em seguida, a energia de interação é minimizada no sítio ativo de receptor a partir de potenciais energéticos. Quatro funções de pontuação baseadas em campo de força foram testadas e otimizadas, compostas por potenciais de van der Waals, de Coulomb, e uma função empírica de solvatação denominada função de Stouten-Verkhivker (SV). A flexibilidade do sistema foi tratada através da geração de confôrmeros que amostram os graus de liberdade dos ligantes descritos como semi-rígidos e através de potenciais atenuados que suavizam a superfície de energia de interação, permitindo interações em distâncias interatômicas antes repulsivas. Como ponto de partida, os métodos implementados no programa LiBELa demonstraram resultados satisfatórios nos testes de cross- e self-docking, mostrando ser uma ferramenta eficiente em encontrar os modos de ligação cristalográficos de forma equivalente ou até melhor às dos programas comparados. Através de testes de enriquecimento nos conjuntos de dados DUD, DUDE e CM-DUD, foram otimizadas de forma sistemática as constantes dielétrica, do termo de solvatação, e dos termos de atenuação. Também foi realizado um paralelo entre as funções de pontuação, incluindo a atenuação e o termo de solvatação. Estes mesmos testes mostraram resultados superiores do LiBELa de 39% e 15% em comparação com um programa baseado puramente no receptor (DOCK 6.6), relativo à média da área sob a curva em escala semi-logarítmica nas bases de dados DUDE e DUD respectivamente. Apesar da função de solvatação SV implementada no LiBELa apresentar boa correlação com dados experimentais (r=0,72) e com o modelo Zou GB de solvatação (r=0,88), não apresentou correlação significativa com os métodos GB e PB implementados no pacote de programas disponível no AmberTools. Comparadas às funções de pontuação do LiBELa, as funções com correção para solvatação apresentaram pior enriquecimento, salvo alguns alvos específicos. Por fim, foram realizados ensaios de docagem molecular utilizando como alvo uma enzima β-galactosidase da família GH42, cuja estrutura fora resolvida em nosso grupo. Os resultados permitiram conclusões acerca de como o modo de ligação interfere na preferência de ligação entre dissacarídeos de ligações glicosídicas distintas, consistentes com dados experimentais de ensaios cinéticos de ligação. / Modeling the interactions between macromolecules and ligands still faces several challenges in the computer-aided drug design area. Despite the growth in the area, subjects such as receptor flexibility, scoring functions and solvation still have been widely explored in the scientific community. In order to analyze the interaction for thousands or millions of complexes, a good harmonization between the computational cost and the accuracy of the calculation methods in molecular docking programs is essential. LiBELa (Ligand Binding Energy Landscape) is a hybrid approach program that uses both ligand and receptor information for ligand docking. Initially, the steric and electrostatic characteristics from a reference binder (crystallographic, for example) are used to similarity and overlay calculations, thus obtaining an initial conformation of the ligand tested. Then, within the receptor´s active site, the interaction energy is minimized using energetic potentials. Four force field-based scoring functions were tested and optimized, composed of van der Waals and Coulomb potentials and an empirical solvation function called Stouten-Verkhivker (SV). Concerning the system flexibility, besides the confomers generation that sample the degrees of freedom for semi-rigid ligands, attenuated potentials smooth the energy surface allowing interactions between previously repulsive interatomic distances. As a starting point, LiBELa performed satisfactorily in the cross- and self-docking tests, showing that is an eficient tool to reproduce crystallographic binding modes equivalently to or even better than reference programs. Through enrichment of DUD, DUDE and CM-DUD datasets, the dielectric constant, solvation and softening terms were systematically optimized. It also allowed a parallel between scoring functions, including attenuation and solvation term. Finally, it revealed the LiBELa showed an enhancement of 39% and 15% as compared to the purely receptor-based program DOCK 6.6, relative to the mean of the area under the curve on a semi-logarithmic scale in the DUDE and DUD databases respectively. Although the SV solvation function implemented in LiBELa showed good correlations with experimental data (r = 0.72) and with the Zou GB / SA solvation method implemented in DOCK6 (r = 0.88), it did not show significant correlation with the GB/SA and PB/SA methods implemented in AmberTools. Comparing all the LiBELa tested scoring functions, those including solvation correction showed worse enrichments, except for some specific targets. Finally, molecular docking experiments using LiBELa were conducted with a β-galactosidase from GH42 family, whose structure was solved in our group. The results allowed conclusions concerning how the binding mode interferes the preference for some disaccharides of distinct glycosidic bonds, consistent with experimental data from kinetic assays. Docagem molecular Função de pontuação Interações receptorligante Molecular docking Receptor-ligand interactions Scoring functions Solvatação Solvation
2	Métodos híbridos em docagem molecular: implementação, validação e aplicação / Hybrid methods in molecular docking: implementation, validation and application Heloisa dos Santos Muniz 13 June 2018 (has links) A modelagem das interações entre macromoléculas e ligantes ainda se depara com diversos desafios na área de desenho de fármacos assistidos por computador. Apesar do crescimento da área, temas como a flexibilidade do receptor, funções de pontuação e solvatação ainda têm sido alvo de intensa investigação na comunidade científica. Com o objetivo de analisar a interação em milhares ou milhões de complexos, é imprescindível uma boa harmonização entre o custo computacional e a acurácia dos métodos computacionais que permitem a classificação de ligantes de acordo com a energia de interação. O LiBELa (Ligand Binding Energy Landscape) é um programa de docagem molecular com abordagem híbrida, ou seja, utiliza informações do ligante e do receptor durante o processo de docagem. Inicialmente, as características estéricas e eletrostáticas de um ligante de referência (cristalográfico, por exemplo) são utilizadas nos cálculos de similaridade e sobreposição, obtendo assim uma conformação inicial pré-otimizada do ligante testado. Em seguida, a energia de interação é minimizada no sítio ativo de receptor a partir de potenciais energéticos. Quatro funções de pontuação baseadas em campo de força foram testadas e otimizadas, compostas por potenciais de van der Waals, de Coulomb, e uma função empírica de solvatação denominada função de Stouten-Verkhivker (SV). A flexibilidade do sistema foi tratada através da geração de confôrmeros que amostram os graus de liberdade dos ligantes descritos como semi-rígidos e através de potenciais atenuados que suavizam a superfície de energia de interação, permitindo interações em distâncias interatômicas antes repulsivas. Como ponto de partida, os métodos implementados no programa LiBELa demonstraram resultados satisfatórios nos testes de cross- e self-docking, mostrando ser uma ferramenta eficiente em encontrar os modos de ligação cristalográficos de forma equivalente ou até melhor às dos programas comparados. Através de testes de enriquecimento nos conjuntos de dados DUD, DUDE e CM-DUD, foram otimizadas de forma sistemática as constantes dielétrica, do termo de solvatação, e dos termos de atenuação. Também foi realizado um paralelo entre as funções de pontuação, incluindo a atenuação e o termo de solvatação. Estes mesmos testes mostraram resultados superiores do LiBELa de 39% e 15% em comparação com um programa baseado puramente no receptor (DOCK 6.6), relativo à média da área sob a curva em escala semi-logarítmica nas bases de dados DUDE e DUD respectivamente. Apesar da função de solvatação SV implementada no LiBELa apresentar boa correlação com dados experimentais (r=0,72) e com o modelo Zou GB de solvatação (r=0,88), não apresentou correlação significativa com os métodos GB e PB implementados no pacote de programas disponível no AmberTools. Comparadas às funções de pontuação do LiBELa, as funções com correção para solvatação apresentaram pior enriquecimento, salvo alguns alvos específicos. Por fim, foram realizados ensaios de docagem molecular utilizando como alvo uma enzima β-galactosidase da família GH42, cuja estrutura fora resolvida em nosso grupo. Os resultados permitiram conclusões acerca de como o modo de ligação interfere na preferência de ligação entre dissacarídeos de ligações glicosídicas distintas, consistentes com dados experimentais de ensaios cinéticos de ligação. / Modeling the interactions between macromolecules and ligands still faces several challenges in the computer-aided drug design area. Despite the growth in the area, subjects such as receptor flexibility, scoring functions and solvation still have been widely explored in the scientific community. In order to analyze the interaction for thousands or millions of complexes, a good harmonization between the computational cost and the accuracy of the calculation methods in molecular docking programs is essential. LiBELa (Ligand Binding Energy Landscape) is a hybrid approach program that uses both ligand and receptor information for ligand docking. Initially, the steric and electrostatic characteristics from a reference binder (crystallographic, for example) are used to similarity and overlay calculations, thus obtaining an initial conformation of the ligand tested. Then, within the receptor´s active site, the interaction energy is minimized using energetic potentials. Four force field-based scoring functions were tested and optimized, composed of van der Waals and Coulomb potentials and an empirical solvation function called Stouten-Verkhivker (SV). Concerning the system flexibility, besides the confomers generation that sample the degrees of freedom for semi-rigid ligands, attenuated potentials smooth the energy surface allowing interactions between previously repulsive interatomic distances. As a starting point, LiBELa performed satisfactorily in the cross- and self-docking tests, showing that is an eficient tool to reproduce crystallographic binding modes equivalently to or even better than reference programs. Through enrichment of DUD, DUDE and CM-DUD datasets, the dielectric constant, solvation and softening terms were systematically optimized. It also allowed a parallel between scoring functions, including attenuation and solvation term. Finally, it revealed the LiBELa showed an enhancement of 39% and 15% as compared to the purely receptor-based program DOCK 6.6, relative to the mean of the area under the curve on a semi-logarithmic scale in the DUDE and DUD databases respectively. Although the SV solvation function implemented in LiBELa showed good correlations with experimental data (r = 0.72) and with the Zou GB / SA solvation method implemented in DOCK6 (r = 0.88), it did not show significant correlation with the GB/SA and PB/SA methods implemented in AmberTools. Comparing all the LiBELa tested scoring functions, those including solvation correction showed worse enrichments, except for some specific targets. Finally, molecular docking experiments using LiBELa were conducted with a β-galactosidase from GH42 family, whose structure was solved in our group. The results allowed conclusions concerning how the binding mode interferes the preference for some disaccharides of distinct glycosidic bonds, consistent with experimental data from kinetic assays. Docagem molecular Função de pontuação Interações receptorligante Solvatação Molecular docking Receptor-ligand interactions Scoring functions Solvation
3	Multiple Biolgical Sequence Alignment: Scoring Functions, Algorithms, and Evaluations Nguyen, Ken D 14 December 2011 (has links) Aligning multiple biological sequences such as protein sequences or DNA/RNA sequences is a fundamental task in bioinformatics and sequence analysis. These alignments may contain invaluable information that scientists need to predict the sequences' structures, determine the evolutionary relationships between them, or discover drug-like compounds that can bind to the sequences. Unfortunately, multiple sequence alignment (MSA) is NP-Complete. In addition, the lack of a reliable scoring method makes it very hard to align the sequences reliably and to evaluate the alignment outcomes. In this dissertation, we have designed a new scoring method for use in multiple sequence alignment. Our scoring method encapsulates stereo-chemical properties of sequence residues and their substitution probabilities into a tree-structure scoring scheme. This new technique provides a reliable scoring scheme with low computational complexity. In addition to the new scoring scheme, we have designed an overlapping sequence clustering algorithm to use in our new three multiple sequence alignment algorithms. One of our alignment algorithms uses a dynamic weighted guidance tree to perform multiple sequence alignment in progressive fashion. The use of dynamic weighted tree allows errors in the early alignment stages to be corrected in the subsequence stages. Other two algorithms utilize sequence knowledge-bases and sequence consistency to produce biological meaningful sequence alignments. To improve the speed of the multiple sequence alignment, we have developed a parallel algorithm that can be deployed on reconfigurable computer models. Analytically, our parallel algorithm is the fastest progressive multiple sequence alignment algorithm. Multiple sequence alignments Algorithms Scoring functions Computer Sciences
4	Scoring functions for protein docking and drug design Viswanath, Shruthi 26 June 2014 (has links) Predicting the structure of complexes formed by two interacting proteins is an important problem in computation structural biology. Proteins perform many of their functions by binding to other proteins. The structure of protein-protein complexes provides atomic details about protein function and biochemical pathways, and can help in designing drugs that inhibit binding. Docking computationally models the structure of protein-protein complexes, given three-dimensional structures of the individual chains. Protein docking methods have two phases. In the first phase, a comprehensive, coarse search is performed for optimally docked models. In the second refinement and reranking phase, the models from the first phase are refined and reranked, with the expectation of extracting a small set of accurate models from the pool of thousands of models obtained from the first phase. In this thesis, new algorithms are developed for the refinement and reranking phase of docking. New scoring functions, or potentials, that rank models are developed. These potentials are learnt using large-scale machine learning methods based on mathematical programming. The procedure for learning these potentials involves examining hundreds of thousands of correct and incorrect models. In this thesis, hierarchical constraints were introduced into the learning algorithm. First, an atomic potential was developed using this learning procedure. A refinement procedure involving side-chain remodeling and conjugate gradient-based minimization was introduced. The refinement procedure combined with the atomic potential was shown to improve docking accuracy significantly. Second, a hydrogen bond potential, was developed. Molecular dynamics-based sampling combined with the hydrogen bond potential improved docking predictions. Third, mathematical programming compared favorably to SVMs and neural networks in terms of accuracy, training and test time for the task of designing potentials to rank docking models. The methods described in this thesis are implemented in the docking package DOCK/PIERR. DOCK/PIERR was shown to be among the best automated docking methods in community wide assessments. Finally, DOCK/PIERR was extended to predict membrane protein complexes. A membrane-based score was added to the reranking phase, and shown to improve the accuracy of docking. This docking algorithm for membrane proteins was used to study the dimers of amyloid precursor protein, implicated in Alzheimer's disease.R. DOCK/PIERR was shown to be among the best automated docking methods in community wide assessments. Finally, DOCK/PIERR was extended to predict membrane protein complexes. A membrane-based score was added to the reranking phase, and shown to improve the accuracy of docking. This docking algorithm for membrane proteins was used to study the dimers of amyloid precursor protein, implicated in Alzheimer’s disease. / text Protein structure prediction Protein docking Scoring functions Knowledge-based potentials Machine learning Computational structural biology Membrane proteins Protein complexes
5	Une nouvelle approche computationnelle pour la découverte des sites de fixation de facteurs de transcription à l’ADN, adaptée aux données de ChIP-chip et de ChIP-séquençage Aid, Malika 09 1900 (has links) Les facteurs de transcription sont des protéines spécialisées qui jouent un rôle important dans différents processus biologiques tel que la différenciation, le cycle cellulaire et la tumorigenèse. Ils régulent la transcription des gènes en se fixant sur des séquences d’ADN spécifiques (éléments cis-régulateurs). L’identification de ces éléments est une étape cruciale dans la compréhension des réseaux de régulation des gènes. Avec l’avènement des technologies de séquençage à haut débit, l’identification de tout les éléments fonctionnels dans les génomes, incluant gènes et éléments cis-régulateurs a connu une avancée considérable. Alors qu’on est arrivé à estimer le nombre de gènes chez différentes espèces, l’information sur les éléments qui contrôlent et orchestrent la régulation de ces gènes est encore mal définie. Grace aux techniques de ChIP-chip et de ChIP-séquençage il est possible d’identifier toutes les régions du génome qui sont liées par un facteur de transcription d’intérêt. Plusieurs approches computationnelles ont été développées pour prédire les sites fixés par les facteurs de transcription. Ces approches sont classées en deux catégories principales: les algorithmes énumératifs et probabilistes. Toutefois, plusieurs études ont montré que ces approches génèrent des taux élevés de faux négatifs et de faux positifs ce qui rend difficile l’interprétation des résultats et par conséquent leur validation expérimentale. Dans cette thèse, nous avons ciblé deux objectifs. Le premier objectif a été de développer une nouvelle approche pour la découverte des sites de fixation des facteurs de transcription à l’ADN (SAMD-ChIP) adaptée aux données de ChIP-chip et de ChIP-séquençage. Notre approche implémente un algorithme hybride qui combine les deux stratégies énumérative et probabiliste, afin d’exploiter les performances de chacune d’entre elles. Notre approche a montré ses performances, comparée aux outils de découvertes de motifs existants sur des jeux de données simulées et des jeux de données de ChIP-chip et de ChIP-séquençage. SAMD-ChIP présente aussi l’avantage d’exploiter les propriétés de distributions des sites liés par les facteurs de transcription autour du centre des régions liées afin de limiter la prédiction aux motifs qui sont enrichis dans une fenêtre de longueur fixe autour du centre de ces régions. Les facteurs de transcription agissent rarement seuls. Ils forment souvent des complexes pour interagir avec l’ADN pour réguler leurs gènes cibles. Ces interactions impliquent des facteurs de transcription dont les sites de fixation à l’ADN sont localisés proches les uns des autres ou bien médier par des boucles de chromatine. Notre deuxième objectif a été d’exploiter la proximité spatiale des sites liés par les facteurs de transcription dans les régions de ChIP-chip et de ChIP-séquençage pour développer une approche pour la prédiction des motifs composites (motifs composés par deux sites et séparés par un espacement de taille fixe). Nous avons testé ce module pour prédire la co-localisation entre les deux demi-sites ERE qui forment le site ERE, lié par le récepteur des œstrogènes ERα. Ce module a été incorporé à notre outil de découverte de motifs SAMD-ChIP. / Transcription factors (TF) play important roles in various biological processes such as differentiation, cell cycle progression and tumorigenesis. They regulate gene expression by binding to specific DNA sequences (TFBS). Identifying these cis-regulatory elements is a crucial step to understand gene regulatory networks. Technological developments have enhanced DNA sequencing at genomic scale. On the basis of the resulting sequences, computational biologists now attempt to localize the most important functional regions, starting with genes, but also importantly the whole genome characterization of transcription factor binding sites and allow the development of several computational DNA motif discovery tools. Although these various tools are widely used and have been successful at discovering novel motifs, they are not adapted to ChIP-chip and ChIP-sequencing data. The main drawback of these approaches is that most of the predicted motifs represent artifacts due to an inefficient assessment of their enrichment. This thesis is about transcription factor proteins and statistical analysis of their binding sites in ChIP-chip and ChIP-sequencing data. The first objective was to develop a new do novo DNA motif discovery tool adapted to ChIP-chip and ChIP-sequencing data. SAMD-ChIP combines enumerative and stochastic strategies to predict enriched motifs in the vicinity of the ChIP peak summits. Our approach is an automated pipeline that includes motif discovery, motif clustering, motif optimization and finally motif identification using transcription factor (TF) databases. SAMD-ChIP outperforms state-of-the-art motif discovery tools in term of the number of predicted motifs and the prediction of rare and degenerate motifs. In particular, SAMD-ChIP efficiently identifies gapped motifs such as inverted or direct repeats bound by nuclear receptors and composite motifs resulting from the association of different single TF binding sites. The underlying assumption of the second objective is that in regulatory regions, binding sites of interacting transcription factors co-occur more often than expected by chance in the vicinity of the ChIP-peak summits. We proposed an approach to predict transcription factor binding sites co-localization based on the prediction of single motifs by do novo motif discovery tools or by using TFBS models from TF data bases. ChIP-chip ChIP-séquençage réseau de régulation des gènes facteurs de transcription découverte de motifs d’ADN fonctions de score éléments cis-régulateurs cancer du sein récepteur des œstrogènes gene regulatory network DNA motifs discovery scoring functions TFBS TF
6	Multivariate design of molecular docking experiments : An investigation of protein-ligand interactions Andersson, David January 2010 (has links) To be able to make informed descicions regarding the research of new drug molecules (ligands), it is crucial to have access to information regarding the chemical interaction between the drug and its biological target (protein). Computer-based methods have a given role in drug research today and, by using methods such as molecular docking, it is possible to investigate the way in which ligands and proteins interact. Despite the acceleration in computer power experienced in the last decades many problems persist in modelling these complicated interactions. The main objective of this thesis was to investigate and improve molecular modelling methods aimed to estimate protein-ligand binding. In order to do so, we have utilised chemometric tools, e.g. design of experiments (DoE) and principal component analysis (PCA), in the field of molecular modelling. More specifically, molecular docking was investigated as a tool for reproduction of ligand poses in protein 3D structures and for virtual screening. Adjustable parameters in two docking software were varied using DoE and parameter settings were identified which lead to improved results. In an additional study, we explored the nature of ligand-binding cavities in proteins since they are important factors in protein-ligand interactions, especially in the prediction of the function of newly found proteins. We developed a strategy, comprising a new set of descriptors and PCA, to map proteins based on their cavity physicochemical properties. Finally, we applied our developed strategies to design a set of glycopeptides which were used to study autoimmune arthritis. A combination of docking and statistical molecular design, synthesis and biological evaluation led to new binders for two different class II MHC proteins and recognition by a panel of T-cell hybridomas. New and interesting SAR conclusions could be drawn and the results will serve as a basis for selection of peptides to include in in vivo studies. Molecular docking chemometrics multivariate analysis principal component analysis PCA design of experiments DoE PLS scoring functions ligand-binding cavity major histocompatibility complex MHC glycopeptide T-cell. Pharmaceutical chemistry Läkemedelskemi

1

Page generated in 0.0845 seconds