Global ETD Search

31	Técnicas de controle da diversidade de populações em algoritmos genéticos para determinação de estruturas de proteínas / Control of the Population Diversity in Genetic Algorithms for the Determination of Protein Structures Ó, Vinicius Tragante do 03 March 2009 (has links) Recentemente, pesquisadores têm proposto o uso de Algoritmos Genéticos (AGs) para a determinação da estrutura tridimensional de proteínas. No entanto, este é um problema difícil para um AG tradicional, pois na maioria das vezes ocorre a convergência prematura das soluções para ótimos locais. Isto ocorre porque o uso de mecanismos de seleção no AG acarreta uma perda da diversidade das soluções. Assim, neste trabalho, são investigadas estratégias para controlar a diversidade da população do AG e evitar que a solução fique rapidamente presa em ótimos locais. São empregadas bases de dados de ângulos de torção para a cadeia principal, cadeia lateral e técnicas de controle de diversidade em AGs conhecidas como Hipermutação e Imigrantes Aleatórios. Além disso, um novo algoritmo baseado no AG com Imigrantes Aleatórios Auto-Organizáveis é proposto. Os resultados mostram que estas variações são efetivas no objetivo de não manter o conjunto de soluções preso a uma região apenas, além de melhorar o desempenho para o problema de determinação de estruturas terciárias de proteínas. / Recently, researchers have proposed the use of Genetic Algorithms (GAs) for the determination of the three-dimensional structure of proteins. However, this problem is considered a difficult problem for the standard GA, because most of the cases the convergence occurs early, into local minima instead of the global optimum. This occurs because the use of selection mechanisms in the GA leads to a loss of diversity of solutions. With this in mind, in this work, strategies to control the diversity of the population in the GA are investigated in order to avoid the solution subset to be early caught in local optima. Database sets of torsion angles for the main chain and the side chain are employed, and also modifications in the GAs, known as Hypermutation and Random Immigrants. Besides these approaches, a new algorithm based on the Self-Organizing Random Immigrants is proposed. Results show that these changes are effective in the goal of avoiding the results ensemble to be trapped in a region, and also help improve the performance for the protein structure prediction problem. Algoritmos Genéticos Auto-Organização Estutura de Proteínas Genetic Algorithms Hipermutação Hypermutation Imigrantes Aletórios Protein Structure Prediction Random Immigrants Self-Organization
32	Algoritmos evolutivos e modelos simplificados de proteínas para predição de estruturas terciárias / Evolutionary algorithms and simplified models for tertiary protein structure prediction Gabriel, Paulo Henrique Ribeiro 23 March 2010 (has links) A predição de estruturas de proteínas (Protein Structure Prediction PSP) é um problema computacionalmente complexo. Para tratar esse problema, modelos simplificados de proteínas, como o Modelo HP, têm sido empregados para representar as conformações e Algoritmos Evolutivos (AEs) são utilizados na busca por soluções adequadas para PSP. Entretanto, abordagens utilizando AEs muitas vezes não tratam adequadamente as soluções geradas, prejudicando o desempenho da busca. Neste trabalho, é apresentada uma formulação multiobjetivo para PSP em Modelo HP, de modo a avaliar de forma mais robusta as conformações produzidas combinando uma avaliação baseada no número de contatos hidrofóbicos com a distância entre os monômeros. Foi adotado o Algoritmo Evolutivo Multiobjetivo em Tabelas (AEMT) a fim de otimizar essas métricas. O algoritmo pode adequadamente explorar o espaço de busca com pequeno número de indivíduos. Como consequência, o total de avaliações da função objetivo é significativamente reduzido, gerando um método para PSP utilizando Modelo HP mais rápido e robusto / Protein Structure Prediction (PSP) is a computationally complex problem. To overcome this drawback, simplified models of protein structures, such as the HP Model, together with Evolutionary Algorithms (EAs) have been investigated in order to find appropriate solutions for PSP. EAs with the HP Model have shown interesting results, however, they do not adequately evaluate potential solutions by using only the usual metric of hydrophobic contacts, hamming the performance of the algorithm. In this work, we present a multi-objective approach for PSP using HP Model that performs a better evaluation of the solutions by combining the evaluation based on the number of hydrophobic contacts with the distance among the hydrophobic amino acids. We employ a Multi-objective Evolutionary Algorithm based on Sub-population Tables (MEAT) to deal with these two metrics. MEAT can adequately explore the search space with relatively low number of individuals. As a consequence, the total assessments of the objective function is significantly reduced generating a method for PSP using HP Model that is faster and more robust Algoritmos evolutivos Evolutionary algorithms HP model Modelo HP Multi-objective optimization Otimização multiobjetivo Predição de estrutura de proteínasw Protein structure prediction
33	Protein structure prediction : Zinc-binding sites, one-dimensional structure and remote homology Shu, Nanjiang January 2010 (has links) Predicting the three-dimensional (3D) structure of proteins is a central problem in biology. These computationally predicted 3D protein structures have been successfully applied in many fields of biomedicine, e.g. family assignments and drug discovery. The accurate detection of remotely homologous templates is critical for the successful prediction of the 3D structure of proteins. Also, the prediction of one-dimensional (1D) protein structures such as secondary structures and shape strings are useful for predicting the 3D structure of proteins and important for understanding the sequence-structure relationship. In addition, the prediction of the functional sites of proteins, such as metal-binding sites, can not only reveal the important function of proteins (even in the absence of the 3D structure) but also facilitate the prediction of the 3D structure. Here, three novel methods in the field of protein structure prediction are presented: PREDZINC, a method for predicting zinc-binding sites in proteins; Frag1D, a method for predicting the 1D structure of proteins; and FragMatch, a method for detecting remotely homologous proteins. These methods compete satisfactorily with the best methods previously published and contribute to the task of protein structure prediction. / At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript. / Protein structure prediction protein structure prediction zinc-binding profile homology detection shape string Bioinformatics Bioinformatik Molecular biology Molekylärbiologi Biochemistry Biokemi
34	Clustering System and Clustering Support Vector Machine for Local Protein Structure Prediction Zhong, Wei 02 August 2006 (has links) Protein tertiary structure plays a very important role in determining its possible functional sites and chemical interactions with other related proteins. Experimental methods to determine protein structure are time consuming and expensive. As a result, the gap between protein sequence and its structure has widened substantially due to the high throughput sequencing techniques. Problems of experimental methods motivate us to develop the computational algorithms for protein structure prediction. In this work, the clustering system is used to predict local protein structure. At first, recurring sequence clusters are explored with an improved K-means clustering algorithm. Carefully constructed sequence clusters are used to predict local protein structure. After obtaining the sequence clusters and motifs, we study how sequence variation for sequence clusters may influence its structural similarity. Analysis of the relationship between sequence variation and structural similarity for sequence clusters shows that sequence clusters with tight sequence variation have high structural similarity and sequence clusters with wide sequence variation have poor structural similarity. Based on above knowledge, the established clustering system is used to predict the tertiary structure for local sequence segments. Test results indicate that highest quality clusters can give highly reliable prediction results and high quality clusters can give reliable prediction results. In order to improve the performance of the clustering system for local protein structure prediction, a novel computational model called Clustering Support Vector Machines (CSVMs) is proposed. In our previous work, the sequence-to-structure relationship with the K-means algorithm has been explored by the conventional K-means algorithm. The K-means clustering algorithm may not capture nonlinear sequence-to-structure relationship effectively. As a result, we consider using Support Vector Machine (SVM) to capture the nonlinear sequence-to-structure relationship. However, SVM is not favorable for huge datasets including millions of samples. Therefore, we propose a novel computational model called CSVMs. Taking advantage of both the theory of granular computing and advanced statistical learning methodology, CSVMs are built specifically for each information granule partitioned intelligently by the clustering algorithm. Compared with the clustering system introduced previously, our experimental results show that accuracy for local structure prediction has been improved noticeably when CSVMs are applied. granular computing SVM (Support Vector Machine) K-means clustering algorithm sequence motif protein structure prediction Computer Sciences
35	Sequenz, Energie, Struktur - Untersuchungen zur Beziehung zwischen Primär- und Tertiärstruktur in globulären und Membran-Proteinen Dressel, Frank 30 September 2008 (has links) (PDF) Proteine spielen auf der zellulären Ebene eines Organismus eine fundamentale Rolle. Sie sind quasi die „Maschinen“ der Zelle. Ihre Bedeutung wird nicht zuletzt in ihrem Namen deutlich, welcher 1838 erstmals von J. Berzelius verwendet wurde und „das Erste“, „das Wichtigste“ bedeutet. Proteine sind aus Aminosäuren aufgebaute Moleküle. Unter physiologischen Bedingungen besitzen sie eine definierte dreidimensionale Gestalt, welche für ihre biologische Funktion bestimmend ist. Es wird heutzutage davon ausgegangen, dass diese dreidimensionale, stabile Struktur von Proteinen eindeutig durch die Abfolge der einzelnen Aminosäuren, der Sequenz, bestimmt ist. Diese Abfolge ist für jedes Protein in der Desoxyribonukleinsäure (DNS) gespeichert. Es ist allerdings eines der größten ungelösten Probleme der letzten Jahrzehnte, wie die Beziehung zwischen Sequenz und 3D-Struktur tatsächlich aussieht. Die Beantwortung dieser Fragestellung erfordert interdisziplinäre Ansätze aus Biologie, Informatik und Physik. In dieser Arbeit werden mit Hilfe von Methoden der theoretischen (Bio-) Physik einige der damit verbundenen Aspekte untersucht. Das Hauptaugenmerk liegt dabei auf Wechselwirkungen der einzelnen Aminosäuren eines Proteins untereinander, wofür in dieser Arbeit ein entsprechendes Energiemodell entwickelt wurde. Es werden Grundzustände sowie Energielandschaften untersucht und mit experimentellen Daten verglichen. Die Stärke der Wechselwirkung einzelner Aminosäuren erlaubt zusätzlich Aussagen über die Stabilität von Proteinen bezüglich mechanischer Kräfte. Die vorliegende Arbeit unterteilt sich wie folgt: Kapitel 2 dient der Einleitung und stellt Proteine und ihre Funktionen dar. Kapitel 3 stellt die Modellierung der Proteinstrukturen in zwei verschiedenen Modellen vor, welche in dieser Arbeit entwickelt wurden, um 3D-Strukturen von Proteinen zu beschreiben. Anschließend wird in Kapitel 4 ein Algorithmus zum Auffinden des exakten Energieminimums dargestellt. Kapitel 5 beschäftigt sich mit der Frage, wie eine geeignete diskrete Energiefunktion aus experimentellen Daten gewonnen werden kann. In Kapitel 6 werden erste Ergebnisse dieses Modells dargestellt. Der Frage, ob der experimentell bestimmte Zustand dem energetischen Grundzustand eines Proteins entspricht, wird in Kapitel 7 nachgegangen. Die beiden Kapitel 8 und 9 zeigen die Anwendung des Modells an zwei Proteinen, dem Tryptophan cage protein als dem kleinsten, stabilen Protein und Kinesin, einem Motorprotein, für welches 2007 aufschlussreiche Experimente zur mechanischen Stabilität durchgeführt wurden. Kapitel 10 bis 12 widmen sich Membranproteinen. Dabei beschäftigt sich Kapitel 10 mit der Vorhersage von stabilen Bereichen (sog. Entfaltungsbarrieren) unter externer Krafteinwirkung. Zu Beginn wird eine kurze Einleitung zu Membranproteinen gegeben. Im folgenden Kapitel 11 wird die Entfaltung mit Hilfe des Modells und Monte-Carlo-Techniken simuliert. Mit dem an Membranproteine angepassten Wechselwirkungsmodell ist es möglich, den Einfluss von Mutationen auch ohne explizite strukturelle Informationen vorherzusagen. Dieses Thema wird in Kapitel 12 diskutiert. Die Beziehung zwischen Primär- und Tertiärstruktur eines Proteins wird in Kapitel 13 behandelt. Es wird ein Ansatz skizziert, welcher in der Lage ist, Strukturbeziehungen zwischen Proteinen zu detektieren, die mit herkömmlichen Methoden der Bioinformatik nicht gefunden werden können. Die letzten beiden Kapitel schließlich geben eine Zusammenfassung bzw. einen Ausblick auf künftige Entwicklungen und Anwendungen des Modells. Proteinstrukturvorhersage AFM Membranprotein Modellierung Proteine protein structure prediction AFM membrane protein coarse grained model proteins ddc:530 rvk:WD 2100
36	Scoring functions for protein docking and drug design Viswanath, Shruthi 26 June 2014 (has links) Predicting the structure of complexes formed by two interacting proteins is an important problem in computation structural biology. Proteins perform many of their functions by binding to other proteins. The structure of protein-protein complexes provides atomic details about protein function and biochemical pathways, and can help in designing drugs that inhibit binding. Docking computationally models the structure of protein-protein complexes, given three-dimensional structures of the individual chains. Protein docking methods have two phases. In the first phase, a comprehensive, coarse search is performed for optimally docked models. In the second refinement and reranking phase, the models from the first phase are refined and reranked, with the expectation of extracting a small set of accurate models from the pool of thousands of models obtained from the first phase. In this thesis, new algorithms are developed for the refinement and reranking phase of docking. New scoring functions, or potentials, that rank models are developed. These potentials are learnt using large-scale machine learning methods based on mathematical programming. The procedure for learning these potentials involves examining hundreds of thousands of correct and incorrect models. In this thesis, hierarchical constraints were introduced into the learning algorithm. First, an atomic potential was developed using this learning procedure. A refinement procedure involving side-chain remodeling and conjugate gradient-based minimization was introduced. The refinement procedure combined with the atomic potential was shown to improve docking accuracy significantly. Second, a hydrogen bond potential, was developed. Molecular dynamics-based sampling combined with the hydrogen bond potential improved docking predictions. Third, mathematical programming compared favorably to SVMs and neural networks in terms of accuracy, training and test time for the task of designing potentials to rank docking models. The methods described in this thesis are implemented in the docking package DOCK/PIERR. DOCK/PIERR was shown to be among the best automated docking methods in community wide assessments. Finally, DOCK/PIERR was extended to predict membrane protein complexes. A membrane-based score was added to the reranking phase, and shown to improve the accuracy of docking. This docking algorithm for membrane proteins was used to study the dimers of amyloid precursor protein, implicated in Alzheimer's disease.R. DOCK/PIERR was shown to be among the best automated docking methods in community wide assessments. Finally, DOCK/PIERR was extended to predict membrane protein complexes. A membrane-based score was added to the reranking phase, and shown to improve the accuracy of docking. This docking algorithm for membrane proteins was used to study the dimers of amyloid precursor protein, implicated in Alzheimer’s disease. / text Protein structure prediction Protein docking Scoring functions Knowledge-based potentials Machine learning Computational structural biology Membrane proteins Protein complexes
37	Refinement of reduced protein models with all-atom force fields Wróblewska, Liliana 14 November 2007 (has links) The goal of the following thesis research was to develop a systematic approach for the refinement of low-resolution protein models, as a part of the protein structure prediction procedure. Significant progress has been made in the field of protein structure prediction and the contemporary methods are able to assemble correct topology for a large fraction of protein domains. But such approximate models are often not detailed enough for some important applications, including studies of reaction mechanisms, functional annotation, drug design or virtual ligand screening. The development of a method that could bring those structures closer to the native is then of great importance. The minimal requirements for a potential that can refine protein structures is the existence of a correlation between the energy with native similarity and the scoring of the native structure as being lowest in energy. Extensive tests of the contemporary all-atom physics-based force fields were conducted to assess their applicability for refinement. The tests revealed flatness of such potentials and enabled the identification of the key problems in the current approaches. Guided by these results, the optimization of the AMBER (ff03) force field was performed that aimed at creating a funnel shape of the potential, with the native structure at the global minimum. Such shape should facilitate the conformational search during refinement and drive it towards the native conformation. Adjusting the relative weights of particular energy components, and adding an explicit hydrogen bond potential significantly improved the average correlation coefficient of the energy with native similarity (from 0.25 for the original ff03 potential to 0.65 for the optimized force field). The fraction of proteins for which the native structure had lowest energy increased from 0.22 to 0.90. The new, optimized potential was subsequently used to refine protein models of various native-similarity. The test employed 47 proteins and 100 decoy structures per protein. When the lowest energy structure from each trajectory was compared with the starting decoy, we observed structural improvement for 70% of the models on average. Such an unprecedented result of a systematic refinement is extremely promising in the context of high-resolution structure prediction. Force field optimization Decoy scoring Protein structure prediction Protein model refinement Proteins Analysis Amino acids Analysis Computational biology Proteins Structure
38	Implementação de um framework de computação evolutiva multi-objetivo para predição Ab Initio da estrutura terciária de proteínas / Implementation of multi-objective evolutionary framework for Ab Initio protein structure prediction Rodrigo Antonio Faccioli 24 August 2012 (has links) A demanda criada pelos estudos biológicos resultou para predição da estrutura terciária de proteínas ser uma alternativa, uma vez que menos de 1% das sequências conhecidas possuem sua estrutura terciária determinada experimentalmente. As predições Ab initio foca nas funções baseadas da física, a qual se trata apenas das informações providas pela sequência primária. Por consequência, um espaço de busca com muitos mínimos locais ótimos deve ser pesquisado. Este cenário complexo evidencia uma carência de algoritmos eficientes para este espaço, tornando-se assim o principal obstáculo para este tipo de predição. A optimização Multi-Objetiva, principalmente os Algoritmos Evolutivos, vem sendo aplicados na predição da estrutura terciária já que na mesma se envolve um compromisso entre os objetivos. Este trabalho apresenta o framework ProtPred-PEO-GROMACS, ou simplesmente 3PG, que não somente faz predições com a mesma acurácia encontrada na literatura, mas também, permite investigar a predição por meio da manipulação de combinações de objetivos, tanto no aspecto energético quanto no estrutural. Além disso, o 3PG facilita a implementação de novas opções, métodos de análises e também novos algoritmos evolutivos. A fim de salientar a capacidade do 3PG, foi então discorrida uma comparação entre os algoritmos NSGA-II e SPEA2 aplicados na predição Ab initio da estrutura terciária de proteínas em seis combinações de objetivos. Ademais, o uso da técnica de refinamento por Dinâmica Molecular é avaliado. Os resultados foram adequados quando comparado com outras técnicas de predições: Algoritmos Evolutivo Multi-Objetivo, Replica Exchange Molecular Dynamics, PEP-FOLD e Folding@Home. / The demand created by biological studies resulted the structure prediction as an alternative, since less than 1% of the known protein primary sequences have their 3D structure experimentally determined. Ab initio predictions focus on physics-based functions, which regard only information about the primary sequence. As a consequence, a search space with several local optima must be sampled, leading to insucient sampling of this space, which is the main hindrance towards better predictions. Multi-Objective Optimization approaches, particularly the Evolutionary Algorithms, have been applied in protein structure prediction as it involves a compromise among conicting objectives. In this paper we present the ProtPred-PEO-GROMACS framework, or 3PG, which can not only make protein structure predictions with the same accuracy standards as those found in the literature, but also allows the study of protein structures by handling several energetic and structural objective combinations. Moreover, the 3PG framework facilitates the fast implementation of new objective options, method analysis and even new evolutionary algorithms. In this study, we perform a comparison between the NSGA-II and SPEA2 algorithms applied on six dierent combinations of objectives to the protein structure. Besides, the use of Molecular Dynamics simulations as a renement technique is assessed. The results were suitable when comparated with other prediction methodologies, such as: Multi-Objective Evolutionary Algorithms, Replica Exchange Molecular Dynamics, PEP-FOLD and Folding@Home. Algoritmos evolutivos multi-objetivo Framework Ab initio protein structure prediction Framework Multi-objective evolutionary algorithms
39	Técnicas de controle da diversidade de populações em algoritmos genéticos para determinação de estruturas de proteínas / Control of the Population Diversity in Genetic Algorithms for the Determination of Protein Structures Vinicius Tragante do Ó 03 March 2009 (has links) Recentemente, pesquisadores têm proposto o uso de Algoritmos Genéticos (AGs) para a determinação da estrutura tridimensional de proteínas. No entanto, este é um problema difícil para um AG tradicional, pois na maioria das vezes ocorre a convergência prematura das soluções para ótimos locais. Isto ocorre porque o uso de mecanismos de seleção no AG acarreta uma perda da diversidade das soluções. Assim, neste trabalho, são investigadas estratégias para controlar a diversidade da população do AG e evitar que a solução fique rapidamente presa em ótimos locais. São empregadas bases de dados de ângulos de torção para a cadeia principal, cadeia lateral e técnicas de controle de diversidade em AGs conhecidas como Hipermutação e Imigrantes Aleatórios. Além disso, um novo algoritmo baseado no AG com Imigrantes Aleatórios Auto-Organizáveis é proposto. Os resultados mostram que estas variações são efetivas no objetivo de não manter o conjunto de soluções preso a uma região apenas, além de melhorar o desempenho para o problema de determinação de estruturas terciárias de proteínas. / Recently, researchers have proposed the use of Genetic Algorithms (GAs) for the determination of the three-dimensional structure of proteins. However, this problem is considered a difficult problem for the standard GA, because most of the cases the convergence occurs early, into local minima instead of the global optimum. This occurs because the use of selection mechanisms in the GA leads to a loss of diversity of solutions. With this in mind, in this work, strategies to control the diversity of the population in the GA are investigated in order to avoid the solution subset to be early caught in local optima. Database sets of torsion angles for the main chain and the side chain are employed, and also modifications in the GAs, known as Hypermutation and Random Immigrants. Besides these approaches, a new algorithm based on the Self-Organizing Random Immigrants is proposed. Results show that these changes are effective in the goal of avoiding the results ensemble to be trapped in a region, and also help improve the performance for the protein structure prediction problem. Algoritmos Genéticos Auto-Organização Estutura de Proteínas Hipermutação Imigrantes Aletórios Genetic Algorithms Hypermutation Protein Structure Prediction Random Immigrants Self-Organization
40	Protein Model Quality Assessment : A Machine Learning Approach Uziela, Karolis January 2017 (has links) Many protein structure prediction programs exist and they can efficiently generate a number of protein models of a varying quality. One of the problems is that it is difficult to know which model is the best one for a given target sequence. Selecting the best model is one of the major tasks of Model Quality Assessment Programs (MQAPs). These programs are able to predict model accuracy before the native structure is determined. The accuracy estimation can be divided into two parts: global (the whole model accuracy) and local (the accuracy of each residue). ProQ2 is one of the most successful MQAPs for prediction of both local and global model accuracy and is based on a Machine Learning approach. In this thesis, I present my own contribution to Model Quality Assessment (MQA) and the newest developments of ProQ program series. Firstly, I describe a new ProQ2 implementation in the protein modelling software package Rosetta. This new implementation allows use of ProQ2 as a scoring function for conformational sampling inside Rosetta, which was not possible before. Moreover, I present two new methods, ProQ3 and ProQ3D that both outperform their predecessor. ProQ3 introduces new training features that are calculated from Rosetta energy functions and ProQ3D introduces a new machine learning approach based on deep learning. ProQ3 program participated in the 12th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP12) and was one of the best methods in the MQA category. Finally, an important issue in model quality assessment is how to select a target function that the predictor is trying to learn. In the fourth manuscript, I show that MQA results can be improved by selecting a contact-based target function instead of more conventional superposition based functions. / <p>At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript.</p> Protein Model Quality Assessment structural bioinformatics machine learning deep learning support vector machine proq Artificial Neural Network protein structure prediction

Search results