• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5
  • 1
  • Tagged with
  • 7
  • 7
  • 6
  • 5
  • 5
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Loop Modeling in Proteins Using a Database Approach with Multi-Dimensional Scaling

Holtby, Daniel James 09 1900 (has links)
Modeling loops is an often necessary step in protein structure and function determination, even with experimental X-ray and NMR data. It is well known to be difficult. Database techniques have the advantage of producing a higher proportion of predictions with sub-angstrom accuracy when compared with ab initio techniques, but the disadvantage of often being able to produce usable results as they depend entirely on the loop already being represented within the database. My contribution is the LoopWeaver protocol, a database method that uses multidimensional scaling to rapidly achieve better clash-free, low energy placement of loops obtained from a database of protein structures. This maintains the above- mentioned advantage while avoiding the disadvantage by permitting the use of lower quality matches that would not otherwise fit. Test results show that this method achieves significantly better results than all other methods, including Modeler, Loopy, SuperLooper, and Rapper before refinement. With refinement, the results (LoopWeaver and Loopy combined) are better than ROSETTA's, with 0.53Å RMSD on average for 206 loops of length 6, 0.75Å local RMSD for 168 loops of length 7, 0.93Å RMSD for 117 loops of length 8, and 1.13Å RMSD loops of length 9, while ROSETTA scores 0.66Å , 0.93Å , 1.23Å , 1.56Å , respectively, at the same average time limit (3 hours on a 2.2 GHz Opteron). When ROSETTA is allowed to run for over a week against LoopWeaver's and Loopy's combined 3 hours, it approaches, but does not surpass, this accuracy.
2

Loop Modeling in Proteins Using a Database Approach with Multi-Dimensional Scaling

Holtby, Daniel James 09 1900 (has links)
Modeling loops is an often necessary step in protein structure and function determination, even with experimental X-ray and NMR data. It is well known to be difficult. Database techniques have the advantage of producing a higher proportion of predictions with sub-angstrom accuracy when compared with ab initio techniques, but the disadvantage of often being able to produce usable results as they depend entirely on the loop already being represented within the database. My contribution is the LoopWeaver protocol, a database method that uses multidimensional scaling to rapidly achieve better clash-free, low energy placement of loops obtained from a database of protein structures. This maintains the above- mentioned advantage while avoiding the disadvantage by permitting the use of lower quality matches that would not otherwise fit. Test results show that this method achieves significantly better results than all other methods, including Modeler, Loopy, SuperLooper, and Rapper before refinement. With refinement, the results (LoopWeaver and Loopy combined) are better than ROSETTA's, with 0.53Å RMSD on average for 206 loops of length 6, 0.75Å local RMSD for 168 loops of length 7, 0.93Å RMSD for 117 loops of length 8, and 1.13Å RMSD loops of length 9, while ROSETTA scores 0.66Å , 0.93Å , 1.23Å , 1.56Å , respectively, at the same average time limit (3 hours on a 2.2 GHz Opteron). When ROSETTA is allowed to run for over a week against LoopWeaver's and Loopy's combined 3 hours, it approaches, but does not surpass, this accuracy.
3

Variable Fidelity Optimization with Hardware-in-the-Loop for Flapping Flight

Duffield, Michael Luke 10 July 2013 (has links) (PDF)
Hardware-in-the-loop (HIL) modeling is a powerful way of modeling complicated systems. However, some hardware is expensive to use in terms of time or mechanical wear. In cases like these, optimizing using the hardware can be prohibitively expensive because of the number of calls to the hardware that are needed. Variable fidelity optimization can help overcome these problems. Variable fidelity optimization uses less expensive surrogates to optimize an expensive system while calling it fewer times. The surrogates are usually created from performing a design of experiments on the expensive model and fitting a surface to the results. However, some systems are too expensive to create a surrogate from. One such case is that of a flapping flight model. In this thesis, a technique for variable fidelity optimization of HIL has been created that optimizes a system while calling it as few times as possible. This technique is referred to as an intelligent DOE. This intelligent DOE was tested using simple models of various dimension. It was then used to find a flapping wing trajectory that maximizes lift. Through testing, the intelligent DOE was shown to be able to optimize expensive systems with fewer calls than traditional variable fidelity optimization would have needed. Savings as high as 97% were recorded. It was noted that as the number of design variables increased, the intelligent DOE became more effective by comparison because the number of calls needed by a traditional DOE based variable fidelity optimization increased faster than linearly, where the number of hardware calls for the intelligent increased linearly.
4

A Fold Recognition Approach to Modeling of Structurally Variable Regions

Levefelt, Christer January 2004 (has links)
<p>A novel approach is proposed for modeling of structurally variable regions in proteins. In this approach, a prerequisite sequence-structure alignment is examined for regions where the target sequence is not covered by the structural template. These regions, extended with a number of residues from adjacent stem regions, are submitted to fold recognition. The alignments produced by fold recognition are integrated into the initial alignment to create a multiple alignment where gaps in the main structural template are covered by local structural templates. This multiple alignment is used to create a protein model by existing protein modeling techniques.</p><p>Several alternative parameters are evaluated using a set of ten proteins. One set of parameters is selected and evaluated using another set of 31 proteins. The most promising result is for loop regions not located at the C- or N-terminal of a protein, where the method produces an average RMSD 12% lower than the loop modeling provided with the program MODELLER. This improvement is shown to be statistically significant.</p>
5

Statistical Computation for Problems in Dynamic Systems and Protein Folding

Wong, Samuel Wing Kwong 21 August 2013 (has links)
Inference for dynamic systems and conformational sampling for protein folding are two problems motivated by applied data, which pose computational challenges from a statistical perspective. Dynamic systems are often described by a set of coupled differential equations, and methods of parametric estimation for these models from noisy data can require repeatedly solving the equations numerically. Many of these models also lead to rough likelihood surfaces, which makes sampling difficult. We introduce a method for Bayesian inference on these models, using a multiple chain framework that exploits the underlying mathematical structure and interpolates the posterior to improve efficiency. In protein folding, a large conformational space must be searched for low energy states, where any energy function constructed on the states is at best approximate. We propose a method for sampling fragment conformations that accounts for geometric and energetic constraints, and explore ideas for folding entire proteins that account for uncertain energy landscapes and learning from data more effectively. These ingredients are combined into a framework for tackling the problem of generating improvements to protein structure predictions. / Statistics
6

A Fold Recognition Approach to Modeling of Structurally Variable Regions

Levefelt, Christer January 2004 (has links)
A novel approach is proposed for modeling of structurally variable regions in proteins. In this approach, a prerequisite sequence-structure alignment is examined for regions where the target sequence is not covered by the structural template. These regions, extended with a number of residues from adjacent stem regions, are submitted to fold recognition. The alignments produced by fold recognition are integrated into the initial alignment to create a multiple alignment where gaps in the main structural template are covered by local structural templates. This multiple alignment is used to create a protein model by existing protein modeling techniques. Several alternative parameters are evaluated using a set of ten proteins. One set of parameters is selected and evaluated using another set of 31 proteins. The most promising result is for loop regions not located at the C- or N-terminal of a protein, where the method produces an average RMSD 12% lower than the loop modeling provided with the program MODELLER. This improvement is shown to be statistically significant.
7

Machine Learning Algorithms for Characterization and Prediction of Protein Structural Properties

Shapovalov, Maxim V January 2019 (has links)
Proteins are large biomolecules which are functional building blocks of living organisms. There are about 22,000 protein-coding genes in the human genome. Each gene encodes a unique protein sequence of a typical 100-1000 length which is built using a 20-letter alphabet of amino acids. Each protein folds up into a unique 3D shape that enables it to perform its function. Each protein structure consists of some number of helical segments, extended segments called sheets, and loops that connect these elements. In the last two decades, machine learning methods coupled with exponentially expanding biological knowledge databases and computational power are enabling significant progress in the field of computational biology. In this dissertation, I carry out machine learning research for three major interconnected problems to advance protein structural biology as a field. A separate chapter in this dissertation is devoted to each problem. After the three chapters I conclude this doctoral research with a summary and direction of our future work. Chapter 1 describes design, training and application of a convolutional neural network (SecNet) to achieve 84% accuracy for the 60-year-old problem of predicting protein secondary structure given a protein sequence. Our accuracy is 2-3% better than any previous result, which had only risen 5% in last 20 years. We identified the key factors for successful prediction in a detailed ablation study. A paper submitted for publication includes our secondary-structure prediction software, data set generation, and training and testing protocols [1]. Chapter 2 characterizes the design and development of a protocol for clustering of beta turns, i.e. short structural motifs responsible for U-turns in protein loops. We identified 18 turn types, 11 of which are newly described [2]. We also developed a turn library and cross-platform software for turn assignment in new structures. In Chapter 3 I build upon the results from these two problems and predict geometries in loops of unknown structure with custom Residual Neural Networks (ResNet). I demonstrate solid results on (a) locating turns and predicting 18 types and (b) prediction of backbone torsion angles in loops. Given the recent progress in machine learning, these two results provide a strong foundation for successful loop modeling and encourage us to develop a new loop structure prediction program, a critical step in protein structure prediction and modeling. / Computer and Information Science

Page generated in 0.0983 seconds