Global ETD Search

11	Feature Identification and Reduction for Improved Generalization Accuracy in Secondary-Structure Prediction Using Temporal Context Inputs in Machine-Learning Models Seeley, Matthew Benjamin 01 May 2015 (has links) (PDF) A protein's properties are influenced by both its amino-acid sequence and its three-dimensional conformation. Ascertaining a protein's sequence is relatively easy using modern techniques, but determining its conformation requires much more expensive and time-consuming techniques. Consequently, it would be useful to identify a method that can accurately predict a protein's secondary-structure conformation using only the protein's sequence data. This problem is not trivial, however, because identical amino-acid subsequences in different contexts sometimes have disparate secondary structures, while highly dissimilar amino-acid subsequences sometimes have identical secondary structures. We propose (1) to develop a set of metrics that facilitates better comparisons between dissimilar subsequences and (2) to design a custom set of inputs for machine-learning models that can harness contextual dependence information between the secondary structures of successive amino acids in order to achieve better secondary-structure prediction accuracy. Bioinformatics machine learning secondary-structure prediction amino-acid properties Computer Sciences
12	Computer-aided modeling and simulation of molecular systems and protein secondary structure prediction Soni, Ravi January 1993 (has links) No description available. Engineering, Mechanical computer-aided modeling molecular systems protein secondary structure prediction
13	Sparse RNA folding revisited Will, Sebastian, Jabbari, Hosna 09 June 2016 (has links) (PDF) Background: RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, spaceefficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Results: Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by n2, but are typically much smaller. The time complexity of RNA folding is reduced from O(n3) to O(n2 + nZ); the space complexity, from O(n2) to O(n + T + Z). Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). Conclusions: The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA–RNA-interaction prediction are expected to profit even stronger than \"standard\" MFE folding. SparseMFEFold is free software, available at http://www.bioinf.unileipzig. de/~will/Software/SparseMFEFold. Raumeinsparung Pseudoknoten-RNA RNA-Sekundärstruktur Space efficient sparsification Pseudoknot-free RNA folding RNA secondary structure prediction ddc:004 ddc:570
14	Discovering Protein Sequence-Structure Motifs and Two Applications to Structural Prediction Tang, Thomas Cheuk Kai January 2004 (has links) This thesis investigates the correlations between short protein peptide sequences and local tertiary structures. In particular, it introduces a novel algorithm for partitioning short protein segments into clusters of local sequence-structure motifs, and demonstrates that these motif clusters contain useful structural information via two applications to structural prediction. The first application utilizes motif clusters to predict local protein tertiary structures. A novel dynamic programming algorithm that performs comparably with some of the best existing algorithms is described. The second application exploits the capability of motif clusters in recognizing regular secondary structures to improve the performance of secondary structure prediction based on Support Vector Machines. Empirical results show significant improvement in overall prediction accuracy with no performance degradation in any specific aspect being measured. The encouraging results obtained illustrate the great potential of using local sequence-structure motifs to tackle protein structure predictions and possibly other important problems in computational biology. Computer Science Bioinformatics Data mining Clustering Secondary Structure Prediction Local Tertiary Structure Prediction SVM
15	Discovering Protein Sequence-Structure Motifs and Two Applications to Structural Prediction Tang, Thomas Cheuk Kai January 2004 (has links) This thesis investigates the correlations between short protein peptide sequences and local tertiary structures. In particular, it introduces a novel algorithm for partitioning short protein segments into clusters of local sequence-structure motifs, and demonstrates that these motif clusters contain useful structural information via two applications to structural prediction. The first application utilizes motif clusters to predict local protein tertiary structures. A novel dynamic programming algorithm that performs comparably with some of the best existing algorithms is described. The second application exploits the capability of motif clusters in recognizing regular secondary structures to improve the performance of secondary structure prediction based on Support Vector Machines. Empirical results show significant improvement in overall prediction accuracy with no performance degradation in any specific aspect being measured. The encouraging results obtained illustrate the great potential of using local sequence-structure motifs to tackle protein structure predictions and possibly other important problems in computational biology. Computer Science Bioinformatics Data mining Clustering Secondary Structure Prediction Local Tertiary Structure Prediction SVM
16	Predikce sekundární struktury RNA sekvencí / Prediction of RNA secondary structure Klímová, Markéta January 2015 (has links) RNA secondary structure is very important in many biological processes. Efficient structure prediction can give information for experimental investigations of these processes. Many available programs for secondary structure prediction exist. Some of them use single sequence, the others use more related sequences. Pseudoknots are still problematic for most methods. This work presents several methods and publicly available software and the implementation of minimum free energy method is described.
17	Modeling RNA folding Hofacker, Ivo L., Stadler, Peter F. 04 February 2019 (has links) In recent years it has become evident that functional RNAs in living organisms are not just curious remnants from a primoridal RNA world but an ubiquitous phenomenon complementing protein enzyme based activity. Functional RNAs, just like proteins, depend in many cases upon their well-defined and evolutionarily conserved three-dimensional structure. In contrast to protein folds, however, RNA molecules have a biophysically important coarse-grained representation: their secondary structure. At this level of resolution at least, RNA structures can be efficiently predicted given only the sequence information. As a consequence, computational studies of RNA routinely incorporate structural information explicitly. RNA secondary structure prediction has proven useful in diverse fields ranging from theoretical models of sequence evolution and biopolymer folding, to genome analysis and even the design biotechnologically or pharmaceutically useful molecules. info:eu-repo/classification/ddc/572.88 ddc:572.88
18	Bayesian models and algoritms for protein secondary structure and beta-sheet prediction Aydin, Zafer 17 September 2008 (has links) In this thesis, we developed Bayesian models and machine learning algorithms for protein secondary structure and beta-sheet prediction problems. In protein secondary structure prediction, we developed hidden semi-Markov models, N-best algorithms and training set reduction procedures for proteins in the single-sequence category. We introduced three residue dependency models (both probabilistic and heuristic) incorporating the statistically significant amino acid correlation patterns at structural segment borders. We allowed dependencies to positions outside the segments to relax the condition of segment independence. Another novelty of the models is the dependency to downstream positions, which is important due to asymmetric correlation patterns observed uniformly in structural segments. Among the dataset reduction methods, we showed that the composition based reduction generated the most accurate results. To incorporate non-local interactions characteristic of beta-sheets, we developed two N-best algorithms and a Bayesian beta-sheet model. In beta-sheet prediction, we developed a Bayesian model to characterize the conformational organization of beta-sheets and efficient algorithms to compute the optimum architecture, which includes beta-strand pairings, interaction types (parallel or anti-parallel) and residue-residue interactions (contact maps). We introduced a Bayesian model for proteins with six or less beta-strands, in which we model the conformational features in a probabilistic framework by combining the amino acid pairing potentials with a priori knowledge of beta-strand organizations. To select the optimum beta-sheet architecture, we analyzed the space of possible conformations by efficient heuristics, in which we significantly reduce the search space by enforcing the amino acid pairs that have strong interaction potentials. For proteins with more than six beta-strands, we first computed beta-strand pairings using the BetaPro method. Then, we computed gapped alignments of the paired beta-strands in parallel and anti-parallel directions and chose the interaction types and beta-residue pairings with maximum alignment scores. Accurate prediction of secondary structure, beta-sheets and non-local contacts should improve the accuracy and quality of the three-dimensional structure prediction. Bayesian models Machine learning Hidden Markov models Contact map prediction Protein beta-sheet prediction Protein secondary structure prediction Molecular biology Amino acid sequence Bioinformatics
19	Protein secondary structure prediction using amino acid regularities Senekal, Frederick Petrus 23 January 2009 (has links) The protein folding problem is examined. Specifically, the problem of predicting protein secondary structure from the amino acid sequence is investigated. A literature study is presented into the protein folding process and the different techniques that currently exist to predict protein secondary structures. These techniques include the use of expert rules, statistics, information theory and various computational intelligence techniques, such as neural networks, nearest neighbour methods, Hidden Markov Models and Support Vector Machines. A pattern recognition technique based on statistical analysis is developed to predict protein secondary structure from the amino acid sequence. The technique can be applied to any problem where an input pattern is associated with an output pattern and each element in both the input and output patterns can take its value from a set with finite cardinality. The technique is applied to discover the role that small sequences of amino acids play in the formation of protein secondary structures. By applying the technique, a performance score of Q8 = 59:2% is achieved, with a corresponding Q3 score of 69.7%. This compares well with state of the art techniques, such as OSS-HMM and PSIPRED, which achieve Q3 scores of 67.9% and 66.8% respectively, when predictions on single sequences are made. / Dissertation (MEng)--University of Pretoria, 2009. / Electrical, Electronic and Computer Engineering / unrestricted Amino acid sequence Neural network Classification Secondary structure Protein secondary structure prediction Bioinformatics Pattern recognition Protein folding problem Amino acid (AA) Protein UCTD
20	Applications of Deep Neural Networks in Computer-Aided Drug Design Ahmadreza Ghanbarpour Ghouchani (10137641) 01 March 2021 (has links) <div>Deep neural networks (DNNs) have gained tremendous attention over the recent years due to their outstanding performance in solving many problems in different fields of science and technology. Currently, this field is of interest to many researchers and growing rapidly. The ability of DNNs to learn new concepts with minimal instructions facilitates applying current DNN-based methods to new problems. Here in this dissertation, three methods based on DNNs are discussed, tackling different problems in the field of computer-aided drug design.</div><div><br></div><div>The first method described addresses the problem of prediction of hydration properties from 3D structures of proteins without requiring molecular dynamics simulations. Water plays a major role in protein-ligand interactions and identifying (de)solvation contributions of water molecules can assist drug design. Two different model architectures are presented for the prediction the hydration information of proteins. The performance of the methods are compared with other conventional methods and experimental data. In addition, their applications in ligand optimization and pose prediction is shown.</div><div><br></div><div>The design of de novo molecules has always been of interest in the field of drug discovery. The second method describes a generative model that learns to derive features from protein sequences to design de novo compounds. We show how the model can be used to generate molecules similar to the known for the targets the model have not seen before and compare with benchmark generative models.</div><div><br></div><div>Finally, it is demonstrated how DNNs can learn to predict secondary structure propensity values derived from NMR ensembles. Secondary structure propensities are important in identifying flexible regions in proteins. Protein flexibility has a major role in drug-protein binding, and identifying such regions can assist in development of methods for ligand binding prediction. The prediction performance of the method is shown for several proteins with two or more known secondary structure conformations.</div> Computational Biology Cheminformatics Computational Chemistry Structural Biology Computer-Aided Drug Design Deep Neural Network (DNN) protein hydration sites De novo drug design Protein secondary structure prediction machine learning

Search results