Global ETD Search

1	The effect of genome variation on human proteins: understanding variants and improving their deleteriousness prediction through extensive contextualisation Raimondi, Daniele 15 May 2017 (has links) Rapid technological advances are providing unprecedented insights in the biologicalsciences, with massive amounts of data generated on genomic and protein sequences.These data continue to grow exponentially, and they are extremely valuable for com-putational tools where the effect of genomic variants on human health is predicted.State of the art tools in this field give varying results and only tend to agree in thecase of single variants that are strongly correlated to disease. The aim of this workis to increase the reliability of these methods, as well as our understanding of theunderlying biological mechanisms that lead to disease. We first developed machinelearning (ML) based structural bioinformatics predictors that are able to predictmolecular features of proteins from the sequence alone. We then used these tools forin silico analysis of the molecular effects of known variants on the affected proteins,and integrated these data with other sources heterogenous sources of information,such as the essentiality of a gene, that put the variants into their broader biologicalcontext. With this information we created DEOGEN, a novel predictor in this field,which is able to deal with the two most common forms of genomic variation, namelySingle Nucleotide Variants (SNVs) and short Insertions and DELetions (INDELs).DEOGEN performs at least on par with other state of the art methods in this fieldon different datasets. The method was then extended with additional contextualdata and is now available as DEOGEN2 via a web server, which visualizes the pre-dicted results for all variants in most human proteins through an interactive interfacetargeted to both bioinformaticians and clinicians. / Doctorat en Sciences / info:eu-repo/semantics/nonPublished Sciences exactes et naturelles single nucleotide variants variant effect prediction bioinformatics cysteine oxidation machine learning
2	Identifying Pathogenic Amino Acid Substitutions in Human Proteins Using Deep Learning / Identifiering av patogena aminosyresubstitutioner i mänskliga proteiner genom deep learning Kvist, Alexander January 2018 (has links) Many diseases of genetic origin originate from non-synonymous single nucleotide polymorphisms (nsSNPs). These cause changes in the final protein product encoded by a gene. Through large scale sequencing and population studies, there is growing availability of information of which variations are tolerated and which are not. Variant effect predictors use a wide range of information about such variations to predict their effect, often focusing on evolutionary information. Here, a novel amino acid substitution variant effect predictor is developed. The predictor is a deep convolutional neural network incorporating evolutionary information, sequence information, as well as structural information, to predict both the pathogenicity as well as the severity of amino acid substitutions. The model achieves state-of-the-art performance on benchmark datasets. Bioinformatics deep learning proteomics mutations single nucleotide polymorphism SNP convolutional neural network variant effect prediction Medical Engineering Medicinteknik
3	Predicting biomolecular function from 3D dynamics : sequence-sensitive coarse-grained elastic network model coupled to machine learning Mailhot, Olivier 08 1900 (has links) La dynamique structurelle des biomolécules est intimement liée à leur fonction, mais très coûteuse à étudier expériementalement. Pour cette raison, de nombreuses méthodologies computationnelles ont été développées afin de simuler la dynamique structurelle biomoléculaire. Toutefois, lorsque l'on s'intéresse à la modélisation des effects de milliers de mutations, les méthodes de simulations classiques comme la dynamique moléculaire, que ce soit à l'échelle atomique ou gros-grain, sont trop coûteuses pour la majorité des applications. D'autre part, les méthodes d'analyse de modes normaux de modèles de réseaux élastiques gros-grain (ENM pour "elastic network model") sont très rapides et procurent des solutions analytiques comprenant toutes les échelles de temps. Par contre, la majorité des ENMs considèrent seulement la géométrie du squelette biomoléculaire, ce qui en fait de mauvais choix pour étudier les effets de mutations qui ne changeraient pas cette géométrie. Le "Elastic Network Contact Model" (ENCoM) est le premier ENM sensible à la séquence de la biomolécule à l'étude, ce qui rend possible son utilisation pour l'exploration efficace d'espaces conformationnels complets de variants de séquence. La présente thèse introduit le pipeline computationel ENCoM-DynaSig-ML, qui réduit les espaces conformationnels prédits par ENCoM à des Signatures Dynamiques qui sont ensuite utilisées pour entraîner des modèles d'apprentissage machine simples. ENCoM-DynaSig-ML est capable de prédire la fonction de variants de séquence avec une précision significative, est complémentaire à toutes les méthodes existantes, et peut générer de nouvelles hypothèses à propos des éléments importants de dynamique structurelle pour une fonction moléculaire donnée. Nous présentons trois exemples d'étude de relations séquence-dynamique-fonction: la maturation des microARN, le potentiel d'activation de ligands du récepteur mu-opioïde et l'efficacité enzymatique de l'enzyme VIM-2 lactamase. Cette application novatrice de l'analyse des modes normaux est rapide, demandant seulement quelques secondes de temps de calcul par variant de séquence, et est généralisable à toute biomolécule pour laquelle des données expérimentale de mutagénèse sont disponibles. / The dynamics of biomolecules are intimately tied to their functions but experimentally elusive, making their computational study attractive. When modelling the effects of thousands of mutations, time-stepping methods such as classical or enhanced sampling molecular dynamics are too costly for most applications. On the other hand, normal mode analysis of coarse-grained elastic network models (ENMs) provides fast analytical dynamics spanning all timescales. However, the vast majority of ENMs consider backbone geometry alone, making them a poor choice to study point mutations which do not affect the equilibrium structure. The Elastic Network Contact Model (ENCoM) is the first sequence-sensitive ENM, enabling its use for the efficient exploration of full conformational spaces from sequence variants. The present work introduces the ENCoM-DynaSig-ML computational pipeline, in which the ENCoM conformational spaces are reduced to Dynamical Signatures and coupled to simple machine learning algorithms. ENCoM-DynaSig-ML predicts the function of sequence variants with significant accuracy, is complementary to all existing methods, and can generate new hypotheses about which dynamical features are important for the studied biomolecule's function. Examples given are the maturation efficiency of microRNA variants, the activation potential of mu-opioid receptor ligands and the effect of point mutations on VIM-2 lactamase's enzymatic efficiency. This novel application of normal mode analysis is very fast, taking a few seconds CPU time per variant, and is generalizable to any biomolecule on which experimental mutagenesis data exist. Dynamique structurelle Analyse des modes normaux Effet des mutations Dynamique de l'ARN Dynamique ligand-récepteur Dynamique des protéines Structural dynamics Normal mode analysis Effect of mutations RNA dynamics Ligand-receptor dynamics Protein dynamics

1

Page generated in 0.4194 seconds