• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • 1
  • Tagged with
  • 7
  • 7
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Computational ligand discovery for the human and zebrafish sex hormone binding globulin

Thorsteinson, Nels 11 1900 (has links)
Virtual screening is a fast, low cost method to identify potential small molecule therapeutics from large chemical databases for the vast amount of target proteins emerging from the life sciences and bioinformatics. In this work, we applied several conventional and newly developed virtual screening approaches to identify novel non-steroidal ligands for the human and zebrafish sex hormone binding globulin (SHBG). The ‘benchmark set of steroids’ is a set of steroids with known affinities for human SHBG that has been widely used for validation in the development of different virtual screening methods. We have updated this data set by including additional steroidal SHBG ligands and by modifying the predicted binding orientations of several benchmark steroids in the SHBG binding site based on the use of an improved docking protocol and information from recent crystallographic data. The new steroid binding orientations and the expanded version of the benchmark set was then used to create new in silico models which were applied in virtual screening to identify high-affinity non-steroidal human SHBG ligands from a large chemical database. Anthropogenic compounds with the capacity to interact with the steroid-binding site of SHBG pose health risks to humans and other vertebrates including fish. We constructed a homology model of SHBG from zebrafish and applied virtual screening to identify ligands for zebrafish SHBG from a set of 80 000 existing commercial substances, many of which can be exposed to the aquatic environment. Six hits from this in silico screen were tested experimentally for zebrafish SHBG binding and three of them, hexestrol, 4-tert-octylcatechol, dihydrobenzo(a)pyren-7(8H)-one demonstrated micromolar binding affinity for the zebrafish SHBG. These findings demonstrate the feasibility of using virtual screening to identify anthropogenic compounds that may disrupt or highjack functionally important protein:ligand interactions. Studies applying this new computational toxicology method could increase the awareness of hazards posed by existing commercial chemicals at relatively low cost.
2

Computational ligand discovery for the human and zebrafish sex hormone binding globulin

Thorsteinson, Nels 11 1900 (has links)
Virtual screening is a fast, low cost method to identify potential small molecule therapeutics from large chemical databases for the vast amount of target proteins emerging from the life sciences and bioinformatics. In this work, we applied several conventional and newly developed virtual screening approaches to identify novel non-steroidal ligands for the human and zebrafish sex hormone binding globulin (SHBG). The ‘benchmark set of steroids’ is a set of steroids with known affinities for human SHBG that has been widely used for validation in the development of different virtual screening methods. We have updated this data set by including additional steroidal SHBG ligands and by modifying the predicted binding orientations of several benchmark steroids in the SHBG binding site based on the use of an improved docking protocol and information from recent crystallographic data. The new steroid binding orientations and the expanded version of the benchmark set was then used to create new in silico models which were applied in virtual screening to identify high-affinity non-steroidal human SHBG ligands from a large chemical database. Anthropogenic compounds with the capacity to interact with the steroid-binding site of SHBG pose health risks to humans and other vertebrates including fish. We constructed a homology model of SHBG from zebrafish and applied virtual screening to identify ligands for zebrafish SHBG from a set of 80 000 existing commercial substances, many of which can be exposed to the aquatic environment. Six hits from this in silico screen were tested experimentally for zebrafish SHBG binding and three of them, hexestrol, 4-tert-octylcatechol, dihydrobenzo(a)pyren-7(8H)-one demonstrated micromolar binding affinity for the zebrafish SHBG. These findings demonstrate the feasibility of using virtual screening to identify anthropogenic compounds that may disrupt or highjack functionally important protein:ligand interactions. Studies applying this new computational toxicology method could increase the awareness of hazards posed by existing commercial chemicals at relatively low cost.
3

Computational ligand discovery for the human and zebrafish sex hormone binding globulin

Thorsteinson, Nels 11 1900 (has links)
Virtual screening is a fast, low cost method to identify potential small molecule therapeutics from large chemical databases for the vast amount of target proteins emerging from the life sciences and bioinformatics. In this work, we applied several conventional and newly developed virtual screening approaches to identify novel non-steroidal ligands for the human and zebrafish sex hormone binding globulin (SHBG). The ‘benchmark set of steroids’ is a set of steroids with known affinities for human SHBG that has been widely used for validation in the development of different virtual screening methods. We have updated this data set by including additional steroidal SHBG ligands and by modifying the predicted binding orientations of several benchmark steroids in the SHBG binding site based on the use of an improved docking protocol and information from recent crystallographic data. The new steroid binding orientations and the expanded version of the benchmark set was then used to create new in silico models which were applied in virtual screening to identify high-affinity non-steroidal human SHBG ligands from a large chemical database. Anthropogenic compounds with the capacity to interact with the steroid-binding site of SHBG pose health risks to humans and other vertebrates including fish. We constructed a homology model of SHBG from zebrafish and applied virtual screening to identify ligands for zebrafish SHBG from a set of 80 000 existing commercial substances, many of which can be exposed to the aquatic environment. Six hits from this in silico screen were tested experimentally for zebrafish SHBG binding and three of them, hexestrol, 4-tert-octylcatechol, dihydrobenzo(a)pyren-7(8H)-one demonstrated micromolar binding affinity for the zebrafish SHBG. These findings demonstrate the feasibility of using virtual screening to identify anthropogenic compounds that may disrupt or highjack functionally important protein:ligand interactions. Studies applying this new computational toxicology method could increase the awareness of hazards posed by existing commercial chemicals at relatively low cost. / Science, Faculty of / Graduate
4

Computationally Linking Chemical Exposure to Molecular Effects with Complex Data: Comparing Methods to Disentangle Chemical Drivers in Environmental Mixtures and Knowledge-based Deep Learning for Predictions in Environmental Toxicology

Krämer, Stefan 30 May 2022 (has links)
Chemical exposures affect the environment and may lead to adverse outcomes in its organisms. Omics-based approaches, like standardised microarray experiments, have expanded the toolbox to monitor the distribution of chemicals and assess the risk to organisms in the environment. The resulting complex data have extended the scope of toxicological knowledge bases and published literature. A plethora of computational approaches have been applied in environmental toxicology considering systems biology and data integration. Still, the complexity of environmental and biological systems given in data challenges investigations of exposure-related effects. This thesis aimed at computationally linking chemical exposure to biological effects on the molecular level considering sources of complex environmental data. The first study employed data of an omics-based exposure study considering mixture effects in a freshwater environment. We compared three data-driven analyses in their suitability to disentangle mixture effects of chemical exposures to biological effects and their reliability in attributing potentially adverse outcomes to chemical drivers with toxicological databases on gene and pathway levels. Differential gene expression analysis and a network inference approach resulted in toxicologically meaningful outcomes and uncovered individual chemical effects — stand-alone and in combination. We developed an integrative computational strategy to harvest exposure-related gene associations from environmental samples considering mixtures of lowly concentrated compounds. The applied approaches allowed assessing the hazard of chemicals more systematically with correlation-based compound groups. This dissertation presents another achievement toward a data-driven hypothesis generation for molecular exposure effects. The approach combined text-mining and deep learning. The study was entirely data-driven and involved state-of-the-art computational methods of artificial intelligence. We employed literature-based relational data and curated toxicological knowledge to predict chemical-biomolecule interactions. A word embedding neural network with a subsequent feed-forward network was implemented. Data augmentation and recurrent neural networks were beneficial for training with curated toxicological knowledge. The trained models reached accuracies of up to 94% for unseen test data of the employed knowledge base. However, we could not reliably confirm known chemical-gene interactions across selected data sources. Still, the predictive models might derive unknown information from toxicological knowledge sources, like literature, databases or omics-based exposure studies. Thus, the deep learning models might allow predicting hypotheses of exposure-related molecular effects. Both achievements of this dissertation might support the prioritisation of chemicals for testing and an intelligent selection of chemicals for monitoring in future exposure studies.:Table of Contents ... I Abstract ... V Acknowledgements ... VII Prelude ... IX 1 Introduction 1.1 An overview of environmental toxicology ... 2 1.1.1 Environmental toxicology ... 2 1.1.2 Chemicals in the environment ... 4 1.1.3 Systems biological perspectives in environmental toxicology ... 7 Computational toxicology ... 11 1.2.1 Omics-based approaches ... 12 1.2.2 Linking chemical exposure to transcriptional effects ... 14 1.2.3 Up-scaling from the gene level to higher biological organisation levels ... 19 1.2.4 Biomedical literature-based discovery ... 24 1.2.5 Deep learning with knowledge representation ... 27 1.3 Research question and approaches ... 29 2 Methods and Data ... 33 2.1 Linking environmental relevant mixture exposures to transcriptional effects ... 34 2.1.1 Exposure and microarray data ... 34 2.1.2 Preprocessing ... 35 2.1.3 Differential gene expression ... 37 2.1.4 Association rule mining ... 38 2.1.5 Weighted gene correlation network analysis ... 39 2.1.6 Method comparison ... 41 Predicting exposure-related effects on a molecular level ... 44 2.2.1 Input ... 44 2.2.2 Input preparation ... 47 2.2.3 Deep learning models ... 49 2.2.4 Toxicogenomic application ... 54 3 Method comparison to link complex stream water exposures to effects on the transcriptional level ... 57 3.1 Background and motivation ... 58 3.1.1 Workflow ... 61 3.2 Results ... 62 3.2.1 Data preprocessing ... 62 3.2.2 Differential gene expression analysis ... 67 3.2.3 Association rule mining ... 71 3.2.4 Network inference ... 78 3.2.5 Method comparison ... 84 3.2.6 Application case of method integration ... 87 3.3 Discussion ... 91 3.4 Conclusion ... 99 4 Deep learning prediction of chemical-biomolecule interactions ... 101 4.1 Motivation ... 102 4.1.1Workflow ...105 4.2 Results ... 107 4.2.1 Input preparation ... 107 4.2.2 Model selection ... 110 4.2.3 Model comparison ... 118 4.2.4 Toxicogenomic application ... 121 4.2.5 Horizontal augmentation without tail-padding ...123 4.2.6 Four-class problem formulation ... 124 4.2.7 Training with CTD data ... 125 4.3 Discussion ... 129 4.3.1 Transferring biomedical knowledge towards toxicology ... 129 4.3.2 Deep learning with biomedical knowledge representation ...133 4.3.3 Data integration ...136 4.4 Conclusion ... 141 5 Conclusion and Future perspectives ... 143 5.1 Conclusion ... 143 5.1.1 Investigating complex mixtures in the environment ... 144 5.1.2 Complex knowledge from literature and curated databases predict chemical- biomolecule interactions ... 145 5.1.3 Linking chemical exposure to biological effects by integrating CTD ... 146 5.2 Future perspectives ... 147 S1 Supplement Chapter 1 ... 153 S1.1 Example of an estrogen bioassay ... 154 S1.2 Types of mode of action ... 154 S1.3 The dogma of molecular biology ... 157 S1.4 Transcriptomics ... 159 S2 Supplement Chapter 3 ... 161 S3 Supplement Chapter 4 ... 175 S3.1 Hyperparameter tuning results ... 176 S3.2 Functional enrichment with predicted chemical-gene interactions and CTD reference pathway genesets ... 179 S3.3 Reduction of learning rate in a model with large word embedding vectors ... 183 S3.4 Horizontal augmentation without tail-padding ... 183 S3.5 Four-relationship classification ... 185 S3.6 Interpreting loss observations for SemMedDB trained models ... 187 List of Abbreviations ... i List of Figures ... vi List of Tables ... x Bibliography ... xii Curriculum scientiae ... xxxix Selbständigkeitserklärung ... xliii
5

Contributions to Computational Methods for Association Extraction from Biomedical Data: Applications to Text Mining and In Silico Toxicology

Raies, Arwa B. 29 November 2018 (has links)
The task of association extraction involves identifying links between different entities. Here, we make contributions to two applications related to the biomedical field. The first application is in the domain of text mining aiming at extracting associations between methylated genes and diseases from biomedical literature. Gathering such associations can benefit disease diagnosis and treatment decisions. We developed the DDMGD database to provide a comprehensive repository of information related to genes methylated in diseases, gene expression, and disease progression. Using DEMGD, a text mining system that we developed, and with an additional post-processing, we extracted ~100,000 of such associations from free-text. The accuracy of extracted associations is 82% as estimated on 2,500 hand-curated entries. The second application is in the domain of computational toxicology that aims at identifying relationships between chemical compounds and toxicity effects. Identifying toxicity effects of chemicals is a necessary step in many processes including drug design. To extract these associations, we propose using multi-label classification (MLC) methods. These methods have not undergone comprehensive benchmarking in the domain of predictive toxicology that could help in identifying guidelines for overcoming the existing deficiencies of these methods. Therefore, we performed extensive benchmarking and analysis of ~19,000 MLC models. We demonstrated variability in the performance of these models under several conditions and determined the best performing model that achieves accuracy of 91% on an independent testing set. Finally, we propose a novel framework, LDR (learning from dense regions), for developing MLC and multi-target regression (MTR) models from datasets with missing labels. The framework is generic, so it can be applied to predict associations between samples and discrete or continuous labels. Our assessment shows that LDR performed better than the baseline approach (i.e., the binary relevance algorithm) when evaluated using four MLC and five MTR datasets. LDR achieved accuracy scores of up to 97% using testing MLC datasets, and R2 scores up to 88% for testing MTR datasets. Additionally, we developed a novel method for minority oversampling to tackle the problem of imbalanced MLC datasets. Our method improved the precision score of LDR by 10%.
6

Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology

Mistry, Pritesh, Neagu, Daniel, Trundle, Paul R., Vessey, J.D. 22 October 2015 (has links)
yes / Drug vehicles are chemical carriers that provide beneficial aid to the drugs they bear. Taking advantage of their favourable properties can potentially allow the safer use of drugs that are considered highly toxic. A means for vehicle selection without experimental trial would therefore be of benefit in saving time and money for the industry. Although machine learning is increasingly used in predictive toxicology, to our knowledge there is no reported work in using machine learning techniques to model drug-vehicle relationships for vehicle selection to minimise toxicity. In this paper we demonstrate the use of data mining and machine learning techniques to process, extract and build models based on classifiers (decision trees and random forests) that allow us to predict which vehicle would be most suited to reduce a drug’s toxicity. Using data acquired from the National Institute of Health’s (NIH) Developmental Therapeutics Program (DTP) we propose a methodology using an area under a curve (AUC) approach that allows us to distinguish which vehicle provides the best toxicity profile for a drug and build classification models based on this knowledge. Our results show that we can achieve prediction accuracies of 80 % using random forest models whilst the decision tree models produce accuracies in the 70 % region. We consider our methodology widely applicable within the scientific domain and beyond for comprehensively building classification models for the comparison of functional relationships between two variables.
7

Extraction et sélection de motifs émergents minimaux : application à la chémoinformatique / Extraction and selection of minimal emerging patterns : application to chemoinformatics

Kane, Mouhamadou bamba 06 September 2017 (has links)
La découverte de motifs est une tâche importante en fouille de données. Cemémoire traite de l’extraction des motifs émergents minimaux. Nous proposons une nouvelleméthode efficace qui permet d’extraire les motifs émergents minimaux sans ou avec contraintede support ; contrairement aux méthodes existantes qui extraient généralement les motifs émergentsminimaux les plus supportés, au risque de passer à côté de motifs très intéressants maispeu supportés par les données. De plus, notre méthode prend en compte l’absence d’attributqui apporte une nouvelle connaissance intéressante.En considérant les règles associées aux motifs émergents avec un support élevé comme desrègles prototypes, on a montré expérimentalement que cet ensemble de règles possède unebonne confiance sur les objets couverts mais malheureusement ne couvre pas une bonne partiedes objets ; ce qui constitue un frein pour leur usage en classification. Nous proposons uneméthode de sélection à base de prototypes qui améliore la couverture de l’ensemble des règlesprototypes sans pour autant dégrader leur confiance. Au vu des résultats encourageants obtenus,nous appliquons cette méthode de sélection sur un jeu de données chimique ayant rapport àl’environnement aquatique : Aquatox. Cela permet ainsi aux chimistes, dans un contexte declassification, de mieux expliquer la classification des molécules, qui sans cette méthode desélection serait prédites par l’usage d’une règle par défaut. / Pattern discovery is an important field of Knowledge Discovery in Databases.This work deals with the extraction of minimal emerging patterns. We propose a new efficientmethod which allows to extract the minimal emerging patterns with or without constraint ofsupport ; unlike existing methods that typically extract the most supported minimal emergentpatterns, at the risk of missing interesting but less supported patterns. Moreover, our methodtakes into account the absence of attribute that brings a new interesting knowledge.Considering the rules associated with emerging patterns highly supported as prototype rules,we have experimentally shown that this set of rules has good confidence on the covered objectsbut unfortunately does not cover a significant part of the objects ; which is a disavadntagefor their use in classification. We propose a prototype-based selection method that improvesthe coverage of the set of the prototype rules without a significative loss on their confidence.We apply our prototype-based selection method to a chemical data relating to the aquaticenvironment : Aquatox. In a classification context, it allows chemists to better explain theclassification of molecules, which, without this method of selection, would be predicted by theuse of a default rule.

Page generated in 0.2542 seconds