• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • Tagged with
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Computationally Linking Chemical Exposure to Molecular Effects with Complex Data: Comparing Methods to Disentangle Chemical Drivers in Environmental Mixtures and Knowledge-based Deep Learning for Predictions in Environmental Toxicology

Krämer, Stefan 30 May 2022 (has links)
Chemical exposures affect the environment and may lead to adverse outcomes in its organisms. Omics-based approaches, like standardised microarray experiments, have expanded the toolbox to monitor the distribution of chemicals and assess the risk to organisms in the environment. The resulting complex data have extended the scope of toxicological knowledge bases and published literature. A plethora of computational approaches have been applied in environmental toxicology considering systems biology and data integration. Still, the complexity of environmental and biological systems given in data challenges investigations of exposure-related effects. This thesis aimed at computationally linking chemical exposure to biological effects on the molecular level considering sources of complex environmental data. The first study employed data of an omics-based exposure study considering mixture effects in a freshwater environment. We compared three data-driven analyses in their suitability to disentangle mixture effects of chemical exposures to biological effects and their reliability in attributing potentially adverse outcomes to chemical drivers with toxicological databases on gene and pathway levels. Differential gene expression analysis and a network inference approach resulted in toxicologically meaningful outcomes and uncovered individual chemical effects — stand-alone and in combination. We developed an integrative computational strategy to harvest exposure-related gene associations from environmental samples considering mixtures of lowly concentrated compounds. The applied approaches allowed assessing the hazard of chemicals more systematically with correlation-based compound groups. This dissertation presents another achievement toward a data-driven hypothesis generation for molecular exposure effects. The approach combined text-mining and deep learning. The study was entirely data-driven and involved state-of-the-art computational methods of artificial intelligence. We employed literature-based relational data and curated toxicological knowledge to predict chemical-biomolecule interactions. A word embedding neural network with a subsequent feed-forward network was implemented. Data augmentation and recurrent neural networks were beneficial for training with curated toxicological knowledge. The trained models reached accuracies of up to 94% for unseen test data of the employed knowledge base. However, we could not reliably confirm known chemical-gene interactions across selected data sources. Still, the predictive models might derive unknown information from toxicological knowledge sources, like literature, databases or omics-based exposure studies. Thus, the deep learning models might allow predicting hypotheses of exposure-related molecular effects. Both achievements of this dissertation might support the prioritisation of chemicals for testing and an intelligent selection of chemicals for monitoring in future exposure studies.:Table of Contents ... I Abstract ... V Acknowledgements ... VII Prelude ... IX 1 Introduction 1.1 An overview of environmental toxicology ... 2 1.1.1 Environmental toxicology ... 2 1.1.2 Chemicals in the environment ... 4 1.1.3 Systems biological perspectives in environmental toxicology ... 7 Computational toxicology ... 11 1.2.1 Omics-based approaches ... 12 1.2.2 Linking chemical exposure to transcriptional effects ... 14 1.2.3 Up-scaling from the gene level to higher biological organisation levels ... 19 1.2.4 Biomedical literature-based discovery ... 24 1.2.5 Deep learning with knowledge representation ... 27 1.3 Research question and approaches ... 29 2 Methods and Data ... 33 2.1 Linking environmental relevant mixture exposures to transcriptional effects ... 34 2.1.1 Exposure and microarray data ... 34 2.1.2 Preprocessing ... 35 2.1.3 Differential gene expression ... 37 2.1.4 Association rule mining ... 38 2.1.5 Weighted gene correlation network analysis ... 39 2.1.6 Method comparison ... 41 Predicting exposure-related effects on a molecular level ... 44 2.2.1 Input ... 44 2.2.2 Input preparation ... 47 2.2.3 Deep learning models ... 49 2.2.4 Toxicogenomic application ... 54 3 Method comparison to link complex stream water exposures to effects on the transcriptional level ... 57 3.1 Background and motivation ... 58 3.1.1 Workflow ... 61 3.2 Results ... 62 3.2.1 Data preprocessing ... 62 3.2.2 Differential gene expression analysis ... 67 3.2.3 Association rule mining ... 71 3.2.4 Network inference ... 78 3.2.5 Method comparison ... 84 3.2.6 Application case of method integration ... 87 3.3 Discussion ... 91 3.4 Conclusion ... 99 4 Deep learning prediction of chemical-biomolecule interactions ... 101 4.1 Motivation ... 102 4.1.1Workflow ...105 4.2 Results ... 107 4.2.1 Input preparation ... 107 4.2.2 Model selection ... 110 4.2.3 Model comparison ... 118 4.2.4 Toxicogenomic application ... 121 4.2.5 Horizontal augmentation without tail-padding ...123 4.2.6 Four-class problem formulation ... 124 4.2.7 Training with CTD data ... 125 4.3 Discussion ... 129 4.3.1 Transferring biomedical knowledge towards toxicology ... 129 4.3.2 Deep learning with biomedical knowledge representation ...133 4.3.3 Data integration ...136 4.4 Conclusion ... 141 5 Conclusion and Future perspectives ... 143 5.1 Conclusion ... 143 5.1.1 Investigating complex mixtures in the environment ... 144 5.1.2 Complex knowledge from literature and curated databases predict chemical- biomolecule interactions ... 145 5.1.3 Linking chemical exposure to biological effects by integrating CTD ... 146 5.2 Future perspectives ... 147 S1 Supplement Chapter 1 ... 153 S1.1 Example of an estrogen bioassay ... 154 S1.2 Types of mode of action ... 154 S1.3 The dogma of molecular biology ... 157 S1.4 Transcriptomics ... 159 S2 Supplement Chapter 3 ... 161 S3 Supplement Chapter 4 ... 175 S3.1 Hyperparameter tuning results ... 176 S3.2 Functional enrichment with predicted chemical-gene interactions and CTD reference pathway genesets ... 179 S3.3 Reduction of learning rate in a model with large word embedding vectors ... 183 S3.4 Horizontal augmentation without tail-padding ... 183 S3.5 Four-relationship classification ... 185 S3.6 Interpreting loss observations for SemMedDB trained models ... 187 List of Abbreviations ... i List of Figures ... vi List of Tables ... x Bibliography ... xii Curriculum scientiae ... xxxix Selbständigkeitserklärung ... xliii
2

Reverse engineering signalling networks in cancer cells

Dorel, Mathurin 16 January 2023 (has links)
Spezialisierung Theoretische Biologie / Obwohl die Krebstherapie im letzten Jahrhundert große Fortschritte gemacht hat, bleibt die Resistenz gegen medikamentöse Behandlungen ein großes Hindernis im Kampf gegen den Krebs. In dieser Arbeit habe ich ein R-Paket namens STASNet entwickelt, das semi-quantitative Modelle der Signaltransduktion aus Signalisierungs-Störungsantwortdaten unter Verwendung von Least Square Modular Response Analysis-Modellen generiert. Um zu untersuchen, wie gut STASNet die Aktivität von Signalwegen quantifizieren kann, haben wir Perturbationsdaten von einem Paar isogener Darmkrebszelllinien mit und ohne SHP2-Knock-out, einem bekannten Resistenzmechanismus bei dieser Krebsart, verwendet. Ich habe dann untersucht die Resistenz gegen die MEK- und ALK-Hemmung beim Neuroblastom, einem pädiatrischen Krebs mit schlechter Prognose. Ein Wirkstoffscreening zeigte, dass der MEK-Inhibitor Selumetinib ein Panel von Neuroblastom-Zelllinien in drei sensitive und sechs resistente Zelllinien trennte, dass konnte nicht mit einzelnen molekularen Markern erklärt. STASNet-Modelle zeigten, dass die starke Resistenz gegen Selumetinib durch eine starke Rückkopplung von ERK auf MEK oder eine vielschichtige Rückkopplung sowohl auf MEK als auch auf IGF1R getrieben wurde. Aus dem Modell konnte eine kombinatorische Therapie abgeleitet werden, die auf MEK in Kombination mit entweder RAF oder IGF1R abzielt, je nach Art der in der Zelllinie vorhandenen Rückkopplungen. Schließlich ergab die Untersuchung der Wirkung von NF1-KO auf die Signalübertragung, dass der Verlust von NF1 den MAPK-Weg für die Liganden-induzierte Aktivierung hypersensibilisierte, aber das ERK-RAF-Rückkopplung störte. Die Erkenntnisse aus den in dieser Arbeit entwickelten Modellen werden somit dazu beitragen, personalisierte Kombinationen von Inhibitoren zu entwerfen, die als Zweitlinientherapie nach molekularer Untersuchung der Tumorreaktion auf die Erstbehandlung eingesetzt werden könnten. / Cancer therapy has seen immense progress over the last century but resistance to drug treatments remains a major obstacle in the war against cancer. I developed an R package named STASNet to generate models of signal transduction from signalling perturbation-response data using Least Square Modular Response Analysis models. I used these models to study how differences in signal transduction relate to drug resistance and can be used to make predictions about resistance mechanisms and optimal treatments. To show how STASNet can accurately quantify the activity of signalling pathways, I used perturbation data from a pair of isogenic colon cancer cell line with and without SHP2 knock-out, a known resistance mechanism in this cancer type, which showed that MAPK signalling is more affected by SHP2 knock-out than PI3K signalling, confirming the role of SHP2 as a primary MAPK component. I investigated resistance to MEK and ALK inhibition in neuroblastoma, a pediatric cancer with a dismal prognosis. The MEK inhibitor Selumetinib separated a panel of neuroblastoma cell lines into three sensitive and six resistant cell lines that could not be explained with individual molecular markers. STASNet models trained on perturbation-response data from these cell lines revealed that the strong resistance to Selumetinib was driven by a strong feedback from ERK to MEK or a multi-layered feedback to both MEK and IGF1R. This was confirmed by phosphoproteomics and suggested a therapy targeting MEK in combination with either RAF or IGF1R depending on the type of feedback present in the cell line that was confirmed experimentally. Finally, studying the effect of NF1-KO on signalling revealed that the loss of NF1 hyper-sensitized the MAPK pathway to ligand-induced activation but disrupted the ERK-RAF feedback. Those insights to design personalized combinations of inhibitors that could be used as second line therapy after molecularly monitoring the tumor response to the initial treatment.
3

Network Inference from Perturbation Data: Robustness, Identifiability and Experimental Design

Groß, Torsten 29 January 2021 (has links)
Hochdurchsatzverfahren quantifizieren eine Vielzahl zellulärer Komponenten, können aber selten deren Interaktionen beschreiben. Daher wurden in den letzten 20 Jahren verschiedenste Netzwerk-Rekonstruktionsmethoden entwickelt. Insbesondere Perturbationsdaten erlauben dabei Rückschlüsse über funktionelle Mechanismen in der Genregulierung, Signal Transduktion, intra-zellulärer Kommunikation und anderen Prozessen zu ziehen. Dennoch bleibt Netzwerkinferenz ein ungelöstes Problem, weil die meisten Methoden auf ungeeigneten Annahmen basieren und die Identifizierbarkeit von Netzwerkkanten nicht aufklären. Diesbezüglich beschreibt diese Dissertation eine neue Rekonstruktionsmethode, die auf einfachen Annahmen von Perturbationsausbreitung basiert. Damit ist sie in verschiedensten Zusammenhängen anwendbar und übertrifft andere Methoden in Standard-Benchmarks. Für MAPK und PI3K Signalwege in einer Adenokarzinom-Zellline generiert sie plausible Netzwerkhypothesen, die unterschiedliche Sensitivitäten von PI3K-Mutanten gegenüber verschiedener Inhibitoren überzeugend erklären. Weiterhin wird gezeigt, dass sich Netzwerk-Identifizierbarkeit durch ein intuitives Max-Flow Problem beschreiben lässt. Dieses analytische Resultat erlaubt effektive, identifizierbare Netzwerke zu ermitteln und das experimentelle Design aufwändiger Perturbationsexperimente zu optimieren. Umfangreiche Tests zeigen, dass der Ansatz im Vergleich zu zufällig generierten Perturbationssequenzen die Anzahl der für volle Identifizierbarkeit notwendigen Perturbationen auf unter ein Drittel senkt. Schließlich beschreibt die Dissertation eine mathematische Weiterentwicklung der Modular Response Analysis. Es wird gezeigt, dass sich das Problem als analytisch lösbare orthogonale Regression approximieren lässt. Dies erlaubt eine drastische Reduzierung des nummerischen Aufwands, womit sich deutlich größere Netzwerke rekonstruieren und neueste Hochdurchsatz-Perturbationsdaten auswerten lassen. / 'Omics' technologies provide extensive quantifications of components of biological systems but rarely characterize the interactions between them. To fill this gap, various network reconstruction methods have been developed over the past twenty years. Using perturbation data, these methods can deduce functional mechanisms in gene regulation, signal transduction, intra-cellular communication and many other cellular processes. Nevertheless, this reverse engineering problem remains essentially unsolved because inferred networks are often based on inapt assumptions, lack interpretability as well as a rigorous description of identifiability. To overcome these shortcoming, this thesis first presents a novel inference method which is based on a simple response logic. The underlying assumptions are so mild that the approach is suitable for a wide range of applications while also outperforming existing methods in standard benchmark data sets. For MAPK and PI3K signalling pathways in an adenocarcinoma cell line, it derived plausible network hypotheses, which explain distinct sensitivities of PI3K mutants to targeted inhibitors. Second, an intuitive maximum-flow problem is shown to describe identifiability of network interactions. This analytical result allows to devise identifiable effective network models in underdetermined settings and to optimize the design of costly perturbation experiments. Benchmarked on a database of human pathways, full network identifiability is obtained with less than a third of the perturbations that are needed in random experimental designs. Finally, the thesis presents mathematical advances within Modular Response Analysis (MRA), which is a popular framework to quantify network interaction strengths. It is shown that MRA can be approximated as an analytically solvable total least squares problem. This insight drastically reduces computational complexity, which allows to model much bigger networks and to handle novel large-scale perturbation data.

Page generated in 0.139 seconds