Spelling suggestions: "subject:"whole genome sequencing"" "subject:"thole genome sequencing""
91 |
Understanding of Salmonella-phytopathogen-environment-plant interactions and development of novel antimicrobial to reduce the Salmonella burden in fresh tomato productionDeblais, Loic January 2018 (has links)
No description available.
|
92 |
Study of Spatiotemporal Responses of Bacterial CellsMontagud Martínez, Roser 28 April 2023 (has links)
[ES] La biotecnología moderna se basa en la aplicación de una mezcla de herramientas experimentales y computacionales para llevar a cabo de forma dirigida la ingeniería genética. El objetivo es obtener células (re)programadas que implementen nuevas funciones o que sirvan como herramientas para el estudio de sistemas biológicos. En este contexto, el uso de bacterias en biotecnología está muy extendido. Sin embargo, la implementación de circuitos genéticos para el aprovechamiento de estos seres vivos puede verse limitada por procesos biológicos naturales; es decir, los circuitos diseñados (o naturales) pueden verse afectados por el transcurso del tiempo o por cambios en el entorno en el que crecen las bacterias. En esta tesis, nos propusimos seguir un enfoque integrador para estudiar cómo las bacterias responden en el tiempo y el espacio a los cambios genéticos y ambientales, que pueden afectar la funcionalidad de los circuitos de interés biotecnológico. Usamos Escherichia coli como organismo modelo, explotando una variedad de herramientas experimentales para trabajar con él. En primer lugar, estudiamos cómo los cambios ambientales y genéticos afectan la funcionalidad de un circuito genético sintético que implementa un comportamiento lógico sofisticado. Descubrimos que hay amplios rangos de concentración de entrada que el sistema puede procesar correctamente, que el circuito diseñado es bastante sensible a los efectos de la temperatura, que la expresión de pequeños ARN heterólogos es costosa para la célula y que una reorganización genética adecuada del sistema para reducir la cantidad de ADN heterólogo en la célula puede mejorar su estabilidad evolutiva. En segundo lugar, estudiamos el crecimiento bacteriano en entornos en los que existen materiales nanoestructurados. Descubrimos que las poblaciones bacterianas se pueden controlar en gran medida mediante el uso de marcos organometálicos, ya que estos materiales nanoestructurados pueden descomponerse lentamente en medios biológicos liberando agentes antimicrobianos (metales y compuestos orgánicos, incluidos los antibióticos). Analizamos la respuesta bacteriana espaciotemporal siguiendo un enfoque experimental y teórico combinado en un entorno tan complejo y desafiante en medios líquidos y sólidos. Además de las variaciones en el rendimiento debido a cambios ambientales, también se debe considerar que esos circuitos genéticos evolucionarán con el tiempo debido a la acumulación estocástica de mutaciones. Estas mutaciones pueden dar lugar a cambios en la funcionalidad de los circuitos reguladores. Por tanto, en tercer lugar, realizamos un experimento de evolución a largo plazo para estudiar la contribución de un sistema de chaperonas de proteínas en la modulación de la estabilidad evolutiva. En los últimos años, se ha demostrado que los sistemas de chaperonas, como GroES/EL, pueden amortiguar o purgar mutaciones. Realizamos la secuenciación del genoma completo en diferentes líneas con diferentes niveles de expresión de GroEL y también medimos la tasa de crecimiento de las células al principio y al final del experimento evolutivo. Sin embargo, nuestros resultados no fueron concluyentes, por lo que se necesita más investigación para comprender completamente el papel de GroES/EL en la evolución y evaluar su utilidad potencial en biotecnología. En conjunto, esta tesis intenta avanzar en nuestro conocimiento sobre cómo las bacterias, y E. coli en particular, se comportan como se espera cuando el entorno se altera, la fisiología cambia y pasa mucho tiempo, para posibles aplicaciones industriales o (pre)clínicas. / [CA] La biotecnologia moderna es basa en l'aplicació d'una mescla d'eines experimentals i computacionals per a realitzar de forma dirigida l'enginyeria genètica. L'objectiu és obtindre cèl·lules (re)programades que implementen noves funcions o que servisquen com a eines per a l'estudi de sistemes biològics. En aquest context, l'ús de bacteris en biotecnologia està molt estés. No obstant això, la implementació de circuits genètics per a l'aprofitament d'aquests éssers vius pot veure's limitada per processos biològics naturals; és a dir, els circuits dissenyats (o naturals) poden veure's afectats pel transcurs del temps o per canvis en l'entorn en el qual creixen els bacteris. En aquesta tesi, ens vam proposar seguir un enfocament integrador per a estudiar com els bacteris responen en el temps i l'espai als canvis genètics i ambientals, que poden afectar la funcionalitat dels circuits d'interés biotecnològic. Usem Escherichia coli com a organisme model, explotant una varietat d'eines experimentals per a treballar amb ell. En primer lloc, estudiem com els canvis ambientals i genètics afecten la funcionalitat d'un circuit genètic sintètic que implementa un comportament lògic sofisticat. Descobrim que hi ha amplis rangs de concentració d'entrada que el sistema pot processar correctament, que el circuit dissenyat és bastant sensible a l'efecte de la temperatura, que l'expressió de xicotets ARN heteròlegs és costosa per a la cèl·lula i que una reorganització genètica adequada del sistema per a reduir la quantitat d'ADN heteròleg en la cèl·lula pot millorar la seua estabilitat evolutiva. En segon lloc, estudiem el creixement bacterià en entorns en els quals existeixen materials nanoestructurats. Descobrim que les poblacions bacterianes es poden controlar en gran manera mitjançant l'ús de marcs organometàlics, ja que aquests materials nanoestructurats poden descompondre's lentament en medis biològics alliberant agents antimicrobians (metalls i compostos orgànics, inclosos els antibiòtics). Analitzem la resposta bacteriana espai-temporal seguint un enfocament experimental i teòric integrador en un entorn tan complex i desafiador en mitjans líquids i sòlids. A més de les variacions en el rendiment degut a canvis ambientals, també s'ha de considerar que aqueixos circuits genètics evolucionaran amb el temps degut a l'acumulació estocàstica de mutacions. Aquestes mutacions poden donar lloc a canvis en la funcionalitat dels circuits reguladors. Per tant, en tercer lloc, realitzem un experiment d'evolució a llarg termini per a estudiar la contribució d'un sistema de chaperones de proteïnes en la modulació de l'estabilitat evolutiva. En els últims anys, s'ha demostrat que els sistemes de chaperones, com GroES/EL, poden esmorteir o purgar mutacions. Realitzem la seqüenciació del genoma complet en diferents línies amb diferents nivells d'expressió de GroEL i també mesurem la taxa de creixement de les cèl·lules al principi i al final de l'experiment evolutiu. No obstant això, els nostres resultats no van ser concloents, per la qual cosa es necessita més investigació per a comprendre completament el paper de GroES/L en l'evolució i avaluar la seua utilitat potencial en biotecnologia. En conjunt, aquesta tesi intenta avançar en el nostre coneixement sobre com els bacteris, i E. coli en particular, es comporten com s'espera quan l'entorn s'altera, la fisiologia canvia i passa molt temps, per a possibles aplicacions industrials o (pre)clíniques. / [EN] Modern biotechnology is based on applying a mix of experimental and computational tools to perform in a directed way genetic engineering. The aim is to obtain (re)programmed cells that implement new functions or that serve as tools for the study of biological systems. In this context, the use of bacteria in biotechnology is widespread. However, the implementation of genetic circuits for the use of these living beings may be limited due to natural biological processes; that is, the engineered (or natural) circuits may be affected by the course of time or by changes in the environment in which bacteria grow. In this thesis, we proposed to follow an integrative approach to study how bacteria respond in time and space to genetic and environmental changes, which may affect the functionality of the circuits of biotechnological interest. We used Escherichia coli as a model organism, exploiting a variety of experimental tools to work with it. Firstly, we studied how environmental and genetic changes affect the functionality of a synthetic genetic circuit that implements a sophisticated logic behavior. We found that there are wide input concentration ranges that the system can correctly process, that the engineered circuitry is quite sensitive to temperature effects, that the expression of heterologous small RNAs is costly for the cell, and that a proper genetic reorganization of the system to reduce the amount of heterologous DNA in the cell can improve its evolutionary stability. Secondly, we studied of bacterial growth in environments in which there are nanostructured materials. We found that bacterial populations can be greatly controlled through the use of metal-organic frameworks, as these nanostructured materials can slowly decompose in biological media releasing antimicrobials (metals and organic compounds, including antibiotics). We analyzed the spatiotemporal bacterial response following a combined experimental and theoretical approach in a such a complex and challenging environment in both liquid and solid media. In addition to variations in performance due to environmental changes, it must also be considered that those gene circuits will evolve over time due to the stochastic accumulation of mutations. These mutations can lead to changes in the functionality of the regulatory circuits. Then thirdly, we performed an experiment of long-term evolution to study the contribution of a protein chaperone system in modulating evolutionary stability. In recent years, it has been shown that chaperone systems, such as GroES/EL, can buffer or purge mutations. We performed whole-genome sequencing over different lines with varying expression levels of GroEL, and also measured the growth rate of the cells at the beginning and the end of the evolutionary experiment. However, our results were not conclusive, so further research is needed to fully understand the role of GroES/EL in evolution and to assess its potential utility in biotechnology. Taken together, this thesis tries to advance our knowledge on how bacteria, and E. coli in particular, behave as expected when the environment is perturbed, the physiology changes, and long time passes, for potential industrial or (pre)clinical applications. / Montagud Martínez, R. (2023). Study of Spatiotemporal Responses of Bacterial Cells [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/193030
|
93 |
Développement d’outils pour l’étude in vivo de la régulation post-transcriptionnelle chez Caenorhabditis elegans / Tools developpement for in vivo post-transcriptional regulation study in Caenorhabditis elegansZniber, Ilyass 17 December 2012 (has links)
La régulation de l’expression des gènes est fondamentale pour coordonner la synthèse, l’assemblage et la localisation des complexes macromoléculaires dans les cellules. Cette expression est régulée à divers niveaux. Elle commence dans le noyau où les facteurs de transcription se lient à des séquences spécifiques d’ADN et recrutent les ARN polymérases pour la synthèse des ARN. La régulation à ce niveau est dite transcriptionnelle. Les protéines de liaison à l’ARN s’associent avec l’ARN en cours de synthèse et opèrent divers modifications comme l’addition d’une coiffe en 5’, l’épissage, l’édition et la poly-adénylation en 3’. Les transcrits sont alors exportés vers le cytoplasme où ils vont être adressés et stockés dans des régions subcellulaires. Les ARNm s’assemblent avec des facteurs de traduction et les ribosomes pour initier la synthèse protéique de manière contrôlée. Enfin, les ARNm sont dégradés. Les régulations qui touchent chacune de ces étapes sont dites post-transcriptionnelles. Le développement récent d’outils d’analyse à l’échelle génomique ont permis une meilleure compréhension globale des programmes de régulation des gènes au niveau transcriptionnel. Cependant, l’architecture globale des systèmes qui régulent les étapes post-transcriptionnelles d’expression des gènes est encore peu connue. Un tel système de régulation post-transcriptionnelle doit être contrôlé par des centaines de protéines de liaison à l’ARN et de microARN (miARN) encodés dans les génomes eucaryotes. C’est pourquoi il est important de disposer d’outils et de plateformes adaptés à l’étude de cette régulation à l’échelle génomique. Dans cette thèse, nous nous sommes intéressés à deux programmes de la régulation post-transcriptionnelle chez Caenorhabditis elegans : l’épissage alternatif et la régulation par les miARN. Nous avons utilisés des vers rapporteurs de l’épissage alternatif exprimant la double fluorescence GFP et RFP afin d’étudier l’architecture de cette régulation et l’identification ou la validation des facteurs en trans et des éléments en cis par génétique classique en utilisant la mutagenèse aléatoire, l’automatisation du crible grâce au COPAS biosorter et le séquençage des génomes entiers. Nous avons également modifiés en profondeur le module ReFlx du cytomètre en flux adapté aux organismes de grande taille (COPAS Biosorter) afin d’éliminer les problèmes de contamination et diviser par sept le temps nécessaire au traitement dans le but de mener une étude de génétique inverse à haut débit par ARN interférence. Nous avons enfin générer des lignées fluorescentes bi-colores pour étudier la régulation dépendante de la région 3’ UTR grâce aux microARN. / The regulation of gene expression is fundamental to coordinate the synthesis, assembly and localization of macromolecular complexes in cells. This expression is regulated at various levels. It begins in the nucleus where transcription factors bind to specific DNA sequences and recruit RNA polymerases to synthesize RNA. Regulation at this level is called transcriptional. RNA binding proteins associate with RNA during synthesis and operate various modifications such as the addition of a 5' cap, splicing, editing and polyadenylation at the 3'. The transcripts are then exported to the cytoplasm where they will be sent to subcellular regions and stored. mRNA are then associated with translation factors and ribosomes to initiate protein synthesis in a controlled manner. Finally, mRNAs are degraded. Regulations that affect each of these steps are called post-transcriptional regulations. The recent tools developments for genomic scale analysis have allowed a better overall understanding of gene regulation programs at the transcriptional level. However, the overall architecture of systems that regulate post-transcriptional steps of gene expression is still misunderstood. Such a system of post-transcriptional regulation must be controlled by hundreds of RNA binding proteins and microRNA (miRNA) encoded in eukaryotic genomes. This is why it is important to have tools and platforms suited to the study of the post-transcriptional regulation on a genomic scale. During this thesis, we have focused our work on two post-transcriptional regulation programs in Caenorhabditis elegans : alternative splicing and miRNAs regulation. We used GFP and RFP double fluorescent alternative splicing reporter lines to study the architecture of this regulation and to identify trans factors and cis-elements by using forward genetics, random mutagenesis, automated screen through COPAS biosorter and whole genome sequencing. We also extensively modified the ReFlx module of the COPAS to fix carry over problems and divide by seven the time required for processing in order to conduct a High throughput reverse genetic study using RNA interference. We finally generate bi-color fluorescent lines to study 3 'UTR regulation mediated by microRNAs.
|
94 |
Impact des changements climatiques et de la variabilité génétique sur le développement et la virulence du nématode à kyste du soya (Heterodera glycines)Gendron St-Marseille, Anne-Frédérique 05 1900 (has links)
Les invasions biologiques dans les agroécosystèmes engendrent de lourdes pertes économiques. Parmi les nombreuses espèces en cause, on retrouve les nématodes phytoparasites, vers microscopiques s’attaquant principalement aux racines. Présent dans tous les principaux pays producteurs de soya, le nématode à kyste du soya (NKS), Heterodera glycines, serait à lui seul responsable annuellement de plusieurs milliards de dollars de pertes. La rotation avec des cultivars résistants est le moyen le plus efficace de contrôler les populations de NKS, mais la surutilisation des mêmes lignées a conduit à la sélection d’individus virulents et mené à leur inefficacité. À ce jour, les mécanismes ainsi que les gènes de virulence associés au contournement de la résistance continuent de mystifier les scientifiques. Dans cette thèse, les effets des changements climatiques sur la reproduction et l’établissement du NKS ainsi que sur la phénologie de son hôte, le soya, ont été étudiés. Le premier modèle bioclimatique simulant le cycle de vie du NKS et du soya a été développé. Il a démontré que le nématode peut déjà se reproduire dans toutes les régions du Québec et que la hausse attendue des températures dans le futur proche (2041-2070) permettrait au NKS de pratiquement doubler le nombre de générations produites par saison de croissance dans toutes les régions. De plus, la production de soya issu du groupe de maturité I pourrait s’étendre à toutes les régions du Québec d’ici 2070. Une étude sur la distribution de la variabilité génétique entre 64 populations américaines et ontariennes et les gènes associés à diverses composantes bioclimatiques et leur rôle dans l’adaptation a également été réalisée. Celle-ci a révélé que la diversité génétique était très élevée entre les populations et qu’un flux de gène continu aurait facilité l’adaptation du NKS à diverses conditions bioclimatiques et son établissement dans toutes les régions nord-américaines où l’on produit du soya. Finalement, cette thèse présente l’analyse des génotypes du NKS et des gènes différentiellement exprimés sur des plants de soya résistant (Peking et PI88788) et sensible (Essex). En plus d’identifier plusieurs protéines liées à la virulence, cette étude a permis de mettre en évidence une région génomique sous forte pression évolutive. Cet îlot génique contient plusieurs répétitions en tandem qui ont divergé et dont certaines sont maintenant utilisées de façon sélective pour le contournement de différents types de résistance. / Biological invasions in agroecosystems are a major cause of economic losses. Plant parasitic nematodes are among the many species causing significant crop damages. The soybean cyst nematode (SCN) is causing billions of dollars of losses in all areas where soybean is produced. Rotation with resistant cultivars is the most effective mean of controlling SCN populations, but the overuse of the same lines has led to the selection of virulent individuals and the ineffectiveness of resistance. To this day, the virulence genes and mecanisms associated with the circumvention of resistance continue to mystify scientists. In this thesis, I explored the effects of climate change on the reproduction and establishment of SCN as well as on the phenology of its host, soybean. I have demonstrated that the nematode can already reproduce in all regions of Québec and that the expected rise in temperatures in the near future (2041-2070) will allow the development of more generations per growing season in all regions. In addition, I have demonstrated that the area suitable for the production of soybean from maturity group I will expand toward the north by 2070, further facilitating the expansion of SCN. I have also explored the genetic variability among more than 64 SCN populations from North America and analyzed the genes associated with various bioclimatic components and their role in adaptation. These analyses revealed that the genetic diversity was very high among SCN populations. This diversity associated with a continuous gene flow between populations has facilitated the adaptation of SCN to various bioclimatic conditions and its establishment in all US and Canadian soybean producing regions. Finaly, this thesis presents an analysis of the SCN genotypes and the differentially expressed genes associated with virulence in two resistant soybean lines (Peking and PI88788) and susceptible Essex. This work has identified several proteins associated with virulence and allowed the discovery of a genomic region under strong evolutionary pressure. This island contains several genes in tandem duplications that have diverged and are now used selectively for overcoming different sources of resistance.
|
95 |
Exploring DeepSEA CNN and DNABERT for Regulatory Feature Prediction of Non-coding DNAStachowicz, Jacob January 2021 (has links)
Prediction and understanding of the regulatory effects of non-coding DNA is an extensive research area in genomics. Convolutional neural networks have been used with success in the past to predict regulatory features, making chromatin feature predictions based solely on non-coding DNA sequences. Non-coding DNA shares various similarities with the human spoken language. This makes Language models such as the transformer attractive candidates for deciphering the non-coding DNA language. This thesis investigates how well the transformer model, usually used for NLP problems, predicts chromatin features based on genome sequences compared to convolutional neural networks. More specifically, the CNN DeepSEA, which is used for regulatory feature prediction based on noncoding DNA, is compared with the transformer DNABert. Further, this study explores the impact different parameters and training strategies have on performance. Furthermore, other models (DeeperDeepSEA and DanQ) are also compared on the same tasks to give a broader comparison value. Lastly, the same experiments are conducted on modified versions of the dataset where the labels cover different amounts of the DNA sequence. This could prove beneficial to the transformer model, which can understand and capture longrange dependencies in natural language problems. The replication of DeepSEA was successful and gave similar results to the original model. Experiments used for DeepSEA were also conducted on DNABert, DeeperDeepSEA, and DanQ. All the models were trained on different datasets, and their results were compared. Lastly, a Prediction voting mechanism was implemented, which gave better results than the models individually. The results showed that DeepSEA performed slightly better than DNABert, regarding AUC ROC. The Wilcoxon Signed-Rank Test showed that, even if the two models got similar AUC ROC scores, there is statistical significance between the distribution of predictions. This means that the models look at the dataset differently and might be why combining their prediction presents good results. Due to time restrictions of training the computationally heavy DNABert, the best hyper-parameters and training strategies for the model were not found, only improved. The Datasets used in this thesis were gravely unbalanced and is something that needs to be worked on in future projects. This project works as a good continuation for the paper Whole-genome deep-learning analysis identifies contribution of non-coding mutations to autism risk, Which uses the DeepSEA model to learn more about how specific mutations correlate with Autism Spectrum Disorder. / Arbetet kring hur icke-kodande DNA påverkar genreglering är ett betydande forskningsområde inom genomik. Convolutional neural networks (CNN) har tidigare framgångsrikt använts för att förutsäga reglerings-element baserade endast på icke-kodande DNA-sekvenser. Icke-kod DNA har ett flertal likheter med det mänskliga språket. Detta gör språkmodeller, som Transformers, till attraktiva kandidater för att dechiffrera det icke-kodande DNA-språket. Denna avhandling undersöker hur väl transformermodellen kan förutspå kromatin-funktioner baserat på gensekvenser jämfört med CNN. Mer specifikt jämförs CNN-modellen DeepSEA, som används för att förutsäga reglerande funktioner baserat på icke-kodande DNA, med transformern DNABert. Vidare undersöker denna studie vilken inverkan olika parametrar och träningsstrategier har på prestanda. Dessutom jämförs andra modeller (DeeperDeepSEA och DanQ) med samma experiment för att ge ett bredare jämförelsevärde. Slutligen utförs samma experiment på modifierade versioner av datamängden där etiketterna täcker olika mängder av DNA-sekvensen. Detta kan visa sig vara fördelaktigt för transformer modellen, som kan förstå beroenden med lång räckvidd i naturliga språkproblem. Replikeringen av DeepSEA experimenten var lyckad och gav liknande resultat som i den ursprungliga modellen. Experiment som användes för DeepSEA utfördes också på DNABert, DeeperDeepSEA och DanQ. Alla modeller tränades på olika datamängder, och resultat på samma datamängd jämfördes. Slutligen implementerades en algoritm som kombinerade utdatan av DeepDEA och DNABERT, vilket gav bättre resultat än modellerna individuellt. Resultaten visade att DeepSEA presterade något bättre än DNABert, med avseende på AUC ROC. Wilcoxon Signed-Rank Test visade att, även om de två modellerna fick liknande AUC ROC-poäng, så finns det en statistisk signifikans mellan fördelningen av deras förutsägelser. Det innebär att modellerna hanterar samma information på olika sätt och kan vara anledningen till att kombinationen av deras förutsägelser ger bra resultat. På grund av tidsbegränsningar för träning av det beräkningsmässigt tunga DNABert hittades inte de bästa hyper-parametrarna och träningsstrategierna för modellen, utan förbättrades bara. De datamängder som användes i denna avhandling var väldigt obalanserade, vilket måste hanteras i framtida projekt. Detta projekt fungerar som en bra fortsättning för projektet Whole-genome deep-learning analysis identifies contribution of non-coding mutations to autism risk, som använder DeepSEA-modellen för att lära sig mer om hur specifika DNA-mutationer korrelerar med autismspektrumstörning.
|
96 |
Prioritizing Causative Genomic Variants by Integrating Molecular and Functional Annotations from Multiple Biomedical OntologiesAlthagafi, Azza Th. 20 July 2023 (has links)
Whole-exome and genome sequencing are widely used to diagnose individual patients. However, despite its success, this approach leaves many patients undiagnosed. This could be due to the need to discover more disease genes and variants or because disease phenotypes are novel and arise from a combination of variants of multiple known genes related to the disease. Recent rapid increases in available genomic, biomedical, and phenotypic data enable computational analyses, reducing the search space for disease-causing genes or variants and facilitating the prediction of causal variants. Therefore, artificial intelligence, data mining, machine learning, and deep learning are essential tools that have been used to identify biological interactions, including protein-protein interactions, gene-disease predictions, and variant--disease associations. Predicting these biological associations is a critical step in diagnosing patients with rare or complex diseases.
In recent years, computational methods have emerged to improve gene-disease prioritization by incorporating phenotype information. These methods evaluate a patient's phenotype against a database of gene-phenotype associations to identify the closest match. However, inadequate knowledge of phenotypes linked with specific genes in humans and model organisms limits the effectiveness of the prediction. Information about gene product functions and anatomical locations of gene expression is accessible for many genes and can be associated with phenotypes through ontologies and machine-learning models. Incorporating this information can enhance gene-disease prioritization methods and more accurately identify potential disease-causing genes.
This dissertation aims to address key limitations in gene-disease prediction and variant prioritization by developing computational methods that systematically relate human phenotypes that arise as a consequence of the loss or change of gene function to gene functions and anatomical and cellular locations of activity. To achieve this objective, this work focuses on crucial problems in the causative variant prioritization pipeline and presents novel computational methods that significantly improve prediction performance by leveraging large background knowledge data and integrating multiple techniques.
Therefore, this dissertation presents novel approaches that utilize graph-based machine-learning techniques to leverage biomedical ontologies and linked biological data as background knowledge graphs. The methods employ representation learning with knowledge graphs and introduce generic models that address computational problems in gene-disease associations and variant prioritization. I demonstrate that my approach is capable of compensating for incomplete information in public databases and efficiently integrating with other biomedical data for similar prediction tasks. Moreover, my methods outperform other relevant approaches that rely on manually crafted features and laborious pre-processing. I systematically evaluate our methods and illustrate their potential applications for data analytics in biomedicine. Finally, I demonstrate how our prediction tools can be used in the clinic to assist geneticists in decision-making. In summary, this dissertation contributes to the development of more effective methods for predicting disease-causing variants and advancing precision medicine.
|
97 |
Cis-regulation and genetic control of gene expression in neuroblastomaBurkert, Christian Martin 28 June 2021 (has links)
Genregulation beeinflusst Phänotypen im Kontext von Gesundheit und Krankheit. In Krebszellen regulieren genetische und epigenetische Faktoren die Genexpression in cis. Das Neuroblastom ist eine Krebserkrankung, die häufig im Kindesalter auftritt. Es ist gekennzeichnet durch eine geringe Anzahl exonischer Mutationen und durch häufige Veränderungen der somatischen Kopienzahl, einschließlich Genamplifikationen auf extrachromosomaler zirkulärer DNA. Bisher ist wenig darüber bekannt, wie lokale genetische und epigenetische Faktoren Gene im Neuroblastom regulieren. In dieser Arbeit kombiniere ich die allelspezifische Analyse ganzer Genome (WGS), Transkriptome und zirkulärer DNA von Neuroblastom-Patienten, um genetische und cis-regulatorische Effekte zu charakterisieren. Ich zeige, dass somatische Dosis-Effekte der Kopienzahl andere lokale genetische Effekte dominieren und wichtige Signalwege regulieren. Genamplifikationen zeigen starke Dosis-Effekte und befinden sich häufig auf großen extrachromosomalen zirkulären DNAs. Die vorgestellte Analyse zeigt, dass der Verlust von 11q zu einer Hochregulation von Histonvarianten H3.3 und H2A in Tumoren mit alternativer Verlängerung der Telomere (ALT) führt, und dass erhöhte somatische Kopienzahl die Expression der TERT Gens verstärken können. Weitere Erkenntnisse sind, dass 17p-Ungleichgewichte und die damit verbundene Herunterregulierung neuronaler Gene sowie die Hochregulierung des genomisch geprägten Gens RTL1 durch Kopienzahl-unabhängige allelische Dosis-Effekte mit einer ungünstigen Prognose verbunden sind. Die cis-QTL-Analyse bestätigt eine zuvor beschriebene Regulation des LMO1 Gens durch einen Enhancer-Polymorphismus und charakterisiert das regulatorische Potenzial weiterer GWAS-Risiko-Loci. Die Arbeit unterstreicht die Bedeutung von Dosis-Effekten im Neuroblastom und liefert eine detaillierte Übersicht regulatorischer Varianten, die in dieser Krankheit aktiv sind. / Gene regulation controls phenotypes in health and disease. In cancer, the interplay between germline variation, genetic aberrations and epigenetic factors modulate gene expression in cis. The childhood cancer neuroblastoma originates from progenitor cells of the sympathetic nervous system. It is characterized by a sparsity of recurrent exonic mutations but frequent somatic copy-number alterations, including gene amplifications on extrachromosomal circular DNA. So far, little is known on how local genetic and epigenetic factors regulate genes in neuroblastoma to establish disease phenotypes. I here combine allele-specific analysis of whole genomes, transcriptomes and circular DNA from neuroblastoma patients to characterize genetic and cis-regulatory effects, and prioritize germline regulatory variants by cis-QTLs mapping and chromatin profiles. The results show that somatic copy-number dosage dominates local genetic effects and regulates pathways involved in telomere maintenance, genomic stability and neuronal processes. Gene amplifications show strong dosage effects and are frequently located on large but not small extrachromosomal circular DNAs. My analysis implicates 11q loss in the upregulation of histone variants H3.3 and H2A in tumors with alternative lengthening of telomeres and cooperative effects of somatic rearrangements and somatic copy-number gains in the upregulation of TERT. Both 17p copy-number imbalances and associated downregulation of neuronal genes as well as upregulation of the imprinted gene RTL1 by copy-number-independent allelic dosage effects is associated with an unfavorable prognosis. cis-QTL analysis confirms the previously reported regulation of the LMO1 gene by a super-enhancer risk polymorphism and characterizes the regulatory potential of additional GWAS risk loci. My work highlights the importance of dosage effects in neuroblastoma and provides a detailed map of regulatory variation active in this disease.
|
Page generated in 0.3437 seconds