• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 188
  • 58
  • 50
  • 33
  • 22
  • 6
  • 5
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 454
  • 454
  • 454
  • 70
  • 68
  • 67
  • 58
  • 54
  • 54
  • 53
  • 52
  • 48
  • 48
  • 46
  • 46
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
361

Microfluidics for low input epigenomic analysis and application to oncology and brain neuroscience

Liu, Zhengzhi 07 September 2023 (has links)
Microfluidics is a versatile tool with many applications in biology. Its ability to manipulate small volumes of liquid precisely has led to the development of many microfluidic assay platforms. They could handle small amounts of samples and carry out analysis with high sensitivity and throughput. Microfluidic assays have provided new insights into scarce biological samples at higher resolution. In this thesis, we developed microfluidic tools to conduct low input ChIP-seq and ChIRP-seq. We applied them to a variety of samples profiling different targets. The native MOWChIP-seq platform was developed to map RNA polymerase II, transcription factors and histone deacetylase binding in 1,000-50,000 cells. We examined mouse prefrontal cortex and cerebellum using this technology. We found extensive differences that correlated with distinct neurological functions of the brain regions. The same platform and workflow were used to profile five key histone modifications in human lung tumor and normal tissue samples. Integrative analysis with gene expression data revealed extensive chromatin remodeling in lung tumor. Spatial histone modification mapping was conducted in mouse neocortex in a similar fashion. We generated an epigenomic tomography that demonstrated the molecular state of the brain in 3D. Lastly, we developed a microfluidic version of the ChIRP-seq process which successfully conducted the assay using only 500K cells. This improvement makes ChIRP-seq in tissue samples feasible. / Doctor of Philosophy / Microfluidics is a type of technology that can control small volumes of liquid in a miniature system. It can carry out reactions on very small scales with higher precision and sensitivity than conventional methods. Microfluidics has found many uses in the field of biology, especially dealing with samples available in limited quantities. These low input microfluidic platforms have helped researchers gain new knowledge on many complex questions. In this thesis, we developed microfluidic tools to carry out low input ChIP-seq and ChIRP-seq. These are two established techniques used to map where certain targets are located on the genome of an organism. These targets include specific chemical modifications to the wrapper protein of DNA (histone modification), proteins that take part in transcription and expression of genes (RNA polymerase II, transcription factors) and other molecules. Our nMOWChIP-seq system removed the need for fixation by chemicals. It was able to examine RNA polymerase II, transcription factors and other enzymes using 1,000-50,000 cells. Traditional ChIP-seq requires more than 10 million cells and time-consuming chemical treatment steps. Our technology greatly improved sensitivity and ease of use. We also used this platform to test five important histone modifications in human lung tumors and healthy tissues. We constructed a spatial map of histone modification in mouse brain by analyzing slices of the cortex. Finally, we developed a microfluidic version of ChIRP-seq process to map locations of long non-coding RNAs in cultured human cells. The cells needed for a successful test were reduced to 500K from 20 million of the original workflow.
362

Poly(A) Tail Regulation in the Nucleus

Alles, Jonathan 19 May 2022 (has links)
Der Ribonukleinsäure (RNS) Stoffwechsel umfasst verschiedene Schritte, beginnend mit der Transkription der RNS über die Translation bis zum RNA Abbau. Poly(A) Schwänze befinden sich am Ende der meisten der Boten-RNS, schützen die RNA vor Abbau und stimulieren Translation. Die Deadenylierung von Poly(A) Schwänzen limitiert den Abbau von RNS. Bisher wurde RNS Abbau meist im Kontext von cytoplasmatischen Prozessen untersucht, ob und wie RNS Deadenylierung und Abbau in Nukleus erfolgen ist bisher unklar. Es wurde daher eine neue Methode zur genomweiten Bestimmung von Poly(A) Schwanzlänge entwickelt, welche FLAM-Seq genannt wurde. FLAM-Seq wurde verwendet um Zelllinien, Organoide und C. elegans RNS zu analysieren und es wurde eine signifikante Korrelation zwischen 3’-UTR und Poly(A) Länge gefunden, sowie für viele Gene ein Zusammenhang von alternativen 3‘-UTR Isoformen und Poly(A) Länge. Die Untersuchung von Poly(A) Schwänzen von nicht-gespleißten RNS Molekülen zeige, dass deren Poly(A) Schwänze eine Länge von mehr als 200 nt hatten. Die Analyse wurde durch eine Inhibition des Spleiß-Prozesses validiert. Die Verwendung von Methoden zur Markierung von RNS, welche die zeitliche Auflösung der RNS Prozessierung ermöglicht, deutete auf eine Deadenylierung der Poly(A) Schwänze schon wenige Minuten nach deren Synthesis hin. Die Analyse von subzellulären Fraktionen zeigte, dass diese initiale Deadenylierung ein Prozess im Nukleus ist. Dieser Prozess ist gen-spezifisch und Poly(A) Schwänze von bestimmten Typen von Transkripten, wie nuklearen langen nicht-kodierende RNS Molekülen waren nicht deadenyliert. Um Enzyme zu identifizieren, welche die Deadenylierung im Zellkern katalysieren, wurden verschiedene Methoden wie RNS-abbauende Cas Systeme, siRNAs oder shRNA Zelllinien verwendet. Trotz einer effizienten Reduktion der RNS Expression entsprechender Enzymkomplexe konnten keine molekularen Phänotypen identifiziert werden welche die Poly(A) Länge im Zellkern beeinflussen. / The RNA metabolism involves different steps from transcription to translation and decay of messenger RNAs (mRNAs). Most mRNAs have a poly(A) tail attached to their 3’-end, which protects them from degradation and stimulates translation. Removal of the poly(A) tail is the rate-limiting step in RNA decay controlling stability and translation. It is yet unclear if and to what extent RNA deadenylation occurs in the mammalian nucleus. A novel method for genome-wide determination of poly(A) tail length, termed FLAM-Seq, was developed to overcome current challenges in sequencing mRNAs, enabling genome-wide analysis of complete RNAs, including their poly(A) tail sequence. FLAM-Seq analysis of different model systems uncovered a strong correlation between poly(A) tail and 3’-UTR length or alternative polyadenylation. Cytosine nucleotides were further significantly enriched in poly(A) tails. Analyzing poly(A) tails of unspliced RNAs from FLAM-Seq data revealed the genome-wide synthesis of poly(A) tails with a length of more than 200 nt. This could be validated by splicing inhibition experiments which uncovered potential links between the completion of splicing and poly(A) tail shortening. Measuring RNA deadenylation kinetics using metabolic labeling experiments hinted at a rapid shortening of tails within minutes. The analysis of subcellular fractions obtained from HeLa cells and a mouse brain showed that initial deadenylation is a nuclear process. Nuclear deadenylation is gene specific and poly(A) tails of lncRNAs retained in the nucleus were not shortened. To identify enzymes responsible for nuclear deadenylation, RNA targeting Cas-systems, siRNAs and shRNA cell lines were used to different deadenylase complexes. Despite efficient mRNA knockdown, subcellular analysis of poly(A) tail length by did not yield molecular phenotypes of changing nuclear poly(A) tail length.
363

A Comparative Transcriptome Analysis of Human and Porcine Choroid Plexus Cells in Response to Streptococcus suis Serotype 2 Infection Points to a Role of Hypoxia

Lauer, Alexa N., Scholtysik, Rene, Beineke, Andreas, Baums, Christoph Georg, Klose, Kristin, Valentin-Weigand, Peter, Ishikawa, Hiroshi, Schroten, Horst, Klein-Hitpass, Ludger, Schwerk, Christian 03 April 2023 (has links)
Streptococcus suis (S. suis) is an important opportunistic pathogen, which can cause septicemia and meningitis in pigs and humans. Previous in vivo observations in S. suisinfected pigs revealed lesions at the choroid plexus (CP). In vitro experiments with primary porcine CP epithelial cells (PCPEC) and human CP epithelial papilloma (HIBCPP) cells demonstrated that S. suis can invade and traverse the CP epithelium, and that the CP contributes to the inflammatory response via cytokine expression. Here, next generation sequencing (RNA-seq) was used to compare global transcriptome profiles of PCPEC and HIBCPP cells challenged with S. suis serotype (ST) 2 infected in vitro, and of pigs infected in vivo. Identified differentially expressed genes (DEGs) were, amongst others, involved in inflammatory responses and hypoxia. The RNA-seq data were validated via quantitative PCR of selected DEGs. Employing Gene Set Enrichment Analysis (GSEA), 18, 28, and 21 enriched hallmark gene sets (GSs) were identified for infected HIBCPP cells, PCPEC, and in the CP of pigs suffering from S. suis ST2 meningitis, respectively, of which eight GSs overlapped between the three different sample sets. The majority of these GSs are involved in cellular signaling and pathways, immune response, and development, including inflammatory response and hypoxia. In contrast, suppressed GSs observed during in vitro and in vivo S. suis ST2 infections included those, which were involved in cellular proliferation and metabolic processes. This study suggests that similar cellular processes occur in infected human and porcine CP epithelial cells, especially in terms of inflammatory response.
364

Antibiotic Resistance Characterization in Human Fecal and Environmental Resistomes using Metagenomics and Machine Learning

Gupta, Suraj 03 November 2021 (has links)
Antibiotic resistance is a global threat that can severely imperil public health. To curb the spread of antibiotic resistance, it is imperative that efforts commensurate with a “One Health” approach are undertaken. Given that interconnectivities among ecosystems can serve as conduits for the proliferation and dissemination of antibiotic resistance, it is increasingly being recognized that a robust global environmental surveillance framework is required to promote One Health. The ideal aim would be to develop approaches that inform global distribution of antibiotic resistance, help prioritize monitoring targets, present robust data analysis frameworks to profile resistance, and ultimately help build strategies to curb the dissemination of antibiotic resistance. The work described in this dissertation was aimed at evaluating and developing different data analysis paradigms and their applications in investigating and characterizing antibiotic resistance across different resistomes. The applications presented in Chapter 2 illustrate challenges associated with various environmental data types (especially metagenomics data) and present a path to advance incorporation of data analytics approaches in Environmental Science and Engineering research and applications. Chapter 3 presents a novel approach, ExtrARG, that identifies discriminatory ARGs among resistomes based on factors of interest. The results in Chapter 4 provide insight into the global distribution of ARGs across human fecal and sewage resistomes across different socioeconomics. Chapter 5 demonstrates a data analysis paradigm using machine learning algorithms that helps bridge the gap between information obtained via culturing and metagenomic sequencing. Lastly, the results of Chapter 6 illustrates the contribution of phages to antibiotic resistance. Overall, the findings provide guidance and approaches for profiling antibiotic resistance using metagenomics and machine learning. The results reported further expand the knowledge on the distribution of antibiotic resistance across different resistomes. / Antibiotic resistance is a global threat that can severely imperil public health. To curb the spread of antibiotic resistance, it is imperative that efforts commensurate with a "One Health" approach are undertaken. Given that interconnectivities among ecosystems can serve as conduits for the proliferation and dissemination of antibiotic resistance, it is increasingly being recognized that a robust global environmental surveillance framework is required to promote One Health. The ideal aim would be to develop approaches that inform global distribution of antibiotic resistance, help prioritize monitoring targets, present robust data analysis frameworks to profile resistance, and ultimately help build strategies to curb the dissemination of antibiotic resistance. The work described in this dissertation was aimed at evaluating and developing different data analysis paradigms and their applications in investigating and characterizing antibiotic resistance across different resistomes. The applications presented in Chapter 2 illustrate challenges associated with various environmental data types (especially metagenomics data) and present a path to advance incorporation of data analytics approaches in Environmental Science and Engineering research and applications. Chapter 3 presents a novel approach, ExtrARG, that identifies discriminatory ARGs among resistomes based on factors of interest. The results in Chapter 4 provide insight into the global distribution of ARGs across human fecal and sewage resistomes across different socioeconomics. Chapter 5 demonstrates a data analysis paradigm using machine learning algorithms that helps bridge the gap between information obtained via culturing and metagenomic sequencing. Lastly, the results of Chapter 6 illustrates the contribution of phages to antibiotic resistance. Overall, the findings provide guidance and approaches for profiling antibiotic resistance using metagenomics and machine learning. The results reported further expand the knowledge on the distribution of antibiotic resistance across different resistomes. / Doctor of Philosophy / Antibiotic resistance is ability of bacteria to withstand an antibiotic to which they were once sensitive. Antibiotic resistance is a global threat that can pose a serious threat to public health. In order to curb the spread of antibiotic resistance, it is imperative that efforts commensurate with the "One Health" approach. Since ecosystem networks can act as channels for the spread and spread of antibiotic resistance, there is growing recognition that a robust global environmental monitoring framework is required to promote a true one-health approach. The ideal goal would be to develop approaches that can inform the global spread of antibiotic resistance, help prioritize monitoring objectives and present robust data analysis frameworks for resistance profiling, and ultimately help develop strategies to contain the spread of antibiotic resistance. The objective of the work described in this thesis was to evaluate and develop different data analysis paradigms and their applications in the study and characterization of antibiotic resistance in different resistomes. The applications presented in Chapter 2 illustrate challenges associated with various environmental data types (especially metagenomics data) and present a path to advance incorporation of data analytics approaches in Environmental Science and Engineering research and applications. The Chapter 3 presents a novel approach, ExtrARG, that identifies discriminatory ARGs among resistomes based on factors of interest. The chapter 5 demonstrates a data analysis paradigm using machine learning algorithms that helps bridge the gap between information obtained via culturing and metagenomic sequencing. The results of Chapters 4 provide insight into the global distribution of ARGs across human fecal and sewage resistomes across different socioeconomics. Lastly, the results of Chapter 6 illustrates the contribution of phages to antibiotic resistance. Overall, the findings provide guidance and approaches for profiling antibiotic resistance using metagenomics and machine learning. The results reported further expand the knowledge on the distribution of antibiotic resistance across different resistomes.
365

Profiling of Microbial Communities, Antibiotic Resistance, Functional Genes, and Biodegradable Dissolved Organic Carbon in a Carbon-Based Potable Water Reuse System

Blair, Matthew Forrest 17 March 2023 (has links)
Water reuse has become a promising alternative to alleviate stress on conventional freshwater resources in the face of population growth, sea level rise, source water depletion, eutrophication of water bodies, and climate change. Potable water reuse intentionally looks to purify wastewater effluent to drinking water quality or better through the development and implementation of advanced treatment trains. While membrane-based treatment has become a widely-adopted treatment step to meet this purpose, there is growing interest in implementing treatment trains that harness microorganisms as a more sustainable and less energy-intensive means of removing contaminants of emerging concern (CECs), through biological degradation or transformation. In this dissertation, various aspects of the operation of a microbially-active carbon-based advanced treatment train producing water intended for potable reuse are examined, including fate of dissolved organic carbon, underlying microbial populations, and functional genes are explored. Further, dynamics associated with antibiotic resistance genes (ARGs), identified as a microbially-relevant CECs, are also assessed. Overall, this dissertation advances understanding associated with the interplay between and within treatment processes as they relate to removal of various organic carbon fractions, microbially community dynamics, functional genes, and ARGs. Further, when relevant, these insights are contextualized to operational conditions, process upsets, water quality parameters, and other intended water uses within the water industry with the goal of broadening the application of advanced molecular tools beyond the scope of academic research. Specifically, this dissertation illuminates relationships among organic carbon fractions and molecular markers within an advanced treatment train employing flocculation, coagulation, and sedimentation (FlocSed), ozonation, biologically active carbon (BAC) filtration, granular active carbon (GAC) contacting, and UV disinfection. Biodegradable dissolved organic carbon (BDOC) analysis was adapted specifically as an assay relevant to assessing dissolved organic carbon biodegradability by BAC/GAC-biofilms and applied to profile biodegradable/non-biodegradable organic carbon as wastewater effluent passed through each of these treatment stages. Of particular interest was the role of ozonation in producing bioavailable organic carbon that can be effectively removed by BAC filtration. In addition to understanding the removal of fractionalized organic carbon, next generation DNA sequencing technologies (NGS) were utilized to better understand the microbial dynamics characteristic of complex microbial communities during disinfection and biological treatment. Specifically, this analysis was focused on succession and colonization of taxa, genes related to a wide range of functional interests (e.g. metabolic processes, horizontal gene transfer, DNA repair, and nitrogen cycling), and microbial CECs. Finally, NGS technologies were employed to assess the differences between a wide range of water use categories, including conventional drinking water, potable reuse, and non-potable reuse effluent's microbiomes to identify core and discriminatory taxa associated with intended water usage. The outcomes of this dissertation provide valuable information for optimizing carbon-based treatment trains as an alternative to membrane-based treatment for sustainable water reuse and also advance the application of NGS as a diagnostic tool for assessing the efficacy of various water treatment technologies for achieving treatment goals. / Doctor of Philosophy / Several factors have led to increased stress on conventional drinking water sources and widespread global water scarcity. Projections indicate that continued population growth, increased water demand, and degradation of current freshwater resources will negatively contribute to water needs and underscore the need to secure new potable (i.e. fit for human consumption) sources. Water reuse is a promising alternative to offset the growing demands on traditional potable sources and ameliorate negative consequences associated with water scarcity. Discharge of treated wastewater to marine environments is especially a lost opportunity, as the water will no longer be of value to freshwater habitats or as a drinking water source. Water reuse challenges the conventional wastewater treatment paradigm by providing advanced treatment of wastewater effluent to produce a valuable resource that can be safely used directly for either non-potable (e.g., irrigation, firefighting) or potable (i.e., drinking water) applications. The means of achieving advanced treatment of wastewater effluents can take many forms, commonly relying on the utilization of membrane filtration. However, membrane filtration is an intensive process and suffers from high initial costs, high operational costs, membrane fouling with time, and the production of a salty and difficult to dispose of waste stream. These drawbacks have motivated the water reuse industry to explore more sustainable approaches to achieving high quality effluents. One such alternative relies on the utilization of microorganisms to provide biological degradation and transformation of contaminants through a process known as biologically active filtration (BAF). Comparatively to membrane systems, BAF is more cost effective and produces significantly fewer byproducts while still producing high quality treated water for reuse. However, the range in quality of the resulting treated water has not yet been fully established, in part due to the lack of understanding of the complex microbial communities responsible for biological treatment. As water and wastewater treatment technologies have evolved over the past century, many biological treatments have remained largely 'black box' due to the lack of effective tools to identify the tens of thousands of species of microbes that inhabit a typical system and to track their dynamics with time. Instead, analysis has largely focused on basic water quality indicators. This dissertation takes important steps in advancing the implementation of the study of DNA and biodegradable organic carbon (BDOC) analysis to improve understanding of the mechanisms that drive different water reuse treatment technologies and to identify potential vulnerabilities. Insights gained through application of these tools are contextualized to observed operational conditions, process upsets, and water quality measurements. This helped to advance the use of DNA-based tools to better inform water treatment engineering practice. Specifically, this dissertation dives into the relationships between organic carbon and DNA-based markers within an advanced treatment train employing flocculation, coagulation, and sedimentation (FlocSed), ozonation, biologically active carbon (BAC) filtration, granular active carbon (GAC) contacting, and UV disinfection. Development and application of the BDOC test revealed that the bulk of organic carbon entering the treatment train is dissolved. Further, BDOC analysis served to characterize the impact of specific treatment processes and changes in operational conditions on both biodegradable and non-biodegradable organic carbon fractions. Such information can help to inform continued process optimization. Utilization of DNA-based technologies shed light on the functional capacity of microbial communities present within each stage of treatment and the fate of antibiotic resistance genes (ARGs). ARGs are of concern because, when present in human pathogens, they can result in the failure of antibiotics to cure deadly infections. Other functional genes of interest were also examined using the DNA-based analysis, including genes driving metabolic processes and nitrogen cycling that are critical to water purification during BAF treatment. Also, the DNA-based analyses made it possible to better understand the effects of disinfectants on microbes. Interestingly, some ARG types increased in relative abundance (a measure analogous to percent composition) response to treatments, such as disinfection, and others decreased. Characterization of the microbial communities and their dynamic response to changing operation conditions were also observed. For example, it was possible to characterize how the profiles of microbes changed with time, an ecological process called succession, during BAC filtration and GAC contacting. Generally, this analysis, coupled with the functional analysis, shed light on the important, divergent roles of bacterial communities on organic degradation during both BAC and GAC treatment. Finally, a study was conducted that compared the microbiome (i.e. entire microbial community) between a wide range of conventional drinking water, potable reuse water, and non-potable reuse waters. Here it was found that significant differences existed between the microbial communities of water intended for potable or non-potable usage. This work also looked to expand the application of NGS technologies beyond strictly academic research by developing the application of more advanced DNA-based tools for treatment train assessment and monitoring.
366

Statistical methods for variant discovery and functional genomic analysis using next-generation sequencing data

Tang, Man 03 January 2020 (has links)
The development of high-throughput next-generation sequencing (NGS) techniques produces massive amount of data, allowing the identification of biomarkers in early disease diagnosis and driving the transformation of most disciplines in biology and medicine. A greater concentration is needed in developing novel, powerful, and efficient tools for NGS data analysis. This dissertation focuses on modeling ``omics'' data in various NGS applications with a primary goal of developing novel statistical methods to identify sequence variants, find transcription factor (TF) binding patterns, and decode the relationship between TF and gene expression levels. Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in NGS applications. Existing methods for calling these variants often make simplified assumption of positional independence and fail to leverage the dependence of genotypes at nearby loci induced by linkage disequilibrium. We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short read data. Simulation experiments show that, under various sequencing depths, vi-HMM outperforms existing methods in terms of sensitivity and F1 score. When applied to the human whole genome sequencing data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs. One important NGS application is chromatin immunoprecipitation followed by sequencing (ChIP-seq), which characterizes protein-DNA relations through genome-wide mapping of TF binding sites. Multiple TFs, binding to DNA sequences, often show complex binding patterns, which indicate how TFs with similar functionalities work together to regulate the expression of target genes. To help uncover the transcriptional regulation mechanism, we propose a novel nonparametric Bayesian method to detect the clustering pattern of multiple-TF bindings from ChIP-seq datasets. Simulation study demonstrates that our method performs best with regard to precision, recall, and F1 score, in comparison to traditional methods. We also apply the method on real data and observe several TF clusters that have been recognized previously in mouse embryonic stem cells. Recent advances in ChIP-seq and RNA sequencing (RNA-Seq) technologies provides more reliable and accurate characterization of TF binding sites and gene expression measurements, which serves as a basis to study the regulatory functions of TFs on gene expression. We propose a log Gaussian cox process with wavelet-based functional model to quantify the relationship between TF binding site locations and gene expression levels. Through the simulation study, we demonstrate that our method performs well, especially with large sample size and small variance. It also shows a remarkable ability to distinguish real local feature in the function estimates. / Doctor of Philosophy / The development of high-throughput next-generation sequencing (NGS) techniques produces massive amount of data and bring out innovations in biology and medicine. A greater concentration is needed in developing novel, powerful, and efficient tools for NGS data analysis. In this dissertation, we mainly focus on three problems closely related to NGS and its applications: (1) how to improve variant calling accuracy, (2) how to model transcription factor (TF) binding patterns, and (3) how to quantify of the contribution of TF binding on gene expression. We develop novel statistical methods to identify sequence variants, find TF binding patterns, and explore the relationship between TF binding and gene expressions. We expect our findings will be helpful in promoting a better understanding of disease causality and facilitating the design of personalized treatments.
367

Computational Analysis of Gene Expression Regulation from Cross Species Comparison to Single Cell Resolution

Lee, Jiyoung 31 August 2020 (has links)
Gene expression regulation is dynamic and specific to various factors such as developmental stages, environmental conditions, and stimulation of pathogens. Nowadays, a tremendous amount of transcriptome data sets are available from diverse species. This trend enables us to perform comparative transcriptome analysis that identifies conserved or diverged gene expression responses across species using transcriptome data. The goal of this dissertation is to develop and apply approaches of comparative transcriptomics to transfer knowledge from model species to non-model species with the hope that such an approach can contribute to the improvement of crop yield and human health. First, we presented a comprehensive method to identify cross-species modules between two plant species. We adapted the unsupervised network-based module finding method to identify conserved patterns of co-expression and functional conservation between Arabidopsis, a model species, and soybean, a crop species. Second, we compared drought-responsive genes across Arabidopsis, soybean, rice, corn, and Populus in order to explore the genomic characteristics that are conserved under drought stress across species. We identified hundreds of common gene families and conserved regulatory motifs between monocots and dicots. We also presented a BLS-based clustering method which takes into account evolutionary relationships among species to identify conserved co-expression genes. Last, we analyzed single-cell RNA-seq data from monocytes to attempt to understand regulatory mechanism of innate immune system under low-grade inflammation. We identified novel subpopulations of cells treated with lipopolysaccharide (LPS), that show distinct expression patterns from pro-inflammatory genes. The data revealed that a promising therapeutic reagent, sodium 4-phenylbutyrate, masked the effect of LPS. We inferred the existence of specific cellular transitions under different treatments and prioritized important motifs that modulate the transitions using feature selection by a random forest method. There has been a transition in genomics research from bulk RNA-seq to single-cell RNA-seq, and scRNA-seq has become a widely used approach for transcriptome analysis. With the experience we gained by analyzing scRNA-seq data, we plan to conduct comparative single-cell transcriptome analysis across multiple species. / Doctor of Philosophy / All cells in an organism have the same set of genes, but there are different cell types, tissues, organs with different functions as the organism ages or under different conditions. Gene expression regulation is one mechanism that modulates complex, dynamic, and specific changes in tissues or cell types for any living organisms. Understanding gene regulation is of fundamental importance in biology. With the rapid advancement of sequencing technologies, there is a tremendous amount of gene expression data (transcriptome) from individual species in public repositories. However, major studies have been reported from several model species and research on non-model species have relied on comparison results with a few model species. Comparative transcriptome analysis across species will help us to transform knowledge from model species to non-model species and such knowledge transfer can contribute to the improvement of crop yields and human health. The focus of my dissertation is to develop and apply approaches for comparative transcriptome analysis that can help us better understand what makes each species unique or special, and what kinds of common functions across species have been passed down from ancestors (evolutionarily conserved functions). Three research chapters are presented in this dissertation. First, we developed a method to identify groups of genes that are commonly co-expressed in two species. We chose seed development data from soybean with the hope to contribute to crop improvement. Second, we compared gene expression data across five plant species including soybean, rice, and corn to provide new perspectives about crop plants. We chose drought stress to identify conserved functions and regulatory factors across species since drought stress is one of the major stresses that negatively impact agricultural production. We also proposed a method that groups genes with evolutionary relationships from an unlimited number of species. Third, we analyzed single-cell RNA-seq data from mouse monocytes to understand the regulatory mechanism of the innate immune system under low-grade inflammation. We observed how innate immune cells respond to inflammation that could cause no symptoms but persist for a long period of time. Also, we reported an effect of a promising therapeutic reagent (sodium 4-phenylbutyrate) on chronic inflammatory diseases. The third project will be extended to comparative single-cell transcriptome analysis with multiple species.
368

Bridging the TB data gap: in silico extraction of rifampicin-resistant tuberculosis diagnostic test results from whole genome sequence data

05 November 2019 (has links)
Yes / Background: Mycobacterium tuberculosis rapid diagnostic tests (RDTs) are widely employed in routine laboratories and national surveys for detection of rifampicinresistant (RR)-TB. However, as next-generation sequencing technologies have become more commonplace in research and surveillance programs, RDTs are being increasingly complemented by whole genome sequencing (WGS). While comparison between RDTs is difficult, all RDT results can be derived from WGS data. This can facilitate continuous analysis of RR-TB burden regardless of the data generation technology employed. By converting WGS to RDT results, we enable comparison of data with different formats and sources particularly for low- and middle-income high TB-burden countries that employ different diagnostic algorithms for drug resistance surveys. This allows national TB control programs (NTPs) and epidemiologists to utilize all available data in the setting for improved RR-TB surveillance. Methods: We developed the Python-based MycTB Genome to Test (MTBGT) tool that transforms WGS-derived data into laboratory-validated results of the primary RDTs—Xpert MTB/RIF, XpertMTB/RIF Ultra, GenoType MDRTBplus v2.0, and GenoscholarNTM+MDRTB II. The tool was validated through RDT results of RR-TB strains with diverse resistance patterns and geographic origins and applied on routine-derived WGS data. Results: The MTBGT tool correctly transformed the single nucleotide polymorphism (SNP) data into the RDT results and generated tabulated frequencies of the RDT probes as well as rifampicin-susceptible cases. The tool supplemented the RDT probe reactions output with the RR-conferring mutation based on identified SNPs. The MTBGT tool facilitated continuous analysis of RR-TB and Xpert probe reactions from different platforms and collection periods in Rwanda. Conclusion: Overall, the MTBGT tool allows low- and middle-income countries to make sense of the increasingly generated WGS in light of the readily available RDT. / Erasmus Mundus Joint Doctorate Fellowship grant 2016- 1346.
369

The Systematic Design and Application of Robust DNA Barcodes

Buschmann, Tilo 19 September 2016 (has links) (PDF)
High-throughput sequencing technologies are improving in quality, capacity, and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multiplexing approach relies on a specific DNA tag, index, or barcode that is attached to the sequencing or amplification primer and hence accompanies every read. After sequencing, each sample read is identified on the basis of the respective barcode sequence. Alterations of DNA barcodes during synthesis, primer ligation, DNA amplification, or sequencing may lead to incorrect sample identification unless the error is revealed and corrected. This can be accomplished by implementing error correcting algorithms and codes. This barcoding strategy increases the total number of correctly identified samples, thus improving overall sequencing efficiency. Two popular sets of error-correcting codes are Hamming codes and codes based on the Levenshtein distance. Levenshtein-based codes operate only on words of known length. Since a DNA sequence with an embedded barcode is essentially one continuous long word, application of the classical Levenshtein algorithm is problematic. In this thesis we demonstrate the decreased error correction capability of Levenshtein-based codes in a DNA context and suggest an adaptation of Levenshtein-based codes that is proven of efficiently correcting nucleotide errors in DNA sequences. In our adaptation, we take any DNA context into account and impose more strict rules for the selection of barcode sets. In simulations we show the superior error correction capability of the new method compared to traditional Levenshtein and Hamming based codes in the presence of multiple errors. We present an adaptation of Levenshtein-based codes to DNA contexts capable of guaranteed correction of a pre-defined number of insertion, deletion, and substitution mutations. Our improved method is additionally capable of correcting on average more random mutations than traditional Levenshtein-based or Hamming codes. As part of this work we prepared software for the flexible generation of DNA codes based on our new approach. To adapt codes to specific experimental conditions, the user can customize sequence filtering, the number of correctable mutations and barcode length for highest performance. However, not every platform is susceptible to a large number of both indel and substitution errors. The Illumina “Sequencing by Synthesis” platform shows a very large number of substitution errors as well as a very specific shift of the read that results in inserted and deleted bases at the 5’-end and the 3’-end (which we call phaseshifts). We argue in this scenario that the application of Sequence-Levenshtein-based codes is not efficient because it aims for a category of errors that barely occurs on this platform, which reduces the code size needlessly. As a solution, we propose the “Phaseshift distance” that exclusively supports the correction of substitutions and phaseshifts. Additionally, we enable the correction of arbitrary combinations of substitution and phaseshift errors. Thus, we address the lopsided number of substitutions compared to phaseshifts on the Illumina platform. To compare codes based on the Phaseshift distance to Hamming Codes as well as codes based on the Sequence-Levenshtein distance, we simulated an experimental scenario based on the error pattern we identified on the Illumina platform. Furthermore, we generated a large number of different sets of DNA barcodes using the Phaseshift distance and compared codes of different lengths and error correction capabilities. We found that codes based on the Phaseshift distance can correct a number of errors comparable to codes based on the Sequence-Levenshtein distance while offering the number of DNA barcodes comparable to Hamming codes. Thus, codes based on the Phaseshift distance show a higher efficiency in the targeted scenario. In some cases (e.g., with PacBio SMRT in Continuous Long Read mode), the position of the barcode and DNA context is not well defined. Many reads start inside the genomic insert so that adjacent primers might be missed. The matter is further complicated by coincidental similarities between barcode sequences and reference DNA. Therefore, a robust strategy is required in order to detect barcoded reads and avoid a large number of false positives or negatives. For mass inference problems such as this one, false discovery rate (FDR) methods are powerful and balanced solutions. Since existing FDR methods cannot be applied to this particular problem, we present an adapted FDR method that is suitable for the detection of barcoded reads as well as suggest possible improvements.
370

Detekce a identifikace virů pomocí sekvenování nové generace (NGS)

PODRÁBSKÁ, Kateřina January 2017 (has links)
Next generation sequencing is a modern method applied in plant virology for sensitive detection of previously characterized and novel pathogens without any preceding knowledge of them. In this study three novel and two already described viruses were detected by de novo assembly of Illumina single-end reads ( Hi-Seq 2500 system) from total poly(A) enriched RNA of diseased red clover (Trifolium pratense) and indicator plant (Nicotiana occidentalis 37B). The complete genomic sequence of novel Red clover carlavirus A (RCCA) was determined from Illumina reads, 5´, 3´ RACE, cloning, RT-PCR and Sanger sequencing. The presence of RCCV was also confirmed in mechanically inoculated tobacco plant.

Page generated in 0.1116 seconds