Global ETD Search

1	Heterogeneity and Context-Specificity in Biological Systems Litvin, Oren January 2014 (has links) High throughput technologies and statistical analyses have transformed the way biological research is performed. These technologies accomplish tasks that were labeled as science fiction only 20 years ago - identifying millions of genetic variations in a genome, a chip that measures expression levels of all genes, quantifying the concentration of dozens of proteins at a single cell resolution. High-throughput genome-wide approaches allowed us, for the first time, to perform unbiased research that doesn't depend on existing knowledge. Thanks to these new technologies, we now have a much better understanding on what goes awry in cancer, what are the genetic predispositions for numerous diseases, and how to select the best available treatment for each patient based on his/her genetic and genomic features. The emergence of new technologies, however, also introduced many new problems that need to be addressed in order to fully exploit the information within the data. Tasks start with data normalization and artifact identification, continue with how to properly model the data using statistical tools, and end with the suitable ways to translate those statistical results into informative and correct biological insights. A new field - computational biology - was emerged to address those problems and bridge the gap between statistics and biology. Here I present 3 studies on computational modeling of heterogeneity and context-specificity in biological systems. My work focused on the identification of genomic features that can predict or explain a phenotype. In my studies of both yeast and cancer, I found vast heterogeneity between individuals that hampers the prediction power of many statistical models. I developed novel computational models that account for the heterogeneity and discovered that, in most cases, the relationship between the genomic feature and the phenotype is context-specific - genomic features explain, predict or exert influence on the phenotype in only a subset of cases. In the first project I studied the landscape of genetic interactions in yeast using gene expression data. I found that roughly 80% of interactions are context-specific, where genetic mutations influence expression levels only in the context of other mutations. In the second project I used gene expression and copy number data to identify drivers of oncogenesis. By using gene expression as a phenotype, and by accounting for context-specificity, I identified two novel copy number drivers that were validated experimentally. In the third project I studied the transcriptional and phenotypic effects of MAPK pathway inhibition in melanoma. I show that most MAPK targets are context-specific - under the control of the pathway only in a subset of cell lines. A computational model I designed to detect context-specific interactions of the MAPK pathway identified the interferon pathway as a major player in the cytotoxic response of MAPK inhibition. Taken together, my research demonstrates the importance of context-specificity in the analysis of biological systems. Context-specific computational modeling, combined with high-throughput technologies, is a powerful tool for dissecting biological networks. Biology Bioinformatics Genetics—Research Bacterial diversity--Genetic aspects Genetics--Mathematical models
2	Development of algorithms for metagenomics and applications to the study of evolutionary processes that maintain microbial biodiversity Luo, Chengwei 20 December 2012 (has links) Understanding microbial evolution lies at the heart of microbiology and environmental sciences. Numerous studies have been dedicated to elucidating the underlying mechanisms that create microbial genetic diversity and adaptation. However, due to technical limitations such as the high level of uncultured cells in almost every natural habitat, most of current knowledge is primarily based on axenic cultures grown under laboratory conditions, which typically do not simulate well the natural environment. How well the knowledge from isolates translates to in-situ processes and natural microbial communities remains essentially speculative. The recent development of culture-independent genomic techniques (aka metagenomics) provides possibilities to bypass some of these limitations and provide new insights into microbial evolution in-situ. To date, most of metagenomic studies have been focused on a few reduced-diversity model communities, e.g., acid mine drainage. Highly complex communities such as those of soil and sediment habitats remain comparatively less understood. Furthermore, a great power of metagenomics, which has not been fully capitalized yet, is the ability to follow the evolution of natural microbial communities over time and environmental perturbations, i.e., times-series metagenomics. Although the recent developments in DNA sequencing technologies have enabled (inexpensive) time-series studies, the bioinformatics approaches to analyze the resulting data have clearly fallen behind. Taken together, to scale up metagenomics for complex community studies, three major challenges remain: 1) the difficulty to process and analyze massive short read sequencing data, often at the terabyte level; 2) the difficulty to effectively assemble genomes from complex metagenomes; and 3) the lack of methods for tracking genotypes and mutational events such as horizontal gene transfer (HGT) through time. Therefore, developing efficient bioinformatics approaches to address these challenges represents an important and timely issue. This thesis aimed to develop novel bioinformatics pipelines and algorithms for high performance computing, and, subsequently, apply these tools to natural microbial communities to generate quantitative insights into the relative importance of the molecular mechanisms creating or maintaining microbial diversity. The tools are not specific to a particular habitat or group of organisms and thus, can be broadly used to advance our understanding of microbial evolution in different settings. In particular, the comparative whole-genome analysis of 24 Escherichia isolates form various habitats, including human and non-human associated habitats such as freshwater ecosystems and beaches, showed that organisms with more similar ecologies tend to exchange more genes, which has important implications for the prokaryotic species concept. To more directly test these findings from isolates and quantify the patterns of genetic exchange among co-occurring populations, three years of time-series metagenomics data from planktonic samples from Lake Lanier (Atlanta, GA) were analyzed. For this, it was first important to develop bioinformatics algorithms to robustly assemble population genomes from complex community metagenomes, identify the phylogenetic affiliation of assembled genome and contig sequences, and detect horizontal gene transfer among these sequences. Using these novel algorithms, in situ bacterial lineage evolution was quantitatively assessed, especially with respect to whether or not ecologically distinct lineages evolve according to the recently proposed fragmented speciation model (Retchless and Lawrence, Science 2008). Evidence in support of this model was rarely observed. Instead, it appeared that rampant HGT disseminated ecologically important genes within the population, maintaining intra-population diversity. By expanding the previous approaches to include methods to assess differential gene abundance and selection pressure between samples, it was possible to quantify how soil microbial communities respond to a decade of warming by 2 0C, which simulated the predicted effects of climate change. It was found that the heated communities showed significant shifts in composition and predicted metabolism, reflecting the release of additional soil carbon compared to the unheated (control) communities, and these shifts were community-wide as opposed to being attributable to a few taxa. These findings indicated that the microbial communities of temperate grassland soils play important roles in mediating the feedback responses to climate change. Collectively, the findings presented here advance our understanding of the modes and tempo of microbial community adaptation to environmental perturbations and have important implications for better modeling the microbial diversity on the planet. The bioinformatics algorithms and approaches developed as part of this thesis are expected to facilitate future genomic and metagenomic studies across the fields of microbiology, ecology, evolution and engineering. Biodiversity Bacterial community Metagenomics Microbial genomics Microbial diversity Microbial ecology Bacterial diversity Genetic aspects Bacterial diversity

Search results

Heterogeneity and Context-Specificity in Biological Systems

Development of algorithms for metagenomics and applications to the study of evolutionary processes that maintain microbial biodiversity