Global ETD Search

1	Strategy for prokaryotic genome sequencing Jiang, Jingwei., 江经纬. January 2011 (has links) Prokaryotes are single-cell microorganisms. These creatures can be further classified to bacteria and archaea. Their DNA genetic meterials are spread around the cytoplasm rather than residing in the nucleus. Unlike eukaryotes, a high percentage of prokaryotic genome is composed of genes. The evolution of prokaryotes is different from that of the eukaryotes. Prokaryotes are found almost everywhere including the harshest environments on Earth. Understanding the whole pictures of their genomes will benefit us a lot in terms of new enzyme discovery, decoding drug resistance, biofuel development, etc. 　High-throughput sequencing technology is becoming increasingly popular in various sequencing projects. With different platforms, scientists are able to achieve millions ~ billions of sequences within days. In the last two years, there are a lot of prokaryotic genomes being sequenced under these platforms. However, there are only 1,542 complete chromosomes available in NCBI GenBank (September 2011) since the first complete genome of Bacillus subtilis was published in 1997. The most difficult step in finishing a complete genome is closing all gaps among different contigs. In this thesis, a series of comprehensive simulation studies based on 1,542 complete chromosomes have been performed in search of a cost-effective way to achieve complete prokaryotic genomes. Solutions to both draft and complete genome sequencing were provided by computer simulation. Moreover, classification studies have been performed to identify special prokaryotic phyla/orders (if any) dissatisfying our proposed strategies. 　Our results indicate that: 1) low coverage (6x-10x) pyrosequencing with long reads (400 bp) is sufficient to produce highly continuous and complete assemblies, presenting a tiny proportion of false gene duplication/loss. High quality draft genomes could be generated by this strategy; 2) Long repeats to some extent influence the assembly quality, especially for the genome coverage and contig number. The number of contigs and genome coverage rate are significantly correlated with the total size of repeat regions; 3) With a combination of one run of single-end reads (10x, 400bp read length) and one run of paired end reads (10x, 8kb library, 400bp read length), ~90% of chromosome assemblies are less than 10 scaffolds and ~95% of chromosome assemblies are less than 150 contigs. Most of the chromosomes can be assembled into high quality draft chromosomes (<50 contigs, ~4 scaffolds, >370kb contig N50 size, >99.99% single base accuracy and <0.5% false gene duplication/loss rate in average); 4) Similar patterns found in both simulated and real reads imply that our simulation analysis is not overestimated; 5) Greater attention is needed regarding the orders Thiotrichales, Enterobacteriales and Nostocales, when applying the above strategies for complete genome sequencing; 6) For prokaryotic species with multiple chromosomes, Pulse Field Gel Electrophoresis is needed to separate all their chromosomes which will be individually collected by electroelution prior to draft/complete genome sequencing. A comprehensive computer simulation study based on 1,542 chromosomes (all availabe prokaryotic complete chromosomes, September 2011) has been performed in this thesis. The sequencing strategies for both prokaryotic draft and complete genome proposed by the simulation study could facilitate the ongoing prokaryotic complete genome sequencing projects. / published_or_final_version / Biological Sciences / Doctoral / Doctor of Philosophy Prokaryotes - Genetics. Gene mapping.
2	Towards in silico detection and classification of prokaryotic Mobile Genetic Elements Lima Mendez, Gipsi 07 January 2008 (has links) Bacteriophage genomes show pervasive mosaicism, indicating that horizontal gene exchange plays a crucial role in their evolution. Phage genomes represent unique combinations of modules, each of them with a different phylogenetic history. Thus, a web-like, rather than a hierarchical scheme is needed for an appropriate representation of phage evolutionary relationships. Part of the virology community has long recognized this fact and calls for changing the traditional taxonomy that classifies tailed phages according to the type of genetic materials and phage tail and head/capsid morphologies. Moreover, based on morphological features, the current system depends on inspection of phage virions under the electron microscope and cannot directly classify prophages. With the genomic era, many phages have been sequenced that are not classified, calling for development of an automatic classification procedure that can cope with the sequencing pace. The ACLAME database provides a classification of phage proteins into families and assigns the families with at least 3 members to one or several functions.<p>In the first contribution of this work, the relative contribution of those different protein families to the similarities between the phages is assessed using pair-wise similarity matrices. The modular character of phage genomes is readily visualized using heatmaps, which differ depending on the function of the proteins used to measure the similarity. <p>Next, I propose a framework that allows for a reticulate classification of phages based on gene content (with statistical assessment of the significance of number of shared genes). Starting from gene/protein families, we built a weighted graph, where nodes represent phages and edges represent phage-phage similarities in terms of shared families. The topology of the network shows that most dsDNA phages form an interconnected group, confirming that dsDNA phages share a common gene pool, as proposed earlier. Differences are observed between temperate and virulent phages in the values of several centrality measures, which may correlate with different constraints to rampant recombination dictated by the phage lifestyle, and thus with a distinct evolutionary role in the phage population. <p>To this graph I applied a two-step clustering method to generate a fuzzy classification of phages. Using this methodology, each phage is associated with a membership vector, which quantitatively characterizes the membership of the phage to the clusters. Alternatively, genes were clustered based on their ‘phylogenetic profiles’ to define ‘evolutionary cohesive modules’. Phages can then be described as composite of a set of modules from the collection of modules of the whole phage population. The relationships between phages define a network based on module sharing. Unlike the first network built from statistical significant number of shared genes, this second network allows for a direct exploration of the nature of the functions shared between the connected phages. This functionality of the module-based network runs at the expense of missing links due to genes that are not part of modules, but which are encoded in the first network. <p>These approaches can easily focus on pre-defined modules for tracing one or several traits across the population. They provide an automatic and dynamic way to study relationships within the phage population. Moreover, they can be extended to the representation of populations of other mobile genetic elements or even to the entire mobilome.<p>Finally, to enrich the phage sequence space, which in turn allows for a better assessment of phage diversity and evolution, I devise a prophage prediction tool. With this methodology, approximately 800 prophages are predicted in 266 among 800 replicons screened. The comparison of a subset of these predictions with a manually annotated set shows a sensitivity of 79% and a positive predictive value of 91%, this later value suggesting that the procedure makes few false predictions. The preliminary analysis of the predicted prophages indicates that many may constitute novel phage types.<p>This work allows tracing guidelines for the classification and analysis of other mobile genetic elements. One can foresee that a pool of putative mobile genetic elements sequences can be extracted from the prokaryotic genomes and be further broken down in groups of related elements and evolutionary conserved modules. This would allow widening the picture of the evolutionary and functional relationships between these elements.<p> / Doctorat en Sciences / info:eu-repo/semantics/nonPublished Sciences exactes et naturelles Biologie Bacteriophages Prokaryotes -- Genetics Bactériophages Procaryotes -- Génétique graph network prophage bacteriophage mobile genetic element
3	Systems approaches to characterize phenotypic heterogeneity in bacterial populations Blattman, Sydney Borg January 2024 (has links) Gene expression heterogeneity underlies critical bacterial phenotypes including antibiotic tolerance, pathogenesis, and communication. Though microbial population heterogeneity has been appreciated for decades, we still lack a complete view of single-cell gene expression and phenotypic states. Various tools, including bulk RNA-seq and proteomics, are available for probing all genes on a population-level. Conversely, fluorescent protein reporters and in situ hybridization can capture single-cell states but only for a limited number of genes. Single-cell RNA-sequencing (scRNA-seq), which can quantify expression of all genes with resolution for individual cells, has revolutionized studies of heterogeneous eukaryotic populations. However, adaptation to bacteria has been hindered by technical barriers. This thesis will describe the development of high-throughput scRNA-seq for bacteria and its application to uncover a distinct transcriptional state of rare antibiotic-tolerant cells called persisters. Chapter 2 presents prokaryotic expression profiling by tagging RNA in situ and sequencing (PETRI-seq), our novel scRNA-seq technology. I will detail how PETRI-seq was optimized to overcome bacteria-specific challenges, including lack of mRNA polyadenylation, thick cell walls, and extremely low mRNA abundance. Using combinatorial indexing, PETRI-seq uniquely barcodes tens of thousands of gram-negative and/or gram-positive cells in a single experiment at low cost. In proof-of-concept experiments, we show robust discrimination of E. coli growth phases and identification of rare prophage activation in S. aureus. PETRI-seq will be broadly useful for characterizing bacterial heterogeneity in many contexts. Chapter 3 describes an expansive investigation into antibiotic persistence in E. coli. When a population is treated with lethal antibiotics, persisters are rare cells that can survive the exposure by assuming a relatively dormant state. Understanding the gene expression state and molecular drivers of persistence has been a longstanding goal with major potential to inform drug development and clinical practice. We have applied PETRI-seq to multiple models of E. coli persistence and discovered a distinct transcriptional state underlying this phenotype. In parallel, we used genome-wide CRISPR-interference to probe the functional contribution of every gene to the persistence phenotype. We discovered multiple driver genes and pathways. Comprehensive validation established Lon protease and YqgE as key gene products modulating translation rate, post-starvation dormancy, and persistence. Our work is a major step in defining the physiological state of persistence and the molecular processes leading cells into this state. In all, this thesis demonstrates how a new generation of systems approaches, including scRNA-seq and CRISPR-interference, enable new discoveries about long-studied phenomena. The overarching approach is broadly applicable with potential to inspire a wide range of microbiology studies. Molecular biology Bacteriology Gene expression Phenotype Prokaryotes--Genetics Escherichia coli--Growth Escherichia coli--Genetics Antibiotics Microorganisms--Effect of antibiotics on Biology--Classification

1

Page generated in 0.0555 seconds