1 |
Ancestral Genome Reconstruction in BacteriaYang, Kuan 25 June 2012 (has links)
The rapid accumulation of numerous sequenced genomes has provided a golden opportunity for ancestral state reconstruction studies, especially in the whole genome reconstruction area. However, most ancestral genome reconstruction methods developed so far only focus on gene or replicon sequences instead of whole genomes. They rely largely on either detailed modeling of evolutionary events or edit distance computation, both of which can be computationally prohibitive for large data sets. Hence, most of these methods can only be applied to a small number of features and species. In this dissertation, we describe the design, implementation, and evaluation of an ancestral genome reconstruction system (REGEN) for bacteria. It is the first bacterial genome reconstruction tool that focuses on ancestral state reconstruction at the genome scale instead of the gene scale. It not only reconstructs ancestral gene content and contiguous gene runs using either a maximum parsimony or a maximum likelihood criterion but also replicon structures of each ancestor. Based on the reconstructed genomes, it can infer all major events at both the gene scale, such as insertion, deletion, and translocation, and the replicon scale, such as replicon gain, loss, and merge. REGEN finishes by producing a visual representation of the entire evolutionary history of all genomes in the study. With a model-free reconstruction method at its core, the computational requirement for ancestral genome reconstruction is reduced sufficiently for the tool to be applied to large data sets with dozens of genomes and thousands of features. To achieve as accurate a reconstruction as possible, we also develop a homologous gene family prediction tool for preprocessing. Furthermore, we build our in-house Prokaryote Genome Evolution simulator (PEGsim) for evaluation purposes. The homologous gene family prediction refinement module can refine homologous gene family predictions generated by third party de novo prediction programs by combining phylogeny and local gene synteny. We show that such refinement can be accomplished for up to 80% of homologous gene family predictions with ambiguity (mixed families). The genome evolution simulator, PEGsim, is the first random events based high level bacteria genome evolution simulator with models for all common evolutionary events at the gene, replicon, and genome scales. The concepts of conserved gene runs and horizontal gene transfer (HGT) are also built in. We show the validation of PEGsim itself and the evaluation of the last reconstruction component with simulated data produced by it. REGEN, REconstruction of GENomes, is an ancestral genome reconstruction tool based on the concept of neighboring gene pairs (NGPs). Although it does not cover the reconstruction of actual nucleotide sequences, it is capable of reconstructing gene content, contiguous genes runs, and replicon structure of each ancestor using either a maximum parsimony or a maximum likelihood criterion. Based on the reconstructed genomes, it can infer all major events at both the gene scale, such as insertion, deletion, and translocation, and the replicon scale, such as replicon gain, loss, and merge. REGEN finishes by producing a visual representation of the entire evolutionary history of all genomes in the study. / Ph. D.
|
2 |
Reconstruction conjointe de l’ordre des gènes de génomes actuels et ancestraux et de leur évolution structurale dans un cadre phylogénétique / Joint reconstruction of ancestral and extant genome structure in a phylogenetic frameworkAnselmetti, Yoann 29 November 2017 (has links)
Les années 2000 ont vu l'apparition des technologies de séquençage haut-débit permettant de faire chuter le coût en temps et argent du séquençage du génome complet et ouvrant la perspective à des analyses de la phylogénie des espèces à l'échelle de génome entiers. Dans cette optique des méthodes pour l'inférence de l'histoire évolutive de l'ordre de marqueurs génomiques le long d'un phylogénie ont été développées. Cependant, les assemblages d'une majorité des grands génomes d'eucaryotes demeurent incomplètement résolues et ne permettent donc pas, en tant que tel, leur exploitation pour la reconstruction de l'histoire évolutive de l'ordre des gènes de ces espèces. C'est dans ce contexte que nous avons développé l'algorithme adseq qui permet de conjointement reconstruire l'histoire évolutive de l'ordre de gènes en considérant la fragmentation des génomes actuels et améliorant l'assemblage de ceux-ci par génomique comparative / The early 2000s saw the emergence of high-throughput sequencing technologies that would bring down the time and cost of sequencing the entire genome and opening the perspective to whole genome-scale species phylogeny. In this perspective, methods for the inference of evolutionary history of the order of genomic markers along a phylogeny have been developed. However, assemblies of a majority of the large eukaryotic genomes remain incompletely resolved and therefore do not, as such, allow their exploitation for the reconstruction of evolutionary history of the order of the genes of these species. It is in this context that we have developed the algorithm ADseq which allows to jointly reconstruct the evolutionary history of the order of genes by considering the fragmentation of the extant genomes and improve the assembly of these by comparative genomics
|
3 |
The Design, Implementation and Application of a Computational Pipeline for the Reconstruction of the Gene Order on the Chromosomes of Very Ancient Ancestral SpeciesXu, Qiaoji 11 September 2023 (has links)
This thesis presents a novel approach to reconstructing ancestral genomes of a number of descendant species related by a phylogeny. Traditional methods face challenges due to cycles of whole genome doubling followed by fractionation in plant lineages. In response, the thesis proposes a new approach that first accumulates a large number of candidate gene adjacencies specific to each ancestor in a phylogeny. A subset of these which to produces long ancestral contigs are chosen through maximum weight matching. The strategy results in more complete reconstructions than existing methods, and a number of quality measures are deployed to assess the results.
The thesis also presents a new computational technique for estimating the ancestral monoploid number of chromosomes, involving a "g-mer" analysis to resolve a bias due to long contigs and gap statistics to estimate the number. The method is applied to a set of phylogenetically related descendant species, and the monoploid number is found to be 9 for all rosid and asterid orders. Additionally, the thesis demonstrates that this result is not an artifact of the method, by deriving a monoploid number of approximately 20 for the metazoan ancestor.
The reconstructed ancestral genomes are functionally annotated and visualized through painting ancestral projections on descendant genomes and highlighting syntenic ancestor-descendant relationships. The proposed method is applied to genomes drawn from a broad range of plant orders. The Raccroche pipeline reconstructs ancestral gene orders and chromosomal contents of the ancestral genomes at all internal vertices of a phylogenetic tree, and constructs chromosomes by counting the frequencies of ancestral contig co-occurrence on the extant genomes, clustering these for each ancestor, and ordering them.
Overall, this thesis presents a significant contribution to the field of ancestral genome reconstruction, offering a new approach that produces more complete reconstructions and provides valuable insights into the evolutionary process giving rise to the gene content and order of extant genomes.
|
4 |
Taxonomic and functional exploration of the biosphere of serpentinizing hydrothermal systems by metagenomics / Exploration de la diversité taxonomique et fonctionnelle de la biosphère des systèmes hydrothermaux serpentinisésFrouin, Eléonore 17 December 2018 (has links)
Les systèmes hydrothermaux associés à la serpentinisation sont anoxiques et riches en $H_2$, $CH_4$ et molécules organiques. Ces composants alimentent des micro-organismes qui colonisent les systèmes serpentinisés, et ce en dépit d’un pH élevé et de faibles concentrations en accepteurs d'électrons et en carbone dissous. Dans ce travail, les communautés microbiennes ont été étudiées en se focalisant sur Prony, un écosystème serpentinisé côtier de Nouvelle-Calédonie, puis, en comparant différents écosystèmes serpentinisés, pour faire émerger des similarités taxonomiques et fonctionnelles. À Prony, nos analyses de métabarcoding ont mis en évidence l'importance d’une biosphère rare. L'analyse de métagénomes a permis de reconstruire 82 génomes procaryotes. Un de ces génomes est phylogénétiquement proche des espèces du genre Serpentinomonas, bactéries chimiolithotrophes isolées du site serpentinisé The Cedars, qui détiennent le record d’alcalophilie. Ces espèces et d'autres phylotypes, tels que les taxons affiliés aux Lost City Methanosarcinales, ont été trouvés dans plusieurs sites serpentinisés et pourraient contribuer à la définition d'une signature biologique des phénomènes de serpentinisation. En ciblant spécifiquement les métabolismes enrichis dans les milieux serpentinisés, nous avons pu mettre en évidence l'importance du métabolisme de l'hydrogène, des mécanismes cellulaires de réponse aux stress et d’une voie de dégradation des phosphonates, reposant sur l’activité d'une C-P lyase. Cette voie métabolique, qui a un rôle clé dans l'assimilation du phosphore et la libération de molécules organiques, vient enrichir les modèles écologiques des systèmes serpentinisés. / Serpentinizing hydrothermal systems are anoxic and enriched in $H_2$, $CH_4$ and organic molecules. These compounds support microbes that colonize serpentinizing systems, despite high pH and low concentrations of electron acceptors and dissolved inorganic carbon. In this work, two axes were explored to study the microbial communities. On the one hand, we focused on Prony, a coastal serpentinizing site in New Caledonia, and on the other hand we compared different serpentinizing systems to reveal taxonomic and functional similarities. At Prony, our metabarcoding analyses highlighted the importance of the rare biosphere. Moreover, 82 prokaryotic genomes were successfully reconstructed using five metagenomes from Prony. One of these genomes was phylogenetically close to the species of the genus Serpentinomonas, chemolithotrophic bacteria isolated at the serpentinizing site The Cedars that are capable of growth up to pH 12.5. These species, and other phylotypes, such as taxa affiliated with Lost City Methanosarcinales were identified in several serpentinizing sites and could contribute to the definition of a biological signature associated with serpentinization. By specifically targeting enriched metabolisms in serpentinizing environments, we highlighted key functions associated with hydrogen metabolism and environmental stress response mechanisms. The comparison of serpentinizing metagenomes revealed the importance of a phosphonate degradative pathway, based on the activity of a C-P lyase. This metabolic pathway, which plays a key role in the uptake of phosphorus and the release of organic molecules, was integrated into the ecological models of serpentinizing systems.
|
Page generated in 0.0845 seconds