• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 19
  • 8
  • 5
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 50
  • 50
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Recoloração convexa de grafos: algoritmos e poliedros / Convex recoloring of graphs: algorithms and polyhedra

Phablo Fernando Soares Moura 07 August 2013 (has links)
Neste trabalho, estudamos o problema a recoloração convexa de grafos, denotado por RC. Dizemos que uma coloração dos vértices de um grafo G é convexa se, para cada cor tribuída d, os vértices de G com a cor d induzem um subgrafo conexo. No problema RC, é dado um grafo G e uma coloração de seus vértices, e o objetivo é recolorir o menor número possível de vértices de G tal que a coloração resultante seja convexa. A motivação para o estudo deste problema surgiu em contexto de árvores filogenéticas. Sabe-se que este problema é NP-difícil mesmo quando G é um caminho. Mostramos que o problema RC parametrizado pelo número de mudanças de cor é W[2]-difícil mesmo se a coloração inicial usa apenas duas cores. Além disso, provamos alguns resultados sobre a inaproximabilidade deste problema. Apresentamos uma formulação inteira para a versão com pesos do problema RC em grafos arbitrários, e então a especializamos para o caso de árvores. Estudamos a estrutura facial do politopo definido como a envoltória convexa dos pontos inteiros que satisfazem as restrições da formulação proposta, apresentamos várias classes de desigualdades que definem facetas e descrevemos os correspondentes algoritmos de separação. Implementamos um algoritmo branch-and-cut para o problema RC em árvores e mostramos os resultados computacionais obtidos com uma grande quantidade de instâncias que representam árvores filogenéticas reais. Os experimentos mostram que essa abordagem pode ser usada para resolver instâncias da ordem de 1500 vértices em 40 minutos, um desempenho muito superior ao alcançado por outros algoritmos propostos na literatura. / In this work we study the convex recoloring problem of graphs, denoted by CR. We say that a vertex coloring of a graph G is convex if, for each assigned color d, the vertices of G with color d induce a connected subgraph. In the CR problem, given a graph G and a coloring of its vertices, we want to find a recoloring that is convex and minimizes the number of recolored vertices. The motivation for investigating this problem has its roots in the study of phylogenetic trees. It is known that this problem is NP-hard even when G is a path. We show that the problem CR parameterized by the number of color changes is W[2]-hard even if the initial coloring uses only two colors. Moreover, we prove some inapproximation results for this problem. We also show an integer programming formulation for the weighted version of this problem on arbitrary graphs, and then specialize it for trees. We study the facial structure of the polytope defined as the convex hull of the integer points satisfying the restrictions of the proposed ILP formulation, present several classes of facet-defining inequalities and the corresponding separation algorithms. We also present a branch-and-cut algorithm that we have implemented for the special case of trees, and show the computational results obtained with a large number of instances. We considered instances which are real phylogenetic trees. The experiments show that this approach can be used to solve instances up to 1500 vertices in 40 minutes, comparing favorably to other approaches that have been proposed in the literature.
32

Bioinformatický nástroj pro klasifikaci bakterií do taxonomických kategorií na základě sekvence genu 16S rRNA / Bioinformatic Tool for Classification of Bacteria into Taxonomic Categories Based on the Sequence of 16S rRNA Gene

Valešová, Nikola January 2019 (has links)
Tato práce se zabývá problematikou automatizované klasifikace a rozpoznávání bakterií po získání jejich DNA procesem sekvenování. V rámci této práce je navržena a popsána nová metoda klasifikace založená na základě segmentu 16S rRNA. Představený princip je vytvořen podle stromové struktury taxonomických kategorií a používá známé algoritmy strojového učení pro klasifikaci bakterií do jedné ze tříd na nižší taxonomické úrovni. Součástí práce je dále implementace popsaného algoritmu a vyhodnocení jeho přesnosti predikce. Přesnost klasifikace různých typů klasifikátorů a jejich nastavení je prozkoumána a je určeno nastavení, které dosahuje nejlepších výsledků. Přesnost implementovaného algoritmu je také porovnána s několika existujícími metodami. Během validace dosáhla implementovaná aplikace KTC více než 45% přesnosti při predikci rodu na datových sadách BLAST 16S i BLAST V4. Na závěr je zmíněno i několik možností vylepšení a rozšíření stávající implementace algoritmu.
33

Techniky pro zarovnávání skupin biologických sekvencí / Techniques for Multiple Sequence Alignments

Hrazdil, Jiří January 2009 (has links)
This thesis summarizes ways of representation of biological sequences and file formats used for sequence exchange and storage. Next part deals with techniques used for sequence pairwise alignment, followed by extension of these techniques to the problem of multiple sequence alignment. Additional methods are introduced, that are suboptimal, but on the other hand are able to compute results in reasonable time. Practical part of this thesis consists of implementing multiple sequence alignment application in Java programming language.
34

Phylogenetic analysis of aquatic microbiomes : Evolution of the brackish microbiome

Deng, Ziling January 2020 (has links)
Microorganisms play crucial roles in aquatic environments in determining ecosystemstability and driving the turnover of elements essential to life. Understanding thedistribution and evolution of aquatic microorganisms will help us predict how aquaticecosystems will respond to Global Change, and such understanding can be gained bystudying these processes of the past. In this project, we investigate the evolutionaryrelationship between brackish water bacteria from the Baltic Sea and Caspian Seawith freshwater and marine bacteria, with the goal of understanding how brackishwater bacteria have evolved. 11,276 bacterial metagenome-assembled genomes(MAGs) from seven metagenomic datasets were used to conduct a comparativeanalysis of freshwater, brackish and marine bacteria. When clustering the genomes bypairwise average nucleotide identity (ANI) at the approximate species level (96.5%ANI), the Baltic Sea genomes were more likely to form clusters with the Caspian Seagenomes than with Swedish lakes genomes, even though geographic distancesbetween Swedish lakes and the Baltic Sea are much smaller. Phylogenomic analysisand ancestral state reconstruction showed that approximately half of the brackishMAGs had freshwater ancestors and half had marine ancestors. Phylogeneticdistances were on average shorter to freshwater ancestors, but when subsampling thetree to the same number of freshwater and marine MAG clusters, the distances werenot significantly different. Brackish genomes belonging to Acidimicrobiia,Actinobacteria and Cyanobacteriia tended to originate from freshwater bacteria, whilethose of Alphaproteobacteria and Bacteroidia mainly had evolved from marinebacteria. / Mikroorganismer spelar avgörande roller i akvatiska ekosystem där de driverkretsloppen av näringsämnen. En ökad förståelse för hur mikroorganismer anpassarsig till miljöförändringar är viktigt för att förutsäga hur akvatiska ekosystem kommeratt förändras som en konsekvens av global uppvärmning, och sådan förståelse kanuppnås genom att studera tidigare skeenden i evolutionen. I detta projekt undersökervi det evolutionära förhållandet mellan brackvatten-bakterier från Östersjön ochKaspiska havet med sötvattens- och marina bakterier, med målet att förstå hurbrackvatten-bakterier har utvecklats. 11,276 bakteriella arvsmassor somrekonstruerats med metagenomik från sju data-set användes för att utföra enjämförande analys av bakterie-genom från söt-, brack och havsvatten. Klustring avgenomen baserat på parvis genomsnittlig nukleotididentitet (ANI) på ungefärligartnivå (96,5% ANI), grupperade Östersjöns bakterier tillsammans med Kaspiskahavets bakterier mer än med bakterier från svenska sjöar, trots att det geografiskaavståndet mellan svenska sjöar och Östersjön är mycket mindre. Fylogenetisk analysvisade att ungefär hälften av brackvatten arterna hade anfäder från sötvatten ochhälften från havsvatten. De fylogenetiska avstånden var i genomsnitt kortare tillanfaderna i sötvatten, men när man reducerade trädet till att ha samma antal sötvattenoch marina arter var avstånden inte längre signifikant olika. Brackvatten-arter somtillhörde Acidimicrobiia, Actinobacteria och Cyanobacteriia tenderade att härstammafrån sötvattenbakterier, medan de från Alphaproteobacteria och Bacteroidia främsthärstammade från marina bakterier.
35

Genome-scaled molecular clock studies of invasive mosquitoes and other organisms of societal relevance

Zadra, Nicola 21 April 2022 (has links)
Molecular dating (or molecular clock) is a powerful technique that uses the mutation rate of biomolecules to estimate divergence times among organisms. In the last two decades, the theory behind the molecular clock has been intensively developed, and it is now possible to employ sophisticated evolutionary models on genome-scaled datasets in a Bayesian framework. The molecular clock has been successfully applied to virtually all types of organisms and molecules to estimate timing of speciation, timing of gene duplications, and generation times: this knowledge allows contextualizing past and present events in the light of (paleo)ecological scenarios. Molecular clock studies are routinely used in evolutionary and ecological studies, but their use in applied fields such as agricultural and medical entomology is still scarce in particular because of a paucity of genome data. Genome-scaled clocks have been successfully applied, for example, to various model organisms such as Anopheles and Drosophila, as well as to invasive mosquitoes Aedes aegypti and Aedes albopictus. Many other invasive pests are emerging worldwide aided by global trade, increased connectivity among countries, lack of prevention, and flawed invasive species management. Among them, there is Aedes koreicus and Aedes japonicus, two invasive mosquito species which are monitored for public health concerns because of their harboured human pathogenic viruses. For these, as well as for other insects of societal relevance, such as the parasitoid Trissolcus japonicus, there is a paucity of gene markers and no genome data for large scale molecular clock studies. Invasive pests are typically studied using microevolutionary approaches that tackle events at an intraspecific level: these approaches provide important information for the pest management, for example, by revealing invasion routes and insecticide resistances. Approaches that tackle the deep-time evolution of the pest, such as the molecular clock, are instead less used in pest science. Many important traits associated with invasiveness have evolved by speciation over a long time frame: the molecular clock can reveal the paleo-ecological conditions that favoured these traits helping a better understanding of pest biology. Molecular clock, when coupled with phylogenomics, can further identify genes and patterns that characterize the pest: this knowledge can be used to enhance management practices. Although this is a data-driven thesis, its major aim is to provide new results to demonstrate the utility of the molecular clock in pest science. This has been done by systematically apply the molecular clock to various neglected organisms of medical and agricultural relevance. To this aim, I generated new genome data and/or assembled the largest genome-scaled data to date. I studied the molecular clock in mosquitoes, focusing on the Aedini radiation (Chapter 2) and identified a strong incongruence between the mitochondrial and nuclear phylogeny for what concerns their molecular clock. This result highlighted the importance of employing genome scaled data for these species to exclude stochastic effects due to poor/inaccurate sampling in clock studies. To tackle the absence of data, I further assembled the whole mitogenome of emerging invasive species Aedes koreicus and Aedes japonicus with the aim of producing useful data for molecular typing and of inferring divergence estimates using whole mitogenomes (Chapter 3). Dated phylogenies point toward more recent diversification of Aedini and Culicini compared to estimates from previous works, addressing the issue of taxon sampling sensitivity in dated phylogeny. Although it is possible to perform molecular clock studies on single/few gene markers, the current trend is to couple this methodology with genome-scaled datasets to reduce the stochastic effect of using few genes. For this reason, I sequenced the draft genome of A. koreicus and A. japonicus (Chapter 4). The assemblies were extremely fragmented, highlighting the problem of sequencing large genomes using short reads. The assemblies provided, however enough information for genome skimming allowing extraction of BUSCO genes for downstream analyses, whole mitogenome assemblies (used in Chapter 3), and characterisation of the associated metagenome. These data need to be integrated by long reads; it provides, however a first framework to investigate the genome evolution of these species. I further sequenced and assembled the genome of Trissolcus japonicus, the parasitoid wasp of the invasive pest Halyomorpha halys. To elucidate its divergence, estimate and define an intraspecific typing system to differentiate strains for biocontrol strategies, I reconstructed the mitochondrial genomes of two populations: the mitogenomes were surprisingly identical, suggesting that they belong to the same de facto population. I further provide a detailed clock investigation of Zika, a virus harboured and transmitted by some Aedes species (Chapter 5). Using the largest set of genomes to date, I could set the origin of ZIKV in the middle age and its first diversification in the mid-19th century. From a methodological point of view, the clocking of this virus highlighted the importance of checking for recombination and for cell-passages to obtain correct divergence estimates. I finally show my contributions to molecular clock studies of three other invasive species (Chapter 6): I helped disentangle the divergence times of Bactrocera, a genus of invasive fruit files pest of agriculture; I contributed in performing a phylogenomics study of opsin genes in Diptera; I used chloroplast and nuclear genome data to reconstruct the divergences of the invasive reed Arundo. In the various Chapters of my thesis, I highlighted the limits and the problems of current molecular clock methodologies and identified the best practices for different types of organisms in order to develop a cross-discipline understanding of the molecular clock techniques. The various results presented in this thesis further demonstrate the utility of the molecular clock approach in pest studies.
36

New hypotheses about the origin of Pseudomonas syringae crop pathogens

Cai, Rongman 31 May 2012 (has links)
Pseudomonas syringae is a common foliar plant pathogenic bacterium that causes diseases on many crop plants. We hypothesized that today's highly virulent P. syringae crop pathogens with narrow host range might have evolved after the advent of agriculture from ancestral P. syringae strains with wide host range that were adapted to mixed plant communities. The model tomato and Arabidopsis pathogen P. syringae pv. tomato (Pto) DC3000 and its close relatives isolated from crop plants were thus selected to unravel basic principles of host range evolution by applying molecular evolutionary analysis and comparative genomics approaches. Phylogenetic analysis was combined with host range tests to reconstruct the host range of the most recent common ancestor of all analyzed strains isolated from crop plants. Even though reconstruction of host range of the most recent common ancestor of all analyzed strains was not conclusive, support for this hypothesis was found in some sub-groups of strains. The focus of my studies then turned to Pto T1, which was found to represent the most common P. syringae lineage causing bacterial speck disease on tomato world-wide. Five genomes were sequenced and compared to each other. Identical genotypes were found in North America and Europe suggesting frequent pathogen movement between these continents. Moreover, the type III-secreted effector gene hopM1 was found to be under strong selection for loss of function and non-synonymous mutations in the fliC gene allowed to identify a region that triggers plant immunity. Finally, Pto T1 was compared to closely related bacteria isolated from snow pack and surface water in the French Alps. Recombination between alpine strains and crop strains was inferred and virulence gene repertoires of alpine strains and crop strains were found to overlap. Alpine strains cause disease on tomato and have relatively wider host ranges than Pto T1. The conclusion from these studies is that Pto T1 and other crop pathogens may have evolved from ancestors similar to the characterized environmental strains isolated in the French Alps by adapting their effector repertoire to individual crops becoming more virulent on these crops but losing virulence on other plants. / Ph. D.
37

Modules réactionnels : un nouveau concept pour étudier l'évolution des voies métaboliques / Reaction modules : a new concept to study the evolution of metabolic pathways

Barba, Matthieu 16 December 2011 (has links)
J'ai mis au point une méthodologie pour annoter les superfamilles d'enzymes, en décrire l'histoire et les replacer dans l'évolution de leurs voies métaboliques. J'en ai étudié trois : (1) les amidohydrolases cycliques, dont les DHOases (dihydroorotases, biosynthèse des pyrimidines), pour lesquelles j'ai proposé une nouvelle classification. L'arbre phylogénétique inclut les dihydropyrimidinases (DHPases) et allantoïnases (ALNases) qui ont des réactions similaires dans d'autres voies (dégradation des pyrimidines et des purines respectivement). (2) L'étude de la superfamille des DHODases (qui suivent les DHOases) montre une phylogénie semblable aux DHOases, avec également des enzymes d'autres voies, dont les DHPDases (qui suivent les DHPases). De cette observation est né le concept de module réactionnel, qui correspond à la conservation de l’enchaînement de réactions semblables dans différentes voies métaboliques. Cela a été utilisé lors de (3) l'étude des carbamoyltransférases (TCases) qui incluent les ATCases (précédant les DHOases). J'ai d'abord montré l'existence d'une nouvelle TCase potentiellement impliquée dans la dégradation des purines et lui ai proposé un nouveau rôle en utilisant le concept de module réactionnel (enchaînement avec l'ALNase). Dans ces trois grandes familles j'ai aussi mis en évidence trois groupes de paralogues non identifiés qui se retrouvent pourtant dans un même contexte génétique appelé « Yge » et qui formeraient donc un module réactionnel constitutif d'une nouvelle voie hypothétique. Appliqué à diverses voies, le concept de modules réactionnels refléterait donc les voies métaboliques ancestrales dont ils seraient les éléments de base. / I designed a methodology to annotate enzyme superfamilies, explain their history and describe them in the context of metabolic pathways evolution. Three superfamilies were studied: (1) cyclic amidohydrolases, including DHOases (dihydroorotases, third step of the pyrimidines biosynthesis), for which I proposed a new classification. The phylogenetic tree also includes dihydropyrimidinases (DHPases) and allantoinases (ALNases) which catalyze similar reactions in other pathways (pyrimidine and purine degradation, respectively). (2) The DHODases superfamily (after DHOases) show a similar phylogeny as DHOases, including enzymes from other pathways, DHPDases in particular (after DHPases). This led to the concept of reaction module, i.e. a conserved series of similar reactions in different metabolic pathways. This was used to study (3) the carbamoyltransferases (TCases) which include ATCases (before DHOases). I first isolated a new kind of TCase, potentially involved in the purine degradation, and I proposed a new role for it in the light of reaction modules (linked with ALNase). In those three superfamilies I also found three groups of unidentified paralogs that were remarkably part of the same genetic context called “Yge” which would be a reaction module part of an unidentified pathway. The concept of reactions modules may then reflect the ancestral metabolic pathways for which they would be basic elements.
38

Structuration des communautés de fourmis de la litière en forêt guyanaise / Organization of leaf litter ant communities in french guianese forest

Fichaux, Mélanie 26 September 2018 (has links)
L’objectif général de cette thèse est de déterminer le rôle de l’exclusion compétitive, du filtrage environnemental et des limites de dispersion dans la distribution des espèces de fourmis de la litière en forêt guyanaise. Pour cela, nous avons évalué comment la diversité des assemblages de fourmis varie le long de gradients environnementaux et géographiques, en considérant les trois facettes de la diversité (i.e. taxonomique, phylogénétique et fonctionnelle) à différentes échelles spatiales. Des patrons observés de structure fonctionnelle et phylogénétique plus faibles qu’attendus au hasard suggèrent que le filtrage environnemental agit sur la distribution des espèces à l’échelle du site de récolte. En revanche, l’hypothèse d’une sur-dispersion fonctionnelle et/ou phylogénétique entre espèces qui co-occurrent localement résultant de l’exclusion d’espèces similaires n’est pas soutenue par nos résultats. A l’échelle régionale, nos résultats montrent que les communautés de fourmis sont fortement structurées par les variations environnementales. La distance spatiale influence également la distribution des espèces de fourmis à travers la région. D’après l’ensemble de nos résultats, le filtrage environnemental est la force majeure de structuration des assemblages d’espèces de fourmis en forêt guyanaise, tant à l’échelle locale qu’à l’échelle régionale. Les espèces sont réparties de manière fragmentaire sur le territoire, en réponse aux variations environnementales. Les patrons de diversité sont également influencés par la distance spatiale à l’échelle régionale, résultant en un turnover dans la composition spécifique des assemblages de fourmis entre localités éloignées. / The overall aim of this thesis is to determine the role of competitive exclusion, environmental filtering and dispersal limitation on the distribution of leaf-litter ant species in French Guianese forest. To this end, we evaluated how the diversity of ant communities varies along environmental and geographic gradients, using the three facets of diversity (i.e. taxonomic, phylogenetic and functional dimensions) at different spatial scales. Observed patterns of functional and phylogenetic structure lower than expected by chance suggest that environmental filtering acts on the distribution of ant species at the scale of sampled site. In contrast, the hypothesis of functional and/or phylogenetic overdispersion between locally co-occurring species resulting from the exclusion of similar species is not supported by our results. At the regional scale, our results show that ant communities are strongly structured by environmental variations. Spatial distance also influences the distribution of ant species throughout the region. Taken together, our results suggest that environmental filtering is the main driver structuring communities of ant species in French Guianese rainforest, both at local and regional scales. Species are distributed in a patchy way throughout the region, in response to environmental variations. Patterns of diversity are also influenced by the spatial distance at the regional scale, leading to a turnover in species composition of ant communities between distant areas.
39

Comparaison de différentes méthodes de classification : application aux langues bantu du nord-ouest / New approaches in linguistic classification : application to Northwestern Bantu languages

Grollemund, Rebecca 17 September 2012 (has links)
Ce travail de thèse propose une étude des nouvelles méthodes de classification, dites phylogénétiques, empruntées à la biologie dans le but de proposer une nouvelle classification linguistique. Les langues étudiées appartiennent à la famille « bantu », présentes au sein de la famille linguistique Niger-Congo, parlée en Afrique. De nombreux travaux ont été établis sur les langues bantu, montrant ainsi la complexité de cette famille linguistique. Notre étude se spécialise sur la zone « Nord-Ouest », qui comprend les pays suivants : Cameroun, Guinée Équatoriale, Gabon, Congo et République Démocratique du Congo. Ce travail présente une nouvelle classification de ces langues à travers l’étude du lexique. Nous avons ainsi constitué une base de données de 100 mots appartenant au vocabulaire de base pour les 207 langues retenues. Plusieurs arbres ont été générés par l’application des algorithmes Neighbor-Joining (Saitou et Nei, 1987) et Neighbor-Net (Bryant et Moulton, 2004). L’étude de la classification des langues du Nord-Ouest a permis de mieux comprendre les relations de proximité linguistiques qui existent entre les langues parlées dans cette région. De même, l’analyse de la classification a permis de proposer un schéma de migrations des langues bantu. / This dissertation is presenting a linguistic classification based on phylogenetic methods borrowed from biology. The sample of languages considered here belongs to the Bantu family, a linguistic sub branch of Niger-Congo languages spoken in Africa. Numerous publications have shown a complexity and the diversity of Bantu languages. Our study focus on the North-West region which includes the following countries: Cameroon, Equatorial Guinea, Gabon, Congo and Democratic Republic of Congo. This new classification is based on the comparison of lexical items. We have organized a database including 100 words from the basic vocabulary for 207 languages. Several tree representations were obtained by using Neighbor-Joining (Saitou and Nei, 1987) and Neighbor-Net (Bryant and Moulton, 2004) algorithms.This study allows us to get a better understanding of the linguistic proximity of these languages. It also provides a historical scenario for Bantu migrations.
40

Combining approaches for predicting genomic evolution / Combinaison d'approches pour résoudre le problème du réarrangement de génomes

Alkindy, Bassam 17 December 2015 (has links)
En bio-informatique, comprendre comment les molécules d’ADN ont évolué au cours du temps reste un problème ouvert etcomplexe. Des algorithmes ont été proposés pour résoudre ce problème, mais ils se limitent soit à l’évolution d’un caractèredonné (par exemple, un nucléotide précis), ou se focalisent a contrario sur de gros génomes nucléaires (plusieurs milliardsde paires de base), ces derniers ayant connus de multiples événements de recombinaison – le problème étant NP completquand on considère l’ensemble de toutes les opérations possibles sur ces séquences, aucune solution n’existe à l’heureactuelle. Dans cette thèse, nous nous attaquons au problème de reconstruction des séquences ADN ancestrales en nousfocalisant sur des chaînes nucléotidiques de taille intermédiaire, et ayant connu assez peu de recombinaison au coursdu temps : les génomes de chloroplastes. Nous montrons qu’à cette échelle le problème de la reconstruction d’ancêtrespeut être résolu, même quand on considère l’ensemble de tous les génomes chloroplastiques complets actuellementdisponibles. Nous nous concentrons plus précisément sur l’ordre et le contenu ancestral en gènes, ainsi que sur lesproblèmes techniques que cette reconstruction soulève dans le cas des chloroplastes. Nous montrons comment obtenirune prédiction des séquences codantes d’une qualité telle qu’elle permette ladite reconstruction, puis comment obtenir unarbre phylogénétique en accord avec le plus grand nombre possible de gènes, sur lesquels nous pouvons ensuite appuyernotre remontée dans le temps – cette dernière étant en cours de finalisation. Ces méthodes, combinant l’utilisation d’outilsdéjà disponibles (dont la qualité a été évaluée) à du calcul haute performance, de l’intelligence artificielle et de la biostatistique,ont été appliquées à une collection de plus de 450 génomes chloroplastiques. / In Bioinformatics, understanding how DNA molecules have evolved over time remains an open and complex problem.Algorithms have been proposed to solve this problem, but they are limited either to the evolution of a given character (forexample, a specific nucleotide), or conversely focus on large nuclear genomes (several billion base pairs ), the latter havingknown multiple recombination events - the problem is NP complete when you consider the set of all possible operationson these sequences, no solution exists at present. In this thesis, we tackle the problem of reconstruction of ancestral DNAsequences by focusing on the nucleotide chains of intermediate size, and have experienced relatively little recombinationover time: chloroplast genomes. We show that at this level the problem of the reconstruction of ancestors can be resolved,even when you consider the set of all complete chloroplast genomes currently available. We focus specifically on the orderand ancestral gene content, as well as the technical problems this raises reconstruction in the case of chloroplasts. Weshow how to obtain a prediction of the coding sequences of a quality such as to allow said reconstruction and how toobtain a phylogenetic tree in agreement with the largest number of genes, on which we can then support our back in time- the latter being finalized. These methods, combining the use of tools already available (the quality of which has beenassessed) in high performance computing, artificial intelligence and bio-statistics were applied to a collection of more than450 chloroplast genomes.

Page generated in 0.0993 seconds