Spelling suggestions: "subject:"genome rearrangement"" "subject:"genome earrangement""
1 |
Plane Permutations and their Applications to Graph Embeddings and Genome RearrangementsChen, Xiaofeng 27 April 2017 (has links)
Maps have been extensively studied and are important in many research fields. A map is a 2-cell embedding of a graph on an orientable surface. Motivated by a new way to read the information provided by the skeleton of a map, we introduce new objects called plane permutations. Plane permutations not only provide new insight into enumeration of maps and related graph embedding problems, but they also provide a powerful framework to study less related genome rearrangement problems.
As results, we refine and extend several existing results on enumeration of maps by counting plane permutations filtered by different criteria. In the spirit of the topological, graph theoretical study of graph embeddings, we study the behavior of graph embeddings under local changes. We obtain a local version of the interpolation theorem, local genus distribution as well as an easy-to-check necessary condition for a given embedding to be of minimum genus. Applying the plane permutation paradigm to genome rearrangement problems, we present a unified simple framework to study transposition distances and block-interchange distances of permutations as well as reversal distances of signed permutations. The essential idea is associating a plane permutation to a given permutation or signed permutation to sort, and then applying the developed plane permutation theory. / Ph. D. / This work is mainly concerned with studying two problems. The first problem starts with a graph <i>G</i> consisting of vertices and lines (called edges) linking some pairs of vertices. Intuitively, if the graph <i>G</i> can not be drawn on the sphere without crossing edges, it may be possibly drawn on a torus (i.e., the surface of a doughnut) without crossing edges; if it is still impossible, it may be possible to draw the graph <i>G</i> on the surface obtained by “gluing” several tori together. Once a graph <i>G</i> is drawn on a surface without crossing edges, there is a cyclic order of those edges incident to each vertex of the graph. Suppose you are not satisfied with how the edges around a vertex are cyclically arranged, and you want to arrange them differently. A question that arises naturally would be: is the adjusted drawing still cross-free on the original surface, or do we need to glue more (or fewer) tori in order for it to be crossfree? The second problem stems from genome rearrangements. In bioinformatics, people try to understand evolution (of species) by comparing the genome sequences (e.g., DNA sequences) of different species. Certain operations on genome sequences are believed to be potential ways of how species evolve. The operations studied in this work are transpositions, block-interchanges and reversals. For example, a transposition is such an operation that swaps two consecutive segments on the given genome sequence. As a candidate indicator of how far away one species is from another from an evolutionary perspective, we can compute how many transpositions are required to transform the genome sequence of one species to that of the other. In this work, we propose a plane permutation framework, which works effectively on solving the above mentioned two problems. In addition, plane permutations themselves are interesting objects to study and are studied as well.
|
2 |
Population Dynamics in Random Environment, Random Walks on Symmetric Group, and Phylogeny ReconstructionJamshidpey, Arash January 2016 (has links)
This thesis concerns applications of some probabilistic tools to phylogeny reconstruction and population genetics. Modelling the evolution of species by continuous-time random walks on the signed permutation groups, we study the asymptotic medians of a set of random permutations sampled from simple random walks at time 0.25cn, for c> 0. Running k independent random walks all starting at identity, we prove that the medians approximate the ancestor (identity permutation) up to time 0.25n, while there exists a constant c>1 after which the medians loose credibility as an estimator. We study the median of a set of random permutations on the symmetric group endowed with different metrics. In particular, for a special metric of dissimilarity, called breakpoint, where the space is not geodesic, we find a large group of medians of random permutations using the concept of partial geodesics (or geodesic patches). Also, we study the Fleming-Viot process in random environment (FVRE) via martingale and duality methods. We develop the duality method to the case of time-dependent and quenched martingale problems. Using a family of dual processes we prove the convergence of the Moran processes in random environments to FVRE in Skorokhod topology. We also study the long-time behaviour of FVRE and prove the existence of equilibrium for the joint annealed-environment process and prove an ergodic theorem for the latter.
|
3 |
Discovery and Analysis of Genomic Patterns: Applications to Transcription Factor Binding and Genome RearrangementSINHA, AMIT U. 22 April 2008 (has links)
No description available.
|
4 |
Robust and Efficient Algorithms for Protein 3-D Structure Alignment and Genome Sequence ComparisonZhao, Zhiyu 07 August 2008 (has links)
Sequence analysis and structure analysis are two of the fundamental areas of bioinformatics research. This dissertation discusses, specifically, protein structure related problems including protein structure alignment and query, and genome sequence related problems including haplotype reconstruction and genome rearrangement. It first presents an algorithm for pairwise protein structure alignment that is tested with structures from the Protein Data Bank (PDB). In many cases it outperforms two other well-known algorithms, DaliLite and CE. The preliminary algorithm is a graph-theory based approach, which uses the concept of \stars" to reduce the complexity of clique-finding algorithms. The algorithm is then improved by introducing \double-center stars" in the graph and applying a self-learning strategy. The updated algorithm is tested with a much larger set of protein structures and shown to be an improvement in accuracy, especially in cases of weak similarity. A protein structure query algorithm is designed to search for similar structures in the PDB, using the improved alignment algorithm. It is compared with SSM and shows better performance with lower maximum and average Q-score for missing proteins. An interesting problem dealing with the calculation of the diameter of a 3-D sequence of points arose and its connection to the sublinear time computation is discussed. The diameter calculation of a 3-D sequence is approximated by a series of sublinear time deterministic, zero-error and bounded-error randomized algorithms and we have obtained a series of separations about the power of sublinear time computations. This dissertation also discusses two genome sequence related problems. A probabilistic model is proposed for reconstructing haplotypes from SNP matrices with incomplete and inconsistent errors. The experiments with simulated data show both high accuracy and speed, conforming to the theoretically provable e ciency and accuracy of the algorithm. Finally, a genome rearrangement problem is studied. The concept of non-breaking similarity is introduced. Approximating the exemplar non-breaking similarity to factor n1..f is proven to be NP-hard. Interestingly, for several practical cases, several polynomial time algorithms are presented.
|
5 |
Studies of Genome Diversity in <i>Bartonella</i> Populations : A journey through cats, mice, men and liceLindroos, Hillevi Lina January 2007 (has links)
<p>Bacteria of the genus <i>Bartonella</i> inhabit the red blood cells of many mammals, including humans, and are transmitted by blood-sucking arthropod vectors. Different species of <i>Bartonella</i> are associated with different mammalian host species, to which they have adapted and normally do not cause any symptoms. Incidental infection of other hosts is however often followed by various disease symptoms, and several <i>Bartonella</i> species are considered as emerging human pathogens.</p><p>In this work, I have studied the genomic diversity within and between different <i>Bartonella</i> species, with focus on the feline-associated human pathogen <i>B. henselae</i> and its close relatives, the similarly feline-associated <i>B. koehlerae</i> and the trench-fever agent <i>B. quintana</i> which is restricted to humans.</p><p>In <i>B. henselae</i>, the overall variability in sequence and genome content was modest and well correlated, suggesting low levels of intra-species recombination in the core genome. The variably present genes were located in the prophage and the genomic islands, which are also absent from <i>B. quintana</i> and <i>B. koehlerae</i>, indicating multiple independent excision events. In contrast, diversity of genome structures was immense and probably associated with rearrangements between the repeated genomic islands located around the terminus of replication, possibly to avoid the host’s immune system. In both <i>B. henselae</i> and the mouse-associated species <i>B. grahamii</i> a large portion of the chromosome was manifold amplified in long-time cultures and packaged into phage particles, allowing for different recombination rates for different chromosomal regions.</p><p>In B<i>. quintana</i>, diversity was studied by sequencing non-coding spacers. The low variability might be due to the recent emergence of this species. Surprisingly, also this species displayed high variability in genome structures, despite its lack of repeated sequences.</p><p>The results indicate that genome rearrangements and gain or loss of mobile elements are major mechanisms of evolution in <i>Bartonella</i>.</p>
|
6 |
Studies of Genome Diversity in Bartonella Populations : A journey through cats, mice, men and liceLindroos, Hillevi Lina January 2007 (has links)
Bacteria of the genus Bartonella inhabit the red blood cells of many mammals, including humans, and are transmitted by blood-sucking arthropod vectors. Different species of Bartonella are associated with different mammalian host species, to which they have adapted and normally do not cause any symptoms. Incidental infection of other hosts is however often followed by various disease symptoms, and several Bartonella species are considered as emerging human pathogens. In this work, I have studied the genomic diversity within and between different Bartonella species, with focus on the feline-associated human pathogen B. henselae and its close relatives, the similarly feline-associated B. koehlerae and the trench-fever agent B. quintana which is restricted to humans. In B. henselae, the overall variability in sequence and genome content was modest and well correlated, suggesting low levels of intra-species recombination in the core genome. The variably present genes were located in the prophage and the genomic islands, which are also absent from B. quintana and B. koehlerae, indicating multiple independent excision events. In contrast, diversity of genome structures was immense and probably associated with rearrangements between the repeated genomic islands located around the terminus of replication, possibly to avoid the host’s immune system. In both B. henselae and the mouse-associated species B. grahamii a large portion of the chromosome was manifold amplified in long-time cultures and packaged into phage particles, allowing for different recombination rates for different chromosomal regions. In B. quintana, diversity was studied by sequencing non-coding spacers. The low variability might be due to the recent emergence of this species. Surprisingly, also this species displayed high variability in genome structures, despite its lack of repeated sequences. The results indicate that genome rearrangements and gain or loss of mobile elements are major mechanisms of evolution in Bartonella.
|
7 |
Patterns of somatic genome rearrangement in human cancerRoberts, Nicola Diane January 2018 (has links)
Cancer development is driven by somatic genome alterations, ranging from single point mutations to larger structural variants (SV) affecting kilobases to megabases of one or more chromosomes. Studies of somatic rearrangement have previously been limited by a paucity of whole genome sequencing data, and a lack of methods for comprehensive structural classification and downstream analysis. The ICGC project on the Pan-Cancer Analysis of Whole Genomes provides an unprecedented opportunity to analyse somatic SVs at base-pair resolution in more than 2500 samples from 30 common cancer types. In this thesis, I build on a recently developed SV classification pipeline to present a census of rearrangement across the pan-cancer cohort, including chromoplexy, replicative two-jumps, and templated insertions connecting as many as eight distant loci. By identifying the precise structure of individual breakpoint junctions and separating out complex clusters, the classification scheme empowers detailed exploration of all simple SV properties and signatures. After illustrating the various SV classes and their frequency across cancer types and samples, Chapter 2 focuses on structural properties including event size and breakpoint homology. Then, in Chapter 3, I consider the SV distribution across the genome, and show patterns of association with various genome properties. Upon examination of rearrangement hotspot loci, I describe tissue-specific fragile site deletion patterns, and a variety of SV profiles around known cancer genes, including recurrent templated insertion cycles affecting TERT and RB1. Turning to co-occurring alteration patterns, Chapter 4 introduces the Hierarchical Dirichlet Process as a non-parametric Bayesian model of mutational signatures. After developing methods for consensus signature extraction, I detour to the domain of single nucleotide variants to test the HDP method on real and simulated data, and to illustrate its utility for simultaneous signature discovery and matching. Finally, I return to the PCAWG SV dataset, and extract SV signatures delineated by structural class, size, and replication timing. In Chapter 5, I move on to the complex SV clusters (largely set aside throughout Chapters 2—4) , and develop an improved breakpoint clustering method to subdivide the complex rearrangement landscape. I propose a raft of summary metrics for groups of five or more breakpoint junctions, and explore their utility for preliminary classification of chromothripsis and other complex phenomena. This comprehensive study of somatic genome rearrangement provides detailed insight into SV patterns and properties across event classes, genome regions, samples, and cancer types. To extrapolate from the progress made in this thesis, Chapter 6 suggests future strategies for addressing unanswered questions about complex SV mechanisms, annotation of functional consequences, and selection analysis to discover novel drivers of the cancer phenotype.
|
8 |
Comparative mitochondrial genomics toward understanding genetics and evolution of arbuscular mycorrhizal fungiNadimi, Maryam 03 1900 (has links)
Les champignons mycorhiziens arbusculaires (CMA) sont très répandus dans le sol où ils forment des associations symbiotiques avec la majorité des plantes appelées mycorhizes arbusculaires. Le développement des CMA dépend fortement de la plante hôte, de telle sorte qu'ils ne peuvent vivre à l'état saprotrophique, par conséquent ils sont considérés comme des biotrophes obligatoires. Les CMA forment une lignée évolutive basale des champignons et ils appartiennent au phylum Glomeromycota. Leurs mycélia sont formés d’un réseau d’hyphes cénocytiques dans lesquelles les noyaux et les organites cellulaires peuvent se déplacer librement d’un compartiment à l’autre. Les CMA permettent à la plante hôte de bénéficier d'une meilleure nutrition minérale, grâce au réseau d'hyphes extraradiculaires, qui s'étend au-delà de la zone du
sol explorée par les racines. Ces hyphes possèdent une grande capacité d'absorption d’éléments nutritifs qui vont être transportés par ceux-ci jusqu’aux racines. De ce fait, les CMA améliorent la croissance des plantes tout en les protégeant des stresses biotiques et abiotiques. Malgré l’importance des CMA, leurs génétique et évolution demeurent peu connues. Leurs études sont ardues à cause de leur mode de vie qui empêche leur culture en absence des plantes hôtes. En plus leur diversité génétique intra-isolat des génomes nucléaires, complique d’avantage ces études, en particulier le développement des marqueurs moléculaires pour des études biologiques, écologiques ainsi que les fonctions des CMA. C’est pour ces raisons que les génomes mitochondriaux offrent des opportunités et alternatives intéressantes pour étudier les CMA. En effet, les génomes mitochondriaux (mt) publiés à date, ne montrent pas de polymorphismes génétique intra-isolats. Cependant, des exceptions peuvent exister. Pour aller de l’avant avec la
génomique mitochondriale, nous avons besoin de générer beaucoup de données de séquençages de l’ADN mitochondrial (ADNmt) afin d’étudier les méchanismes évolutifs, la génétique des population, l’écologie des communautés et la fonction des CMA. Dans ce contexte, l’objectif de mon projet de doctorat consiste à: 1) étudier l’évolution des génomes mt en utilisant l’approche de la génomique comparative au niveau des espèces proches, des isolats ainsi que des espèces phylogénétiquement éloignées chez les CMA; 2) étudier l’hérédité génétique des génomes mt au sein des isolats de l’espèce modèle Rhizophagus irregularis par le biais des anastomoses ; 3) étudier l’organisation des ADNmt et les gènes mt pour le développement des marqueurs moléculaires pour des études phylogénétiques. Nous avons utilisé l’approche dite ‘whole genome shotgun’ en pyroséquençage 454 et Illumina HiSeq pour séquencer plusieurs taxons de CMA sélectionnés selon leur importance et leur disponibilité. Les assemblages de novo, le séquençage conventionnel Sanger, l’annotation et la génomique comparative ont été réalisés pour caractériser des ADNmt complets. Nous avons découvert plusieurs mécanismes évolutifs intéressant chez l’espèce Gigaspora rosea dans laquelle le génome mt est complètement remanié en comparaison avec Rhizophagus irregularis isolat DAOM 197198. En plus nous avons mis en évidence que deux gènes cox1 et rns sont fragmentés en deux morceaux. Nous avons démontré que les ARN transcrits les deux fragments de cox1 se relient entre eux par épissage en trans ‘Trans-splicing’ à l’aide de l’ARN du gene nad5 I3 qui met ensemble les deux ARN cox1.1 et cox1.2 en formant un ARN complet et fonctionnel. Nous avons aussi trouvé une organisation de l’ADNmt très particulière chez l’espèce Rhizophagus sp. Isolat DAOM 213198 dont le génome mt est constitué par deux chromosomes circulaires. En plus nous avons trouvé une quantité considérable des séquences apparentées aux plasmides ‘plasmid-related sequences’ chez les Glomeraceae par rapport aux Gigasporaceae, contribuant ainsi à une évolution rapide des ADNmt chez les Glomeromycota. Nous avons aussi séquencé plusieurs isolats de l’espèces R. irregularis et Rhizophagus sp. pour décortiquer leur position phylogénéque et inférer des relations évolutives entre celles-ci. La comparaison génomique mt nous montré l’existence de plusieurs éléments mobiles comme : des cadres de lecture ‘open reading frames (mORFs)’, des séquences courtes inversées ‘short inverted repeats (SIRs)’, et des séquences apparentées aux plasimdes ‘plasmid-related sequences (dpo)’ qui impactent l’ordre des gènes mt et permettent le remaniement chromosomiques des ADNmt. Tous ces divers mécanismes évolutifs observés au niveau des isolats, nous permettent de développer des marqueurs moléculaires spécifiques à chaque isolat ou espèce de CMA. Les données générées dans mon projet de doctorat ont permis d’avancer les connaissances fondamentales des génomes mitochondriaux non seulement chez les Glomeromycètes, mais aussi de chez le règne des Fungi et les eucaryotes en général. Les trousses moléculaires développées dans ce projet peuvent servir à des études de la génétique des populations, des échanges génétiques et l’écologie des CMA ce qui va contribuer à la compréhension du rôle primorial des
CMA en agriculture et environnement. / Arbuscular mycorrhizal fungi (AMF) are the most widespread eukaryotic symbionts,
forming mutualistic associations known as Arbuscular Mycorrhizae with the majority of plantroots. AMF are obligate biotrophs belonging to an ancient fungal lineage of phylum
Glomeromycota. Their mycelia are formed by a complex network made up of coenocytic hyphae, where nuclei and cell organelles can freely move from one compartment to another. AMF are commonly acknowledged to improve plant growth by enhancing mineral nutrient uptake, in particular phosphate and nitrate, and they confer tolerance to abiotic and biotic stressors for plants. Despite their significant roles in ecosystems, their genetics and evolution are not well understood. Studying AMF is challenging due to their obligate biotrophy, their slow growth, and their limited morphological criteria. In addition, intra-isolate genetic polymorphism of nuclear DNA brings another level of complexity to the investigation of the biology, ecology and function of AMF. Genetic polymorphism of nuclear DNA within a single isolate limits the development of efficient molecular markers mainly at lower taxonomic levels (i.e. the inter-isolate level). Instead, mitochondrial (mt) genomics have been used as an attractive alternative to study AMF. In AMF, mt genomes have been shown to be homogeneous, or at least much less polymorphic than nuclear
DNA. However, by generating large mt sequence datasets we can investigate the efficiency and usefulness of developing molecular marker toolkits in order to study the dynamic and evolutionary mechanisms of AMF. This approach also elucidates the population genetics, community ecology and functions of Glomeromycota. Therefore, the objectives of my Ph.D. project were: 1) To investigate mitochondrial genome evolution using comparative mitogenomic analyses of closely related species and isolates as well as phylogenetically distant taxa of AMF; 2) To explore mt genome inheritance among compatible isolates of the model AMF Rhizophagus irregularis through anastomosis formation; and 3) To assess mtDNA and mt genes for marker development and phylogenetic analyses. We used whole genome shotgun, 454 pyrosequencing and HiSeq Illimina to sequence AMF taxa selected according to their importance and availability in our lab collections. De novo assemblies, Sanger sequencing, annotation and comparative genomics were then performed to characterize complete mtDNAs. We discovered interesting evolutionary mechanisms in Gigaspora rosea: 1) we found a fully reshuffled mt genome synteny compared to Rhizaphagus irregularis DAOM 197198; and 2) we discovered the presence of fragmented cox1 and rns genes. We demonstrated that two cox1 transcripts are joined by trans-splicing. We also reported an unusual mtDNA organization in Rhizophagus sp. DAOM 213198, whose mt genome consisted of two circular mtDNAs. In addition, we observed a considerably higher number of mt plasmidrelated sequences in Glomeraceae compared with Gigasporaceae, contributing a mechanism for faster evolution of mtDNA in Glomeromycota. We also sequenced other isolates of R. irregularis and Rhizophagus sp. in order to unravel their evolutionary relationships and to develop molecular toolkits for their discrimination. Comparative mitogenomic analyses of these mtDNAs revealed the occurrence of many mobile elements such as mobile open reading frames (mORFs), short inverted repeats (SIRs), and plasmid-related sequences (dpo) that impact mt genome synteny and mtDNA alteration. All together, these evolutionary mechanisms among closely related AMF
isolates give us clues for designing reliable and efficient intra- and inter-specific markers to discriminate closely related AMF taxa and isolates.
Data generated in my Ph.D. project advances our knowledge of mitochondrial genomes
evolution not only in Glomeromycota, but also in the larger framework of the Fungal kingdom and Eukaryotes in general. Molecular toolkits developed in this project will offer new opportunities to study population genetics, genetic exchanges and ecology of AMF. In turn, this work will contribute to understanding the role of these fungi in nature, with potential applications in both agriculture and environmental protection.
|
Page generated in 1.7475 seconds