• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 16
  • 3
  • 3
  • 2
  • Tagged with
  • 38
  • 38
  • 10
  • 9
  • 8
  • 8
  • 7
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

VIRAL QUASISPECIES RECONSTRUCTION USING NEXT GENERATION SEQUENCING READS

Tork, Bassam A 12 August 2013 (has links)
The genomic diversity of viral quasispecies is a subject of great interest, especially for chronic infections. Characterization of viral diversity can be addressed by high-throughput sequencing technology (454 Life Sciences, Illumina, SOLiD, Ion Torrent, etc.). Standard assembly software was originally designed for single genome assembly and cannot be used to assemble and estimate the frequency of closely related quasispecies sequences. This work focuses on parsimonious and maximum likelihood models for assembling viral quasispecies and estimating their frequencies from 454 sequencing data. Our methods have been applied to several RNA viruses (HCV, IBV) as well as DNA viruses (HBV), genotyped using 454 Life Sciences amplicon and shotgun methods.
2

Viral Quasispecies Reconstruction Using Next Generation Sequencing Reads

Tork, Bassam A 12 August 2013 (has links)
The genomic diversity of viral quasispecies is a subject of great interest, especially for chronic infections. Characterization of viral diversity can be addressed by high-throughput sequencing technology (454 Life Sciences, Illumina, SOLiD, Ion Torrent, etc.). Standard assembly software was originally designed for single genome assembly and cannot be used to assemble and estimate the frequency of closely related quasispecies sequences. This work focuses on parsimonious and maximum likelihood models for assembling viral quasispecies and estimating their frequencies from 454 sequencing data. Our methods have been applied to several RNA viruses (HCV, IBV) as well as DNA viruses (HBV), genotyped using 454 Life Sciences amplicon and shotgun methods.
3

Computational methods for de novo assembly of next-generation genome sequencing data / Méthodes de calcul pour assemblage de novo de nouvelle génération des techniques de séquençage du génome

Chikhi, Rayan 02 July 2012 (has links)
Dans cette thèse, nous présentons des méthodes de calcul (modèles théoriques et algorithmiques) pour effectuer la reconstruction de séquences d'ADN. Il s'agit de l'assemblage de novo de génome à partir de lectures (courte séquences ADN) produites par des séquenceurs à haut débit. Ce problème est difficile, aussi bien en théorie qu'en pratique. Du point de vue théorique, les génomes sont structurellement complexes. Chaque instance d'assemblage de novo doit faire face à des ambiguïtés de reconstruction. Les lectures peuvent conduire à un nombre exponentiel de reconstructions possibles, une seule étant correcte. Comme il est impossible de déterminer laquelle, une approximation fragmentée du génome est retournée. Du point de vue pratique, les séquenceurs produisent un énorme volume de lectures, avec une redondance élevée. Une puissance de calcul importante est nécessaire pour traiter ces lectures. Le séquençage ADN évolue désormais vers des génomes et méta-génomes de plus en plus grands. Ceci renforce la nécessité de méthodes efficaces pour l'assemblage de novo. Cette thèse présente de nouvelles contributions en informatique autour de l'assemblage de génomes. Ces contributions visent à incorporer plus d'information pour améliorer la qualité des résultats, et à traiter efficacement les données de séquençage afin de réduire la complexité du calcul. Plus précisément, nous proposons un nouvel algorithme pour quantifier la couverture maximale d'un génome atteignable par le séquençage, et nous appliquons cet algorithme à plusieurs génomes modèles. Nous formulons un ensemble de problèmes informatiques pour incorporer l'information des lectures pairées dans l'assemblage, et nous étudions leur complexité. Cette thèse introduit la notion d'assemblage localisé, qui consiste à construire et parcourir un graphe d'assemblage partiel. Pour économiser l'utilisation de la mémoire, nous utilisons des structures de données optimisées spécifiquement pour la tâche d'assemblage. Ces notions sont implémentées dans un nouvel assembleur de novo, Monument. Enfin, le dernier chapitre de cette thèse est consacré à des concepts d'assemblage dépassant l'assemblage de novo classique. / In this thesis, we discuss computational methods (theoretical models and algorithms) to perform the reconstruction (de novo assembly) of DNA sequences produced by high-throughput sequencers. This problem is challenging, both theoretically and practically. The theoretical difficulty arises from the complex structure of genomes. The assembly process has to deal with reconstruction ambiguities. The output of sequencing predicts up to an exponential number of reconstructions, yet only one is correct. To deal with this problem, only a fragmented approximation of the genome is returned. The practical difficulty stems from the huge volume of data produced by sequencers, with high redundancy. Significant computing power is required to process it. As larger genomes and meta-genomes are being sequenced, the need for efficient computational methods for de novo assembly is increasing rapidly. This thesis introduces novel contributions to genome assembly, both in terms of incorporating more information to improve the quality of results, and efficiently processing data to reduce the computation complexity. Specifically, we propose a novel algorithm to quantify the maximum theoretical genome coverage achievable by sequencing data (paired reads), and apply this algorithm to several model genomes. We formulate a set of computational problems that take into account pairing information in assembly, and study their complexity. Then, two novel concepts that cover practical aspects of assembly are proposed: localized assembly and memory-efficient reads indexing. Localized assembly consists in constructing and traversing a partial assembly graph. These ingredients are implemented in a complete de novo assembly software package, the Monument assembler. Monument is compared with other state of the art assembly methods. Finally, we conclude with a series of smaller projects, exploring concepts beyond classical de novo assembly.
4

Expanding the Knowledgebase of Earth’s Microbiome Using Culture Dependent and Independent Methods

Murphy, Trevor 01 June 2021 (has links)
Microorganisms exist ubiquitously on Earth, yet their functions and ecological roles remain elusive. Investigating these microbes is accomplished by using culture-dependent and culture-independent methodologies. This study employs both methodologies to characterize: 1) the genomic potential of the novel deep-subsurface bacterial isolate Thermanaerosceptrum fracticalcis strain DRI-13T by combining next-generation and nanopore sequencing technologies and 2) the microbiome of the artificial marine environment for the Hawaiian Bobtail Squid in aquaculture using next-generation sequencing of 16S rRNA gene. Microbial ecology of the deep-subsurface remains understudied in terms of microbial diversity and function. The genomic information of DRI-13T revealed a potential for syntrophic relationships, diverse metabolic potential including prophages/antiviral defenses, and novel methylation motifs. Artificial marine environments housing marine the Hawaiian Bobtail Squid (Euprymna scolopes) contain microorganisms that can directly influence animal and aquaculture health. No studies presently show if bacterial communities of the tank environment correlate with the health and productivity of E. scolopes. This study sought to address this by sampling from a year of unproductive aquaculture yield and comparing the bacterial communities from productive cohorts. Bacterial communities from unproductive samples show less bacterial diversity and abundance coupled with shifts in bacterial composition. Nitrate and pH levels between the tanks were found to be strong influences on determining the bacterial populations of productive and unproductive cohorts.
5

<em>De novo</em> Genome Assembly and SNP Marker Development of <em>Pyrenophora semeniperda</em>

Soliai, Marcus Makina 17 March 2011 (has links) (PDF)
Pyrenophora semeniperda (anamorph Drechslera campulata) is a necrotrophic fungal seed pathogen of a variety of grass genra and species, including Bromus tectorum, an exotic grass that has invaded many natural ecosystems of the U.S. Intermountain West. As a natural seed pathogen of B. tectorum, P. semeniperda has potential as a biocontrol agent due to its effectiveness at killing dormant B. tectorum seeds; however, few genetic resources exist for this fungus. Here, the genome assembly of a P. semeniperda isolate using 454 GS-FLX genomic and paired-end pyrosequencing techniques is presented. The total assembly is 32.5 Mb and contains 11,453 gene models greater than 24 amino acids. The assembly contains a variety of predicted genes that are involved in pathogenic pathways typically found in necrotrophic fungi. In addition, 454 sequence reads were used to identify single nucleotide polymorphisms between two isolates of P. semeniperda. In total, 20 SNP markers were developed for the purposes of recombination assesment of 600 individual P. semeniperda isolates representing 36 populations from throughout the U.S. Intermountain West. Although 17 of the fungal populations were fixed at all SNP loci, linkage disequilibrium was determined in the remaining 18 populations. This research demonstrates the effectiveness of the 454 GS-FLX sequencing technology, for de novo assembly and marker development of filamentous fungal genomes. Many features of the assembly match those of other Pyrenophora genomes including P. tritici-repentis and P. teres f. teres, which lend validity to our assembly. These findings present a significant resource for examining and furthering our understanding of P. semeniperda biology.
6

The Bioluminescence Heterozygous Genome Assembler

Price, Jared Calvin 01 December 2014 (has links) (PDF)
High-throughput DNA sequencing technologies are currently revolutionizing the fields of biology and medicine by elucidating the structure and function of the components of life. Modern DNA sequencing machines typically produce relatively short reads of DNA which are then assembled by software in an attempt to produce a representation of the entire genome. Due to the complex structure of all but the smallest genomes, especially the abundant presence of exact or almost exact repeats, all genome assemblers introduce errors into the final sequence and output a relatively large set of contigs instead of full-length chromosomes (a contig is a DNA sequence built from the overlaps between many reads). These problems are dramatically worse when homologous copies of the same chromosome differ substantially. Currently such genomes are usually avoided as assembly targets and, when they are not avoided, they generally produce assemblies of relatively low quality. An improved algorithm for the assembly of such data would dramatically improve our understanding of the genetics of a large class of organisms. We present a unique algorithm for the assembly of diploid genomes which have a high degree of variation between homologous chromosomes. The approach uses coverage, graph patterns and machine-learning classification to identify haplotype-specific sequences in the input reads. It then uses these haplotype-specific markers to guide an improved assembly. We validate the approach with a large experiment that isolates and elucidates the effect of single nucleotide polymorphisms (SNPs) on genome assembly more clearly than any previous study. The experiment conclusively demonstrates that the Bioluminescence heterozygous genome assembler produces dramatically longer contigs with fewer haplotype-switch errors than competing algorithms under conditions of high heterozygosity.
7

Characterization of the genetic diversity and thermal tolerance of Pocilloporid Corals in the Red Sea

Buitrago-López, Carol 07 1900 (has links)
This dissertation characterizes the genetic diversity and thermal tolerance of the coral holobiont Stylophora pistillata and Pocillopora verrucosa (family Pocilloporidae) across the Saudi Arabian Red Sea coast (~1500 km). The population genetic structure and holobiont diversity was assessed using genome-wide single nucleotide polymorphisms (SNPs) identified with reference genome-based RAD-Seq, while the associated microbial communities of the algal symbiont (Symbiodiniaceae) and bacteria were inferred from metabarcoding analyses of the ITS2 and 16S rRNA gene. Thermal tolerance of Stylophora pistillata colonies was assessed using standardized short-term heat stress assays on the novel Coral Bleaching Automated Stress System (CBASS). Chapter 1 details the assembly and annotation of the P. verrucosa genome (~380 Mbp; 27,439 gene models), which was highly complete and compared well to the already available S. pistillata genome. Chapter 2 presents population genetic analyses of both coral species, which revealed pronounced differences in their population genetic structure. While P. verrucosa seemed to be highly connected across the Red Sea basin with the exception of the far south, S. pistillata depicted a complex population genetic structure. Microbial communities of Symbiodiniaceae and bacteria were overall less diverse in P. verrucosa than in S. pistillata, and followed an association pattern that was partly determined by the environment and partly by host genotype. Chapter 3 identifies thermally tolerant S. pistillata genotypes by comparing the heat stress response of colonies collected at two sites within the same reef. Ex-situ heat-stress assays confirmed that colonies from the more temperature stable site (fore reef) were less thermally tolerant than their conspecifics from the back reef, where the diel temperature is more variable. This chapter also highlights the utility of acute heat-stress assays as a tool to identify thermotolerant colonies. Taken together, the work of this dissertation provides a foundation for coral conservation in the Red Sea. It highlights that the genetic structure differs between coral species, suggesting that effective conservation through marine protected areas need to incorporate data from multiple species. Coral population genetic data should further be complemented by thermal tolerance assays across the Red Sea to associate genetic diversity with patterns of heat stress tolerance.
8

Draft Assembly and Baseline Annotation of the Ziziphus spina-christi Genome

Shuwaikan, Raghad H. 07 1900 (has links)
Third generation sequencing has revolutionized our understanding of genomics, and enabled the in-depth discovery of complex plant genomes. In this project I aimed to assemble and annotate the genome of Z. spina-christi, a native plant to Saudi Arabia, as part of the the Kingdom of Saudi Arabia Native Genome Project established at the Center for Desert Agriculture at KAUST. Initially, a voucher plant was selected from the Al Lith region of Western Saudi Arabia. Fresh leaf tissue was collected for high-molecular weight (HMW) DNA extraction, as well as seed for greenhouse propagation. After HMW DNA extraction, library construction and PacBio HiFi sequencing, I generated a de novo assembly of the Z. spina-christi genome using the Hifiasm assembler, which yielded a 1.9 Gbp long assembly with high levels of duplication. The assembled contigs were scaffolded using an in-house script based on the software RagTag, that yielded a 406 Mbp long scaffold with 331 gaps (85.45% of estimated genome size). A preliminary analysis of the assembly for transposable elements revealed a TE content of 32.36%, with Long Terminal Repeats retrotransposons (LTR-RTs) being the major contributor to the total TE content. Basline annotation was completed using Omicsbox revealing 18,330 functional genes. This work describes the first genomic resource for the desert plant Z. spina-christi. To improve the assembly, I suggest the use of scaffolding using optical mapping, long Nanopore reads and Hi-C data to capture the spatial organization of the genome. Further experimental, genetic and TEs analysis is needed to explore the plant’s resilience to abiotic stresses in extreme environments.
9

Using Pan-Genomes to Include Functional Data in Ancient Pathogen Studies / Ancient DNA and Gene Function Analyses

Long, George S. January 2024 (has links)
Ancient DNA analyses are reliant on reference genomes to authenticate and identify endogenous genomes. While this has lead to many successful studies involving proboscidians, hominids, and ancient pathogens such as Yersinia pestis, our reliance on at most a small number of genomes greatly limits our ability to functionally describe the genome of interest. Further, given the existence of open bacterial genomes and horizontal gene transfers it is likely that reference biases have been incorporated and cited in following studies as representative of past gene diversity. By implementing and standardizing the use of bacterial pan-genomes the effect of these biases are greatly diminished while also revealing the relative capabilities of the target genome compared to the modern diversity. Describing an ancient strain by both its phylogenetic and functional similarities to modern strains allows for a more nuanced analysis of the species evolutionary history. Incongruencies between the phylogeny and genetic function are ripe for deeper analyses and the implications of its findings resonate beyond the characterization of an ancient genome. A pan-genome centric approach to ancient bacterial studies offers significant improvements compared to the current paradigm. / Dissertation / Doctor of Philosophy (PhD)
10

Analyse bioinformatique du génome et de l’épigénome du pommier / Bioinformatic analysis of the apple genome and epigenome

Daccord, Nicolas 27 November 2018 (has links)
La pomme est l’un des fruits les plus consommés au monde. En utilisant les dernières technologies de séquençage (PacBio) et de cartes optiques (BioNano), nous avons généré un assemblage de novo de haute qualité du génome du pommier (Malus domestica Borkh.). Nous avons réalisé une annotation des gènes et des éléments transposables pour permettre à cet assemblage d’être utilisé en tant que génome de référence. La grande contiguité de l’assemblage a permis de détecter les éléments transposables de façon exhaustive, ce qui fournit une opportunité sans précédents d’étudier les régions non-caractérisées d’un génome d’arbre. Nous avons également trouvé que le génome du pommier est entièrement dupliqué, comme montré par les relations de synthénie entre les chromosomes. En utilisant du Whole Genome Bisulfite Sequencing (WGBS) ainsi que l’assemblage précédemment généré, nous avons montré des cartes de méthylation de l’ADN pour tout le génome et montré une corrélation générale entre la méthylation de l’ADN près des gènes et l’expression des gènes. De plus, nous avons identifié plusieurs Régions Différentiellement Méthylées (RDMs) entre les méthylomes de fruits et de feuilles du pommier, associées à des gènes candidats qui pourraient être impliqués dans des traits agronomiques importants tel que le développement du fruit. Enfin, nous avons développé un pipeline rapide, simple et complet qui prend entièrement en charge l’analyse des données WGBS, de l’alignement des reads au calcul des RDMs. / Apple is one of the most consumed fruits in the world. Using the latest sequencing (PacBio) and optical mapping (BioNano) technologies, we have generated a high-quality de novo assembly of the apple (Malus domestica Borkh.) genome. We performed a gene annotation as well as a transposable element annotation to allow this assembly to be used as a reference genome. The highcontiguity of the assembly allowed to exhaustively detect the transposable elements, which represented over half the assembly, thus providing an unprecedented opportunity to investigate the uncharacterized regions of a tree genome. We also found that the apple genome is entirely duplicated as showed by the synteny links between chromosomes. Using Whole Genome Bisulfite Sequencing (WGBS) and the previously generated assembly, we produced genome-wide DNA methylation maps and showed a general correlation between DNA methylation next to genes and gene expression. Moreover, we identified several Differentially Methylated Regions (DMRs) between apple fruits and leaf methylomes associated to candidate genes that could be involved in agronomically relevant traits such as apple fruit development. Finally, we developped a complete and easyto- use pipeline which aim is to handle the complete treatment of WGBS data, from the reads mapping to the DMRs computing. It can handle datasets having a low number of biological replicates.

Page generated in 0.2899 seconds