Global ETD Search

1	Přímá klasifikace metagenomických signálů ze sekvenace nanopórem / Direct Binning of Metagenomic Signals from Nanopore Sequencing Lebó, Marko January 2019 (has links) This diploma thesis deals with taxonomy independent methods for classification of metagenomic signals, aquired by a MinION sequencer. It describes the formation and character of metagenomic data and already existing methods of metagenomic data classification and their development. This thesis also evaluates an impact of the third generation sequencing techniques in the world of metagenomics and further specialises on the function of the Oxford Nanopore MinION sequencing device. Lastly, a custom method for metagenomic data classification, based on data obtained from a MinION sequencer, is proposed and compared with an already existing method of classification.
2	Draft Assembly and Baseline Annotation of the Ziziphus spina-christi Genome Shuwaikan, Raghad H. 07 1900 (has links) Third generation sequencing has revolutionized our understanding of genomics, and enabled the in-depth discovery of complex plant genomes. In this project I aimed to assemble and annotate the genome of Z. spina-christi, a native plant to Saudi Arabia, as part of the the Kingdom of Saudi Arabia Native Genome Project established at the Center for Desert Agriculture at KAUST. Initially, a voucher plant was selected from the Al Lith region of Western Saudi Arabia. Fresh leaf tissue was collected for high-molecular weight (HMW) DNA extraction, as well as seed for greenhouse propagation. After HMW DNA extraction, library construction and PacBio HiFi sequencing, I generated a de novo assembly of the Z. spina-christi genome using the Hifiasm assembler, which yielded a 1.9 Gbp long assembly with high levels of duplication. The assembled contigs were scaffolded using an in-house script based on the software RagTag, that yielded a 406 Mbp long scaffold with 331 gaps (85.45% of estimated genome size). A preliminary analysis of the assembly for transposable elements revealed a TE content of 32.36%, with Long Terminal Repeats retrotransposons (LTR-RTs) being the major contributor to the total TE content. Basline annotation was completed using Omicsbox revealing 18,330 functional genes. This work describes the first genomic resource for the desert plant Z. spina-christi. To improve the assembly, I suggest the use of scaffolding using optical mapping, long Nanopore reads and Hi-C data to capture the spatial organization of the genome. Further experimental, genetic and TEs analysis is needed to explore the plant’s resilience to abiotic stresses in extreme environments. sequencing genome assembly plant genomics plant genomes third-generation sequencing
3	Evolution of multi-drug resistant HCV clones from pre-existing resistant-associated variants during direct-acting antiviral therapy determined by third-generation sequencing / 第三世代シーケンシングにより明らかになった、抗ウイルス薬投与下におけるC型肝炎ウイルスの多剤耐性クローンの進化 Takeda, Haruhiko 26 March 2018 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(医学) / 甲第20989号 / 医博第4335号 / 新制\|\|医\|\|1027(附属図書館) / 京都大学大学院医学研究科医学専攻 / (主査)教授朝長啓造, 教授松田文彦, 教授小柳義夫 / 学位規則第4条第1項該当 / Doctor of Medical Science / Kyoto University / DFAM hepatitis C virus direct acting antivirals resistant associated variant third-generation sequencing phylogenetic analysis clonal evolution 490
4	Machine learning for epigenetics : algorithms for next generation sequencing data Mayo, Thomas Richard January 2018 (has links) The advent of Next Generation Sequencing (NGS), a little over a decade ago, has led to a vast and rapid increase in the generation of genomic data. The drastically reduced cost has in turn enabled powerful modifications that can be used to investigate not just genetic, but epigenetic, phenomena. Epigenetics refers to the study of mechanisms effecting gene expression other than the genetic code itself and thus, at the transcription level, incorporates DNA methylation, transcription factor binding and histone modifications amongst others. This thesis outlines and tackles two major challenges in the computational analysis of such data using techniques from machine learning. Firstly, I address the problem of testing for differential methylation between groups of bisulfite sequencing data sets. DNA methylation plays an important role in genomic imprinting, X-chromosome inactivation and the repression of repetitive elements, as well as being implicated in numerous diseases, such as cancer. Bisulfite sequencing provides single nucleotide resolution methylation data at the whole genome scale, but a sensitive analysis of such data is difficult. I propose a solution that uses a powerful kernel-based machine learning technique, the Maximum Mean Discrepancy, to leverage well-characterised spatial correlations in DNA methylation, and adapt the method for this particular use. I use this tailored method to analyse a novel data set from a study of ageing in three different tissues in the mouse. This study motivates further modifications to the method and highlights the utility of the underlying measure as an exploratory tool for methylation analysis. Secondly, I address the problem of predictive and explanatory modelling of chromatin immunoprecipitation sequencing data (ChIP-Seq). ChIP-Seq is typically used to assay the binding of a protein of interest, such as a transcription factor or histone, to the DNA, and as such is one of the most widely used sequencing assays. While peak callers are a powerful tool in identifying binding sites of sparse and clean ChIPSeq profiles, more broad signals defy analysis in this framework. Instead, generative models that explain the data in terms of the underlying sequence can help uncover mechanisms that predicting binding or the lack thereof. I explore current problems with ChIP-Seq analysis, such as zero-inflation and the use of the control experiment, known as the input. I then devise a method for representing k-mers that enables the use of longer DNA sub-sequences within a flexible model development framework, such as generalised linear models, without heavy programming requirements. Finally, I use these insights to develop an appropriate Bayesian generative model that predicts ChIP-Seq count data in terms of the underlying DNA sequence, incorporating DNA methylation information where available, fitting the model with the Expectation-Maximization algorithm. The model is tested on simulated data and real data pertaining to the histone mark H3k27me3. This thesis therefore straddles the fields of bioinformatics and machine learning. Bioinformatics is both plagued and blessed by the plethora of different techniques available for gathering data and their continual innovations. Each technique presents a unique challenge, and hence out-of-the-box machine learning techniques have had little success in solving biological problems. While I have focused on NGS data, the methods developed in this thesis are likely to be applicable to future technologies, such as Third Generation Sequencing methods, and the lessons learned in their adaptation will be informative for the next wave of computational challenges.

1

Page generated in 0.1046 seconds