Global ETD Search

1	Structural Variant Detection: A Novel Approach January 2014 (has links) abstract: Genomic structural variation (SV) is defined as gross alterations in the genome broadly classified as insertions/duplications, deletions inversions and translocations. DNA sequencing ushered structural variant discovery beyond laboratory detection techniques to high resolution informatics approaches. Bioinformatics tools for computational discovery of SVs however are still missing variants in the complex cancer genome. This study aimed to define genomic context leading to tool failure and design novel algorithm addressing this context. Methods: The study tested the widely held but unproven hypothesis that tools fail to detect variants which lie in repeat regions. Publicly available 1000-Genomes dataset with experimentally validated variants was tested with SVDetect-tool for presence of true positives (TP) SVs versus false negative (FN) SVs, expecting that FNs would be overrepresented in repeat regions. Further, the novel algorithm designed to informatically capture the biological etiology of translocations (non-allelic homologous recombination and 3&ndashD; placement of chromosomes in cells –context) was tested using simulated dataset. Translocations were created in known translocation hotspots and the novel&ndashalgorithm; tool compared with SVDetect and BreakDancer. Results: 53% of false negative (FN) deletions were within repeat structure compared to 81% true positive (TP) deletions. Similarly, 33% FN insertions versus 42% TP, 26% FN duplication versus 57% TP and 54% FN novel sequences versus 62% TP were within repeats. Repeat structure was not driving the tool's inability to detect variants and could not be used as context. The novel algorithm with a redefined context, when tested against SVDetect and BreakDancer was able to detect 10/10 simulated translocations with 30X coverage dataset and 100% allele frequency, while SVDetect captured 4/10 and BreakDancer detected 6/10. For 15X coverage dataset with 100% allele frequency, novel algorithm was able to detect all ten translocations albeit with fewer reads supporting the same. BreakDancer detected 4/10 and SVDetect detected 2/10 Conclusion: This study showed that presence of repetitive elements in general within a structural variant did not influence the tool's ability to capture it. This context-based algorithm proved better than current tools even with half the genome coverage than accepted protocol and provides an important first step for novel translocation discovery in cancer genome. / Dissertation/Thesis / Ph.D. Biomedical Informatics 2014 Bioinformatics genomic structural variation structural variation translocation
2	Structural Comparative Genomics of Four African Species of Oryza Goicoechea, Jose Luis January 2009 (has links) Rice is one of the most important crops in the world and is the first whose genome was completely sequenced. This landmark accomplishment placed O. sativa as a leading model in plant biology, especially for cereals. The genus Oryza includes 23 species, two of them independently domesticated in Asia and Africa. Wild species of Oryza contain a reservoir of useful agronomical traits which could be exploited for the benefit of rice agriculture, which is facing global problems as other crops, mainly due to a rampant increase in the human population and progressive deterioration of soils and water supplies. The Oryza Mapping Alignment Project has opened great opportunities to tap the genetic potential encapsulated in these species. Four BAC libraries generated from the African species of Oryza: O. barthii, O. glaberrima (AA genome), O. punctata (BB genome) and O. brachyantha (FF genome) were fully characterized and shown to provide enough coverage to represent their respective genomes. BAC clones from these libraries were fingerprinted and end-sequenced to assemble physical maps that were heavily manually edited using the sequence of O. sativa as a reference genome. The physical maps showed high coverage for all the species across all chromosomes. Both, BAC libraries and physical maps were used to investigate synteny and structural variation. The four species show high colinearity to the reference genome, although synteny perturbations were detected, including contractions, expansions, and putative inversions and translocations, which potential have an important impact in the evolution of these species. Africa Comparative Genomics Oryza Physical Maps Structural Variation
3	Methods for Inter- and Intra-Species Genomics for the Detection of Variation and Function Kural, Deniz January 2014 (has links) Thesis advisor: Gabor T. Marth / This thesis concerns itself with the development of methods for comparing genomes. Chapter 2 is a comparative genomics investigation of coding regions across multiple species. Regions of the genome coding for proteins show higher conservation than non-coding regions. Furthermore, we show that a portion of coding regions are conserved beyond the requirements of protein conservation, supporting functions such as microRNA binding and splicing enhancement, providing the non-coding functional impetus to conservation. In Chapter 3, we focus on the detection and characterization of a particular type of structural variation - mobile element insertions (MEIs). While there are many types of mobile elements in the human genome, three of these are active and cause most of the MEI variation observed in humans: ALU, L1 and SVA elements. We detect variation across 1000 Genomes Pilot populations caused by these elements, assemble ALU elements to single nucleotide resolution, and determine actively copying species of this element. We've developed a variety of algorithmic approaches to MEI detection, and present these. Chapter 4 outlines an approach to remedy reference bias via the incorporation of variation data into the reference. In particular, we construct a pan-genome reference, demonstrated concretely via resolving ALU regions, and develop new alignment software to align against this enriched reference structure. / Thesis (PhD) — Boston College, 2014. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Biology. algorithms evolution graph mobile elements sequencing structural variation
4	Structual variation detection in the human genome Wu, Jiantao January 2013 (has links) Thesis advisor: Gabor T. Marth / Structural variations (SVs), like single nucleotide polymorphisms (SNPs) and short insertion-deletion polymorphisms (INDELs), are a ubiquitous feature of genomic sequences and are major contributors to human genetic diversity and disease. Due to technical difficulties, i.e. the high data-acquisition cost and/or low detection resolution of previous genome-scanning technologies, this source of genetic variation has not been well studied until the completion of the Human Genome Project and the emergence of next-generation sequencing (NGS) technologies. The assembly of the human genome and economical high-throughput sequencing technologies enable the development of numerous new SV detection algorithms with unprecedented accuracy, sensitivity and precision. Although a number of SV detection programs have been developed for various SV types, such as copy number variations, deletions, tandem duplications, inversions and translocations, some types of SVs, e.g. copy number variations (CNVs) in capture sequencing data and mobile element insertions (MEIs) have undergone limited study. This is a result of the lack of suitable statistical models and computational approaches, e.g. efficient mapping method to handle multiple aligned reads from mobile element (ME) sequences. The focus of my dissertation was to identify and characterize CNVs in capture sequencing data and MEI from large-scale whole-genome sequencing data. This was achieved by building sophisticated statistical models and developing efficient algorithms and analysis methods for NGS data. In Chapter 2, I present a novel algorithm that uses the read depth (RD) signal to detect CNVs in deep-coverage exon capture sequencing data that are originally designed for SNPs discovery. We were one of the early pioneers to tackle this problem. In Chapter 3, I present a fast, convenient and memory-efficient program, Tangram, that integrates read-pair (RP) and split-read (SR) signals to detect and genotype MEI events. Based on the results from both simulated and experimental data, Tangram has superior sensitivity, specificity, breakpoint resolution and genotyping accuracy, when compared to other recently published MEI detection methods. Lastly, Chapter 4 summarizes my work for SV detection in human genomes during my PhD study and describes the future direction of genetic variant researches. / Thesis (PhD) — Boston College, 2013. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Biology. copy number variation mobile element insertion structural variation
5	The role of structural variation in cleft lip and palate Lansdon, Lisa Ann 01 January 2018 (has links) Clefts of the lip and/or palate (CL/P) are one of the most common birth defects in the world occurring about every 1 in 700 live births. Individuals with non-syndromic clefting (NSCL/P) account for about 70% of all cleft cases and exhibit a cleft only whereas syndromic occurrences (SCL/P) include additional cognitive or structural abnormalities. Linkage, genome-wide association, candidate gene, animal model, sequencing and copy number variant (CNV) analyses have been used to study CL/P and have established that it is a heterogeneous, complex disorder. However, the impact of identified sequence variants on protein structure and the contribution of structural genetic variation to CL/P remains poorly understood. In our first analysis we reassessed the phenotype of a 30-year-old individual of SCL/P and noticed phenotypic overlap with Hartsfield syndrome, a rare syndrome resulting from sequence variants in Fibroblast growth factor 1 (FGFR1). We sequenced the coding region of FGFR1 and identified a novel, de novo variant. Due to the fact sequence variants in FGFR1 contribute to multiple syndromes encompassing a wide phenotypic spectrum, we performed an extensive literature search to record every published sequence variant of FGFR1 and mapped it to the protein structure by disease and phenotype. Although no statistically significant protein domain-phenotype correlations were identified, many regions neared significance. This work stresses the need for systematic, comprehensive phenotyping of patients and provides a method for assessing the impact of the location of sequence variants within the 3D structure of the protein. Although rare and common CNVs have been identified in individuals with CL/P, prior to our work no large-scale studies of rare CNVs for the identification of novel clefting genes had been performed. For our second set of analyses, we conducted two such studies, first focusing on a smaller cohort of 140 individuals with NSCL/P from the Philippines to establish our informatic and functional validation pipeline. We used whole-genome tiling arrays to assess rare deletions overlapping genes not previously implicated in clefting, and identified one deletion overlapping Isthmin1 (ISM1) and a deletion just 3’ of the gene in a second affected individual. Functional validation of Ism1 in Xenopus laevis showed strong expression in structures necessary for craniofacial development, and morpholino and CRISPR/Cas9 knockdown of Ism1 resulted in a median cleft lip in some embryos, establishing ISM1 as a novel craniofacial patterning gene. We then expanded our study and assessed genomic CNVs in 1021 individuals with NSCL/P and 81 individuals with SCL/P, finding no differences in CNV number, load or burden between these groups. We also identified 8 putative clefting genes overlapped by deletions in two or more individuals but at a rare (< 1% frequency) in the cohort. Functional validation of these genes using CRISPR/Cas9 in zebrafish and Xenopus tropicalis is currently underway. This work has identified a novel sequence variant leading to the diagnosis of Hartsfield syndrome in an individual with SCL/P, developed an innovative method for assessing the impact of sequence variation on protein structure, improved our understanding of the contribution of CNVs to SCL/P and NSCL/P and identified several putative novel clefting loci which may help explain a portion of the missing heritability of CL/P. Cleft lip and palate Copy number variant Craniofacial development Structural variation Genetics
6	Investigation of Mechanics of Mutation and Selection by Comparative Sequencing Zody, Michael C., January 2009 (has links) Diss. (sammanfattning) Uppsala : Uppsala universitet, 2009.
7	Indels and large scale variation in archaic hominins compared to present day humans Chintalapati, Manjusha 07 January 2019 (has links) No description available. info:eu-repo/classification/ddc/000 ddc:000
8	Long Read Based Individual Molecule Sequencing and Real-time Pathogen Detection Bi, Chongwei 10 1900 (has links) With the ability to produce reads with hundreds of kilobases in length, long-read sequencing technology is emerging as a powerful tool to decode complex genetic sequences that are previously inaccessible for short reads. Though the sequencing chemistry and base calling algorithm are being actively developed, the accuracy of the current long-read sequencing is still considerably low, thus limiting its applications. In this dissertation, I present three long read based DNA sequencing methods to overcome the limitation of read accuracy, contribute to a better understanding of Cas9 editing outcomes and mitochondrial DNA heterogeneity, and pave the way for real-time pathogen detection and mutation surveillance. The development of IDMseq enables the single-base-resolution haplotype-resolved quantitative characterization of diverse types of rare variants. IDMseq provides the first quantitative evidence of persistent nonrandom large structural variants following repair of double-strand breaks induced by CRISPR-Cas9 in human ESCs. The development of iMiGseq represents the first mitochondrial DNA sequencing method that provides ultra-sensitive variant detection, complete haplotyping, and unbiased evaluation of heteroplasmy levels, all at the individual mitochondrial DNA molecule level. iMiGseq uncovers unappreciated levels of heteroplasmic variants in single healthy human oocytes well below the current 1% detection limit, of which numerous variants are deleterious and associated with late-onset mitochondrial disease and cancer. It could comprehensively characterize and haplotype single-nucleotide and structural variants of mitochondrial DNA and their genetic linkage in NARP/Leigh syndrome patient-derived cells. The development of NIRVANA deals with the COVID-19 pandemic. NIRVANA can simultaneously detect SARS-CoV-2 and three co-infecting respiratory viruses, and monitor mutations for up to 96 samples in real time. It provides a promising solution for rapid field-deployable detection and mutation surveillance of pandemic viruses. Taken all together, IDMseq, iMiGseq and NIRVANA utilize the advantage of long reads, overcome the limitation of low accuracy, and facilitate the application of long-read sequencing technologies in multidisciplinary fields. DNA Sequencing rare mutation structural variation pathogen detection individual molecule sequencing
9	Structural Variation Discovery and Genotyping from Whole Genome Sequencing: Methodology and Applications: A Dissertation Zhuang, Jiali 15 September 2015 (has links) A comprehensive understanding about how genetic variants and mutations contribute to phenotypic variations and alterations entails experimental technologies and analytical methodologies that are able to detect genetic variants/mutations from various biological samples in a timely and accurate manner. High-throughput sequencing technology represents the latest achievement in a series of efforts to facilitate genetic variants discovery and genotyping and promises to transform the way we tackle healthcare and biomedical problems. The tremendous amount of data generated by this new technology, however, needs to be processed and analyzed in an accurate and efficient way in order to fully harness its potential. Structural variation (SV) encompasses a wide range of genetic variations with different sizes and generated by diverse mechanisms. Due to the technical difficulties of reliably detecting SVs, their characterization lags behind that of SNPs and indels. In this dissertation I presented two novel computational methods: one for detecting transposable element (TE) transpositions and the other for detecting SVs in general using a local assembly approach. Both methods are able to pinpoint breakpoint junctions at single-nucleotide resolution and estimate variant allele frequencies in the sample. I also applied those methods to study the impact of TE transpositions on the genomic stability, the inheritance patterns of TE insertions in the population and the molecular mechanisms and potential functional consequences of somatic SVs in cancer genomes. Genomic Structural Variation DNA Transposable Elements Genome-Wide Association Study Dissertations, UMMS Bioinformatics Computational Biology Genomics
10	Variant Detection Using Next Generation Sequencing Data Pyon, Yoon Soo 08 March 2013 (has links) No description available. Bioinformatics Computer Science Genetics structural variation SNP next generation sequencing naive Bayes classifier

Search results