Thesis advisor: Gabor T. Marth / Structural variations (SVs), like single nucleotide polymorphisms (SNPs) and short insertion-deletion polymorphisms (INDELs), are a ubiquitous feature of genomic sequences and are major contributors to human genetic diversity and disease. Due to technical difficulties, i.e. the high data-acquisition cost and/or low detection resolution of previous genome-scanning technologies, this source of genetic variation has not been well studied until the completion of the Human Genome Project and the emergence of next-generation sequencing (NGS) technologies. The assembly of the human genome and economical high-throughput sequencing technologies enable the development of numerous new SV detection algorithms with unprecedented accuracy, sensitivity and precision. Although a number of SV detection programs have been developed for various SV types, such as copy number variations, deletions, tandem duplications, inversions and translocations, some types of SVs, e.g. copy number variations (CNVs) in capture sequencing data and mobile element insertions (MEIs) have undergone limited study. This is a result of the lack of suitable statistical models and computational approaches, e.g. efficient mapping method to handle multiple aligned reads from mobile element (ME) sequences. The focus of my dissertation was to identify and characterize CNVs in capture sequencing data and MEI from large-scale whole-genome sequencing data. This was achieved by building sophisticated statistical models and developing efficient algorithms and analysis methods for NGS data. In Chapter 2, I present a novel algorithm that uses the read depth (RD) signal to detect CNVs in deep-coverage exon capture sequencing data that are originally designed for SNPs discovery. We were one of the early pioneers to tackle this problem. In Chapter 3, I present a fast, convenient and memory-efficient program, Tangram, that integrates read-pair (RP) and split-read (SR) signals to detect and genotype MEI events. Based on the results from both simulated and experimental data, Tangram has superior sensitivity, specificity, breakpoint resolution and genotyping accuracy, when compared to other recently published MEI detection methods. Lastly, Chapter 4 summarizes my work for SV detection in human genomes during my PhD study and describes the future direction of genetic variant researches. / Thesis (PhD) — Boston College, 2013. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Biology.
Identifer | oai:union.ndltd.org:BOSTON/oai:dlib.bc.edu:bc-ir_101332 |
Date | January 2013 |
Creators | Wu, Jiantao |
Publisher | Boston College |
Source Sets | Boston College |
Language | English |
Detected Language | English |
Type | Text, thesis |
Format | electronic, application/pdf |
Rights | Copyright is held by the author, with all rights reserved, unless otherwise noted. |
Page generated in 0.0016 seconds