Return to search

Algorithms for genomics and genetics : compression-accelerated search and admixture analysis

Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Mathematics, 2013. / This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. / Cataloged from student-submitted PDF version of thesis. / Includes bibliographical references (pages 133-139). / Rapid advances in next-generation sequencing technologies are revolutionizing genomics, with data sets at the scale of thousands of human genomes fast becoming the norm. These technological leaps promise to enable corresponding advances in biology and medicine, but the deluge of raw data poses substantial mathematical, computational and statistical challenges that must first be overcome. This thesis consists of two research thrusts along these lines. First, we propose an algorithmic framework, "compressive genomics," that accelerates bioinformatic computations through analysis-aware compression. We demonstrate this methodology with proof-of-concept implementations of compression-accelerated search (CaBLAST and CaBLAT). Second, we develop new computational tools for investigating population admixture, a phenomenon of importance in understanding demographic histories of human populations and facilitating association mapping of disease genes. Our recently released ALDER and MixMapper software packages provide fast, sensitive, and robust methods for detecting and analyzing signatures of admixture created by genetic drift and recombination on genome-wide, large-sample scales. / by Po-Ru Loh. / Ph.D.

Identiferoai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/83631
Date January 2013
CreatorsLoh, Po-Ru
ContributorsBonnie Berger., Massachusetts Institute of Technology. Department of Mathematics., Massachusetts Institute of Technology. Department of Mathematics.
PublisherMassachusetts Institute of Technology
Source SetsM.I.T. Theses and Dissertation
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Format139 pages, application/pdf
RightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission., http://dspace.mit.edu/handle/1721.1/7582

Page generated in 0.0016 seconds