Return to search

Discovering rare variants from populations to families

Thesis advisor: Gabor T. Marth / Partitioning an individual's phenotype into genetic and environmental components has been a major goal of genetics since the early 20th century. Formally, the proportion of phenotypic variance attributable to genetic variation in the population is known as heritability. Genome wide association studies have explained a modest percentage of variability of complex traits by genotyping common variants. Currently, there is great interest in what role rare variants play in explaining the missing heritability of complex traits. Advances of next generation sequencing and genomic enrichment technologies over the past several years have made it feasible to re-sequence large numbers of individuals, enabling the discovery of the full spectrum of genetic variation segregating in the human population, including rare variants. The four projects that comprise my dissertation all revolve around the discovery of rare variants from next generation sequencing datasets. In my first project, I analyzed data from the exon sequencing pilot of the 1000 Genomes Project, where I discovered variants from exome capture sequencing experiments in a worldwide sample of nearly 700 individuals. My results show that the allele frequency spectrum of the dataset has an excess of rare variants. My next project demonstrated the applicability of using whole-genome amplified DNA (WGA) in capture sequencing. WGA is a method that amplifies DNA from nanogram starting amounts of template. In two separate capture experiments I compared the concordance of call sets, both at the site and genotype level, of variant calls derived from WGA and genomic DNA. WGA derived calls have excellent concordance metrics, both at the site and genotypic level, suggesting that WGA DNA can be used in lieu of genomic DNA. The results of this study have ramifications for medical sequencing experiments, where DNA stocks are a finite quantity and re-collecting samples maybe too expensive or not possible. My third project kept its focus on capture sequencing, but in a different context. Here, I analyzed sequencing data from Mendelian exome study of non-sensorineural hearing loss (NSHL). A subset of 6 individuals (5 affected, 1 unaffected) from a family of European descent were whole exome sequenced in an attempt to uncover the causative mutation responsible for the loss of hearing phenotype in the family. Previous linkage analysis uncovered a linkage region on chr12, but no mutations in previous candidate genes were found, suggesting a novel mutation segregates in the family. Using a discrete filtering approach with a minor allele frequency cutoff, I uncovered a putative causative non-synonymous mutation in a gene that encodes a transmembrane protein. The variant perfectly segregates with the phenotype in the family and is enriched in frequency in an unrelated cohort of individuals. Finally, for my last project I implemented a variant calling method for family sequencing datasets, named Pgmsnp, which incorporates Mendelian relationships of family members using a Bayesian network inference algorithm. My method has similar detection sensitivities compared to other pedigree aware callers, and increases power of detection for non-founder individuals. / Thesis (PhD) — Boston College, 2013. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Biology.

Identiferoai:union.ndltd.org:BOSTON/oai:dlib.bc.edu:bc-ir_101399
Date January 2013
CreatorsIndap, Amit R.
PublisherBoston College
Source SetsBoston College
LanguageEnglish
Detected LanguageEnglish
TypeText, thesis
Formatelectronic, application/pdf
RightsCopyright is held by the author, with all rights reserved, unless otherwise noted.

Page generated in 0.0019 seconds