The genome-wide association study (GWAS) approach has identified novel loci for a variety of complex diseases. However, for most of these disorder much of the heritability is not explained by this approach, which focuses on identifying common variants that are associated with disease risk. The unexplained heritability may be due to genetic or phenotypic heterogeneity or the influence of rare variants. The motivation behind this thesis was to uncover the unexplained heritability by applying joint analyses of sets of variants (gene-based association test) and multiple disease-related phenotypes (called multivariate gene-based association test). First, we evaluated multivariate gene-based methods for detecting association of common genetic variants with correlated phenotypes. An extensive simulation study showed that the method combining the MultiPhen and GATES software performed best for most tested scenarios especially when correlations among phenotypes are relatively low. We developed a new multivariate gene-based test using rare variants called VEMPHAS. A simulation study
using VEMPHAS showed that this method correctly controls for type I error in all tested
scenarios. We applied VEMPHAS to analysis of various phenotypes related to Alzheimer
disease (AD) and found suggestive association (P < 4.15x10-6) with the gene TRIM22,
which has been identified in a previous sequencing study of AD onset in PSEN1/2
mutation carriers. We also developed software with a graphical user interface which is
designed for integrating information from different types of data sources including
genetic data (from GWAS or sequencing), expression data (from RNA-Seq), and protein
structures (from protein data banks). This software has several features including 1)
testing associations between genetic variants and gene expressions; 2) locating amino
acids, encoded by the variants, in a protein structure; and 3) retrieving genetic locations
(chromosome and base pair positions) of amino acids of interest in the protein structure.
The last feature can be applied for prioritizing coding variants for gene-based association
testing. The methods and strategies developed for this dissertation project can effectively
uncover a portion of the remaining heritability of complex diseases that is unexplained by
traditional GWAS approaches.
Identifer | oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/27693 |
Date | 18 March 2018 |
Creators | Chung, Jaeyoon |
Contributors | Farrer, Lindsay A. |
Source Sets | Boston University |
Language | en_US |
Detected Language | English |
Type | Thesis/Dissertation |
Page generated in 0.0019 seconds