Return to search

Hypothesis Testing for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous Traits

acase@tulane.edu / Extreme phenotype sampling (EPS) is a broadly-used design to identify candidate genetic factors contributing to the variation of quantitative traits. By enriching the signals in the extreme phenotypic samples within the top and bottom percentiles, EPS can boost the study power compared with the random sampling with the same sample size. The existing statistical methods for EPS data test the variants/regions individually. However, many disorders are caused by multiple genetic factors. Therefore, it is critical to simultaneously model the effects of genetic factors, which may increase the power of current genetic studies and identify novel disease-associated genetic factors in EPS. The challenge of the simultaneous analysis of genetic data is that the number (p ~10,000) of genetic factors is typically greater than the sample size (n ~1,000) in a single study. The standard linear model would be inappropriate for this p>n problem due to the rank deficiency of the design matrix. An alternative solution is to apply a penalized regression method – the least absolute shrinkage and selection operator (LASSO).
LASSO can deal with this high-dimensional (p>n) problem by forcing certain regression coefficients to be zero. Although the application of LASSO in genetic studies under random sampling has been widely studied, its statistical inference and testing under EPS remain unknown. We propose a novel sparse model (EPS-LASSO) with hypothesis test for high-dimensional regression under EPS based on a decorrelated score function to investigate the genetic associations, including the gene expression and rare variant analyses. The comprehensive simulation shows EPS-LASSO outperforms existing methods with superior power when the effects are large and stable type I error and FDR control. Together with the real data analysis of genetic study for obesity, our results indicate that EPS-LASSO is an effective method for EPS data analysis, which can account for correlated predictors. / 1 / Chao Xu

  1. tulane:78817
Identiferoai:union.ndltd.org:TULANE/oai:http://digitallibrary.tulane.edu/:tulane_78817
Date January 2018
ContributorsXu, Chao (author), Deng, Hong-Wen (Thesis advisor), School of Public Health & Tropical Medicine Biostatistics and Bioinformatics (Degree granting institution)
PublisherTulane University
Source SetsTulane University
LanguageEnglish
Detected LanguageEnglish
TypeText
Formatelectronic, 88
RightsNo embargo, Copyright is in accordance with U.S. Copyright law.

Page generated in 0.0018 seconds