Background/Aims: Case-control designs are commonly employed in genetic association studies. In addition to the primary trait of interest, data on additional secondary traits, related to the primary trait, are often collected. Traditional association analyses between genetic variants and secondary traits can be biased in such cases, and several methods have been proposed to address this issue, including the inverse-probability-of-sampling-weighted (IPW) approach and semi-parametric maximum likelihood (SPML) approach.
Methods: Here, we propose a set of new estimating equation based approach that combines observed and counter-factual outcomes to provide unbiased estimation of genetic associations with secondary traits. We extend the estimating equation framework to both generalized linear models (GLM) and non-parametric regressions, and compare it with the existing approaches.
Results: We demonstrate analytically and numerically that our proposed approach provides robust and fairly efficient unbiased estimation in all simulations we consider. Unlike existing methods, it is less sensitive to the sampling scheme and underlying disease model specification. In addition, we illustrate our new approach using two real data examples. The first one is to analyze the binary secondary trait diabetes under GLM framework using a stroke case-control study. The second one is to analyze the continuous secondary trait serum IgE levels under linear and quantile regression models using an asthma case-control study.
Conclusion: The proposed new estimating equation approach is able to accommodate a wide range of regressions, and it outperforms the existing approaches in some scenarios we consider.
Identifer | oai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/D8T15DB5 |
Date | January 2015 |
Creators | Song, Xiaoyu |
Source Sets | Columbia University |
Language | English |
Detected Language | English |
Type | Theses |
Page generated in 0.0023 seconds