Return to search

Fine mapping of causal HLA variants using penalised regression

The identification of risk loci in the Human Leukocyte Antigen (HLA) region using single-SNP association tests has been hampered by the extent of linkage disequilibrium (LD). Penalised regression via Least Absolute Shrinkage and Selection Operator (LASSO) can be used as a method for selection of variables in multi-SNP analysis, and to deal with the problem of multi-collinearity among predictors. This method applies a penalty that shrinks the estimates of the regression coefficients towards zero. This is equivalent to applying a double exponential (DE) prior distribution to the coefficients with a mode at zero, corresponding to the prior belief that most of the effects are negligible in a Bayesian approach. Parameter inference is based on the posterior mode, with non-zero values indicating marker-disease associations. Single-SNP, stepwise regression and the LASSO approach were applied to case-control studies of rheumatoid arthritis, a disease which has been associated with markers from the HLA region. A generalisation of the LASSO called the HyperLasso (HLASSO), which uses the normal-exponential-gamma prior in place of the DE, was also investigated. These approaches were applied to data from the Genetics of Rheumatoid Arthritis (GoRA) study. Genotype imputation was used as a means to jointly analyse the GoRA and the Wellcome Trust Case Control Consortium (WTCCC) HLA SNPs. The North American Rheumatoid Arthritis Consortium (NARAC) study was used to validate the findings. After controlling for type-I error, the penalised approaches greatly reduced the number of positive signals compared to single-SNP analysis, suggesting that correlation among SNP loci was better handled. The HLASSO results were sparser but similar to the LASSO results. One SNP in HLA-DPB1 was replicated in the NARAC study. In both models, the robustness of the retained variables was verified by bootstrapping. The results suggest that SNP-selection using LASSO or HLASSO shows a substantial benefit in identifying risk loci in regions of high LD.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:526298
Date January 2010
CreatorsVignal, Charlotte
ContributorsBalding, David ; Bansal, Aruna
PublisherImperial College London
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/10044/1/6145

Page generated in 0.0019 seconds