Return to search

Statistical issues in Mendelian randomization : use of genetic instrumental variables for assessing causal associations

Mendelian randomization is an epidemiological method for using genetic variationto estimate the causal effect of the change in a modifiable phenotype onan outcome from observational data. A genetic variant satisfying the assumptionsof an instrumental variable for the phenotype of interest can be usedto divide a population into subgroups which differ systematically only in thephenotype. This gives a causal estimate which is asymptotically free of biasfrom confounding and reverse causation. However, the variance of the causalestimate is large compared to traditional regression methods, requiring largeamounts of data and necessitating methods for efficient data synthesis. Additionally,if the association between the genetic variant and the phenotype is notstrong, then the causal estimates will be biased due to the “weak instrument”in finite samples in the direction of the observational association. This biasmay convince a researcher that an observed association is causal. If the causalparameter estimated is an odds ratio, then the parameter of association willdiffer depending on whether viewed as a population-averaged causal effect ora personal causal effect conditional on covariates. We introduce a Bayesian framework for instrumental variable analysis, whichis less susceptible to weak instrument bias than traditional two-stage methods,has correct coverage with weak instruments, and is able to efficiently combinegene–phenotype–outcome data from multiple heterogeneous sources. Methodsfor imputing missing genetic data are developed, allowing multiple genetic variantsto be used without reduction in sample size. We focus on the question ofa binary outcome, illustrating how the collapsing of the odds ratio over heterogeneousstrata in the population means that the two-stage and the Bayesianmethods estimate a population-averaged marginal causal effect similar to thatestimated by a randomized trial, but which typically differs from the conditionaleffect estimated by standard regression methods. We show how thesemethods can be adjusted to give an estimate closer to the conditional effect. We apply the methods and techniques discussed to data on the causal effect ofC-reactive protein on fibrinogen and coronary heart disease, concluding withan overall estimate of causal association based on the totality of available datafrom 42 studies.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:549804
Date January 2012
CreatorsBurgess, Stephen
ContributorsThompson, Simon G.
PublisherUniversity of Cambridge
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttps://www.repository.cam.ac.uk/handle/1810/242184

Page generated in 0.0021 seconds