Return to search

Estimating the Local False Discovery Rate via a Bootstrap Solution to the Reference Class Problem: Application to Genetic Association Data

Modern scientific technology such as microarrays, imaging devices, genome-wide association studies or social science surveys provide statisticians with hundreds or even thousands of tests to consider simultaneously. Testing many thousands of null hypotheses may increase the number of Type $I$ errors. In large-scale hypothesis testing, researchers can use different statistical techniques such as family-wise error rates, false discovery rates, permutation methods, local false discovery rate, where all available data usually should be analyzed together. In applications, the thousands of tests are related by a scientifically meaningful structure. Ignoring that structure can be misleading as it may increase the number of false positives and false negatives. As an example, in genome-wide association studies each test corresponds to a specific genetic marker. In such a case, the scientific structure for each genetic marker can be its minor allele frequency.
In this research, the local false discovery rate as a relevant statistical approach is considered to analyze the thousands of tests together. We present a model for multiple hypothesis testing when the scientific structure of each test is incorporated as a co-variate. The purpose of this model is to incorporate the co-variate to improve the performance of testing procedures. The method we consider has different estimates depending on the tuning parameter. We would like to estimate the optimal value of that parameter by considering observed statistics. Thus, among those estimators, the one which minimizes the estimated errors due to bias and to variance is chosen by applying the bootstrap approach. Such an estimation method is called an adaptive reference class method. Under the combined reference class method, the effect of the co-variates is ignored and all null hypotheses should be analyzed together.
In this research, under some assumptions for the co-variates and the prior probabilities, the proposed adaptive reference class method shows smaller error than the combined reference class method in estimating the local false discovery rate, when the number of tests gets large. We describe the adaptive reference class method to the coronary artery disease data, and we use simulation data to evaluate the performance of the estimator associated with the adaptive reference class method.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/33367
Date January 2015
CreatorsAbbas Aghababazadeh, Farnoosh
ContributorsBickel, David, Alvo, Mayer
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0016 seconds