Return to search

False Discovery Rates, Higher Criticism and Related Methods in High-Dimensional Multiple Testing

The technical advancements in genomics, functional magnetic-resonance
and other areas of scientific research seen in the last two decades
have led to a burst of interest in multiple testing procedures.

A driving factor for innovations in the field of multiple testing has been the problem of
large scale simultaneous testing. There, the goal is to uncover lower--dimensional signals
from high--dimensional data. Mathematically speaking, this means that the dimension d
is usually in the thousands while the sample size n is relatively small (max. 100 in general,
often due to cost constraints) --- a characteristic commonly abbreviated as d >> n.

In my thesis I look at several multiple testing problems and corresponding
procedures from a false discovery rate (FDR) perspective, a methodology originally introduced in a seminal paper by Benjamini and Hochberg (2005).

FDR analysis starts by fitting a two--component mixture model to the observed test statistics. This mixture consists of a null model density and an alternative component density from which the interesting cases are assumed to be drawn.

In the thesis I proposed a new approach called log--FDR
to the estimation of false discovery rates. Specifically,
my new approach to truncated maximum likelihood estimation yields accurate
null model estimates. This is complemented by constrained maximum
likelihood estimation for the alternative density using log--concave
density estimation.

A recent competitor to the FDR is the method of \"Higher
Criticism\". It has been strongly advocated
in the context of variable selection in classification
which is deeply linked to multiple comparisons.
Hence, I also looked at variable selection in class prediction which can be viewed as
a special signal identification problem. Both FDR methods and Higher Criticism
can be highly useful for signal identification. This is discussed in the context of
variable selection in linear discriminant analysis (LDA),
a popular classification method.

FDR methods are not only useful for multiple testing situations in the strict sense,
they are also applicable to related problems. I looked at several kinds of applications of FDR in linear classification. I present and extend statistical techniques related to effect size estimation using false discovery rates and showed how to use these for variable selection. The resulting fdr--effect
method proposed for effect size estimation is shown to work as well as competing
approaches while being conceptually simple and computationally inexpensive.

Additionally, I applied the fdr--effect method to variable selection by minimizing
the misclassification rate and showed that it works very well and leads to compact
and interpretable feature sets.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa.de:bsz:15-qucosa-102676
Date16 January 2013
CreatorsKlaus, Bernd
ContributorsUniversität Leipzig,, Professor Dr. Anne–Laure Boulesteix, Professor Dr. Bernd Kirstein
PublisherUniversitätsbibliothek Leipzig
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typedoc-type:doctoralThesis
Formatapplication/pdf

Page generated in 0.0026 seconds