The purpose of the present dissertation is to study model selection techniques which are specifically designed for classification of high-dimensional data with a large number of classes. To the best of our knowledge, this problem has never been studied in depth previously. We assume that the number of components p is much larger than the number of samples n, and that only few of those p components are useful for subsequent classification. In what follows, we introduce two Bayesian models which use two different approaches to the problem: one which discards components which have “almost constant” values (Model 1) and another which retains the components for which between-group variations are larger than withingroup variation (Model 2). We show that particular cases of the above two models recover familiar variance or ANOVA-based component selection. When one has only two classes and features are a priori independent, Model 2 reduces to the Feature Annealed Independence Rule (FAIR) introduced by Fan and Fan (2008) and can be viewed as a natural generalization to the case of L > 2 classes. A nontrivial result of the dissertation is that the precision of feature selection using Model 2 improves when the number of classes grows. Subsequently, we examine the rate of misclassification with and without feature selection on the basis of Model 2.
Identifer | oai:union.ndltd.org:ucf.edu/oai:stars.library.ucf.edu:etd-2836 |
Date | 01 January 2011 |
Creators | Davis, Justin Kyle |
Publisher | STARS |
Source Sets | University of Central Florida |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Electronic Theses and Dissertations |
Page generated in 0.0016 seconds