Spelling suggestions: "subject:"parsimonious classifier"" "subject:"parsimonious elassifier""
1 |
Data-Driven Supervised Classifiers in High-Dimensional Spaces: Application on Gene Expression DataEfrem, Nabiel H. January 2024 (has links)
Several ready-to-use supervised classifiers perform predictively well in large-sample cases, but generally, the same cannot be expected when transitioning to high-dimensional settings. This can be explained by the classical supervised theory that has not been developed within high-dimensional spaces, giving several classifiers a hard combat against the curse of dimensionality. A rise in parsimonious classification procedures, particularly techniques incorporating feature selectors, can be observed. It can be interpreted as a two-step procedure: allowing an arbitrary selector to obtain a feature subset independent of a ready-to-use model and subsequently classify unlabelled instances within the selected subset. Modeling the two-step procedure is often heavy in motivation, and theoretical and algorithmic descriptions are frequently overlooked. In this thesis, we aim to describe the theoretical and algorithmic framework when employing a feature selector as a pre-processing step for Support Vector Machine and assess its validity in high-dimensional settings. The validity of the proposed classifier is evaluated based on predictive performance through a comparative study with a state-of-the-art algorithm designed for advanced learning tasks. The chosen algorithm effectively employs feature relevance during training, making it suitable for high-dimensional settings. The results suggest that the proposed classifier performs predicatively superior to the Support Vector Machine in lower input dimensions; however, a high rate of convergence towards a performance comparable to the Support Vector Machine tends to emerge for input dimensions beyond a certain threshold. Additionally, the thesis could not conclude any strict superior performance between the chosen state-of-the-art algorithm and the proposed classifier. Nonetheless, the state-of-the-art algorithm imposes a more balanced performance across both labels.
|
Page generated in 0.0812 seconds