Global ETD Search

Return to search

Quantifying the stability of feature selection

Feature Selection is central to modern data science, from exploratory data analysis to predictive model-building. The "stability"of a feature selection algorithm refers to the robustness of its feature preferences, with respect to data sampling and to its stochastic nature. An algorithm is "unstable" if a small change in data leads to large changes in the chosen feature subset. Whilst the idea is simple, quantifying this has proven more challenging---we note numerous proposals in the literature, each with different motivation and justification. We present a rigorous statistical and axiomatic treatment for this issue. In particular, with this work we consolidate the literature and provide (1) a deeper understanding of existing work based on a small set of properties, and (2) a clearly justified statistical approach with several novel benefits. This approach serves to identify a stability measure obeying all desirable properties, and (for the first time in the literature) allowing confidence intervals and hypothesis tests on the stability of an approach, enabling rigorous comparison of feature selection algorithms.

https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.740328

004

Identifer	oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:740328
Date	January 2018
Creators	Nogueira, Sarah
Contributors	Brown, Gavin ; Shapiro, Jonathan
Publisher	University of Manchester
Source Sets	Ethos UK
Detected Language	English
Type	Electronic Thesis or Dissertation
Source	https://www.research.manchester.ac.uk/portal/en/theses/quantifying-the-stability-of-feature-selection(6b69098a-58ee-4182-9a30-693d714f0c9f).html

Page generated in 0.0017 seconds

Quantifying the stability of feature selection

Description

Links & Downloads

Tags

Additional Fields