Feature selection is an effective technique in reducing the dimensionality of features in many applications where datasets involve hundreds or thousands of features. The objective of feature selection is to find an optimal subset of relevant features such that the feature size is reduced and understandability of a learning process is improved without significantly decreasing the overall accuracy and applicability. This thesis focuses on the consistency measure where a feature subset is consistent if there exists a set of instances of length more than two with the same feature values and the same class labels. This thesis introduces a new consistency-based algorithm, Automatic Hybrid Search (AHS) and reviews several existing feature selection algorithms (ES, PS and HS) which are based on the consistency rate. After that, we conclude this work by conducting an empirical study to a comparative analysis of different search algorithms.
Identifer | oai:union.ndltd.org:WKU/oai:digitalcommons.wku.edu:theses-1062 |
Date | 01 May 2009 |
Creators | Lin, Pengpeng |
Publisher | TopSCHOLAR® |
Source Sets | Western Kentucky University Theses |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Masters Theses & Specialist Projects |
Page generated in 0.002 seconds