Return to search

A Comparison of SVM Classifiers with Embedded Feature Selection

Since their introduction in 1995, Support Vector Machines (SVM) have come to be a widely employed machine learning model for binary classification, owing to their explainable architecture, efficient forward inference, and good ability to generalize. A common desire, not only for SVMs but for machine learning classifiers in general, is to have the model do feature selection, using only a limited subset of the available attributes in its predictions. Various alterations to the SVM problem formulation exist that address this, and in this report we compare a range of such SVM models. We compare how the accuracy and feature selection compare between the models for different datasets, both real and synthetic, and we also investigate the impact of dataset size on the aforementioned quantities.  Our conclusions are that models trained to classify samples based on a smaller subset of features, tend to perform at a comparable level to dense models, with particular advantage when the dataset is small. Furthermore, as the training dataset grows in size, the number of selected features also increases, giving a more complex classifier when prompted with a larger data supply.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-348822
Date January 2024
CreatorsJohansson, Adam, Mattsson, Anton
PublisherKTH, Skolan för teknikvetenskap (SCI)
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationTRITA-SCI-GRU ; 2024:128

Page generated in 0.0018 seconds