Return to search

Supervised Classification Leveraging Refined Unlabeled Data

This thesis focuses on how unlabeled data can improve supervised learning classi-fiers in all contexts, for both scarce to abundant label situations. This is meant toaddress the limitations within supervised learning with regards to label availability.Extending the training set with unlabeled data can overcome issues such as selec-tion bias, noise and insufficient data. Based on the overall data distribution andthe initial set of labels, semi-supervised methods provide labels for additional datapoints. The semi-supervised approaches considered in this thesis belong to one ofthe following categories: transductive SVMs, Cluster-then-Label and graph-basedtechniques. Further, we evaluate the behavior of: Logistic regression, Single layerperceptron, SVM and Decision trees. By learning on the extended training set,supervised classifiers are able to generalize better. Based on the results, this the-sis recommends data-processing and algorithmic solutions appropriate to real-worldsituations.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-119320
Date January 2015
CreatorsBocancea, Andreea
PublisherLinköpings universitet, Institutionen för datavetenskap, Linköpings universitet, Tekniska fakulteten
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0022 seconds