Global ETD Search

Return to search

Semi-Supervised Self-Learning on Imbalanced Data Sets

Semi-supervised self-learning algorithms have been shown to improve classifier accuracy under a variety of conditions. In this thesis, semi-supervised self-learning using ensembles of random forests and fuzzy c-means clustering similarity was applied to three data sets to show where improvement is possible over random forests alone. Two of the data sets are emulations of large simulations in which the data may be distributed. Additionally, the ratio of majority to minority class examples in the training set was altered to examine the effect of training set bias on performance when applying the semi-supervised algorithm.

Identifer	oai:union.ndltd.org:USF/oai:scholarcommons.usf.edu:etd-2685
Date	05 April 2010
Creators	Korecki, John Nicholas
Publisher	Scholar Commons
Source Sets	University of South Flordia
Detected Language	English
Type	text
Format	application/pdf
Source	Graduate Theses and Dissertations
Rights	default

Page generated in 0.0026 seconds

Semi-Supervised Self-Learning on Imbalanced Data Sets

Description

Links & Downloads

Tags

Additional Fields