Global ETD Search

Return to search

Improving Multi-label Classification by Avoiding Implicit Negativity with Incomplete Data

Many real world problems require multi-label classification, in which each training instance is associated with a set of labels. There are many existing learning algorithms for multi-label classification; however, these algorithms assume implicit negativity, where missing labels in the training data are automatically assumed to be negative. Additionally, many of the existing algorithms do not handle incremental learning in which new labels could be encountered later in the learning process. A novel multi-label adaptation of the backpropagation algorithm is proposed that does not assume implicit negativity. In addition, this algorithm can, using a naive Bayesian approach, infer missing labels in the training data. This algorithm can also be trained incrementally as it dynamically considers new labels. This solution is compared with existing multi-label algorithms using data sets from multiple domains and the performance is measured with standard multi-label evaluation metrics. It is shown that our algorithm improves classification performance for all metrics by an overall average of 7.4% when at least 40% of the labels are missing from the training data, and improves by 18.4% when at least 90% of the labels are missing.

implicit negativity

multi-label classification

thesis

Computer Sciences

Identifer	oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-3843
Date	11 October 2011
Creators	Heath, Derrall L.
Publisher	BYU ScholarsArchive
Source Sets	Brigham Young University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations
Rights	http://lib.byu.edu/about/copyright/

Page generated in 0.0025 seconds

Improving Multi-label Classification by Avoiding Implicit Negativity with Incomplete Data

Description

Links & Downloads

Tags

Additional Fields