Return to search

Improving Multi-label Classification by Avoiding Implicit Negativity with Incomplete Data

Many real world problems require multi-label classification, in which each training instance is associated with a set of labels. There are many existing learning algorithms for multi-label classification; however, these algorithms assume implicit negativity, where missing labels in the training data are automatically assumed to be negative. Additionally, many of the existing algorithms do not handle incremental learning in which new labels could be encountered later in the learning process. A novel multi-label adaptation of the backpropagation algorithm is proposed that does not assume implicit negativity. In addition, this algorithm can, using a naive Bayesian approach, infer missing labels in the training data. This algorithm can also be trained incrementally as it dynamically considers new labels. This solution is compared with existing multi-label algorithms using data sets from multiple domains and the performance is measured with standard multi-label evaluation metrics. It is shown that our algorithm improves classification performance for all metrics by an overall average of 7.4% when at least 40% of the labels are missing from the training data, and improves by 18.4% when at least 90% of the labels are missing.

Identiferoai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-3843
Date11 October 2011
CreatorsHeath, Derrall L.
PublisherBYU ScholarsArchive
Source SetsBrigham Young University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceTheses and Dissertations
Rightshttp://lib.byu.edu/about/copyright/

Page generated in 0.0021 seconds