Return to search

Multi-label Learning under Different Labeling Scenarios

Traditional multi-class classification problems assume that each instance is associated with a single label from category set Y where |Y| > 2. Multi-label classification generalizes multi-class classification by allowing each instance to be associated with multiple labels from Y. In many real world data analysis problems, data objects can be assigned into multiple categories and hence produce multi-label classification problems. For example, an image for object categorization can be labeled as 'desk' and 'chair' simultaneously if it contains both objects. A news article talking about the effect of Olympic games on tourism industry might belong to multiple categories such as 'sports', 'economy', and 'travel', since it may cover multiple topics. Regardless of the approach used, multi-label learning in general requires a sufficient amount of labeled data to recover high quality classification models. However due to the label sparsity, i.e. each instance only carries a small number of labels among the label set Y, it is difficult to prepare sufficient well-labeled data for each class. Many approaches have been developed in the literature to overcome such challenge by exploiting label correlation or label dependency. In this dissertation, we propose a probabilistic model to capture the pairwise interaction between labels so as to alleviate the label sparsity. Besides of the traditional setting that assumes training data is fully labeled, we also study multi-label learning under other scenarios. For instance, training data can be unreliable due to missing values. A conditional Restricted Boltzmann Machine (CRBM) is proposed to take care of such challenge. Furthermore, labeled training data can be very scarce due to the cost of labeling but unlabeled data are redundant. We proposed two novel multi-label learning algorithms under active setting to relieve the pain, one for standard single level problem and one for hierarchical problem. Our empirical results on multiple multi-label data sets demonstrate the efficacy of the proposed methods. / Computer and Information Science

Identiferoai:union.ndltd.org:TEMPLE/oai:scholarshare.temple.edu:20.500.12613/3182
Date January 2015
CreatorsLi, Xin
ContributorsGuo, Yuhong, Vucetic, Slobodan, Dragut, Eduard Constantin, Dong, Yuexiao
PublisherTemple University. Libraries
Source SetsTemple University
LanguageEnglish
Detected LanguageEnglish
TypeThesis/Dissertation, Text
Format97 pages
RightsIN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available., http://rightsstatements.org/vocab/InC/1.0/
Relationhttp://dx.doi.org/10.34944/dspace/3164, Theses and Dissertations

Page generated in 0.0064 seconds