Traditional multi-class classification problems assume that each instance is associated with a single label from category set Y where |Y| > 2. Multi-label classification generalizes multi-class classification by allowing each instance to be associated with multiple labels from Y. In many real world data analysis problems, data objects can be assigned into multiple categories and hence produce multi-label classification problems. For example, an image for object categorization can be labeled as 'desk' and 'chair' simultaneously if it contains both objects. A news article talking about the effect of Olympic games on tourism industry might belong to multiple categories such as 'sports', 'economy', and 'travel', since it may cover multiple topics. Regardless of the approach used, multi-label learning in general requires a sufficient amount of labeled data to recover high quality classification models. However due to the label sparsity, i.e. each instance only carries a small number of labels among the label set Y, it is difficult to prepare sufficient well-labeled data for each class. Many approaches have been developed in the literature to overcome such challenge by exploiting label correlation or label dependency. In this dissertation, we propose a probabilistic model to capture the pairwise interaction between labels so as to alleviate the label sparsity. Besides of the traditional setting that assumes training data is fully labeled, we also study multi-label learning under other scenarios. For instance, training data can be unreliable due to missing values. A conditional Restricted Boltzmann Machine (CRBM) is proposed to take care of such challenge. Furthermore, labeled training data can be very scarce due to the cost of labeling but unlabeled data are redundant. We proposed two novel multi-label learning algorithms under active setting to relieve the pain, one for standard single level problem and one for hierarchical problem. Our empirical results on multiple multi-label data sets demonstrate the efficacy of the proposed methods. / Computer and Information Science
Identifer | oai:union.ndltd.org:TEMPLE/oai:scholarshare.temple.edu:20.500.12613/3182 |
Date | January 2015 |
Creators | Li, Xin |
Contributors | Guo, Yuhong, Vucetic, Slobodan, Dragut, Eduard Constantin, Dong, Yuexiao |
Publisher | Temple University. Libraries |
Source Sets | Temple University |
Language | English |
Detected Language | English |
Type | Thesis/Dissertation, Text |
Format | 97 pages |
Rights | IN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available., http://rightsstatements.org/vocab/InC/1.0/ |
Relation | http://dx.doi.org/10.34944/dspace/3164, Theses and Dissertations |
Page generated in 0.0064 seconds