Global ETD Search

1	To Encourage or to Restrict: the Label Dependency in Multi-Label Learning Yang, Zhuo 06 1900 (has links) Multi-label learning addresses the problem that one instance can be associated with multiple labels simultaneously. Understanding and exploiting the Label Dependency (LD) is well accepted as the key to build high-performance multi-label classifiers, i.e., classifiers having abilities including but not limited to generalizing well on clean data and being robust under evasion attack. From the perspective of generalization on clean data, previous works have proved the advantage of exploiting LD in multi-label classification. To further verify the positive role of LD in multi-label classification and address previous limitations, we originally propose an approach named Prototypical Networks for Multi- Label Learning (PNML). Specially, PNML addresses multi-label classification from the angle of estimating the positive and negative class distribution of each label in a shared nonlinear embedding space. PNML achieves the State-Of-The-Art (SOTA) classification performance on clean data. From the perspective of robustness under evasion attack, as a pioneer, we firstly define the attackability of an multi-label classifier as the expected maximum number of flipped decision outputs by injecting budgeted perturbations to the feature distribution of data. Denote the attackability of a multi-label classifier as C∗, and the empirical evaluation of C∗ is an NP-hard problem. We thus develop a method named Greedy Attack Space Exploration (GASE) to estimate C∗ efficiently. More interestingly, we derive an information-theoretic upper bound for the adversarial risk faced by multi-label classifiers. The bound unveils the key factors determining the attackability of multi-label classifiers and points out the negative role of LD in multi-label classifiers’ adversarial robustness, i.e. LD helps the transfer of attack across labels, which makes multi-label classifiers more attackable. One step forward, inspired by the derived bound, we propose a Soft Attackability Estimator (SAE) and further develop Adversarial Robust Multi-label learning with regularized SAE (ARM-SAE) to improve the adversarial robustness of multi-label classifiers. This work gives a more comprehensive understanding of LD in multi-label learning. The exploiting of LD should be encouraged since its positive role in models’ generalization on clean data, but be restricted because of its negative role in models’ adversarial robustness. Read more multi-label learning label dependency adversarial evasion attackability
2	Zero-shot visual recognition via latent embedding learning Wang, Qian January 2018 (has links) Traditional supervised visual recognition methods require a great number of annotated examples for each concerned class. The collection and annotation of visual data (e.g., images and videos) could be laborious, tedious and time-consuming when the number of classes involved is very large. In addition, there are such situations where the test instances are from novel classes for which training examples are unavailable in the training stage. These issues can be addressed by zero-shot learning (ZSL), an emerging machine learning technique enabling the recognition of novel classes. The key issue in zero-shot visual recognition is the semantic gap between visual and semantic representations. We address this issue in this thesis from three different perspectives: visual representations, semantic representations and the learning models. We first propose a novel bidirectional latent embedding framework for zero-shot visual recognition. By learning a latent space from visual representations and labelling information of the training examples, instances of different classes can be mapped into the latent space with the preserving of both visual and semantic relatedness, hence the semantic gap can be bridged. We conduct experiments on both object and human action recognition benchmarks to validate the effectiveness of the proposed ZSL framework. Then we extend the ZSL to the multi-label scenarios for multi-label zero-shot human action recognition based on weakly annotated video data. We employ a long short term memory (LSTM) neural network to explore the multiple actions underlying the video data. A joint latent space is learned by two component models (i.e. the visual model and the semantic model) to bridge the semantic gap. The two component embedding models are trained alternately to optimize the ranking based objectives. Extensive experiments are carried out on two multi-label human action datasets to evaluate the proposed framework. Finally, we propose alternative semantic representations for human actions towards narrowing the semantic gap from the perspective of semantic representation. A simple yet effective solution based on the exploration of web data has been investigated to enhance the semantic representations for human actions. The novel semantic representations are proved to benefit the zero-shot human action recognition significantly compared to the traditional attributes and word vectors. In summary, we propose novel frameworks for zero-shot visual recognition towards narrowing and bridging the semantic gap, and achieve state-of-the-art performance in different settings on multiple benchmarks. Read more 004
3	Multi-Label Dimensionality Reduction January 2011 (has links) abstract: Multi-label learning, which deals with data associated with multiple labels simultaneously, is ubiquitous in real-world applications. To overcome the curse of dimensionality in multi-label learning, in this thesis I study multi-label dimensionality reduction, which extracts a small number of features by removing the irrelevant, redundant, and noisy information while considering the correlation among different labels in multi-label learning. Specifically, I propose Hypergraph Spectral Learning (HSL) to perform dimensionality reduction for multi-label data by exploiting correlations among different labels using a hypergraph. The regularization effect on the classical dimensionality reduction algorithm known as Canonical Correlation Analysis (CCA) is elucidated in this thesis. The relationship between CCA and Orthonormalized Partial Least Squares (OPLS) is also investigated. To perform dimensionality reduction efficiently for large-scale problems, two efficient implementations are proposed for a class of dimensionality reduction algorithms, including canonical correlation analysis, orthonormalized partial least squares, linear discriminant analysis, and hypergraph spectral learning. The first approach is a direct least squares approach which allows the use of different regularization penalties, but is applicable under a certain assumption; the second one is a two-stage approach which can be applied in the regularization setting without any assumption. Furthermore, an online implementation for the same class of dimensionality reduction algorithms is proposed when the data comes sequentially. A Matlab toolbox for multi-label dimensionality reduction has been developed and released. The proposed algorithms have been applied successfully in the Drosophila gene expression pattern image annotation. The experimental results on some benchmark data sets in multi-label learning also demonstrate the effectiveness and efficiency of the proposed algorithms. / Dissertation/Thesis / Ph.D. Computer Science 2011 Read more Computer Science canonical correlation analysis dimensionality reduction hypergraph spectral learning multi-label learning partial least squares
4	Multi-Label Classification Methods for Image Annotation BRHANIE, BEKALU MULLU January 2016 (has links) No description available. Image annotation Empirical study Multi-label learning classification Machine Learning Image Analysis. Computer Sciences Datavetenskap (datalogi)
5	A piRNA regulation landscape in C. elegans and a computational model to predict gene functions Chen, Hao 28 October 2020 (has links) Investigating mechanisms that regulate genes and the genes' functions are essential to understand a biological system. This dissertation is consists of two specific research projects under these aims, which are for understanding piRNA's regulation mechanism and predicting genes' function computationally. The first project shows a piRNA regulation landscape in C. elegans. piRNAs (Piwi-interacting small RNAs) form a complex with Piwi Argonautes to maintain fertility and silence transposons in animal germlines. In C. elegans, previous studies have suggested that piRNAs tolerate mismatched pairing and in principle could target all transcripts. In this project, by computationally analyzing the chimeric reads directly captured by cross-linking piRNA and their targets in vivo, piRNAs are found to target all germline mRNAs with microRNA-like pairing rules. The number of targeting chimeric reads correlates better with binding energy than with piRNA abundance, suggesting that piRNA concentration does not limit targeting. Further more, in mRNAs silenced by piRNAs, secondary small RNAs are found to be accumulating at the center and ends of piRNA binding sites. Whereas in germline-expressed mRNAs, reduced piRNA binding density and suppression of piRNA-associated secondary small RNAs targeting correlate with the CSR-1 Argonaute presence. These findings reveal physiologically important and nuanced regulation of piRNA targets and provide evidence for a comprehensive post-transcriptional regulatory step in germline gene expression. The second project elaborates a computational model to predict gene function. Predicting genes involved in a biological function facilitates many kinds of research, such as prioritizing candidates in a screening project. Following the “Guilt By Association” principle, multiple datasets are considered as biological networks and integrated together under a multi-label learning framework for predicting gene functions. Specifically, the functional labels are propagated and smoothed using a label propagation method on the networks and then integrated using an “Error correction of code” multi-label learning framework, where a “codeword” defines all the labels annotated to a specific gene. The model is then trained by finding the optimal projections between the code matrix and the biological datasets using canonical correlation analysis. Its performance is benchmarked by comparing to a state-of-art algorithm and a large scale screen results for piRNA pathway genes in D.melanogaster. Finally, piRNA targeting's roles in epigenetics and physiology and its cross-talk with CSR-1 pathway are discussed, together with a survey of additional biological datasets and a discussion of benchmarking methods for the gene function prediction. Read more Bioinformatics Biological networks Error correction of code Gene function prediction Multi-label learning piRNA Regulation
6	Aprendizado de máquina multirrótulo: explorando a dependência de rótulos e o aprendizado ativo / Multi-label machine learning: exploring label dependency and active learning Cherman, Everton Alvares 10 January 2014 (has links) Métodos tradicionais de aprendizado supervisionado, chamados de aprendizado monorrótulo, consideram que cada exemplo do conjunto de dados rotulados está associado a um único rótulo. No entanto, existe uma crescente quantidade de aplicações que lidam com exemplos que estão associados a múltiplos rótulos. Essas aplicações requerem métodos de aprendizado multirrótulo. Esse cenário de aprendizado introduz novos desafios que demandam abordagens diferentes daquelas tradicionalmente utilizadas no aprendizado monorrótulo. O custo associado ao processo de rotulação de exemplos, um problema presente em aprendizado monorrótulo, é ainda mais acentuado no contexto multirrótulo. O desenvolvimento de métodos para reduzir esse custo representa um desafio de pesquisa nessa área. Além disso, novos métodos de aprendizado também devem ser desenvolvidos para, entre outros objetivos, considerar a dependência de rótulos: uma nova característica presente no aprendizado multirrótulo. Há um consenso na comunidade de que métodos de aprendizado multirrótulo têm a capacidade de usufruir de melhor eficácia preditiva quando considerada a dependência de rótulos. Os principais objetivos deste trabalho estão relacionados a esses desafios: reduzir o custo do processo de rotulação de exemplos; e desenvolver métodos de aprendizado que explorem a dependência de rótulos. No primeiro caso, entre outras contribuições, um novo método de aprendizado ativo, chamado score dev, é proposto para reduzir os custos associados ao processo de rotulação multirrótulo. Resultados experimentais indicam que o método score dev é superior a outros métodos em vários domínios. No segundo caso, um método para identificar dependência de rótulos, chamado UBC, é proposto, bem como o BR+, um método para explorar essa característica. O método BR+ apresenta resultados superiores a métodos considerados estado da arte / Traditional supervised learning methods, called single-label learning, consider that each example from a labeled dataset is associated with only one label. However, an increasing number of applications deals with examples that are associated with multiple labels. These applications require multi-label learning methods. This learning scenario introduces new challenges and demands approaches that are different from those traditionally used in single-label learning. The cost of labeling examples, a problem in single-label learning, is even higher in the multi-label context. Developing methods to reduce this cost represents a research challenge in this area. Moreover, new learning methods should also be developed to, among other things, consider the label dependency: a new characteristic present in multi-label learning problems. Furthermore, there is a consensus in the community that multi-label learning methods are able to improve their predictive performance when label dependency is considered. The main aims of this work are related to these challenges: reducing the cost of the labeling process; and developing multi-label learning methods to explore label dependency. In the first case, as well as other contributions, a new multi-label active learning method, called score dev, is proposed to reduce the multi-labeling processing costs. Experimental results show that score dev outperforms other methods in many domains. In the second case, a method to identify label dependency, called UBC, is proposed, as well as BR+, a method to explore this characteristic. Results show that the BR+ method outperforms other state-of-the-art methods Read more Active learning Aprendizado ativo Aprendizado de máquina Aprendizado multirrótulo Dependência de rótulos Label dependency Machine learning Multi-label learning
7	Aprendizado de máquina multirrótulo: explorando a dependência de rótulos e o aprendizado ativo / Multi-label machine learning: exploring label dependency and active learning Everton Alvares Cherman 10 January 2014 (has links) Métodos tradicionais de aprendizado supervisionado, chamados de aprendizado monorrótulo, consideram que cada exemplo do conjunto de dados rotulados está associado a um único rótulo. No entanto, existe uma crescente quantidade de aplicações que lidam com exemplos que estão associados a múltiplos rótulos. Essas aplicações requerem métodos de aprendizado multirrótulo. Esse cenário de aprendizado introduz novos desafios que demandam abordagens diferentes daquelas tradicionalmente utilizadas no aprendizado monorrótulo. O custo associado ao processo de rotulação de exemplos, um problema presente em aprendizado monorrótulo, é ainda mais acentuado no contexto multirrótulo. O desenvolvimento de métodos para reduzir esse custo representa um desafio de pesquisa nessa área. Além disso, novos métodos de aprendizado também devem ser desenvolvidos para, entre outros objetivos, considerar a dependência de rótulos: uma nova característica presente no aprendizado multirrótulo. Há um consenso na comunidade de que métodos de aprendizado multirrótulo têm a capacidade de usufruir de melhor eficácia preditiva quando considerada a dependência de rótulos. Os principais objetivos deste trabalho estão relacionados a esses desafios: reduzir o custo do processo de rotulação de exemplos; e desenvolver métodos de aprendizado que explorem a dependência de rótulos. No primeiro caso, entre outras contribuições, um novo método de aprendizado ativo, chamado score dev, é proposto para reduzir os custos associados ao processo de rotulação multirrótulo. Resultados experimentais indicam que o método score dev é superior a outros métodos em vários domínios. No segundo caso, um método para identificar dependência de rótulos, chamado UBC, é proposto, bem como o BR+, um método para explorar essa característica. O método BR+ apresenta resultados superiores a métodos considerados estado da arte / Traditional supervised learning methods, called single-label learning, consider that each example from a labeled dataset is associated with only one label. However, an increasing number of applications deals with examples that are associated with multiple labels. These applications require multi-label learning methods. This learning scenario introduces new challenges and demands approaches that are different from those traditionally used in single-label learning. The cost of labeling examples, a problem in single-label learning, is even higher in the multi-label context. Developing methods to reduce this cost represents a research challenge in this area. Moreover, new learning methods should also be developed to, among other things, consider the label dependency: a new characteristic present in multi-label learning problems. Furthermore, there is a consensus in the community that multi-label learning methods are able to improve their predictive performance when label dependency is considered. The main aims of this work are related to these challenges: reducing the cost of the labeling process; and developing multi-label learning methods to explore label dependency. In the first case, as well as other contributions, a new multi-label active learning method, called score dev, is proposed to reduce the multi-labeling processing costs. Experimental results show that score dev outperforms other methods in many domains. In the second case, a method to identify label dependency, called UBC, is proposed, as well as BR+, a method to explore this characteristic. Results show that the BR+ method outperforms other state-of-the-art methods Read more Aprendizado ativo Aprendizado de máquina Aprendizado multirrótulo Dependência de rótulos Active learning Label dependency Machine learning Multi-label learning
8	Multi-label Learning under Different Labeling Scenarios Li, Xin January 2015 (has links) Traditional multi-class classification problems assume that each instance is associated with a single label from category set Y where \|Y\| > 2. Multi-label classification generalizes multi-class classification by allowing each instance to be associated with multiple labels from Y. In many real world data analysis problems, data objects can be assigned into multiple categories and hence produce multi-label classification problems. For example, an image for object categorization can be labeled as 'desk' and 'chair' simultaneously if it contains both objects. A news article talking about the effect of Olympic games on tourism industry might belong to multiple categories such as 'sports', 'economy', and 'travel', since it may cover multiple topics. Regardless of the approach used, multi-label learning in general requires a sufficient amount of labeled data to recover high quality classification models. However due to the label sparsity, i.e. each instance only carries a small number of labels among the label set Y, it is difficult to prepare sufficient well-labeled data for each class. Many approaches have been developed in the literature to overcome such challenge by exploiting label correlation or label dependency. In this dissertation, we propose a probabilistic model to capture the pairwise interaction between labels so as to alleviate the label sparsity. Besides of the traditional setting that assumes training data is fully labeled, we also study multi-label learning under other scenarios. For instance, training data can be unreliable due to missing values. A conditional Restricted Boltzmann Machine (CRBM) is proposed to take care of such challenge. Furthermore, labeled training data can be very scarce due to the cost of labeling but unlabeled data are redundant. We proposed two novel multi-label learning algorithms under active setting to relieve the pain, one for standard single level problem and one for hierarchical problem. Our empirical results on multiple multi-label data sets demonstrate the efficacy of the proposed methods. / Computer and Information Science Read more Computer Science Artificial Intelligence Active Learning Computer Vision Hierarchical Classification Image Classification Machine Learning Multi-label Learning

Search results