• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 13
  • 5
  • 2
  • 1
  • Tagged with
  • 43
  • 43
  • 43
  • 26
  • 13
  • 12
  • 12
  • 11
  • 10
  • 9
  • 8
  • 8
  • 7
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Construções de comitês de classificadores multirrótulos no aprendizado semissupervisionado multidescrição

Silva, Wilamis Kleiton Nunes da 18 August 2017 (has links)
Submitted by Lara Oliveira (lara@ufersa.edu.br) on 2017-09-19T21:25:54Z No. of bitstreams: 1 WilamisKNS_DISSERT.pdf: 2959360 bytes, checksum: f4e2b25f85638d49d61b7b5e7415d3fc (MD5) / Approved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2017-10-27T13:05:12Z (GMT) No. of bitstreams: 1 WilamisKNS_DISSERT.pdf: 2959360 bytes, checksum: f4e2b25f85638d49d61b7b5e7415d3fc (MD5) / Approved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2017-10-27T13:08:52Z (GMT) No. of bitstreams: 1 WilamisKNS_DISSERT.pdf: 2959360 bytes, checksum: f4e2b25f85638d49d61b7b5e7415d3fc (MD5) / Made available in DSpace on 2017-10-27T13:09:10Z (GMT). No. of bitstreams: 1 WilamisKNS_DISSERT.pdf: 2959360 bytes, checksum: f4e2b25f85638d49d61b7b5e7415d3fc (MD5) Previous issue date: 2017-08-18 / Multi-label problems have become increasingly common, for a label can be attributed to more than one instance, being called multi-label classification problems. Among the di_erent multilabel classification methods we can mention: BR (Binary Relevance), LP (Label Powerset) And RAkEL (RAndom k labELsets). Such methods have been recognized as methods for transforming the Problem, since they consist of turning the multi-label problem into several problems of traditional classification (mono label). However, the adoption of Classificatory committees in multi-label classification problems has still been new-found so far, With a great field to be explored for conducting researches as well. This work aims of doing a study on the construction of multilabel classifiers committees Built through the application of multi- description semisupervised learning techniques, in order to verify if application of this type of learning in the construction of committees results in improvements linked to the results. The committees of classifiers used in the experiments were Bagging, Boosting and Stacking as methods of transformation of the problems used were the BR, LP and Rakel methods and for classification multi-label multi-label semi-supervised multi-description was used Co-Training. At the end of the experimental analyzes, it was verified that the use of the semi-supervised approach presented satisfactory results, since the two approaches presented similar results / São cada vez mais comum problemas multirrótulos onde um rótulo pode ser atribuído a mais de uma instância, sendo chamados de problemas de classificação multirrótulo. Dentre os diferentes métodos de classificação multirrótulo, podemos citar os métodos BR (Binary Relevance), LP (Label Powerset) e RAkEL (RAndom k-labELsets). Tais métodos são ditos métodos de transformação do problema, pois consistem em transformar o problema multirrótulo em vários problemas de classificação tradicional (monorrótulo).A adoção de comitês de classificadores em problemas de classificação multirrótulo ainda é algo muito recente, com muito a ser explorado para a realização de pesquisas. O objetivo deste trabalho é realizar um estudo sobre a construção de comitês de classificadores multirrótulos construídos através da aplicação das técnicas de aprendizado semissupervisionado multidescrição, a fim de verificar se aplicação desse tipo de aprendizado na construção de comitês acarreta melhorias nos resultados. Os comitês de classificadores utilizados nos experimentos foram o Bagging, Boosting e Stacking como métodos de transformação do problemas foram utilizados os métodos BR, LP e Rakel e para a classificação multirrótulo semissupervisionada multidescrição foi utilizado o Co-Training. Ao fim das análises experimentais verificou-se que a utilização da abordagem semissupervisionado apresentou resultados satisfatórios, uma vez que as duas abordagens supervisionada e semissupervisionada utilizadas no trabalho apresentaram resultados semelhantes / 2017-09-19
32

Utilizando aprendizado emissupervisionado multidescrição em problemas de classificação hierárquica multirrótulo

Araújo, Hiury Nogueira de 17 November 2017 (has links)
Submitted by Lara Oliveira (lara@ufersa.edu.br) on 2018-03-14T20:25:58Z No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5) / Approved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2018-06-18T16:58:58Z (GMT) No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5) / Approved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2018-06-18T16:59:18Z (GMT) No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5) / Made available in DSpace on 2018-06-18T16:59:31Z (GMT). No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5) Previous issue date: 2017-11-17 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Data classification is a task applied in various areas of knowledge, therefore, the focus of ongoing research. Data classification can be divided according to the available data, which are labeled or not labeled. One approach has proven very effective when working with data sets containing labeled and unlabeled data, this called semi-supervised learning, your objective is to label the unlabeled data by using the amount of labeled data in the data set, improving their success rate. Such data can be classified with more than one label, known as multi-label classification. Furthermore, these data can be organized hierarchically, thus containing a relation therebetween, this called hierarchical classification. This work proposes the use of multi-view semi-supervised learning, which is one of the semissupervisionado learning aspects, in problems of hierarchical multi-label classification, with the objective of investigating whether semi-supervised learning is an appropriate approach to solve the problem of low dimensionality of data. An experimental analysis of the methods found that supervised learning had a better performance than semi-supervised approaches, however, semi-supervised learning may be a widely used approach, because, there is plenty to be contributed in this area / classificação de dados é uma tarefa aplicada em diversas áreas do conhecimento, sendo assim, foco de constantes pesquisas. A classificação de dados pode ser dividida de acordo com a disposição dos dados, sendo estes rotulados ou não rotulados. Uma abordagem vem se mostrando bastante eficiente ao se trabalhar com conjuntos de dados contendo dados rotulados e não rotulados, esta chamada de aprendizado semissupervisionado, seu objetivo é classificar os dados não rotulados através da quantidade de dados rotulados contidos no conjunto, melhorando sua taxa de acerto. Tais dados podem ser classificados com mais de um rótulo, conhecida como classificação multirrótulo. Além disso, estes dados podem estar organizados de forma hierárquica, contendo assim, uma relação entre os mesmos, esta, por sua vez, denominada classificação hierárquica. Neste trabalho é proposto a utilização do aprendizado semissupervisionado multidescrição, que é uma das vertentes do aprendizado semissupervisionado, em problemas de classificação hierárquica multirrótulo, com o objetivo de investigar se o aprendizado semissupervisionado é uma abordagem apropriada para resolver o problema de baixa dimensionalidade de dados. Uma análise experimental dos métodos verificou que o aprendizado supervisionado obteve melhor desempenho contra as abordagens semissupervisionadas, contudo, o aprendizado semissupervisionado pode vir a ser uma abordagem amplamente utilizada, pois, há bastante o que ser contribuído nesta área / 2018-03-14
33

Redes neurais e algoritmos genéticos para problemas de classificação hierárquica multirrótulo / Neural networks and genetic algorithms for hierarchical multi-label classification

Ricardo Cerri 05 December 2013 (has links)
Em problemas convencionais de classificação, cada exemplo de um conjunto de dados é associado a apenas uma dentre duas ou mais classes. No entanto, existem problemas de classificação mais complexos, nos quais as classes envolvidas no problema são estruturadas hierarquicamente, possuindo subclasses e superclasses. Nesses problemas, exemplos podem ser atribuídos simultaneamente a classes pertencentes a dois ou mais caminhos de uma hierarquia, ou seja, exemplos podem ser classificados em várias classes localizadas em um mesmo nível hierárquico. Tal hierarquia pode ser estruturada como uma árvore ou como um grafo acíclico direcionado. Esses problemas são chamados de problemas de classificação hierárquica multirrótulo, sendo mais difíceis devido à alta complexidade, diversidade de soluções, difícil modelagem e desbalanceamento dos dados. Duas abordagens são utilizadas para tratar esses problemas, chamadas global e local. Na abordagem global, um único classificador é induzido para lidar com todas as classes do problema simultaneamente, e a classificação de novos exemplos é realizada em apenas um passo. Já na abordagem local, um conjunto de classificadores é induzido, sendo cada classificador responsável pela predição de uma classe ou de um conjunto de classes, e a classificação de novos exemplos é realizada em vários passos, considerando as predições dos vários classificadores. Nesta Tese de Doutorado, são propostos e investigados dois métodos para classificação hierárquica multirrótulo. O primeiro deles é baseado na abordagem local, e associa uma rede neural Multi-Layer Perceptron (MLP) a cada nível da hierarquia, sendo cada MLP responsável pelas predições no seu nível associado. O método é chamado Hierarchical Multi- Label Classification with Local Multi-Layer Perceptrons (HMC-LMLP). O segundo método é baseado na abordagem global, e induz regras de classificação hierárquicas multirrótulo utilizando um Algoritmo Genético. O método é chamado Hierarchical Multi-Label Classification with a Genetic Algorithm (HMC-GA). Experimentos utilizando hierarquias estruturadas como árvores mostraram que o método HMC-LMLP obteve desempenhos de classificação superiores ao método estado-da-arte na literatura, e desempenhos superiores ou competitivos quando utilizando hierarquias estruturadas como grafos. O método HMC-GA obteve resultados competitivos com outros métodos da literatura em hierarquias estruturadas como árvores e grafos, sendo capaz de induzir, em muitos casos, regras menores e em menor quantidade / conventional classification problems, each example of a dataset is associated with just one among two or more classes. However, there are more complex classification problems where the classes are hierarchically structured, having subclasses and superclasses. In these problems, examples can be simultaneously assigned to classes belonging to two or more paths of a hierarchy, i.e., examples can be classified in many classes located in the same hierarchical level. Such a hierarchy can be structured as a tree or a directed acyclic graph. These problems are known as hierarchical multi-label classification problems, being more difficult due to the high complexity, diversity of solutions, modeling difficulty and data imbalance. Two main approaches are used to deal with these problems, called global and local. In the global approach, only one classifier is induced to deal with all classes simultaneously, and the classification of new examples is done in just one step. In the local approach, a set of classifiers is induced, where each classifier is responsible for the predictions of one class or a set of classes, and the classification of new examples is done in many steps, considering the predictions of all classifiers. In this Thesis, two methods for hierarchical multi-label classification are proposed and investigated. The first one is based on the local approach, and associates a Multi-Layer Perceptron (MLP) to each hierarchical level, being each MLP responsible for the predictions in its associated level. The method is called Hierarchical Multi-Label Classification with Local Multi-Layer Perceptrons (HMC-LMLP). The second method is based on the global approach, and induces hierarchical multi-label classification rules using a Genetic Algorithm. The method is called Hierarchical Multi-Label Classification with a Genetic Algorithm (HMC-GA). Experiments using hierarchies structured as trees showed that HMC-LMLP obtained classification performances superior to the state-of-the-art method in the literature, and superior or competitive performances when using graph-structured hierarchies. The HMC-GA method obtained competitive results with other methods of the literature in both tree and graph-structured hierarchies, being able of inducing, in many cases, smaller and in less quantity rules
34

Classification multi-labels graduée : découverte des relations entre les labels, et adaptation à la reconnaissance des odeurs et au contexte big data des systèmes de recommandation / Graded multi-label classification : discovery of label relations, and adaptation to odor recognition and the big data context of recommendation systems

Laghmari, Khalil 23 March 2018 (has links)
En classification multi-labels graduée (CMLG), chaque instance est associée à un ensemble de labels avec des degrés d’association gradués. Par exemple, une même molécule odorante peut être associée à une odeur forte ‘musquée’, une odeur modérée ‘animale’, et une odeur faible ‘herbacée’. L’objectif est d’apprendre un modèle permettant de prédire l’ensemble gradué de labels associé à une instance à partir de ses variables descriptives. Par exemple, prédire l’ensemble gradué d’odeurs à partir de la masse moléculaire, du nombre de liaisons doubles, et de la structure de la molécule. Un autre domaine intéressant de la CMLG est les systèmes de recommandation. En effet, les appréciations des utilisateurs par rapport à des items (produits, services, livres, films, etc) sont d’abord collectées sous forme de données MLG (l’échelle d’une à cinq étoiles est souvent utilisée). Ces données sont ensuite exploitées pour recommander à chaque utilisateur des items qui ont le plus de chance de l’intéresser. Dans cette thèse, une étude théorique approfondie de la CMLG permet de ressortir les limites des approches existantes, et d’assoir un ensemble de nouvelles approches apportant des améliorations évaluées expérimentalement sur des données réelles. Le cœur des nouvelles approches proposées est l’exploitation des relations entre les labels. Par exemple, une molécule ayant une forte odeur ‘musquée’ émet souvent une odeur faible ou modérée ‘animale’. Cette thèse propose également de nouvelles approches adaptées au cas des molécules odorantes et au cas des gros volumes de données collectées dans le cadre des systèmes de recommandation. / In graded multi-label classification (GMLC), each instance is associated to a set of labels with graded membership degrees. For example, the same odorous molecule may be associated to a strong 'musky' odor, a moderate 'animal' odor, and a weak 'grassy' odor. The goal is to learn a model to predict the graded set of labels associated to an instance from its descriptive variables. For example, predict the graduated set of odors from the molecular weight, the number of double bonds, and the structure of the molecule. Another interesting area of the GMLC is recommendation systems. In fact, users' assessments of items (products, services, books, films, etc.) are first collected in the form of GML data (using the one-to-five star rating). These data are then used to recommend to each user items that are most likely to interest him. In this thesis, an in-depth theoretical study of the GMLC allows to highlight the limits of existing approaches, and to introduce a set of new approaches bringing improvements evaluated experimentally on real data. The main point of the new proposed approaches is the exploitation of relations between labels. For example, a molecule with a strong 'musky' odor often has a weak or moderate 'animal' odor. This thesis also proposes new approaches adapted to the case of odorous molecules and to the case of large volumes of data collected in the context of recommendation systems.
35

Deep Learning For RADAR Signal Processing

Wharton, Michael K. January 2021 (has links)
No description available.
36

Uso de confiabilidade na rotula??o de exemplos em problemas de classifica??o multirr?tulo com aprendizado semissupervisionado

Rodrigues, Fillipe Morais 21 February 2014 (has links)
Made available in DSpace on 2014-12-17T15:48:09Z (GMT). No. of bitstreams: 1 FillipeMR_DISSERT.pdf: 1204563 bytes, checksum: 66d7e69371d4103cf2e242609ed0bbb7 (MD5) Previous issue date: 2014-02-21 / Conselho Nacional de Desenvolvimento Cient?fico e Tecnol?gico / The techniques of Machine Learning are applied in classification tasks to acquire knowledge through a set of data or information. Some learning methods proposed in literature are methods based on semissupervised learning; this is represented by small percentage of labeled data (supervised learning) combined with a quantity of label and non-labeled examples (unsupervised learning) during the training phase, which reduces, therefore, the need for a large quantity of labeled instances when only small dataset of labeled instances is available for training. A commom problem in semi-supervised learning is as random selection of instances, since most of paper use a random selection technique which can cause a negative impact. Much of machine learning methods treat single-label problems, in other words, problems where a given set of data are associated with a single class; however, through the requirement existent to classify data in a lot of domain, or more than one class, this classification as called multi-label classification. This work presents an experimental analysis of the results obtained using semissupervised learning in troubles of multi-label classification using reliability parameter as an aid in the classification data. Thus, the use of techniques of semissupervised learning and besides methods of multi-label classification, were essential to show the results / As t?cnicas de Aprendizado de M?quina s?o aplicadas em tarefas de classifica??o para a aquisi??o de conhecimento atrav?s de um conjunto de dados ou informa??es. Alguns m?todos de aprendizado utilizados pela literatura s?o baseados em aprendizado semissupervisionado; este ? representado por pequeno percentual de exemplos rotulados (aprendizado supervisionado) combinados com uma quantidade de exemplos rotulados e n?o rotulados (n?o-supervisionado) durante a fase de treinamento, reduzindo, portanto, a necessidade de uma grande quantidade de dados rotulados quando apenas um pequeno conjunto de exemplos rotulados est? dispon?vel para treinamento. O problema da escolha aleat?ria das inst?ncias ? comum no aprendizado semissupervisionado, pois a maioria dos trabalhos usam a escolha aleat?ria dessas inst?ncias o que pode causar um impacto negativo. Por outro lado, grande parte dos m?todos de aprendizado de m?quina trata de problemas unirr?tulo, ou seja, problemas onde exemplos de um determinado conjunto s?o associados a uma ?nica classe. Entretanto, diante da necessidade existente de classificar dados em uma grande quantidade de dom?nios, ou em mais de uma classe, essa classifica??o citada ? denominada classifica??o multirr?tulo. Este trabalho apresenta uma an?lise experimental dos resultados obtidos por meio da utiliza??o do aprendizado semissupervisionado em problemas de classifica??o multirr?tulo usando um par?metro de confiabilidade como aux?lio na classifica??o dos dados. Dessa maneira, a utiliza??o de t?cnicas de aprendizado semissupervisionado, bem como de m?todos de classifica??o multirr?tulos, foram imprescind?veis na apresenta??o dos resultados
37

Multimodal Deep Learning for Multi-Label Classification and Ranking Problems

Dubey, Abhishek January 2015 (has links) (PDF)
In recent years, deep neural network models have shown to outperform many state of the art algorithms. The reason for this is, unsupervised pretraining with multi-layered deep neural networks have shown to learn better features, which further improves many supervised tasks. These models not only automate the feature extraction process but also provide with robust features for various machine learning tasks. But the unsupervised pretraining and feature extraction using multi-layered networks are restricted only to the input features and not to the output. The performance of many supervised learning algorithms (or models) depends on how well the output dependencies are handled by these algorithms [Dembczy´nski et al., 2012]. Adapting the standard neural networks to handle these output dependencies for any specific type of problem has been an active area of research [Zhang and Zhou, 2006, Ribeiro et al., 2012]. On the other hand, inference into multimodal data is considered as a difficult problem in machine learning and recently ‘deep multimodal neural networks’ have shown significant results [Ngiam et al., 2011, Srivastava and Salakhutdinov, 2012]. Several problems like classification with complete or missing modality data, generating the missing modality etc., are shown to perform very well with these models. In this work, we consider three nontrivial supervised learning tasks (i) multi-class classification (MCC), (ii) multi-label classification (MLC) and (iii) label ranking (LR), mentioned in the order of increasing complexity of the output. While multi-class classification deals with predicting one class for every instance, multi-label classification deals with predicting more than one classes for every instance and label ranking deals with assigning a rank to each label for every instance. All the work in this field is associated around formulating new error functions that can force network to identify the output dependencies. Aim of our work is to adapt neural network to implicitly handle the feature extraction (dependencies) for output in the network structure, removing the need of hand crafted error functions. We show that the multimodal deep architectures can be adapted for these type of problems (or data) by considering labels as one of the modalities. This also brings unsupervised pretraining to the output along with the input. We show that these models can not only outperform standard deep neural networks, but also outperform standard adaptations of neural networks for individual domains under various metrics over several data sets considered by us. We can observe that the performance of our models over other models improves even more as the complexity of the output/ problem increases.
38

New Methods for Learning from Heterogeneous and Strategic Agents

Divya, Padmanabhan January 2017 (has links) (PDF)
1 Introduction In this doctoral thesis, we address several representative problems that arise in the context of learning from multiple heterogeneous agents. These problems are relevant to many modern applications such as crowdsourcing and internet advertising. In scenarios such as crowdsourcing, there is a planner who is interested in learning a task and a set of noisy agents provide the training data for this learning task. Any learning algorithm making use of the data provided by these noisy agents must account for their noise levels. The noise levels of the agents are unknown to the planner, leading to a non-trivial difficulty. Further, the agents are heterogeneous as they differ in terms of their noise levels. A key challenge in such settings is to learn the noise levels of the agents while simultaneously learning the underlying model. Another challenge arises when the agents are strategic. For example, when the agents are required to perform a task, they could be strategic on the efforts they put in. As another example, when required to report their costs incurred towards performing the task, the agents could be strategic and may not report the costs truthfully. In general, the performance of the learning algorithms could be severely affected if the information elicited from the agents is incorrect. We address the above challenges that arise in the following representative learning problems. Multi-label Classification from Heterogeneous Noisy Agents Multi-label classification is a well-known supervised machine learning problem where each instance is associated with multiple classes. Since several labels can be assigned to a single instance, one of the key challenges in this problem is to learn the correlations between the classes. We first assume labels from a perfect source and propose a novel topic model called Multi-Label Presence-Absence Latent Dirichlet Allocation (ML-PA-LDA). In the current day scenario, a natural source for procuring the training dataset is through mining user-generated content or directly through users in a crowdsourcing platform. In the more practical scenario of crowdsourcing, an additional challenge arises as the labels of the training instances are provided by noisy, heterogeneous crowd-workers with unknown qualities. With this as the motivation, we further adapt our topic model to the scenario where the labels are provided by multiple noisy sources and refer to this model as ML-PA-LDA-MNS (ML-PA-LDA with Multiple Noisy Sources). With experiments on standard datasets, we show that the proposed models achieve superior performance over existing methods. Active Linear Regression with Heterogeneous, Noisy and Strategic Agents In this work, we study the problem of training a linear regression model by procuring labels from multiple noisy agents or crowd annotators, under a budget constraint. We propose a Bayesian model for linear regression from multiple noisy sources and use variational inference for parameter estimation. When labels are sought from agents, it is important to minimize the number of labels procured as every call to an agent incurs a cost. Towards this, we adopt an active learning approach. In this specific context, we prove the equivalence of well-studied criteria of active learning such as entropy minimization and expected error reduction. For the purpose of annotator selection in active learning, we observe a useful connection with the multi-armed bandit framework. Due to the nature of the distribution of the rewards on the arms, we resort to the Robust Upper Confidence Bound (UCB) scheme with truncated empirical mean estimator to solve the annotator selection problem. This yields provable guarantees on the regret. We apply our model to the scenario where annotators are strategic and design suitable incentives to induce them to put in their best efforts. Ranking with Heterogeneous Strategic Agents We look at the problem where a planner must rank multiple strategic agents, a problem that has many applications including sponsored search auctions (SSA). Stochastic multi-armed bandit (MAB) mechanisms have been used in the literature to solve this problem. Existing stochastic MAB mechanisms with a deterministic payment rule, proposed in the literature, necessarily suffer a regret of (T 2=3), where T is the number of time steps. This happens because these mechanisms address the worst case scenario where the means of the agents’ stochastic rewards are separated by a very small amount that depends on T . We however take a detour and allow the planner to indicate the resolution, , with which the agents must be distinguished. This immediately leads us to introduce the notion of -Regret. We propose a dominant strategy incentive compatible (DSIC) and individually rational (IR), deterministic MAB mechanism, based on ideas from the Upper Confidence Bound (UCB) family of MAB algorithms. The proposed mechanism - UCB achieves a -regret of O(log T ). We first establish the results for single slot SSA and then non-trivially extend the results to the case of multi-slot SSA.
39

Sparse Multiclass And Multi-Label Classifier Design For Faster Inference

Bapat, Tanuja 12 1900 (has links) (PDF)
Many real-world problems like hand-written digit recognition or semantic scene classification are treated as multiclass or multi-label classification prob-lems. Solutions to these problems using support vector machines (SVMs) are well studied in literature. In this work, we focus on building sparse max-margin classifiers for multiclass and multi-label classification. Sparse representation of the resulting classifier is important both from efficient training and fast inference viewpoints. This is true especially when the training and test set sizes are large.Very few of the existing multiclass and multi-label classification algorithms have given importance to controlling the sparsity of the designed classifiers directly. Further, these algorithms were not found to be scalable. Motivated by this, we propose new formulations for sparse multiclass and multi-label classifier design and also give efficient algorithms to solve them. The formulation for sparse multi-label classification also incorporates the prior knowledge of label correlations. In both the cases, the classification model is designed using a common set of basis vectors across all the classes. These basis vectors are greedily added to an initially empty model, to approximate the target function. The sparsity of the classifier can be controlled by a user defined parameter, dmax which indicates the max-imum number of common basis vectors. The computational complexity of these algorithms for multiclass and multi-label classifier designisO(lk2d2 max), Where l is the number of training set examples and k is the number of classes. The inference time for the proposed multiclass and multi-label classifiers is O(kdmax). Numerical experiments on various real-world benchmark datasets demonstrate that the proposed algorithms result in sparse classifiers that require lesser number of basis vectors than required by state-of-the-art algorithms, to attain the same generalization performance. Very small value of dmax results in significant reduction in inference time. Thus, the proposed algorithms provide useful alternatives to the existing algorithms for sparse multiclass and multi-label classifier design.
40

[pt] APRENDIZADO SEMI E AUTO-SUPERVISIONADO APLICADO À CLASSIFICAÇÃO MULTI-LABEL DE IMAGENS DE INSPEÇÕES SUBMARINAS / [en] SEMI AND SELF-SUPERVISED LEARNING APPLIED TO THE MULTI-LABEL CLASSIFICATION OF UNDERWATER INSPECTION IMAGE

AMANDA LUCAS PEREIRA 11 July 2023 (has links)
[pt] O segmento offshore de produção de petróleo é o principal produtor nacional desse insumo. Nesse contexto, inspeções submarinas são cruciais para a manutenção preventiva dos equipamentos, que permanecem toda a vida útil em ambiente oceânico. A partir dos dados de imagem e sensor coletados nessas inspeções, especialistas são capazes de prevenir e reparar eventuais danos. Tal processo é profundamente complexo, demorado e custoso, já que profissionais especializados têm que assistir a horas de vídeos atentos a detalhes. Neste cenário, o presente trabalho explora o uso de modelos de classificação de imagens projetados para auxiliar os especialistas a encontrarem o(s) evento(s) de interesse nos vídeos de inspeções submarinas. Esses modelos podem ser embarcados no ROV ou na plataforma para realizar inferência em tempo real, o que pode acelerar o ROV, diminuindo o tempo de inspeção e gerando uma grande redução nos custos de inspeção. No entanto, existem alguns desafios inerentes ao problema de classificação de imagens de inspeção submarina, tais como: dados rotulados balanceados são caros e escassos; presença de ruído entre os dados; alta variância intraclasse; e características físicas da água que geram certas especificidades nas imagens capturadas. Portanto, modelos supervisionados tradicionais podem não ser capazes de cumprir a tarefa. Motivado por esses desafios, busca-se solucionar o problema de classificação de imagens submarinas a partir da utilização de modelos que requerem menos supervisão durante o seu treinamento. Neste trabalho, são explorados os métodos DINO (Self-DIstillation with NO labels, auto-supervisionado) e uma nova versão multi-label proposta para o PAWS (Predicting View Assignments With Support Samples, semi-supervisionado), que chamamos de mPAWS (multi-label PAWS). Os modelos são avaliados com base em sua performance como extratores de features para o treinamento de um classificador simples, formado por uma camada densa. Nos experimentos realizados, para uma mesma arquitetura, se obteve uma performance que supera em 2.7 por cento o f1-score do equivalente supervisionado. / [en] The offshore oil production segment is the main national producer of this input. In this context, underwater inspections are crucial for the preventive maintenance of equipment, which remains in the ocean environment for its entire useful life. From the image and sensor data collected in these inspections,experts are able to prevent and repair damage. Such a process is deeply complex, time-consuming and costly, as specialized professionals have to watch hours of videos attentive to details. In this scenario, the present work explores the use of image classification models designed to help experts to find the event(s) of interest in under water inspection videos. These models can be embedded in the ROV or on the platform to perform real-time inference,which can speed up the ROV, monitor notification time, and greatly reduce verification costs. However, there are some challenges inherent to the problem of classification of images of armored submarines, such as: balanced labeled data are expensive and scarce; the presence of noise among the data; high intraclass variance; and some physical characteristics of the water that achieved certain specificities in the captured images. Therefore, traditional supervised models may not be able to fulfill the task. Motivated by these challenges, we seek to solve the underwater image classification problem using models that require less supervision during their training. In this work, they are explorers of the DINO methods (Self-Distillation with NO labels, self-supervised) anda new multi-label version proposed for PAWS (Predicting View AssignmentsWith Support Samples, semi-supervised), which we propose as mPAWS (multi-label PAWS). The models are evaluated based on their performance as features extractors for training a simple classifier, formed by a dense layer. In the experiments carried out, for the same architecture, a performance was obtained that exceeds by 2.7 percent the f1-score of the supervised equivalent.

Page generated in 0.171 seconds