Global ETD Search

11	Técnicas para o problema de dados desbalanceados em classificação hierárquica / Techniques for the problem of imbalanced data in hierarchical classification Victor Hugo Barella 24 July 2015 (has links) Os recentes avanços da ciência e tecnologia viabilizaram o crescimento de dados em quantidade e disponibilidade. Junto com essa explosão de informações geradas, surge a necessidade de analisar dados para descobrir conhecimento novo e útil. Desse modo, áreas que visam extrair conhecimento e informações úteis de grandes conjuntos de dados se tornaram grandes oportunidades para o avanço de pesquisas, tal como o Aprendizado de Máquina (AM) e a Mineração de Dados (MD). Porém, existem algumas limitações que podem prejudicar a acurácia de alguns algoritmos tradicionais dessas áreas, por exemplo o desbalanceamento das amostras das classes de um conjunto de dados. Para mitigar tal problema, algumas alternativas têm sido alvos de pesquisas nos últimos anos, tal como o desenvolvimento de técnicas para o balanceamento artificial de dados, a modificação dos algoritmos e propostas de abordagens para dados desbalanceados. Uma área pouco explorada sob a visão do desbalanceamento de dados são os problemas de classificação hierárquica, em que as classes são organizadas em hierarquias, normalmente na forma de árvore ou DAG (Direct Acyclic Graph). O objetivo deste trabalho foi investigar as limitações e maneiras de minimizar os efeitos de dados desbalanceados em problemas de classificação hierárquica. Os experimentos realizados mostram que é necessário levar em consideração as características das classes hierárquicas para a aplicação (ou não) de técnicas para tratar problemas dados desbalanceados em classificação hierárquica. / Recent advances in science and technology have made possible the data growth in quantity and availability. Along with this explosion of generated information, there is a need to analyze data to discover new and useful knowledge. Thus, areas for extracting knowledge and useful information in large datasets have become great opportunities for the advancement of research, such as Machine Learning (ML) and Data Mining (DM). However, there are some limitations that may reduce the accuracy of some traditional algorithms of these areas, for example the imbalance of classes samples in a dataset. To mitigate this drawback, some solutions have been the target of research in recent years, such as the development of techniques for artificial balancing data, algorithm modification and new approaches for imbalanced data. An area little explored in the data imbalance vision are the problems of hierarchical classification, in which the classes are organized into hierarchies, commonly in the form of tree or DAG (Direct Acyclic Graph). The goal of this work aims at investigating the limitations and approaches to minimize the effects of imbalanced data with hierarchical classification problems. The experimental results show the need to take into account the features of hierarchical classes when deciding the application of techniques for imbalanced data in hierarchical classification. Aprendizado supervisionado Classificação hierárquica Dados desbalanceados Desbalanceamento de dados Data imbalance Hierarchical classification Imbalanced data Supervised learning
12	Etude de la classification dans un trés grand nombre de catégories. / Very large number of classes classification study Puget, Raphael 04 July 2016 (has links) La croissance des données disponibles aujourd'hui génère de nouvelles problématiques pour lesquelles l'apprentissage statistique ne possède pas de réponses adaptées. Ainsi le cadre classique de la classification qui consiste à affecter une ou plusieurs classes à une instance est étendu à des problèmes avec des milliers, voire des millions de classes différentes. Avec ces problèmes viennent de nouveaux axes de recherches comme \deleted{le temps} \added{la réduction de la compléxité} de classification qui est habituellement linéaire en fonction du nombre de classes du problème\deleted{.} \added{, ce qui est problématique lorsque le nombre de classe devient trop important.} Plusieurs familles de solutions pour cette problématique ont émergé comme la construction d'une hiérarchie de classifieurs ou bien l'adaptation de méthodes ensemblistes de type ECOC. Le travail présenté ici propose deux nouvelles méthodes pour répondre au problème de classification extrême. Le premier travail consiste en une nouvelle mesure asymétrique pour le partitionnement de classes dans le cadre d'une classification hiérarchique alors que le second axe explore l'élaboration d'un algorithme séquentiel actif d'agrégation des classifieurs les plus intéressants. / The increase in volume of the data nowadays is at the origin of new problematics for which machine learning does not possess adapted answers. The usual classification task which requires to assign one or more classes to an example is extended to problems with thousands or even millions of different classes. Those problems bring new research fields like the complexity reduction of the classification process. That classification process has a complexity usually linear with the number of classes of the problem, which can be an issue if the number of classes is too large. Various ways to deal with those new problems have emerged like the construction of a hierarchy of classifiers or the adaptation of ECOC ensemble methods. The work presented here describes two new methods to answer this extreme classification task. The first one consists in a new asymmetrical measure to help the partitioning of the classes in order to build a hierarchy of classes. The second one proposes a sequential way to aggregate effectively the most interesting classifiers. Classification Hiérarchisation Méthodes ensemblistes Big data Ontologie Hierarchical classification Complexity reduction Ontology 004
13	Resource efficient travel mode recognition / Resurseffektiv transportlägesigenkänning Runhem, Lovisa January 2017 (has links) In this report we attempt to provide insights to how a resource efficient solution for transportation mode recognition can be implemented on a smartphone using the accelerometer and magnetometer as sensors for data collection. The proposed system uses a hierarchical classification process where instances are first classified as vehicles or non-vehicles, then as wheel or rail vehicles, and lastly as belonging to one of the transportation modes: bus, car, motorcycle, subway, or train. A virtual gyroscope is implemented as a low-power source of simulated gyroscope data. Features are extracted from the accelerometer, magnetometer and virtual gyroscope readings that are sampled at 30 Hz, before they are classified using machine learning algorithms from the WEKA machine learning library. An Android application was developed to classify real-time data, and the resource consumption of the application was measured using the Trepn profiler application. The proposed system achieves an overall accuracy of 82.7% and a vehicular accuracy of 84.9% using a 5 second window with 75% overlap while having an average power consumption of 8.5 mW. / I denna rapport försöker vi ge insikter om hur en resurseffektiv lösning för transportlägesigenkänning kan implementeras på en smartphone genom att använda accelerometern och magnetometern som sensorer för datainsamling. Det föreslagna systemet använder en hierarkisk klassificeringsprocess där instanser först klassificeras som fordon eller icke-fordon, sedan som hjul- eller järnvägsfordon, och slutligen som tillhörande ett av transportsätten: buss, bil, motorcykel, tunnelbana eller tåg. Ett virtuellt gyroskop implementeras som en lågenergi källa till simulerad gyroskopdata. Olika särdrag extraheras från accelerometer, magnetometer och virtuella gyroskopläsningar som samlas in vid 30 Hz, innan de klassificeras med hjälp av maskininlärningsalgoritmer från WEKA-maskinlärningsbiblioteket. En Android-applikation har utvecklats för att klassificera realtidsdata, och programmets resursförbrukning mättes med hjälp av Trepn profiler-applikationen. Det föreslagna systemet uppnår en övergripande noggrannhet av 82.7% och en fordonsnoggrannhet av 84.9% genom att använda ett 5 sekunders fönster med 75% överlappning med en genomsnittlig energiförbrukning av 8.5 mW. transportation mode recognition hierarchical classification smartphone sensors Computer Sciences Datavetenskap (datalogi)
14	Automating Deep-Sea Video Annotation Egbert, Hanson 01 June 2021 (has links) (PDF) As the world explores opportunities to develop offshore renewable energy capacity, there will be a growing need for pre-construction biological surveys and post-construction monitoring in the challenging marine environment. Underwater video is a powerful tool to facilitate such surveys, but the interpretation of the imagery is costly and time-consuming. Emerging technologies have improved automated analysis of underwater video, but these technologies are not yet accurate or accessible enough for widespread adoption in the scientific community or industries that might benefit from these tools. To address these challenges, prior research developed a website that allows to: (1) Quickly play and annotate underwater videos, (2) Create a short tracking video for each annotation that shows how an annotated concept moves in time, (3) Verify the accuracy of existing annotations and tracking videos, (4) Create a neural network model from existing annotations, and (5) Automatically annotate unwatched videos using a model that was previously created. It uses both validated and unvalidated annotations and automatically generated annotations from trackings to count the number of Rathbunaster californicus (starfish) and Strongylocentrotus fragilis (sea urchin) with count accuracy of 97% and 99%, respectively, and F1 score accuracy of 0.90 and 0.81, respectively. The thesis explores several improvements to the model above. First, a method to sync JavaScript video frames to a stable Python environment. Second, reinforcement training using marine biology experts and the verification feature. Finally, a hierarchical method that allows the model to combine predictions of related concepts. On average, this method improved the F1 scores from 0.42 to 0.45 (a relative increase of 7%) and count accuracy from 58% to 69% (a relative increase of 19%) for the concepts Umbellula Lindahli and Funiculina. Object Detection Hierarchical Classification Biodiversity Monitoring Object Tracking Object Counting Automatic Video Annotation Other Computer Engineering
15	Naturally Generated Decision Trees for Image Classification Ravi, Sumved Reddy 31 August 2021 (has links) Image classification has been a pivotal area of research in Deep Learning, with a vast body of literature working to tackle the problem, constantly striving to achieve higher accuracies. This push to reach achieve greater prediction accuracy however, has further exacerbated the black box phenomenon which is inherent of neural networks, and more for so CNN style deep architectures. Likewise, it has lead to the development of highly tuned methods, suitable only for a specific data sets, requiring significant work to alter given new data. Although these models are capable of producing highly accurate predictions, we have little to no ability to understand the decision process taken by a network to reach a conclusion. This factor poses a difficulty in use cases such as medical diagnostics tools or autonomous vehicles, which require insight into prediction reasoning to validate a conclusion or to debug a system. In essence, modern applications which utilize deep networks are able to learn to produce predictions, but lack interpretability and a deeper understanding of the data. Given this key point, we look to decision trees, opposite in nature to deep networks, with a high level of interpretability but a low capacity for learning. In our work we strive to merge these two techniques as a means to maintain the capacity for learning while providing insight into the decision process. More importantly, we look to expand the understanding of class relationships through a tree architecture. Our ultimate goal in this work is to create a technique able to automatically create a visual feature based knowledge hierarchy for class relations, applicable broadly to any data set or combination thereof. We maintain these goals in an effort to move away from specific systems and instead toward artificial general intelligence (AGI). AGI requires a deeper understanding over a broad range of information, and more so the ability to learn new information over time. In our work we embed networks of varying sizes and complexity within decision trees on a node level, where each node network is responsible for selecting the next branch path in the tree. Each leaf node represents a single class and all parent and ancestor nodes represent groups of classes. We designed the method such that classes are reasonably grouped by their visual features, where parent and ancestor nodes represent hidden super classes. Our work aims to introduce this method as a small step towards AGI, where class relations are understood through an automatically generated decision tree (representing a class hierarchy), capable of accurate image classification. / Master of Science / Many modern day applications make use of deep networks for image classification. Often these networks are incredibly complex in architecture, and applicable only for specific tasks and data. Standard approaches use just a neural network to produce predictions. However, the internal decision process of the network remains a black box due to the nature of the technique. As more complex human related applications, such as medical image diagnostic tools or autonomous driving software, are being created, they require an understanding of reasoning behind a prediction. To provide this insight into the prediction reasoning, we propose a technique which merges decision trees and deep networks. Tested on the MNIST image data set we were able to achieve an accuracy over 99.0%. We were also able to achieve an accuracy over 73.0% on the CIFAR-10 image data set. Our method is found to create decision trees that are easily understood and are reasonably capable of image classification. Image Classification Hierarchical Classification Deep learning (Machine learning) Adaptive Trees Artificial General Intelligence
16	Balance-guaranteed optimized tree with reject option for live fish recognition Huang, Xuan January 2014 (has links) This thesis investigates the computer vision application of live fish recognition, which is needed in application scenarios where manual annotation is too expensive, when there are too many underwater videos. This system can assist ecological surveillance research, e.g. computing fish population statistics in the open sea. Some pre-processing procedures are employed to improve the recognition accuracy, and then 69 types of features are extracted. These features are a combination of colour, shape and texture properties in different parts of the fish such as tail/head/top/bottom, as well as the whole fish. Then, we present a novel Balance-Guaranteed Optimized Tree with Reject option (BGOTR) for live fish recognition. It improves the normal hierarchical method by arranging more accurate classifications at a higher level and keeping the hierarchical tree balanced. BGOTR is automatically constructed based on inter-class similarities. We apply a Gaussian Mixture Model (GMM) and Bayes rule as a reject option after the hierarchical classification to evaluate the posterior probability of being a certain species to filter less confident decisions. This novel classification-rejection method cleans up decisions and rejects unknown classes. After constructing the tree architecture, a novel trajectory voting method is used to eliminate accumulated errors during hierarchical classification and, therefore, achieves better performance. The proposed BGOTR-based hierarchical classification method is applied to recognize the 15 major species of 24150 manually labelled fish images and to detect new species in an unrestricted natural environment recorded by underwater cameras in south Taiwan sea. It achieves significant improvements compared to the state-of-the-art techniques. Furthermore, the sequence of feature selection and constructing a multi-class SVM is investigated. We propose that an Individual Feature Selection (IFS) procedure can be directly exploited to the binary One-versus-One SVMs before assembling the full multiclass SVM. The IFS method selects different subsets of features for each Oneversus- One SVM inside the multiclass classifier so that each vote is optimized to discriminate the two specific classes. The proposed IFS method is tested on four different datasets comparing the performance and time cost. Experimental results demonstrate significant improvements compared to the normal Multiclass Feature Selection (MFS) method on all datasets. 333.95
17	Abordagens para aprendizado semissupervisionado multirrótulo e hierárquico / Multi-label and hierarchical semi-supervised learning approaches Metz, Jean 25 October 2011 (has links) A tarefa de classificação em Aprendizado de Máquina consiste da criação de modelos computacionais capazes de identificar automaticamente a classe de objetos pertencentes a um domínio pré-definido a partir de um conjunto de exemplos cuja classe é conhecida. Existem alguns cenários de classificação nos quais cada objeto pode estar associado não somente a uma classe, mas a várias classes ao mesmo tempo. Adicionalmente, nesses cenários denominados multirrótulo, as classes podem ser organizadas em uma taxonomia que representa as relações de generalização e especialização entre as diferentes classes, definindo uma hierarquia de classes, o que torna a tarefa de classificação ainda mais específica, denominada classificação hierárquica. Os métodos utilizados para a construção desses modelos de classificação são complexos e dependem fortemente da disponibilidade de uma quantidade expressiva de exemplos previamente classificados. Entretanto, para muitas aplicações é difícil encontrar um número significativo desses exemplos. Além disso, com poucos exemplos, os algoritmos de aprendizado supervisionado não são capazes de construir modelos de classificação eficazes. Nesses casos, é possível utilizar métodos de aprendizado semissupervisionado, cujo objetivo é aprender as classes do domínio utilizando poucos exemplos conhecidos conjuntamente com um número considerável de exemplos sem a classe especificada. Neste trabalho são propostos, entre outros, métodos que fazem uso do aprendizado semissupervisionado baseado em desacordo coperspectiva, tanto para a tarefa de classificação multirrótulo plana quanto para a tarefa de classificação hierárquica. São propostos, também, outros métodos que utilizam o aprendizado ativo com intuito de melhorar a performance de algoritmos de classificação semissupervisionada. Além disso, são propostos dois métodos para avaliação de algoritmos multirrótulo e hierárquico, os quais definem estratégias para identificação dos multirrótulos majoritários, que são utilizados para calcular os valores baseline das medidas de avaliação. Foi desenvolvido um framework para realizar a avaliação experimental da classificação hierárquica, no qual foram implementados os métodos propostos e um módulo completo para realizar a avaliação experimental de algoritmos hierárquicos. Os métodos propostos foram avaliados e comparados empiricamente, considerando conjuntos de dados de diversos domínios. A partir da análise dos resultados observa-se que os métodos baseados em desacordo não são eficazes para tarefas de classificação complexas como multirrótulo e hierárquica. Também é observado que o problema central de degradação do modelo dos algoritmos semissupervisionados agrava-se nos casos de classificação multirrótulo e hierárquica, pois, nesses casos, há um incremento nos fatores responsáveis pela degradação nos modelos construídos utilizando aprendizado semissupervisionado baseado em desacordo coperspectiva / In machine learning, the task of classification consists on creating computational models that are able to automatically identify the class of objects belonging to a predefined domain from a set of examples whose class is known a priori. There are some classification scenarios in which each object can be associated to more than one class at the same time. Moreover, in such multilabeled scenarios, classes can be organized in a taxonomy that represents the generalization and specialization relationships among the different classes, which defines a class hierarchy, making the classification task, known as hierarchical classification, even more specific. The methods used to build such classification models are complex and highly dependent on the availability of an expressive quantity of previously classified examples. However, for a large number of applications, it is difficult to find a significant number of such examples. Moreover, when few examples are available, supervised learning algorithms are not able to build efficient classification models. In such situations it is possible to use semi-supervised learning, whose aim is to learn the classes of the domain using a few classified examples in conjunction to a considerable number of examples with no specified class. In this work, we propose methods that use the co-perspective disagreement based learning approach for both, the flat multilabel classification and the hierarchical classification tasks, among others. We also propose other methods that use active learning, aiming at improving the performance of semi-supervised learning algorithms. Additionally, two methods for the evaluation of multilabel and hierarchical learning algorithms are proposed. These methods define strategies for the identification of the majority multilabels, which are used to estimate the baseline evaluation measures. A framework for the experimental evaluation of the hierarchical classification was developed. This framework includes the implementations of the proposed methods as well as a complete module for the experimental evaluation of the hierarchical algorithms. The proposed methods were empirically evaluated considering datasets from various domains. From the analysis of the results, it can be observed that the methods based on co-perspective disagreement are not effective for complex classification tasks, such as the multilabel and hierarchical classification. It can also be observed that the main degradation problem of the models of the semi-supervised algorithms worsens for the multilabel and hierarchical classification due to the fact that, for these cases, there is an increase in the causes of the degradation of the models built using semi-supervised learning based on co-perspective disagreement Active learning Aprendizado ativo Aprendizado semissupervisionado Classificação hierárquica Classificação multirrótulo Hierarchical classification Multi-label classification Semi-supervised learning
18	Partitionnement non supervisé d'images hyperspectrales : application à l'identification de la végétation littorale / Unsupervised partitioning approach of hyperspectral image : application to the identification of the algal vegetation Chen, Bai Yang 02 December 2016 (has links) La première partie de ce travail présente un état de l'art des principaux critères non supervisés, non paramétriques, d'évaluation d'une partition, des méthodes d'estimation préliminaires du nombre de classes, et enfin des méthodes de classification supervisées, semi-supervisées et non supervisées. Une analyse des avantages et des inconvénients de ces critères et méthodes est menée. L'analyse des performances des méthodes de classification et des critères d'évaluation a été également conduite via l'application visée dans cette thèse. Une approche de partitionnement non supervisée, non paramétrique et hiérarchique s'avère la plus adaptée au problème posé. En effet, ce type d'approche et plus particulièrement la classification descendante donne un partitionnement à plusieurs niveaux et met en évidence des informations plus détaillées d'un niveau à l'autre, ce qui permet une meilleure interprétation de la richesse d'information apportée par l'imagerie hyperspectrale et ainsi conduire à une meilleure décision. Dans ce sens, la deuxième partie de cette thèse présente, tout d'abord l'approche de classification descendante hiérarchique non supervisée (CDHNS) développée. Cette approche non paramétrique, permet l'obtention de résultats stables et objectifs indépendamment des utilisateurs finaux. Le second développement conduit, porte sur la sélection de bandes spectrales parmi celles qui composent l'image hyperspectrale originale afin de réduire la quantité d'information à traiter avant le processus de classification. Cette méthode est également non supervisée et non paramétrique. L'approche de classification et la méthode de réduction ont été expérimentées et validées sur une image hyperspectrale synthétique construite à partir des images réelles puis sur des images réelles dont l'application porte sur l'identification des différentes classes algales. Les résultats de partitionnement obtenus sans réduction montrent d'une part, la stabilité des résultats et, d'autre part, la discrimination des classes principales (végétation, substrat et eau) dès les premiers niveaux. Les résultats de la sélection des bandes spectrales font apparaître leur bonne répartition sur toute la gamme spectrale du capteur (visible et proche-infrarouge). Les résultats montrent aussi que le partitionnement avec et sans réduction sont globalement similaires. De plus, le temps de calcul est fortement réduit. / The upstream location of the different algal species causing clogging in the EDF nuclear power plants cooling systems along the Channel coastline, by analyzing hyperspectral aerial image is today the most appropriate means. Indeed, hyperspectral imaging allows, through its spatial resolution and its broad spectral range covering the areas of visible and near infrared, the objective discrimination of plant species on the foreshore, necessarily yielding accurate maps on large coastal areas. To provide a solution to this problem and achieve the objectives, the work conducted within the framework of this thesis lies in the development of unsupervised partitioning approaches to data with large spectral and spatial dimensions. The first part of this work presents a state of the art of main unsupervised criteria, and nonparametric, for partitioning evaluation, the preliminary methods for estimating the number of classes, and finally, supervised, semi-supervised and unsupervised classification methods. An analysis of the advantages and drawbacks of these methods and criteria is conducted. The analysis of the performances of these classification methods and evaluation criteria was also conducted through the application targeted in this thesis. An unsupervised, nonparametric, hierarchical partitioning approach appears best suited to the problem. Indeed, this type of approach, and particularly the descending classification, gives a partitioning at several levels and highlights more detailed information from one level to another, allowing a better interpretation of the wealth of information provided by hyperspectral imaging and therefore leading to a better decision. In this sense, the second part of this thesis presents, firstly the unsupervised hierarchical descending classification (UHDC) approach developed. This nonparametric approach allows obtaining stable and objective results regardless of end users. The second development proposed concerns the selection of spectral bands from those that make up the original hyperspectral image, in order to reduce the amount of information to be processed before the classification process. This method is also unsupervised and nonparametric. The classification approach and the reduction method have been tested and validated on a synthetic hyperspectral image constructed from real images, and then on real images, with application to the identification of different algal classes. The partitioning results obtained without reduction show firstly, the stability of the results and, secondly, the discrimination of the main classes (vegetation, substrate and water) from the first levels. The results of the spectral bands selection method show that the retained bands are well distributed over the entire spectral range of the sensor (visible and near-infrared). The results also show that partitioning results with and without reduction are broadly similar. Moreover, the computation time is greatly reduced. Classification hiérarchique Partitionnement non supervisé Imagerie hyperspectrale Végétation algale Hierarchical classification Unsupervised partitioning Nonparametric Hyperspectral imagery Algal vegetation
19	Abordagens para aprendizado semissupervisionado multirrótulo e hierárquico / Multi-label and hierarchical semi-supervised learning approaches Jean Metz 25 October 2011 (has links) A tarefa de classificação em Aprendizado de Máquina consiste da criação de modelos computacionais capazes de identificar automaticamente a classe de objetos pertencentes a um domínio pré-definido a partir de um conjunto de exemplos cuja classe é conhecida. Existem alguns cenários de classificação nos quais cada objeto pode estar associado não somente a uma classe, mas a várias classes ao mesmo tempo. Adicionalmente, nesses cenários denominados multirrótulo, as classes podem ser organizadas em uma taxonomia que representa as relações de generalização e especialização entre as diferentes classes, definindo uma hierarquia de classes, o que torna a tarefa de classificação ainda mais específica, denominada classificação hierárquica. Os métodos utilizados para a construção desses modelos de classificação são complexos e dependem fortemente da disponibilidade de uma quantidade expressiva de exemplos previamente classificados. Entretanto, para muitas aplicações é difícil encontrar um número significativo desses exemplos. Além disso, com poucos exemplos, os algoritmos de aprendizado supervisionado não são capazes de construir modelos de classificação eficazes. Nesses casos, é possível utilizar métodos de aprendizado semissupervisionado, cujo objetivo é aprender as classes do domínio utilizando poucos exemplos conhecidos conjuntamente com um número considerável de exemplos sem a classe especificada. Neste trabalho são propostos, entre outros, métodos que fazem uso do aprendizado semissupervisionado baseado em desacordo coperspectiva, tanto para a tarefa de classificação multirrótulo plana quanto para a tarefa de classificação hierárquica. São propostos, também, outros métodos que utilizam o aprendizado ativo com intuito de melhorar a performance de algoritmos de classificação semissupervisionada. Além disso, são propostos dois métodos para avaliação de algoritmos multirrótulo e hierárquico, os quais definem estratégias para identificação dos multirrótulos majoritários, que são utilizados para calcular os valores baseline das medidas de avaliação. Foi desenvolvido um framework para realizar a avaliação experimental da classificação hierárquica, no qual foram implementados os métodos propostos e um módulo completo para realizar a avaliação experimental de algoritmos hierárquicos. Os métodos propostos foram avaliados e comparados empiricamente, considerando conjuntos de dados de diversos domínios. A partir da análise dos resultados observa-se que os métodos baseados em desacordo não são eficazes para tarefas de classificação complexas como multirrótulo e hierárquica. Também é observado que o problema central de degradação do modelo dos algoritmos semissupervisionados agrava-se nos casos de classificação multirrótulo e hierárquica, pois, nesses casos, há um incremento nos fatores responsáveis pela degradação nos modelos construídos utilizando aprendizado semissupervisionado baseado em desacordo coperspectiva / In machine learning, the task of classification consists on creating computational models that are able to automatically identify the class of objects belonging to a predefined domain from a set of examples whose class is known a priori. There are some classification scenarios in which each object can be associated to more than one class at the same time. Moreover, in such multilabeled scenarios, classes can be organized in a taxonomy that represents the generalization and specialization relationships among the different classes, which defines a class hierarchy, making the classification task, known as hierarchical classification, even more specific. The methods used to build such classification models are complex and highly dependent on the availability of an expressive quantity of previously classified examples. However, for a large number of applications, it is difficult to find a significant number of such examples. Moreover, when few examples are available, supervised learning algorithms are not able to build efficient classification models. In such situations it is possible to use semi-supervised learning, whose aim is to learn the classes of the domain using a few classified examples in conjunction to a considerable number of examples with no specified class. In this work, we propose methods that use the co-perspective disagreement based learning approach for both, the flat multilabel classification and the hierarchical classification tasks, among others. We also propose other methods that use active learning, aiming at improving the performance of semi-supervised learning algorithms. Additionally, two methods for the evaluation of multilabel and hierarchical learning algorithms are proposed. These methods define strategies for the identification of the majority multilabels, which are used to estimate the baseline evaluation measures. A framework for the experimental evaluation of the hierarchical classification was developed. This framework includes the implementations of the proposed methods as well as a complete module for the experimental evaluation of the hierarchical algorithms. The proposed methods were empirically evaluated considering datasets from various domains. From the analysis of the results, it can be observed that the methods based on co-perspective disagreement are not effective for complex classification tasks, such as the multilabel and hierarchical classification. It can also be observed that the main degradation problem of the models of the semi-supervised algorithms worsens for the multilabel and hierarchical classification due to the fact that, for these cases, there is an increase in the causes of the degradation of the models built using semi-supervised learning based on co-perspective disagreement Aprendizado ativo Aprendizado semissupervisionado Classificação hierárquica Classificação multirrótulo Active learning Hierarchical classification Multi-label classification Semi-supervised learning
20	Molecular analysis of honey bee foraging ecology Richardson, Rodney Trey January 2018 (has links) No description available. Entomology Ecology Bioinformatics Biology Agriculture Pollen quantitative metabarcoding pollen molecular palynology hierarchical classification pollinator nutrition waggle dance pollinator foraging

Search results