Spelling suggestions: "subject:"multilabel"" "subject:"multilabels""
11 |
Efficient extreme classification / Classification extreme a faible complexitéCisse, Mouhamadou Moustapha 25 July 2014 (has links)
Dans cette thèse, nous proposons des méthodes a faible complexité pour la classification en présence d'un très grand nombre de catégories. Ces methodes permettent d'accelerer la prediction des classifieurs afin des les rendre utilisables dans les applications courantes. Nous proposons deux methodes destinées respectivement a la classification monolabel et a la classification multilabel. La première méthode utilise l'information hierarchique existante entre les catégories afin de créer un représentation binaire compact de celles-ci. La seconde approche , destinée aux problemes multilabel adpate le framework des Filtres de Bloom a la representation de sous ensembles de labels sous forme de de vecteurs binaires sparses. Dans chacun des cas, des classifieurs binaires sont appris afin de prédire les representations des catégories/labels et un algorithme permettant de retrouver l'ensemble de catégories pertinentes a partir de la représentation prédite est proposée. Les méthodes proposées sont validées par des expérience sur des données de grandes échelles et donnent des performances supérieures aux méthodes classiquement utilisées pour la classification extreme. / We propose in this thesis new methods to tackle classification problems with a large number of labes also called extreme classification. The proposed approaches aim at reducing the inference conplexity in comparison with the classical methods such as one-versus-rest in order to make learning machines usable in a real life scenario. We propose two types of methods respectively for single label and multilable classification. The first proposed approach uses existing hierarchical information among the categories in order to learn low dimensional binary representation of the categories. The second type of approaches, dedicated to multilabel problems, adapts the framework of Bloom Filters to represent subsets of labels with sparse low dimensional binary vectors. In both approaches, binary classifiers are learned to predict the new low dimensional representation of the categories and several algorithms are also proposed to recover the set of relevant labels. Large scale experiments validate the methods.
|
12 |
Topical Classification of Images in Wikipedia : Development of topical classification models followed by a study of the visual content of Wikipedia / Ämneklassificering av bilder i Wikipedia : Utveckling av ämneklassificeringsmodeller följd av studier av Wikipedias bilddataVieira Bernat, Matheus January 2023 (has links)
With over 53 million articles and 11 million images, Wikipedia is the greatest encyclopedia in history. The number of users is equally significant, with daily views surpassing 1 billion. Such an enormous system needs automation of tasks to make it possible for the volunteers to maintain. When it comes to textual data, there is a system based on machine learning called ORES providing automation to tasks such as article quality estimation and article topic routing. A visual counterpart system also needs to be developed to support tasks such as vandalism detection in images and for a better understanding of the visual data of Wikipedia. Researchers from the Wikimedia Foundation identified a hindrance to implementing the visual counterpart of ORES: the images of Wikipedia lack topical metadata. Thus, this work aims to develop a deep learning model that classifies images into a set of topics, which have been pre-determined in parallel work. State-of-the-art image classification models and other methods to mitigate the existing class imbalance are used. The conducted experiments show, among others, that: using the data that considers the hierarchy of labels performs better; resampling techniques are ineffective at mitigating imbalance due to the high label concurrence; sample-weighting improves metrics; and that initializing parameters as pre-trained on ImageNet rather than randomly yields better metrics. Moreover, we find interesting outlier labels that, despite having fewer samples, obtain better performance metrics, which is believed to be either due to bias from pre-training or simply more signal in the label. The distribution of the visual data predicted by the models displayed. Finally, some qualitative examples of the model predictions to some images are presented, proving the ability of the model to find correct labels that are missing in the ground truth
|
13 |
[en] RDS - RECOVERING DISCARDED SAMPLES WITH NOISY LABELS: TECHNIQUES FOR TRAINING DEEP LEARNING MODELS WITH NOISY SAMPLES / [pt] RDS - RECUPERANDO AMOSTRAS DESCARTADAS COM RÓTULOS RUIDOSOS: TÉCNICAS PARA TREINAMENTO DE MODELOS DE DEEP LEARNING COM AMOSTRAS RUIDOSASVITOR BENTO DE SOUSA 20 May 2024 (has links)
[pt] Modelos de Aprendizado Profundo para classificação de imagens alcançaram o
estado da arte em um vasto campo de aplicações. Entretanto, é frequente deparar-se com amostras ruidosas, isto é, amostras contendo rótulos incorretos, nos
conjuntos de dados provenientes de aplicações do mundo real. Quando modelos
de Aprendizado Profundo são treinados nestes conjuntos de dados, a sua
performance é prejudicada. Modelos do estado da arte, como Co-teaching+ e
Jocor, utilizam a técnica Small Loss Approach (SLA) para lidar com amostras
ruidosas no cenário multiclasse. Nesse trabalho, foi desenvolvido uma nova
técnica para lidar com amostras ruidosas, chamada Recovering Discarded
Samples (RDS), que atua em conjunto com a SLA. Para demostrar a eficácia da
técnica, aplicou-se o RDS nos modelos Co-teaching+ e Jocor resultando em dois
novos modelos RDS-C e RDS-J. Os resultados indicam ganhos de até 6 por cento nas
métricas de teste para ambos os modelos. Um terceiro modelo chamado RDS-Contrastive também foi desenvolvido, este modelo superou o estado da arte em
até 4 por cento na acurácia de teste. Além disso, nesse trabalho, expandiu-se a técnica
SLA para o cenário multilabel, sendo desenvolvido a técnica SLA Multilabel
(SLAM). Com essa técnica foi desenvolvido mais dois modelos para cenário
multilabel com amostras ruidosas. Os modelos desenvolvidos nesse trabalho para
multiclasse foram utilizados em um problema real de cunho ambiental. Os
modelos desenvolvidos para o cenário multilabel foram aplicados como solução
para um problema real na área de óleo e gás. / [en] Deep Learning models designed for image classification have consistently achieved state-of-the-art performance across a plethora of applications. However, the presence of noisy samples, i.e., instances with incorrect labels, is a prevalent challenge in datasets derived from real-world applications. The training of Deep Learning models on such datasets inevitably compromises their performance. State-of-the-art models, such as Co-teaching+ and Jocor, utilize the Small Loss Approach (SLA) technique to handle noisy samples in a multi-class scenario. In this work, a new technique named Recovering Discarded Samples (RDS) was developed to address noisy samples, working with SLA. To demonstrate the effectiveness of the technique, RDS was applied to the Co-teaching+ and Jocor models, resulting in two new models, RDS-C and RDS-J. The results indicate gains of up to 6 percent in test metrics for both models. A third model, named RDS-Contrastive, was also developed, surpassing the state-of-the-art by up to 4 percent in test accuracy. Furthermore, this work extended the SLA technique to the multilabel scenario, leading to the development of the SLA Multilabel (SLAM) technique. With this technique, two additional models for the multilabel scenario with noisy samples were developed. The models proposed in this work for the multiclass scenario were applied in a real-world environmental solution, while the models developed for the multilabel scenario were implemented as a solution for a real problem in the oil and gas industry.
|
14 |
[en] A STUDY OF MULTILABEL TEXT CLASSIFICATION ALGORITHMS USING NAIVE-BAYES / [pt] UM ESTUDO DE ALGORITMOS PARA CLASSIFICAÇÃO AUTOMÁTICA DE TEXTOS UTILIZANDO NAIVE-BAYESDAVID STEINBRUCH 12 March 2007 (has links)
[pt] A quantidade de informação eletrônica vem crescendo de
forma acelerada,
motivada principalmente pela facilidade de publicação e
divulgação que a
Internet proporciona. Desta forma, é necessária a
organização da informação
de forma a facilitar a sua aquisição. Muitos trabalhos
propuseram resolver
este problema através da classificação automática de
textos associando a
eles vários rótulos (classificação multirótulo). No
entanto, estes trabalhos
transformam este problema em subproblemas de classificação
binária,
considerando que existe independência entre as categorias.
Além disso,
utilizam limiares (thresholds), que são muito específicos
para o conjunto
de treinamento utilizado, não possuindo grande capacidade
de generalização
na aprendizagem. Esta dissertação propõe dois algoritmos
de classificação
automática de textos baseados no algoritmo multinomial
naive Bayes e sua
utilização em um ambiente on-line de classificação
automática de textos
com realimentação de relevância pelo usuário. Para testar
a eficiência dos
algoritmos propostos, foram realizados experimentos na
base de notícias
Reuters 21758 e na base de documentos médicos Ohsumed. / [en] The amount of electronic information has been growing
fast, mainly due to
the easiness of publication and spreading that Internet
provides. Therefore,
is necessary the organisation of information to facilitate
its retrieval. Many
works have solved this problem through the automatic text
classification,
associating to them several labels (multilabel
classification). However, those
works have transformed this problem into binary
classification subproblems,
considering there is not dependence among categories.
Moreover, they have
used thresholds, which are very sepecific of the
classifier document base,
and so, does not have great generalization capacity in the
learning process.
This thesis proposes two text classifiers based on the
multinomial algorithm
naive Bayes and its usage in an on-line text
classification environment with
user relevance feedback. In order to test the proposed
algorithms efficiency,
experiments have been performed on the Reuters 21578 news
base, and on
the Ohsumed medical document base.
|
15 |
Etude de relaxations en traitement d'images. Application à la segmentation et autres problèmes multi-étiquettes. / Relaxations in image processing, application to segmentation and others multi-label problemsYildizoglu, Romain 08 July 2014 (has links)
Cette thèse étudie différentes relaxations pour minimiser des fonctionnelles non convexes qui apparaissent en traitement d’images. Des problèmes comme la segmentation d’image peuvent en effet s’écrire comme un problème de minimisation d’une certaine fonctionnelle, le minimiseur représentant la segmentation recherchée. Différentes méthodes ont été proposées pour trouver des minima locaux ou globaux de la fonctionnelle non convexe du modèle de Mumford-Shah constant par morceaux à deux phases. Certaines approches utilisent une relaxation convexe qui permet d’obtenir des minima globaux de la fonctionnelle non convexe. On rappelle et compare certaines de ces méthodes et on propose un nouveau modèle par bande étroite, qui permet d’obtenir des minima locaux tout en utilisant des algorithmes robustes qui proviennent de l’optimisation convexe. Ensuite, on construit une relaxation convexe d’un modèle de segmentation à deux phases qui repose sur la comparaison entre deux histogrammes donnés et les histogrammes estimés globalement sur les deux régions de la segmentation. Des relaxations pour des problèmes multi-étiquettes à plusieurs dimensions comme le flot optique sont également étudiées. On propose une relaxation convexe avec un algorithme itératif qui ne comprend que des projections qui se calculent exactement, ainsi qu’un nouvel algorithme pour une relaxation convexe sur chaque variable mais non convexe globalement. On étudie la manière d’estimer une solution du problème non convexe original à partir d’une solution d’un problème relaxé en comparant des méthodes existantes avec des nouvelles / In this thesis we study different relaxations of non-convex functionals that can be found in image processing. Some problems, such as image segmentation, can indeed be written as the minimization of a functional. The minimizer of the functional represents the segmentation. Different methods have been proposed in order to find local or global minima of the non-convex functional of the two-phase piecewise constant Mumford-Shah model. With a convex relaxation of this model we can find a global minimum of the nonconvex functional. We present and compare some of these methods and we propose a new model with a narrow band. This model finds local minima while using robust convex optimization algorithms. Then a convex relaxation of a two-phase segmentation model is built that compares two given histograms with those of the two segmented regions. We also study some relaxations of high-dimension multi-label problems such as optical flow computation. A convex relaxation with a new algorithm is proposed. The algorithm is iterative with exact projections. A new algorithm is given for a relaxationthat is convex in each variable but that is not convex globally. We study the problem of constructing a solution of the original non-convex problem with a solution of the relaxed problem. We compare existing methods with new ones.
|
16 |
Modélisation de documents combinant texte et image : application à la catégorisation et à la recherche d'information multimédiaMoulin, Christophe 22 June 2011 (has links) (PDF)
L'exploitation des documents multimédias pose des problèmes de représentation des informations textuelles et visuelles contenues dans ces documents. Notre but est de proposer un modèle permettant de représenter chacune de ces informations et de les combiner en vue de deux tâches : la catégorisation et la recherche d'information. Ce modèle représente les documents sous forme de sacs de mots nécessitant la création de vocabulaires spécifiques. Le vocabulaire textuel, généralement de très grande taille, est constitué des mots apparaissant dans les documents. Le vocabulaire visuel est quant à lui construit en extrayant des caractéristiques de bas niveau des images. Nous étudions les différentes étapes de sa création et la pondération tfidf des mots visuels dans les images, inspirée des approches classiquement utilisées pour les mots textuels. Dans le contexte de la catégorisation de documents textuels, nous introduisons un critère qui sélectionne les mots les plus discriminants pour les catégories afin de réduire la taille du vocabulaire sans dégrader les résultats du classement. Nous présentons aussi dans le cadre multilabel, une méthode permettant de sélectionner les différentes catégories à associer à un document. En recherche d'information, nous proposons une approche analytique par apprentissage pour combiner linéairement les résultats issus des informations textuelles et visuelles, permettant d'améliorer significativement la recherche. Notre modèle est validé pour ces différentes tâches en participant à des compétitions internationales telles que XML Mining et ImageCLEF et sur des collections de taille conséquente
|
17 |
Un système interactif pour l'analyse des musiques électroacoustiquesGulluni, Sébastien 20 December 2011 (has links) (PDF)
Les musiques électroacoustiques sont encore aujourd'hui relativement peu abordées dans les recherches qui visent à retrouver des informations à partir du contenu musical. La plupart des travaux de recherche concernant ces musiques sont centrés sur les outils de composition, la pédagogie et l'analyse musicale. Dans ce travail de thèse, nous nous intéressons aux problématiques scientifiques liées à l'analyse des musiques électroacoustiques. Après avoir replacé ces musiques dans leur contexte historique, une étude des pratiques d'analyse de trois professionnels nous permet d'obtenir des pistes pour l'élaboration d'un système d'analyse. Ainsi, nous proposons un système interactif d'aide à l'analyse des musiques électroacoustiques qui permet de retrouver les différentes instances des objets sonores composant une pièce polyphonique. Le système proposé permet dans un premier temps de réaliser une segmentation afin de dégager les instances initiales des objets sonores principaux. L'utilisateur peut ainsi sélectionner les objets qu'il vise avant de rentrer dans une boucle d'interaction qui utilise l'apprentissage actif et le retour de pertinence fourni par l'utilisateur. Le retour apporté par l'utilisateur est utilisé par le système qui réalise une classification multilabel des différents segments sonores en fonction des objets sonores visés. Une évaluation par simulation utilisateur est réalisée à partir d'un corpus de pièces synthétiques. L'évaluation montre que notre approche permet d'obtenir des résultats satisfaisants en un nombre raisonnable d'interactions.
|
18 |
Classification multilabels à partir de signaux EEG d'imaginations motrices combinées : application au contrôle 3D d'un bras robotique / Multilabel classification of EEG-based combined motor imageries implemented for the 3D control of a robotic armLindig León, Cecilia 10 January 2017 (has links)
Les interfaces cerveau-ordinateur (ou BCI en anglais pour Brain-Computer Interfaces) mettent en place depuis le système nerveux central un circuit artificiel secondaire qui remplace l’utilisation des nerfs périphériques, permettant entre autres à des personnes ayant une déficience motrice grave d’interagir, uniquement à l’aide de leur activité cérébrale, avec différents types d’applications, tels qu’un système d’écriture, une neuro-prothèse, un fauteuil roulant motorisé ou un bras robotique. Une technique répandue au sein des BCI pour enregistrer l’activité cérébrale est l’électroencéphalographie (EEG), étant donné que contrairement à d’autres techniques d’imagerie, elle est non invasif et peu coûteuse. En outre, l’imagination motrice (MI), c’est-à-dire les oscillations des neurones du cortex moteur générées lorsque les sujets imaginent effectuer un mouvement sans réellement l’accomplir, est appropriée car détectable dans l’EEG et liée à l’activité motrice pour concevoir des interfaces comme des neuro-prothèses non assujetties à des stimuli. Cependant, même si des progrès importants ont été réalisés au cours des dernières années, un contrôle 3D complet reste un objectif à atteindre. Afin d’explorer de nouvelles solutions pour surmonter les limitations existantes, nous présentons une approche multiclasses qui considère la détection des imaginations motrices combinées. Le paradigme proposé comprend l’utilisation de la main gauche, de la main droite, et des deux pieds ensemble. Ainsi, par combinaison, huit commandes peuvent être fournies pour diriger un bras robotisé comprenant quatorze mouvements différents qui offrent un contrôle 3D complet. À cette fin, un système de commutation entre trois modes (déplacement du bras, du poignet ou des doigts) a été conçu et permet de gérer les différentes actions en utilisant une même commande. Ce système a été mis en oeuvre sur la plate-forme OpenViBE. En outre, pour l’extraction de caractéristiques une nouvelle approche de traitement d’information fournie par les capteurs a été développée sur la base de l’emplacement spécifique des sources d’activité liées aux parties du corps considérées. Cette approche permet de regrouper au sein d’une seule classe les différentes actions pour lesquelles le même membre est engagé, d’une manière que la tâche multiclasses originale se transforme en un problème équivalent impliquant une série de modèles de classification binaires. Cette approche permet d’utiliser l’algorithme de Common Spatial pattern (CSP) dont la capacité à discriminer des rythmes sensorimoteurs a été largement montrée mais qui présente l’inconvénient d’être applicable uniquement pour différencier deux classes. Nous avons donc également contribué à une nouvelle stratégie qui combine un ensemble de CSP et la géométrie riemannienne. Ainsi des caractéristiques plus discriminantes peuvent être obtenues comme les distances séparant les données des centres des classes considérées. Ces stratégies ont été appliquées sur trois nouvelles approches de classification qui ont été comparées à des méthodes de discrimination multiclasses classiques en utilisant les signaux EEG d’un groupe de sujets sains naïfs, montrant ainsi que les alternatives proposées permettent non seulement d’améliorer l’existant, mais aussi de réduire la complexité de la classification / Brain-Computer Interfaces (BCIs) replace the natural nervous system outputs by artificial ones that do not require the use of peripheral nerves, allowing people with severe motor impairments to interact, only by using their brain activity, with different types of applications, such as spellers, neuroprostheses, wheelchairs, or among others robotics devices. A very popular technique to record signals for BCI implementation purposes consists of electroencephalography (EEG), since in contrast with other alternatives, it is noninvasive and inexpensive. In addition, due to the potentiality of Motor Imagery (MI, i.e., brain oscillations that are generated when subjects imagine themselves performing a movement without actually accomplishing it) to generate suitable patterns for scheming self-paced paradigms, such combination has become a common solution for BCI neuroprostheses design. However, even though important progress has been made in the last years, full 3D control is an unaccomplished objective. In order to explore new solutions for overcoming the existing limitations, we present a multiclass approach that considers the detection of combined motor imageries, (i.e., two or more body parts used at the same time). The proposed paradigm includes the use of the left hand, right hand, and both feet together, from which eight commands are provided to direct a robotic arm comprising fourteen different movements that afford a full 3D control. To this end, an innovative switching-mode scheme that allows managing different actions by using the same command was designed and implemented on the OpenViBE platform. Furthermore, for feature extraction a novel signal processing scheme has been developed based on the specific location of the activity sources that are related to the considered body parts. This insight allows grouping together within a single class those conditions for which the same limb is engaged, in a manner that the original multiclass task is transformed into an equivalent problem involving a series of binary classification models. Such approach allows using the Common Spatial Pattern (CSP) algorithm; which has been shown to be powerful at discriminating sensorimotor rhythms, but has the drawback of being suitable only to differentiate between two classes. Based on this perspective we also have contributed with a new strategy that combines together the CSP algorithm and Riemannian geometry. In which the CSP projected trials are mapped into the Riemannian manifold, from where more discriminative features can be obtained as the distances separating the input data from the considered class means. These strategies were applied on three new classification approaches that have been compared to classical multiclass methods by using the EEG signals from a group of naive healthy subjects, showing that the proposed alternatives not only outperform the existing schema, but also reduce the complexity of the classification task
|
19 |
Extraction of medical knowledge from clinical reports and chest x-rays using machine learning techniquesBustos, Aurelia 19 June 2019 (has links)
This thesis addresses the extraction of medical knowledge from clinical text using deep learning techniques. In particular, the proposed methods focus on cancer clinical trial protocols and chest x-rays reports. The main results are a proof of concept of the capability of machine learning methods to discern which are regarded as inclusion or exclusion criteria in short free-text clinical notes, and a large scale chest x-ray image dataset labeled with radiological findings, diagnoses and anatomic locations. Clinical trials provide the evidence needed to determine the safety and effectiveness of new medical treatments. These trials are the basis employed for clinical practice guidelines and greatly assist clinicians in their daily practice when making decisions regarding treatment. However, the eligibility criteria used in oncology trials are too restrictive. Patients are often excluded on the basis of comorbidity, past or concomitant treatments and the fact they are over a certain age, and those patients that are selected do not, therefore, mimic clinical practice. This signifies that the results obtained in clinical trials cannot be extrapolated to patients if their clinical profiles were excluded from the clinical trial protocols. The efficacy and safety of new treatments for patients with these characteristics are not, therefore, defined. Given the clinical characteristics of particular patients, their type of cancer and the intended treatment, discovering whether or not they are represented in the corpus of available clinical trials requires the manual review of numerous eligibility criteria, which is impracticable for clinicians on a daily basis. In this thesis, a large medical corpora comprising all cancer clinical trials protocols in the last 18 years published by competent authorities was used to extract medical knowledge in order to help automatically learn patient’s eligibility in these trials. For this, a model is built to automatically predict whether short clinical statements were considered inclusion or exclusion criteria. A method based on deep neural networks is trained on a dataset of 6 million short free-texts to classify them between elegible or not elegible. For this, pretrained word embeddings were used as inputs in order to predict whether or not short free-text statements describing clinical information were considered eligible. The semantic reasoning of the word-embedding representations obtained was also analyzed, being able to identify equivalent treatments for a type of tumor in an analogy with the drugs used to treat other tumors. Results show that representation learning using deep neural networks can be successfully leveraged to extract the medical knowledge from clinical trial protocols and potentially assist practitioners when prescribing treatments. The second main task addressed in this thesis is related to knowledge extraction from medical reports associated with radiographs. Conventional radiology remains the most performed technique in radiodiagnosis services, with a percentage close to 75% (Radiología Médica, 2010). In particular, chest x-ray is the most common medical imaging exam with over 35 million taken every year in the US alone (Kamel et al., 2017). They allow for inexpensive screening of several pathologies including masses, pulmonary nodules, effusions, cardiac abnormalities and pneumothorax. For this task, all the chest-x rays that had been interpreted and reported by radiologists at the Hospital Universitario de San Juan (Alicante) from Jan 2009 to Dec 2017 were used to build a novel large-scale dataset in which each high-resolution radiograph is labeled with its corresponding metadata, radiological findings and pathologies. This dataset, named PadChest, includes more than 160,000 images obtained from 67,000 patients, covering six different position views and additional information on image acquisition and patient demography. The free text reports written in Spanish by radiologists were labeled with 174 different radiographic findings, 19 differential diagnoses and 104 anatomic locations organized as a hierarchical taxonomy and mapped onto standard Unified Medical Language System (UMLS) terminology. For this, a subset of the reports (a 27%) were manually annotated by trained physicians, whereas the remaining set was automatically labeled with deep supervised learning methods using attention mechanisms and fed with the text reports. The labels generated were then validated in an independent test set achieving a 0.93 Micro-F1 score. To the best of our knowledge, this is one of the largest public chest x-ray databases suitable for training supervised models concerning radiographs, and also the first to contain radiographic reports in Spanish. The PadChest dataset can be downloaded on request from http://bimcv.cipf.es/bimcv-projects/padchest/. PadChest is intended for training image classifiers based on deep learning techniques to extract medical knowledge from chest x-rays. It is essential that automatic radiology reporting methods could be integrated in a clinically validated manner in radiologists’ workflow in order to help specialists to improve their efficiency and enable safer and actionable reporting. Computer vision methods capable of identifying both the large spectrum of thoracic abnormalities (and also the normality) need to be trained on large-scale comprehensively labeled large-scale x-ray datasets such as PadChest. The development of these computer vision tools, once clinically validated, could serve to fulfill a broad range of unmet needs. Beyond implementing and obtaining results for both clinical trials and chest x-rays, this thesis studies the nature of the health data, the novelty of applying deep learning methods to obtain large-scale labeled medical datasets, and the relevance of its applications in medical research, which have contributed to its extramural diffusion and worldwide reach. This thesis describes this journey so that the reader is navigated across multiple disciplines, from engineering to medicine up to ethical considerations in artificial intelligence applied to medicine.
|
Page generated in 0.0414 seconds