Global ETD Search

11	Topical Classification of Images in Wikipedia : Development of topical classification models followed by a study of the visual content of Wikipedia / Ämneklassificering av bilder i Wikipedia : Utveckling av ämneklassificeringsmodeller följd av studier av Wikipedias bilddata Vieira Bernat, Matheus January 2023 (has links) With over 53 million articles and 11 million images, Wikipedia is the greatest encyclopedia in history. The number of users is equally significant, with daily views surpassing 1 billion. Such an enormous system needs automation of tasks to make it possible for the volunteers to maintain. When it comes to textual data, there is a system based on machine learning called ORES providing automation to tasks such as article quality estimation and article topic routing. A visual counterpart system also needs to be developed to support tasks such as vandalism detection in images and for a better understanding of the visual data of Wikipedia. Researchers from the Wikimedia Foundation identified a hindrance to implementing the visual counterpart of ORES: the images of Wikipedia lack topical metadata. Thus, this work aims to develop a deep learning model that classifies images into a set of topics, which have been pre-determined in parallel work. State-of-the-art image classification models and other methods to mitigate the existing class imbalance are used. The conducted experiments show, among others, that: using the data that considers the hierarchy of labels performs better; resampling techniques are ineffective at mitigating imbalance due to the high label concurrence; sample-weighting improves metrics; and that initializing parameters as pre-trained on ImageNet rather than randomly yields better metrics. Moreover, we find interesting outlier labels that, despite having fewer samples, obtain better performance metrics, which is believed to be either due to bias from pre-training or simply more signal in the label. The distribution of the visual data predicted by the models displayed. Finally, some qualitative examples of the model predictions to some images are presented, proving the ability of the model to find correct labels that are missing in the ground truth Wikipedia Multilabel classification Deep learning Computer Sciences Datavetenskap (datalogi)
12	[en] A STUDY OF MULTILABEL TEXT CLASSIFICATION ALGORITHMS USING NAIVE-BAYES / [pt] UM ESTUDO DE ALGORITMOS PARA CLASSIFICAÇÃO AUTOMÁTICA DE TEXTOS UTILIZANDO NAIVE-BAYES DAVID STEINBRUCH 12 March 2007 (has links) [pt] A quantidade de informação eletrônica vem crescendo de forma acelerada, motivada principalmente pela facilidade de publicação e divulgação que a Internet proporciona. Desta forma, é necessária a organização da informação de forma a facilitar a sua aquisição. Muitos trabalhos propuseram resolver este problema através da classificação automática de textos associando a eles vários rótulos (classificação multirótulo). No entanto, estes trabalhos transformam este problema em subproblemas de classificação binária, considerando que existe independência entre as categorias. Além disso, utilizam limiares (thresholds), que são muito específicos para o conjunto de treinamento utilizado, não possuindo grande capacidade de generalização na aprendizagem. Esta dissertação propõe dois algoritmos de classificação automática de textos baseados no algoritmo multinomial naive Bayes e sua utilização em um ambiente on-line de classificação automática de textos com realimentação de relevância pelo usuário. Para testar a eficiência dos algoritmos propostos, foram realizados experimentos na base de notícias Reuters 21758 e na base de documentos médicos Ohsumed. / [en] The amount of electronic information has been growing fast, mainly due to the easiness of publication and spreading that Internet provides. Therefore, is necessary the organisation of information to facilitate its retrieval. Many works have solved this problem through the automatic text classification, associating to them several labels (multilabel classification). However, those works have transformed this problem into binary classification subproblems, considering there is not dependence among categories. Moreover, they have used thresholds, which are very sepecific of the classifier document base, and so, does not have great generalization capacity in the learning process. This thesis proposes two text classifiers based on the multinomial algorithm naive Bayes and its usage in an on-line text classification environment with user relevance feedback. In order to test the proposed algorithms efficiency, experiments have been performed on the Reuters 21578 news base, and on the Ohsumed medical document base. [pt] APRENDIZADO DE MAQUINA [en] MACHINE LEARNING [pt] INTERNET [en] INTERNET [pt] CATEGORIZACAO DE TEXTOS [en] TEXT CATEGORIZATION [pt] CLASSIFICACAO DE TEXTOS [en] TEXT CLASSIFICATION [pt] MULTIROTULO [en] MULTILABEL [pt] NAIVE-BAYES [en] NAIVE-BAYES
13	Etude de relaxations en traitement d'images. Application à la segmentation et autres problèmes multi-étiquettes. / Relaxations in image processing, application to segmentation and others multi-label problems Yildizoglu, Romain 08 July 2014 (has links) Cette thèse étudie différentes relaxations pour minimiser des fonctionnelles non convexes qui apparaissent en traitement d’images. Des problèmes comme la segmentation d’image peuvent en effet s’écrire comme un problème de minimisation d’une certaine fonctionnelle, le minimiseur représentant la segmentation recherchée. Différentes méthodes ont été proposées pour trouver des minima locaux ou globaux de la fonctionnelle non convexe du modèle de Mumford-Shah constant par morceaux à deux phases. Certaines approches utilisent une relaxation convexe qui permet d’obtenir des minima globaux de la fonctionnelle non convexe. On rappelle et compare certaines de ces méthodes et on propose un nouveau modèle par bande étroite, qui permet d’obtenir des minima locaux tout en utilisant des algorithmes robustes qui proviennent de l’optimisation convexe. Ensuite, on construit une relaxation convexe d’un modèle de segmentation à deux phases qui repose sur la comparaison entre deux histogrammes donnés et les histogrammes estimés globalement sur les deux régions de la segmentation. Des relaxations pour des problèmes multi-étiquettes à plusieurs dimensions comme le flot optique sont également étudiées. On propose une relaxation convexe avec un algorithme itératif qui ne comprend que des projections qui se calculent exactement, ainsi qu’un nouvel algorithme pour une relaxation convexe sur chaque variable mais non convexe globalement. On étudie la manière d’estimer une solution du problème non convexe original à partir d’une solution d’un problème relaxé en comparant des méthodes existantes avec des nouvelles / In this thesis we study different relaxations of non-convex functionals that can be found in image processing. Some problems, such as image segmentation, can indeed be written as the minimization of a functional. The minimizer of the functional represents the segmentation. Different methods have been proposed in order to find local or global minima of the non-convex functional of the two-phase piecewise constant Mumford-Shah model. With a convex relaxation of this model we can find a global minimum of the nonconvex functional. We present and compare some of these methods and we propose a new model with a narrow band. This model finds local minima while using robust convex optimization algorithms. Then a convex relaxation of a two-phase segmentation model is built that compares two given histograms with those of the two segmented regions. We also study some relaxations of high-dimension multi-label problems such as optical flow computation. A convex relaxation with a new algorithm is proposed. The algorithm is iterative with exact projections. A new algorithm is given for a relaxationthat is convex in each variable but that is not convex globally. We study the problem of constructing a solution of the original non-convex problem with a solution of the relaxed problem. We compare existing methods with new ones. Traitement d’images Histogrammes Problèmes multi-étiquettes Relaxation Convexification Flot optique Segmentation Image processing Histograms Multilabel problems Relaxation Convexification Optical flow Segmentation
14	Modélisation de documents combinant texte et image : application à la catégorisation et à la recherche d'information multimédia Moulin, Christophe 22 June 2011 (has links) (PDF) L'exploitation des documents multimédias pose des problèmes de représentation des informations textuelles et visuelles contenues dans ces documents. Notre but est de proposer un modèle permettant de représenter chacune de ces informations et de les combiner en vue de deux tâches : la catégorisation et la recherche d'information. Ce modèle représente les documents sous forme de sacs de mots nécessitant la création de vocabulaires spécifiques. Le vocabulaire textuel, généralement de très grande taille, est constitué des mots apparaissant dans les documents. Le vocabulaire visuel est quant à lui construit en extrayant des caractéristiques de bas niveau des images. Nous étudions les différentes étapes de sa création et la pondération tfidf des mots visuels dans les images, inspirée des approches classiquement utilisées pour les mots textuels. Dans le contexte de la catégorisation de documents textuels, nous introduisons un critère qui sélectionne les mots les plus discriminants pour les catégories afin de réduire la taille du vocabulaire sans dégrader les résultats du classement. Nous présentons aussi dans le cadre multilabel, une méthode permettant de sélectionner les différentes catégories à associer à un document. En recherche d'information, nous proposons une approche analytique par apprentissage pour combiner linéairement les résultats issus des informations textuelles et visuelles, permettant d'améliorer significativement la recherche. Notre modèle est validé pour ces différentes tâches en participant à des compétitions internationales telles que XML Mining et ImageCLEF et sur des collections de taille conséquente Représentation de documents Modèle vectoriel Modèle sacs de mots Documents multimédias Caractérisation multiclasse multilabel Recherche d'information multimédia
15	Un système interactif pour l'analyse des musiques électroacoustiques Gulluni, Sébastien 20 December 2011 (has links) (PDF) Les musiques électroacoustiques sont encore aujourd'hui relativement peu abordées dans les recherches qui visent à retrouver des informations à partir du contenu musical. La plupart des travaux de recherche concernant ces musiques sont centrés sur les outils de composition, la pédagogie et l'analyse musicale. Dans ce travail de thèse, nous nous intéressons aux problématiques scientifiques liées à l'analyse des musiques électroacoustiques. Après avoir replacé ces musiques dans leur contexte historique, une étude des pratiques d'analyse de trois professionnels nous permet d'obtenir des pistes pour l'élaboration d'un système d'analyse. Ainsi, nous proposons un système interactif d'aide à l'analyse des musiques électroacoustiques qui permet de retrouver les différentes instances des objets sonores composant une pièce polyphonique. Le système proposé permet dans un premier temps de réaliser une segmentation afin de dégager les instances initiales des objets sonores principaux. L'utilisateur peut ainsi sélectionner les objets qu'il vise avant de rentrer dans une boucle d'interaction qui utilise l'apprentissage actif et le retour de pertinence fourni par l'utilisateur. Le retour apporté par l'utilisateur est utilisé par le système qui réalise une classification multilabel des différents segments sonores en fonction des objets sonores visés. Une évaluation par simulation utilisateur est réalisée à partir d'un corpus de pièces synthétiques. L'évaluation montre que notre approche permet d'obtenir des résultats satisfaisants en un nombre raisonnable d'interactions. musiques électroacoustiques apprentissage interactif retour de pertinence apprentissage actif classification multilabel
16	Classification multilabels à partir de signaux EEG d'imaginations motrices combinées : application au contrôle 3D d'un bras robotique / Multilabel classification of EEG-based combined motor imageries implemented for the 3D control of a robotic arm Lindig León, Cecilia 10 January 2017 (has links) Les interfaces cerveau-ordinateur (ou BCI en anglais pour Brain-Computer Interfaces) mettent en place depuis le système nerveux central un circuit artificiel secondaire qui remplace l’utilisation des nerfs périphériques, permettant entre autres à des personnes ayant une déficience motrice grave d’interagir, uniquement à l’aide de leur activité cérébrale, avec différents types d’applications, tels qu’un système d’écriture, une neuro-prothèse, un fauteuil roulant motorisé ou un bras robotique. Une technique répandue au sein des BCI pour enregistrer l’activité cérébrale est l’électroencéphalographie (EEG), étant donné que contrairement à d’autres techniques d’imagerie, elle est non invasif et peu coûteuse. En outre, l’imagination motrice (MI), c’est-à-dire les oscillations des neurones du cortex moteur générées lorsque les sujets imaginent effectuer un mouvement sans réellement l’accomplir, est appropriée car détectable dans l’EEG et liée à l’activité motrice pour concevoir des interfaces comme des neuro-prothèses non assujetties à des stimuli. Cependant, même si des progrès importants ont été réalisés au cours des dernières années, un contrôle 3D complet reste un objectif à atteindre. Afin d’explorer de nouvelles solutions pour surmonter les limitations existantes, nous présentons une approche multiclasses qui considère la détection des imaginations motrices combinées. Le paradigme proposé comprend l’utilisation de la main gauche, de la main droite, et des deux pieds ensemble. Ainsi, par combinaison, huit commandes peuvent être fournies pour diriger un bras robotisé comprenant quatorze mouvements différents qui offrent un contrôle 3D complet. À cette fin, un système de commutation entre trois modes (déplacement du bras, du poignet ou des doigts) a été conçu et permet de gérer les différentes actions en utilisant une même commande. Ce système a été mis en oeuvre sur la plate-forme OpenViBE. En outre, pour l’extraction de caractéristiques une nouvelle approche de traitement d’information fournie par les capteurs a été développée sur la base de l’emplacement spécifique des sources d’activité liées aux parties du corps considérées. Cette approche permet de regrouper au sein d’une seule classe les différentes actions pour lesquelles le même membre est engagé, d’une manière que la tâche multiclasses originale se transforme en un problème équivalent impliquant une série de modèles de classification binaires. Cette approche permet d’utiliser l’algorithme de Common Spatial pattern (CSP) dont la capacité à discriminer des rythmes sensorimoteurs a été largement montrée mais qui présente l’inconvénient d’être applicable uniquement pour différencier deux classes. Nous avons donc également contribué à une nouvelle stratégie qui combine un ensemble de CSP et la géométrie riemannienne. Ainsi des caractéristiques plus discriminantes peuvent être obtenues comme les distances séparant les données des centres des classes considérées. Ces stratégies ont été appliquées sur trois nouvelles approches de classification qui ont été comparées à des méthodes de discrimination multiclasses classiques en utilisant les signaux EEG d’un groupe de sujets sains naïfs, montrant ainsi que les alternatives proposées permettent non seulement d’améliorer l’existant, mais aussi de réduire la complexité de la classification / Brain-Computer Interfaces (BCIs) replace the natural nervous system outputs by artificial ones that do not require the use of peripheral nerves, allowing people with severe motor impairments to interact, only by using their brain activity, with different types of applications, such as spellers, neuroprostheses, wheelchairs, or among others robotics devices. A very popular technique to record signals for BCI implementation purposes consists of electroencephalography (EEG), since in contrast with other alternatives, it is noninvasive and inexpensive. In addition, due to the potentiality of Motor Imagery (MI, i.e., brain oscillations that are generated when subjects imagine themselves performing a movement without actually accomplishing it) to generate suitable patterns for scheming self-paced paradigms, such combination has become a common solution for BCI neuroprostheses design. However, even though important progress has been made in the last years, full 3D control is an unaccomplished objective. In order to explore new solutions for overcoming the existing limitations, we present a multiclass approach that considers the detection of combined motor imageries, (i.e., two or more body parts used at the same time). The proposed paradigm includes the use of the left hand, right hand, and both feet together, from which eight commands are provided to direct a robotic arm comprising fourteen different movements that afford a full 3D control. To this end, an innovative switching-mode scheme that allows managing different actions by using the same command was designed and implemented on the OpenViBE platform. Furthermore, for feature extraction a novel signal processing scheme has been developed based on the specific location of the activity sources that are related to the considered body parts. This insight allows grouping together within a single class those conditions for which the same limb is engaged, in a manner that the original multiclass task is transformed into an equivalent problem involving a series of binary classification models. Such approach allows using the Common Spatial Pattern (CSP) algorithm; which has been shown to be powerful at discriminating sensorimotor rhythms, but has the drawback of being suitable only to differentiate between two classes. Based on this perspective we also have contributed with a new strategy that combines together the CSP algorithm and Riemannian geometry. In which the CSP projected trials are mapped into the Riemannian manifold, from where more discriminative features can be obtained as the distances separating the input data from the considered class means. These strategies were applied on three new classification approaches that have been compared to classical multiclass methods by using the EEG signals from a group of naive healthy subjects, showing that the proposed alternatives not only outperform the existing schema, but also reduce the complexity of the classification task Interfaces cerveau-Ordinateur EEG Imagination motrice Contrôle robotique Classification multilabels Common spatial pattern Géométrie Riemanienne Brain-Computer interfaces EEG Motor imagery Robotics control Multilabel approach Common Spatial Pattern Riemannian geometry 004.019
17	Extraction of medical knowledge from clinical reports and chest x-rays using machine learning techniques Bustos, Aurelia 19 June 2019 (has links) This thesis addresses the extraction of medical knowledge from clinical text using deep learning techniques. In particular, the proposed methods focus on cancer clinical trial protocols and chest x-rays reports. The main results are a proof of concept of the capability of machine learning methods to discern which are regarded as inclusion or exclusion criteria in short free-text clinical notes, and a large scale chest x-ray image dataset labeled with radiological findings, diagnoses and anatomic locations. Clinical trials provide the evidence needed to determine the safety and effectiveness of new medical treatments. These trials are the basis employed for clinical practice guidelines and greatly assist clinicians in their daily practice when making decisions regarding treatment. However, the eligibility criteria used in oncology trials are too restrictive. Patients are often excluded on the basis of comorbidity, past or concomitant treatments and the fact they are over a certain age, and those patients that are selected do not, therefore, mimic clinical practice. This signifies that the results obtained in clinical trials cannot be extrapolated to patients if their clinical profiles were excluded from the clinical trial protocols. The efficacy and safety of new treatments for patients with these characteristics are not, therefore, defined. Given the clinical characteristics of particular patients, their type of cancer and the intended treatment, discovering whether or not they are represented in the corpus of available clinical trials requires the manual review of numerous eligibility criteria, which is impracticable for clinicians on a daily basis. In this thesis, a large medical corpora comprising all cancer clinical trials protocols in the last 18 years published by competent authorities was used to extract medical knowledge in order to help automatically learn patient’s eligibility in these trials. For this, a model is built to automatically predict whether short clinical statements were considered inclusion or exclusion criteria. A method based on deep neural networks is trained on a dataset of 6 million short free-texts to classify them between elegible or not elegible. For this, pretrained word embeddings were used as inputs in order to predict whether or not short free-text statements describing clinical information were considered eligible. The semantic reasoning of the word-embedding representations obtained was also analyzed, being able to identify equivalent treatments for a type of tumor in an analogy with the drugs used to treat other tumors. Results show that representation learning using deep neural networks can be successfully leveraged to extract the medical knowledge from clinical trial protocols and potentially assist practitioners when prescribing treatments. The second main task addressed in this thesis is related to knowledge extraction from medical reports associated with radiographs. Conventional radiology remains the most performed technique in radiodiagnosis services, with a percentage close to 75% (Radiología Médica, 2010). In particular, chest x-ray is the most common medical imaging exam with over 35 million taken every year in the US alone (Kamel et al., 2017). They allow for inexpensive screening of several pathologies including masses, pulmonary nodules, effusions, cardiac abnormalities and pneumothorax. For this task, all the chest-x rays that had been interpreted and reported by radiologists at the Hospital Universitario de San Juan (Alicante) from Jan 2009 to Dec 2017 were used to build a novel large-scale dataset in which each high-resolution radiograph is labeled with its corresponding metadata, radiological findings and pathologies. This dataset, named PadChest, includes more than 160,000 images obtained from 67,000 patients, covering six different position views and additional information on image acquisition and patient demography. The free text reports written in Spanish by radiologists were labeled with 174 different radiographic findings, 19 differential diagnoses and 104 anatomic locations organized as a hierarchical taxonomy and mapped onto standard Unified Medical Language System (UMLS) terminology. For this, a subset of the reports (a 27%) were manually annotated by trained physicians, whereas the remaining set was automatically labeled with deep supervised learning methods using attention mechanisms and fed with the text reports. The labels generated were then validated in an independent test set achieving a 0.93 Micro-F1 score. To the best of our knowledge, this is one of the largest public chest x-ray databases suitable for training supervised models concerning radiographs, and also the first to contain radiographic reports in Spanish. The PadChest dataset can be downloaded on request from http://bimcv.cipf.es/bimcv-projects/padchest/. PadChest is intended for training image classifiers based on deep learning techniques to extract medical knowledge from chest x-rays. It is essential that automatic radiology reporting methods could be integrated in a clinically validated manner in radiologists’ workflow in order to help specialists to improve their efficiency and enable safer and actionable reporting. Computer vision methods capable of identifying both the large spectrum of thoracic abnormalities (and also the normality) need to be trained on large-scale comprehensively labeled large-scale x-ray datasets such as PadChest. The development of these computer vision tools, once clinically validated, could serve to fulfill a broad range of unmet needs. Beyond implementing and obtaining results for both clinical trials and chest x-rays, this thesis studies the nature of the health data, the novelty of applying deep learning methods to obtain large-scale labeled medical datasets, and the relevance of its applications in medical research, which have contributed to its extramural diffusion and worldwide reach. This thesis describes this journey so that the reader is navigated across multiple disciplines, from engineering to medicine up to ethical considerations in artificial intelligence applied to medicine. Natural Language Processing Machine Learning Artificial Intelligence Neural Networks Deep Learning Computer Vision Multilabel Text Classifiers Clinical Research Radiology Chest X-Rays Medical Image Dataset Clinical Trials on Cancer Medical Text Lenguajes y Sistemas Informáticos

Page generated in 0.0441 seconds