Global ETD Search

501	Weakly supervised learning of deformable part models and convolutional neural networks for object detection / Détection d'objets faiblement supervisée par modèles de pièces déformables et réseaux de neurones convolutionnels Tang, Yuxing 14 December 2016 (has links) Dans cette thèse, nous nous intéressons au problème de la détection d’objets faiblement supervisée. Le but est de reconnaître et de localiser des objets dans les images, n’ayant à notre disposition durant la phase d’apprentissage que des images partiellement annotées au niveau des objets. Pour cela, nous avons proposé deux méthodes basées sur des modèles différents. Pour la première méthode, nous avons proposé une amélioration de l’approche ”Deformable Part-based Models” (DPM) faiblement supervisée, en insistant sur l’importance de la position et de la taille du filtre racine initial spécifique à la classe. Tout d’abord, un ensemble de candidats est calculé, ceux-ci représentant les positions possibles de l’objet pour le filtre racine initial, en se basant sur une mesure générique d’objectness (par region proposals) pour combiner les régions les plus saillantes et potentiellement de bonne qualité. Ensuite, nous avons proposé l’apprentissage du label des classes latentes de chaque candidat comme un problème de classification binaire, en entrainant des classifieurs spécifiques pour chaque catégorie afin de prédire si les candidats sont potentiellement des objets cible ou non. De plus, nous avons amélioré la détection en incorporant l’information contextuelle à partir des scores de classification de l’image. Enfin, nous avons élaboré une procédure de post-traitement permettant d’élargir et de contracter les régions fournies par le DPM afin de les adapter efficacement à la taille de l’objet, augmentant ainsi la précision finale de la détection. Pour la seconde approche, nous avons étudié dans quelle mesure l’information tirée des objets similaires d’un point de vue visuel et sémantique pouvait être utilisée pour transformer un classifieur d’images en détecteur d’objets d’une manière semi-supervisée sur un large ensemble de données, pour lequel seul un sous-ensemble des catégories d’objets est annoté avec des boîtes englobantes nécessaires pour l’apprentissage des détecteurs. Nous avons proposé de transformer des classifieurs d’images basés sur des réseaux convolutionnels profonds (Deep CNN) en détecteurs d’objets en modélisant les différences entre les deux en considérant des catégories disposant à la fois de l’annotation au niveau de l’image globale et l’annotation au niveau des boîtes englobantes. Cette information de différence est ensuite transférée aux catégories sans annotation au niveau des boîtes englobantes, permettant ainsi la conversion de classifieurs d’images en détecteurs d’objets. Nos approches ont été évaluées sur plusieurs jeux de données tels que PASCAL VOC, ImageNet ILSVRC et Microsoft COCO. Ces expérimentations ont démontré que nos approches permettent d’obtenir des résultats comparables à ceux de l’état de l’art et qu’une amélioration significative a pu être obtenue par rapport à des méthodes récentes de détection d’objets faiblement supervisées. / In this dissertation we address the problem of weakly supervised object detection, wherein the goal is to recognize and localize objects in weakly-labeled images where object-level annotations are incomplete during training. To this end, we propose two methods which learn two different models for the objects of interest. In our first method, we propose a model enhancing the weakly supervised Deformable Part-based Models (DPMs) by emphasizing the importance of location and size of the initial class-specific root filter. We first compute a candidate pool that represents the potential locations of the object as this root filter estimate, by exploring the generic objectness measurement (region proposals) to combine the most salient regions and “good” region proposals. We then propose learning of the latent class label of each candidate window as a binary classification problem, by training category-specific classifiers used to coarsely classify a candidate window into either a target object or a non-target class. Furthermore, we improve detection by incorporating the contextual information from image classification scores. Finally, we design a flexible enlarging-and-shrinking post-processing procedure to modify the DPMs outputs, which can effectively match the approximate object aspect ratios and further improve final accuracy. Second, we investigate how knowledge about object similarities from both visual and semantic domains can be transferred to adapt an image classifier to an object detector in a semi-supervised setting on a large-scale database, where a subset of object categories are annotated with bounding boxes. We propose to transform deep Convolutional Neural Networks (CNN)-based image-level classifiers into object detectors by modeling the differences between the two on categories with both image-level and bounding box annotations, and transferring this information to convert classifiers to detectors for categories without bounding box annotations. We have evaluated both our approaches extensively on several challenging detection benchmarks, e.g. , PASCAL VOC, ImageNet ILSVRC and Microsoft COCO. Both our approaches compare favorably to the state-of-the-art and show significant improvement over several other recent weakly supervised detection methods. Détection d’objets Apprentissage faiblement supervisé Deformable parts models Apprentissage profond Réseaux de neurones convolutionnels Transfert d’apprentissage Object detection Weakly supervised learning Deformable part models Region proposals Deep learning Convolutional neural networks Transfer learning
502	Obstacle detection and emergency exit sign recognition for autonomous navigation using camera phone Mohammed, Abdulmalik January 2017 (has links) In this research work, we develop an obstacle detection and emergency exit sign recognition system on a mobile phone by extending the feature from accelerated segment test detector with Harris corner filter. The first step often required for many vision based applications is the detection of objects of interest in an image. Hence, in this research work, we introduce emergency exit sign detection method using colour histogram. The hue and saturation component of an HSV colour model are processed into features to build a 2D colour histogram. We backproject a 2D colour histogram to detect emergency exit sign from a captured image as the first task required before performing emergency exit sign recognition. The result of classification shows that the 2D histogram is fast and can discriminate between objects and background with accuracy. One of the challenges confronting object recognition methods is the type of image feature to compute. In this work therefore, we present two feature detectors and descriptor methods based on the feature from accelerated segment test detector with Harris corner filter. The first method is called Upright FAST-Harris and binary detector (U-FaHB), while the second method Scale Interpolated FAST-Harris and Binary (SIFaHB). In both methods, feature points are extracted using the accelerated segment test detectors and Harris filter to return the strongest corner points as features. However, in the case of SIFaHB, the extraction of feature points is done across the image plane and along the scale-space. The modular design of these detectors allows for the integration of descriptors of any kind. Therefore, we combine these detectors with binary test descriptor like BRIEF to compute feature regions. These detectors and the combined descriptor are evaluated using different images observed under various geometric and photometric transformations and the performance is compared with other detectors and descriptors. The results obtained show that our proposed feature detector and descriptor method is fast and performs better compared with other methods like SIFT, SURF, ORB, BRISK, CenSurE. Based on the potential of U-FaHB detector and descriptor, we extended it for use in optical flow computation, which we termed the Nearest-flow method. This method has the potential of computing flow vectors for use in obstacle detection. Just like any other new methods, we evaluated the Nearest flow method using real and synthetic image sequences. We compare the performance of the Nearest-flow with other methods like the Lucas and Kanade, Farneback and SIFT-flow. The results obtained show that our Nearest-flow method is faster to compute and performs better on real scene images compared with the other methods. In the final part of this research, we demonstrate the application potential of our proposed methods by developing an obstacle detection and exit sign recognition system on a camera phone and the result obtained shows that the methods have the potential to solve this vision based object detection and recognition problem. 004
503	Aide au diagnostic du cancer de la prostate par IRM multi-paramétrique : une approche par classification supervisée / Computer-aided diagnosis of prostate cancer using multi-parametric MRI : a supervised learning approach Niaf, Émilie 10 December 2012 (has links) Le cancer de la prostate est la deuxième cause de mortalité chez l’homme en France. L’IRM multiparamétrique est considérée comme la technique la plus prometteuse pour permettre une cartographie du cancer, ouvrant la voie au traitement focal, alternatif à la prostatectomie radicale. Néanmoins, elle reste difficile à interpréter et est sujette à une forte variabilité inter- et intra-expert, d’où la nécessité de développer des systèmes experts capables d’aider le radiologue dans son diagnostic. Nous proposons un système original d’aide au diagnostic (CAD) offrant un second avis au radiologue sur des zones suspectes pointées sur l’image. Nous évaluons notre système en nous appuyant sur une base de données clinique de 30 patients, annotées de manière fiable et exhaustive grâce à l’analyse des coupes histologiques obtenues par prostatectomie. Les performances mesurées dans des conditions cliniques auprès de 12 radiologues, sans et avec notre outil, démontrent l’apport significatif de ce CAD sur la qualité du diagnostic, la confiance des radiologues et la variabilité inter-expert. La création d’une base de corrélations anatomo-radiologiques est une tâche complexe et fastidieuse. Beaucoup d’études n’ont pas d’autre choix que de s’appuyer sur l’analyse subjective d’un radiologue expert, entâchée d’incertitude. Nous proposons un nouveau schéma de classification, basé sur l’algorithme du séparateur à vaste marge (SVM), capable d’intégrer, dans la fonction d’apprentissage, l’incertitude sur l’appartenance à une classe (ex. sain/malin) de certains échantillons de la base d’entraînement. Les résultats obtenus, tant sur des exemples simulés que sur notre base de données cliniques, démontrent le potentiel de ce nouvel algorithme, en particulier pour les applications CAD, mais aussi de manière plus générale pour toute application de machine learning s’appuyant sur un étiquetage quantitatif des données / Prostate cancer is one of the leading cause of death in France. Multi-parametric MRI is considered the most promising technique for cancer visualisation, opening the way to focal treatments as an alternative to prostatectomy. Nevertheless, its interpretation remains difficult and subject to inter- and intra-observer variability, which motivates the development of expert systems to assist radiologists in making their diagnosis. We propose an original computer-aided diagnosis system returning a malignancy score to any suspicious region outlined on MR images, which can be used as a second view by radiologists. The CAD performances are evaluated based on a clinical database of 30 patients, exhaustively and reliably annotated thanks to the histological ground truth obtained via prostatectomy. Finally, we demonstrate the influence of this system in clinical condition based on a ROC analysis involving 12 radiologists, and show a significant increase of diagnostic accuracy, rating confidence and a decrease in inter-expert variability. Building an anatomo-radiological correlation database is a complex and fastidious task, so that numerous studies base their evaluation analysis on the expertise of one experienced radiologist, which is thus doomed to contain uncertainties. We propose a new classification scheme, based on the support vector machine (SVM) algorithm, which is able to account for uncertain data during the learning step. The results obtained, both on toy examples and on our clinical database, demonstrate the potential of this new approach that can be extended to any machine learning problem relying on a probabilitic labelled dataset Cancer de la prostate Systèmes d’aide au diagnostic Apprentissage supervisé Séparateurs à vaste marge Prostate cancer Aided diagnosis system Supervised learning Support vector machine 616.075
504	Aprendizado semissupervisionado multidescrição em classificação de textos / Multi-view semi-supervised learning in text classification Braga, Ígor Assis 23 April 2010 (has links) Algoritmos de aprendizado semissupervisionado aprendem a partir de uma combinação de dados rotulados e não rotulados. Assim, eles podem ser aplicados em domínios em que poucos exemplos rotulados e uma vasta quantidade de exemplos não rotulados estão disponíveis. Além disso, os algoritmos semissupervisionados podem atingir um desempenho superior aos algoritmos supervisionados treinados nos mesmos poucos exemplos rotulados. Uma poderosa abordagem ao aprendizado semissupervisionado, denominada aprendizado multidescrição, pode ser usada sempre que os exemplos de treinamento são descritos por dois ou mais conjuntos de atributos disjuntos. A classificação de textos é um domínio de aplicação no qual algoritmos semissupervisionados vêm obtendo sucesso. No entanto, o aprendizado semissupervisionado multidescrição ainda não foi bem explorado nesse domínio dadas as diversas maneiras possíveis de se descrever bases de textos. O objetivo neste trabalho é analisar o desempenho de algoritmos semissupervisionados multidescrição na classificação de textos, usando unigramas e bigramas para compor duas descrições distintas de documentos textuais. Assim, é considerado inicialmente o difundido algoritmo multidescrição CO-TRAINING, para o qual são propostas modificações a fim de se tratar o problema dos pontos de contenção. É também proposto o algoritmo COAL, o qual pode melhorar ainda mais o algoritmo CO-TRAINING pela incorporação de aprendizado ativo como uma maneira de tratar pontos de contenção. Uma ampla avaliação experimental desses algoritmos foi conduzida em bases de textos reais. Os resultados mostram que o algoritmo COAL, usando unigramas como uma descrição das bases textuais e bigramas como uma outra descrição, atinge um desempenho significativamente melhor que um algoritmo semissupervisionado monodescrição. Levando em consideração os bons resultados obtidos por COAL, conclui-se que o uso de unigramas e bigramas como duas descrições distintas de bases de textos pode ser bastante compensador / Semi-supervised learning algorithms learn from a combination of both labeled and unlabeled data. Thus, they can be applied in domains where few labeled examples and a vast amount of unlabeled examples are available. Furthermore, semi-supervised learning algorithms may achieve a better performance than supervised learning algorithms trained on the same few labeled examples. A powerful approach to semi-supervised learning, called multi-view learning, can be used whenever the training examples are described by two or more disjoint sets of attributes. Text classification is a domain in which semi-supervised learning algorithms have shown some success. However, multi-view semi-supervised learning has not yet been well explored in this domain despite the possibility of describing textual documents in a myriad of ways. The aim of this work is to analyze the effectiveness of multi-view semi-supervised learning in text classification using unigrams and bigrams as two distinct descriptions of text documents. To this end, we initially consider the widely adopted CO-TRAINING multi-view algorithm and propose some modifications to it in order to deal with the problem of contention points. We also propose the COAL algorithm, which further improves CO-TRAINING by incorporating active learning as a way of dealing with contention points. A thorough experimental evaluation of these algorithms was conducted on real text data sets. The results show that the COAL algorithm, using unigrams as one description of text documents and bigrams as another description, achieves significantly better performance than a single-view semi-supervised algorithm. Taking into account the good results obtained by COAL, we conclude that the use of unigrams and bigrams as two distinct descriptions of text documents can be very effective Aprendizado de máquina Aprendizado multidescrição Aprendizado semissupervisionado Bigrams Biogramas Classificação de textos Co-training Co-Training cial Coal Machine learning Multi-view learning Self-training Self-training Semi-supervised learning Text classification Unigramas Unigrams
505	L'approche Support Vector Machines (SVM) pour le traitement des données fonctionnelles / Support Vector Machines (SVM) for Fonctional Data Analysis Henchiri, Yousri 16 October 2013 (has links) L'Analyse des Données Fonctionnelles est un domaine important et dynamique en statistique. Elle offre des outils efficaces et propose de nouveaux développements méthodologiques et théoriques en présence de données de type fonctionnel (fonctions, courbes, surfaces, ...). Le travail exposé dans cette thèse apporte une nouvelle contribution aux thèmes de l'apprentissage statistique et des quantiles conditionnels lorsque les données sont assimilables à des fonctions. Une attention particulière a été réservée à l'utilisation de la technique Support Vector Machines (SVM). Cette technique fait intervenir la notion d'Espace de Hilbert à Noyau Reproduisant. Dans ce cadre, l'objectif principal est d'étendre cette technique non-paramétrique d'estimation aux modèles conditionnels où les données sont fonctionnelles. Nous avons étudié les aspects théoriques et le comportement pratique de la technique présentée et adaptée sur les modèles de régression suivants. Le premier modèle est le modèle fonctionnel de quantiles de régression quand la variable réponse est réelle, les variables explicatives sont à valeurs dans un espace fonctionnel de dimension infinie et les observations sont i.i.d.. Le deuxième modèle est le modèle additif fonctionnel de quantiles de régression où la variable d'intérêt réelle dépend d'un vecteur de variables explicatives fonctionnelles. Le dernier modèle est le modèle fonctionnel de quantiles de régression quand les observations sont dépendantes. Nous avons obtenu des résultats sur la consistance et les vitesses de convergence des estimateurs dans ces modèles. Des simulations ont été effectuées afin d'évaluer la performance des procédures d'inférence. Des applications sur des jeux de données réelles ont été considérées. Le bon comportement de l'estimateur SVM est ainsi mis en évidence. / Functional Data Analysis is an important and dynamic area of statistics. It offers effective new tools and proposes new methodological and theoretical developments in the presence of functional type data (functions, curves, surfaces, ...). The work outlined in this dissertation provides a new contribution to the themes of statistical learning and quantile regression when data can be considered as functions. Special attention is devoted to use the Support Vector Machines (SVM) technique, which involves the notion of a Reproducing Kernel Hilbert Space. In this context, the main goal is to extend this nonparametric estimation technique to conditional models that take into account functional data. We investigated the theoretical aspects and practical attitude of the proposed and adapted technique to the following regression models.The first model is the conditional quantile functional model when the covariate takes its values in a bounded subspace of the functional space of infinite dimension, the response variable takes its values in a compact of the real line, and the observations are i.i.d.. The second model is the functional additive quantile regression model where the response variable depends on a vector of functional covariates. The last model is the conditional quantile functional model in the dependent functional data case. We obtained the weak consistency and a convergence rate of these estimators. Simulation studies are performed to evaluate the performance of the inference procedures. Applications to chemometrics, environmental and climatic data analysis are considered. The good behavior of the SVM estimator is thus highlighted. Analyse des Données Fonctionnelles Support Vector Machines Quantiles de régression Apprentissage statistique Apprentissage supervisé Espace de Hilbert à noyau reproduisant Functional Data Analysis Support Vector Machines Quantile Regression Statistical learning Supervised learning Reproducing kernel Hilbert space
506	Identificação automatizada de espécies de abelhas através de imagens de asas. / Automated bee species identification through wing images. Felipe Leno da Silva 19 February 2015 (has links) Diversas pesquisas focam no estudo e conservação das abelhas, em grande parte por sua importância para a agricultura. Entretanto, a identicação de espécies de abelhas vem sendo um impedimento para a condução de novas pesquisas, já que demanda tempo e um conhecimento muito especializado. Apesar de existirem diversos métodos para realizar esta tarefa, muitos deles são excessivamente custosos, restringindo sua aplicabilidade. Por serem facilmente acessíveis, as asas das abelhas vêm sendo amplamente utilizadas para a extração de características, já que é possível aplicar técnicas morfométricas utilizando apenas uma foto da asa. Como a medição manual de diversas características é tediosa e propensa a erros, sistemas foram desenvolvidos com este propósito. Entretanto, os sistemas ainda possuem limitações e não há um estudo voltado às técnicas de classificação que podem ser utilizadas para este m. Esta pesquisa visa avaliar as técnicas de extração de características e classificação de modo a determinar o conjunto de técnicas mais apropriado para a discriminação de espécies de abelhas. Nesta pesquisa foi demonstrado que o uso de uma conjunção de características morfométricas e fotométricas obtêm melhores resultados que o uso de somente características morfométricas. Também foram analisados os melhores algoritmos de classificação tanto usando somente características morfométricas, quanto usando uma conjunção de características morfométricas e fotométricas, os quais são, respectivamente, o Naïve Bayes e o classificador Logístico. Os Resultados desta pesquisa podem guiar o desenvolvimento de novos sistemas para identificação de espécies de abelha, objetivando auxiliar pesquisas conduzidas por biólogos. / Several researches focus on the study and conservation of bees, largely because of its importance for agriculture. However, the identification of bee species has hampering new studies, since it demands a very specialized knowledge and is time demanding. Although there are several methods to accomplish this task, many of them are excessively costly, restricting its applicability. For being accessible, the bee wings have been widely used for the extraction of features, since it is possible to apply morphometric techniques using just one image of the wing. As the manual measurement of various features is tedious and error prone, some systems have been developed for this purpose. However, these systems also have limitations, and there is no study concerning classification techniques that can be used for this purpose. This research aims to evaluate the feature extraction and classification techniques in order to determine the combination of more appropriate techniques for discriminating species of bees. The results of our research indicate that the use of a conjunction of Morphometric and Pixel-based features is more effective than only using Morphometric features. OuranalysisalsoconcludedthatthebestclassicationalgorithmsusingbothonlyMorphometric features and a conjunction of Morphometric and Pixel-based features are, respectively, Naïve Bayes and Logistic classier. The results of this research can guide the development of new systems to identify bee species in order to assist in researches conducted by biologists. Aprendizado supervisionado Aprendizagem de máquina Classificação de abelhas Extração de características Inteligência articial Reconhecimento de padrões Seleção de características Visão computacional Articial intelligence Bee species recognition Computer vision Feature extraction Feature selection. Machine learning Pattern recognition Supervised learning
507	Identificação automatizada de espécies de abelhas através de imagens de asas. / Automated bee species identification through wing images. Silva, Felipe Leno da 19 February 2015 (has links) Diversas pesquisas focam no estudo e conservação das abelhas, em grande parte por sua importância para a agricultura. Entretanto, a identicação de espécies de abelhas vem sendo um impedimento para a condução de novas pesquisas, já que demanda tempo e um conhecimento muito especializado. Apesar de existirem diversos métodos para realizar esta tarefa, muitos deles são excessivamente custosos, restringindo sua aplicabilidade. Por serem facilmente acessíveis, as asas das abelhas vêm sendo amplamente utilizadas para a extração de características, já que é possível aplicar técnicas morfométricas utilizando apenas uma foto da asa. Como a medição manual de diversas características é tediosa e propensa a erros, sistemas foram desenvolvidos com este propósito. Entretanto, os sistemas ainda possuem limitações e não há um estudo voltado às técnicas de classificação que podem ser utilizadas para este m. Esta pesquisa visa avaliar as técnicas de extração de características e classificação de modo a determinar o conjunto de técnicas mais apropriado para a discriminação de espécies de abelhas. Nesta pesquisa foi demonstrado que o uso de uma conjunção de características morfométricas e fotométricas obtêm melhores resultados que o uso de somente características morfométricas. Também foram analisados os melhores algoritmos de classificação tanto usando somente características morfométricas, quanto usando uma conjunção de características morfométricas e fotométricas, os quais são, respectivamente, o Naïve Bayes e o classificador Logístico. Os Resultados desta pesquisa podem guiar o desenvolvimento de novos sistemas para identificação de espécies de abelha, objetivando auxiliar pesquisas conduzidas por biólogos. / Several researches focus on the study and conservation of bees, largely because of its importance for agriculture. However, the identification of bee species has hampering new studies, since it demands a very specialized knowledge and is time demanding. Although there are several methods to accomplish this task, many of them are excessively costly, restricting its applicability. For being accessible, the bee wings have been widely used for the extraction of features, since it is possible to apply morphometric techniques using just one image of the wing. As the manual measurement of various features is tedious and error prone, some systems have been developed for this purpose. However, these systems also have limitations, and there is no study concerning classification techniques that can be used for this purpose. This research aims to evaluate the feature extraction and classification techniques in order to determine the combination of more appropriate techniques for discriminating species of bees. The results of our research indicate that the use of a conjunction of Morphometric and Pixel-based features is more effective than only using Morphometric features. OuranalysisalsoconcludedthatthebestclassicationalgorithmsusingbothonlyMorphometric features and a conjunction of Morphometric and Pixel-based features are, respectively, Naïve Bayes and Logistic classier. The results of this research can guide the development of new systems to identify bee species in order to assist in researches conducted by biologists. Aprendizado supervisionado Aprendizagem de máquina Articial intelligence Bee species recognition Classificação de abelhas Computer vision Extração de características Feature extraction Feature selection. Inteligência articial Machine learning Pattern recognition Reconhecimento de padrões Seleção de características Supervised learning Visão computacional
508	Predicting Linguistic Structure with Incomplete and Cross-Lingual Supervision Täckström, Oscar January 2013 (has links) Contemporary approaches to natural language processing are predominantly based on statistical machine learning from large amounts of text, which has been manually annotated with the linguistic structure of interest. However, such complete supervision is currently only available for the world's major languages, in a limited number of domains and for a limited range of tasks. As an alternative, this dissertation considers methods for linguistic structure prediction that can make use of incomplete and cross-lingual supervision, with the prospect of making linguistic processing tools more widely available at a lower cost. An overarching theme of this work is the use of structured discriminative latent variable models for learning with indirect and ambiguous supervision; as instantiated, these models admit rich model features while retaining efficient learning and inference properties. The first contribution to this end is a latent-variable model for fine-grained sentiment analysis with coarse-grained indirect supervision. The second is a model for cross-lingual word-cluster induction and the application thereof to cross-lingual model transfer. The third is a method for adapting multi-source discriminative cross-lingual transfer models to target languages, by means of typologically informed selective parameter sharing. The fourth is an ambiguity-aware self- and ensemble-training algorithm, which is applied to target language adaptation and relexicalization of delexicalized cross-lingual transfer parsers. The fifth is a set of sequence-labeling models that combine constraints at the level of tokens and types, and an instantiation of these models for part-of-speech tagging with incomplete cross-lingual and crowdsourced supervision. In addition to these contributions, comprehensive overviews are provided of structured prediction with no or incomplete supervision, as well as of learning in the multilingual and cross-lingual settings. Through careful empirical evaluation, it is established that the proposed methods can be used to create substantially more accurate tools for linguistic processing, compared to both unsupervised methods and to recently proposed cross-lingual methods. The empirical support for this claim is particularly strong in the latter case; our models for syntactic dependency parsing and part-of-speech tagging achieve the hitherto best published results for a wide number of target languages, in the setting where no annotated training data is available in the target language. linguistic structure prediction structured prediction latent-variable model semi-supervised learning multilingual learning cross-lingual learning indirect supervision partial supervision ambiguous supervision part-of-speech tagging dependency parsing named-entity recognition sentiment analysis
509	Técnicas de Sistemas Automáticos de Soporte Vectorial en la Réplica del Rating Crediticio Campos Espinoza, Ricardo Álex 10 July 2012 (has links) La correcta qualificació de risc de crèdit d'un emissor és un factor crític en l’economia actual. Aquest és un punt d’acord entre professionals i acadèmics. Actualment, des dels mitjans de comunicació s’han difós sovint notícies d'impacte provocades per agències de ràting. És per aquest motiu que treball d'anàlisi realitzat per experts financers aporta importants recursos a les empreses de consultoria d'inversió i agències qualificadores. Avui en dia, hi ha molts avenços metodològics i tècnics que permeten donar suport a la tasca que fan els professionals de la qualificació de la qualitat de crèdit dels emissors. Tanmateix encara queden molts buits per completar i àrees a desenvolupar per tal què aquesta tasca sigui tan precisa com cal. D'altra banda, els sistemes d'aprenentatge automàtic basats en funcions nucli, particularment les Support Vector Machines (SVM), han donat bons resultats en problemes de classificació quan les dades no són linealment separables o quan hi ha patrons amb soroll. A més, al usar estructures basades en funcions nucli és possible tractar qualsevol espai de dades, ampliant les possibilitats per trobar relacions entre els patrons, tasca que no resulta fàcil amb tècniques estadístiques convencionals. L’objectiu d'aquesta tesi és examinar les aportacions que s'han fet en la rèplica de ràting, i alhora, examinar diferents alternatives que permetin millorar l'acompliment de la rèplica amb SVM. Per a això, primer s'ha revisat la literatura financera amb la idea d'obtenir una visió general i panoràmica dels models usats per al mesurament del risc de crèdit. S'han revisat les aproximacions de mesurament de risc de crèdit individuals, utilitzades principalment per a la concessió de crèdits bancaris i per l'avaluació individual d'inversions en títols de renda fixa. També s'han revisat models de carteres d'actius, tant aquells proposats des del món acadèmic com els patrocinats per institucions financeres. A més, s'han revisat les aportacions dutes a terme per avaluar el risc de crèdit usant tècniques estadístiques i sistemes d'aprenentatge automàtic. S'ha fet especial èmfasi en aquest últim conjunt de mètodes d'aprenentatge i en el conjunt de metodologies usades per realitzar adequadament la rèplica de ràting. Per millorar l'acompliment de la rèplica, s'ha triat una tècnica de discretització de les variables sota la suposició que, per emetre l'opinió tècnica del ràting de les companyies, els experts financers en forma intuïtiva avaluen les característiques de les empreses en termes intervalars. En aquesta tesi, per fer la rèplica de ràting, s'ha fet servir una mostra de dades de companyies de països desenvolupats. S'han usat diferents tipus de SVM per replicar i s'ha exposat la bondat dels resultats d'aquesta rèplica, comparant-la amb altres dues tècniques estadístiques àmpliament usades en la literatura financera. S'ha concentrat l'atenció de la mesura de la bondat de l'ajust dels models en les taxes d'encert i en la forma en què es distribueixen els errors. D'acord amb els resultats obtinguts es pot sostenir que l'acompliment de les SVM és millor que el de les tècniques estadístiques usades en aquesta tesi, i després de la discretització de les dades d'entrada s'ha mostrat que no es perd informació rellevant en aquest procés. Això contribueix a la idea que els experts financers instintivament realitzen un procés similar de discretització de la informació financera per lliurar la seva opinió creditícia de les companyies qualificades. / La correcta calificación de riesgo crediticio de un emisor es un factor crítico en nuestra actual economía. Profesionales y académicos están de acuerdo en esto, y los medios de comunicación han difundido mediáticamente eventos de impacto provocados por agencias de rating. Por ello, el trabajo de análisis del deudor realizado por expertos financieros conlleva importantes recursos en las empresas de consultoría de inversión y agencias calificadoras. Hoy en día, muchos avances metodológicos y técnicos permiten el apoyo a la labor que hacen los profesionales en de calificación de la calidad crediticia de los emisores. No obstante aún quedan muchos vacíos por completar y áreas que desarrollar para que esta tarea sea todo lo precisa que necesita. Por otra parte, los sistemas de aprendizaje automático basados en funciones núcleo, particularmente las Support Vector Machines (SVM), han dado buenos resultados en problemas de clasificación cuando los datos no son linealmente separables o cuando hay patrones ruidosos. Además, al usar estructuras basadas en funciones núcleo resulta posible tratar cualquier espacio de datos, expandiendo las posibilidades para encontrar relaciones entre los patrones, tarea que no resulta fácil con técnicas estadísticas convencionales. El propósito de esta tesis es examinar los aportes que se han hecho en la réplica de rating, y a la vez, examinar diferentes alternativas que permitan mejorar el desempeño de la réplica con SVM. Para ello, primero se ha revisado la literatura financiera con la idea de obtener una visión general y panorámica de los modelos usados para la medición del riesgo crediticio. Se han revisado las aproximaciones de medición de riesgo crediticio individuales, utilizadas principalmente para la concesión de créditos bancarios y para la evaluación individual de inversiones en títulos de renta fija. También se han revisado modelos de carteras de activos, tanto aquellos propuestos desde el mundo académico como los patrocinados por instituciones financieras. Además, se han revisado los aportes llevados a cabo para evaluar el riesgo crediticio usando técnicas estadísticas y sistemas de aprendizaje automático. Se ha hecho especial énfasis en este último conjunto de métodos de aprendizaje y en el conjunto de metodologías usadas para realizar adecuadamente la réplica de rating. Para mejorar el desempeño de la réplica, se ha elegido una técnica de discretización de las variables bajo la suposición de que, para emitir la opinión técnica del rating de las compañías, los expertos financieros en forma intuitiva evalúan las características de las empresas en términos intervalares. En esta tesis, para realizar la réplica de rating, se ha usado una muestra de datos de compañías de países desarrollados. Se han usado diferentes tipos de SVM para replicar y se ha expuesto la bondad de los resultados de dicha réplica, comparándola con otras dos técnicas estadísticas ampliamente usadas en la literatura financiera. Se ha concentrado la atención de la medición de la bondad del ajuste de los modelos en las tasas de acierto y en la forma en que se distribuyen los errores. De acuerdo con los resultados obtenidos se puede sostener que el desempeño de las SVM es mejor que el de las técnicas estadísticas usadas en esta tesis; y luego de la discretización de los datos de entrada se ha mostrado que no se pierde información relevante en dicho proceso. Esto contribuye a la idea de que los expertos financieros instintivamente realizan un proceso similar de discretización de la información financiera para entregar su opinión crediticia de las compañías calificadas. / Proper credit rating of an issuer is a critical factor in our current economy. Professionals and academics agree on this, and the media have spread impact events caused by rating agencies. Therefore, the analysis performed by the debtor's financial experts has significant resources on investment consulting firms and rating agencies. Nowadays, many methodological and technical exist to support the professional qualification of the credit quality of issuers. However there are still many gaps to complete and areas to develop for this task to be as precise as needed. Moreover, machine learning systems based on core functions, particularly Support Vector Machines (SVM) have been successful in classification problems when the data are not linearly separable or when noisy patterns are used. In addition, by using structures based on kernel functions is possible to treat any data space, expanding the possibilities to find relationships between patterns, a task that is not easy with conventional statistical techniques. The purpose of this thesis is to examine the contributions made in the replica of rating, and, to look at different alternatives to improve the performance of prediction with SVM. To do this, we first reviewed the financial literature and overview the models used to measure credit risk. We reviewed the approaches of individual credit risk measurement, used principally for the lending bank and the individual assessment of investments in fixed income securities. Models based on portfolio of assets have also been revised, both those proposed from academia such as those used by financial institutions. In addition, we have reviewed the contributions carried out to assess credit risk using statistical techniques and machine learning systems. Particular emphasis has been placed on learning methods methodologies used to perform adequately replicate rating. To improve the performance of replication, a discretization technique has been chosen for the variables under the assumption that, for the opinion of the technical rating companies, financial experts intuitively evaluate the performances of companies in intervalar terms. In this thesis, for rating replication, we used a data sample of companies in developed countries. Different types of SVM have been used to replicate and discussed the goodness of the results of the replica, compared with two other statistical techniques widely used in the financial literature. Special attention has been given to measure the goodness of fit of the models in terms of rates of success and how they errors are distributed. According to the results it can be argued that the performance of SVM is better than the statistical techniques used in this thesis. In addition, it has been shown that in the process of discretization of the input data no-relevant information is lost. This contributes to the idea that financial experts instinctively made a similar process of discretization of financial information to deliver their credit opinion of the qualified companies. Màquina de vector de suport Qualificació de crèdit Risc creditici Máquina de soporte vectorial Calificación crediticia Riesgo crediticio Support vector machine Supervised learning machines Credit rating Credit risk Management Sciences 004 336
510	L’analyse de composants émotionnels dans des stratégies d’apprentissage Cioboiu, Emilia Alina 08 1900 (has links) Un certain nombre de théories pédagogiques ont été établies depuis plus de 20 ans. Elles font appel aux réactions de l’apprenant en situation d’apprentissage, mais aucune théorie pédagogique n’a pu décrire complètement un processus d’enseignement en tenant compte de toutes les réactions émotionnelles de l’apprenant. Nous souhaitons intégrer les émotions de l’apprenant dans ces processus d’apprentissage, car elles sont importantes dans les mécanismes d’acquisition de connaissances et dans la mémorisation. Récemment on a vu que le facteur émotionnel est considéré jouer un rôle très important dans les processus cognitifs. Modéliser les réactions émotionnelles d’un apprenant en cours du processus d’apprentissage est une nouveauté pour un Système Tutoriel Intelligent. Pour réaliser notre recherche, nous examinerons les théories pédagogiques qui n’ont pas considéré les émotions de l’apprenant. Jusqu’à maintenant, aucun Système Tutoriel Intelligent destiné à l’enseignement n’a incorporé la notion de facteur émotionnel pour un apprenant humain. Notre premier objectif est d’analyser quelques stratégies pédagogiques et de détecter les composantes émotionnelles qui peuvent y être ou non. Nous cherchons à déterminer dans cette analyse quel type de méthode didactique est utilisé, autrement dit, que fait le tuteur pour prévoir et aider l’apprenant à accomplir sa tâche d’apprentissage dans des conditions optimales. Le deuxième objectif est de proposer l’amélioration de ces méthodes en ajoutant les facteurs émotionnels. On les nommera des « méthodes émotionnelles ». Le dernier objectif vise à expérimenter le modèle d’une théorie pédagogique améliorée en ajoutant les facteurs émotionnels. Dans le cadre de cette recherche nous analyserons un certain nombre de théories pédagogiques, parmi lesquelles les théories de Robert Gagné, Jerome Bruner, Herbert J. Klausmeier et David Merrill, pour chercher à identifier les composantes émotionnelles. Aucune théorie pédagogique n’a mis l’accent sur les émotions au cours du processus d’apprentissage. Ces théories pédagogiques sont développées en tenant compte de plusieurs facteurs externes qui peuvent influencer le processus d’apprentissage. Nous proposons une approche basée sur la prédiction d’émotions qui est liée à de potentielles causes déclenchées par différents facteurs déterminants au cours du processus d’apprentissage. Nous voulons développer une technique qui permette au tuteur de traiter la réaction émotionnelle de l’apprenant à un moment donné au cours de son processus d’apprentissage et de l’inclure dans une méthode pédagogique. Pour atteindre le deuxième objectif de notre recherche, nous utiliserons un module tuteur apprenant basé sur le principe de l’éducation des émotions de l’apprenant, modèle qui vise premièrement sa personnalité et deuxièmement ses connaissances. Si on défini l’apprenant, on peut prédire ses réactions émotionnelles (positives ou négatives) et on peut s’assurer de la bonne disposition de l’apprenant, de sa coopération, sa communication et l’optimisme nécessaires à régler les problèmes émotionnels. Pour atteindre le troisième objectif, nous proposons une technique qui permet au tuteur de résoudre un problème de réaction émotionnelle de l’apprenant à un moment donné du processus d’apprentissage. Nous appliquerons cette technique à une théorie pédagogique. Pour cette première théorie, nous étudierons l’effet produit par certaines stratégies pédagogiques d’un tuteur virtuel au sujet de l’état émotionnel de l’apprenant, et pour ce faire, nous développerons une structure de données en ligne qu’un agent tuteur virtuel peut induire à l’apprenant des émotions positives. Nous analyserons les résultats expérimentaux en utilisant la première théorie et nous les comparerons ensuite avec trois autres théories que nous avons proposées d’étudier. En procédant de la sorte, nous atteindrons le troisième objectif de notre recherche, celui d’expérimenter un modèle d’une théorie pédagogique et de le comparer ensuite avec d’autres théories dans le but de développer ou d’améliorer les méthodes émotionnelles. Nous analyserons les avantages, mais aussi les insuffisances de ces théories par rapport au comportement émotionnel de l’apprenant. En guise de conclusion de cette recherche, nous retiendrons de meilleures théories pédagogiques ou bien nous suggérerons un moyen de les améliorer. / A number of educational theories have been established for over 20 years. They use the learner’s reactions in a learning situation, but no educational theory could fully describe an educational process taking into account all the emotional reactions of a learner. We want to integrate the learner’s emotions in these learning processes, as they are important in the mechanisms of learning and memory. Recently we saw that emotional factor is considered to play an important role in cognitive processes. Modeling a learner’s emotional reactions during the learning process is a novelty for an Intelligent Tutorial System. To achieve our research, we will examine educational theories which did not consider the learner’s emotions. Until now, no Intelligent Tutorial System for teaching has incorporated the concept of emotional factor of a human learner. Our first objective is to analyze a few strategies and detect emotional components that may be there or not. We seek to determine what type of teaching method is used, in other words, what the tutor is doing to predict and assist the learner to accomplish his/her learning task under optimal conditions. The second objective is to improve these methods by adding the emotional factors. They are so called “emotional methods”. The final objective is to test the model of an improved educational theory by adding the emotional factors. As part of this research we analyze a number of educational theories, including theories of Robert Gagné, Jerome Bruner, Herbert J. Klausmeier and David Merrill, in seeking to identify the emotional components. No educational theory has focused on emotions during the learning process. These educational theories are developed taking into account several factors that can influence the learning process. We propose an approach based on emotion prediction that is linked to potential causes triggered by different factors in the learning process. We want to develop a technique that allows the tutor to deal with the learner’s emotional reaction at any given time during the learning process and to include it in a teaching method. To achieve the second objective of our research, we use a learning tutor model based on the principle of educating the learner’s emotions, model which first seeks the person’s personality and second the person's knowledge. If we know the learner’s personality, we can predict his/her emotional reactions (positive or negative) and we can ensure the proper disposal of the learner, his cooperation, communication and optimism necessary to resolve emotional problems. In order to achieve the third objective, we propose a technique that allows the tutor to solve an emotional reaction problem of the learner at a given moment during the learning process. We apply this technique to an educational theory. For this first theory, we study the effect of certain educational strategies of a virtual tutor about the learner’s emotional state, and to this end, we develop an online data structure with which a virtual tutor can induce positive emotions to the learner. We analyze the experimental results using the first theory and then we compare them with three other theories proposed for study. In doing so, we reach the third objective of our research, which is to test an educational theory model and then compare it with other theories in order to develop or improve the emotional methods. We analyze the advantages, but also the shortcomings of these theories compared to a learner’s emotional behaviour. In conclusion, we will keep the best educational theories or we will suggest a way to improve them. Système Tutorial Intelligent modèle computationnel des émotions apprentissage supervisé méthode d’intervention Intelligent Tutorial System computational model of emotions predictions of emotional reactions supervised learning intervention method

Search results