Global ETD Search

361	Utilizando aprendizado emissupervisionado multidescrição em problemas de classificação hierárquica multirrótulo Araújo, Hiury Nogueira de 17 November 2017 (has links) Submitted by Lara Oliveira (lara@ufersa.edu.br) on 2018-03-14T20:25:58Z No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5) / Approved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2018-06-18T16:58:58Z (GMT) No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5) / Approved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2018-06-18T16:59:18Z (GMT) No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5) / Made available in DSpace on 2018-06-18T16:59:31Z (GMT). No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5) Previous issue date: 2017-11-17 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Data classification is a task applied in various areas of knowledge, therefore, the focus of ongoing research. Data classification can be divided according to the available data, which are labeled or not labeled. One approach has proven very effective when working with data sets containing labeled and unlabeled data, this called semi-supervised learning, your objective is to label the unlabeled data by using the amount of labeled data in the data set, improving their success rate. Such data can be classified with more than one label, known as multi-label classification. Furthermore, these data can be organized hierarchically, thus containing a relation therebetween, this called hierarchical classification. This work proposes the use of multi-view semi-supervised learning, which is one of the semissupervisionado learning aspects, in problems of hierarchical multi-label classification, with the objective of investigating whether semi-supervised learning is an appropriate approach to solve the problem of low dimensionality of data. An experimental analysis of the methods found that supervised learning had a better performance than semi-supervised approaches, however, semi-supervised learning may be a widely used approach, because, there is plenty to be contributed in this area / classificação de dados é uma tarefa aplicada em diversas áreas do conhecimento, sendo assim, foco de constantes pesquisas. A classificação de dados pode ser dividida de acordo com a disposição dos dados, sendo estes rotulados ou não rotulados. Uma abordagem vem se mostrando bastante eficiente ao se trabalhar com conjuntos de dados contendo dados rotulados e não rotulados, esta chamada de aprendizado semissupervisionado, seu objetivo é classificar os dados não rotulados através da quantidade de dados rotulados contidos no conjunto, melhorando sua taxa de acerto. Tais dados podem ser classificados com mais de um rótulo, conhecida como classificação multirrótulo. Além disso, estes dados podem estar organizados de forma hierárquica, contendo assim, uma relação entre os mesmos, esta, por sua vez, denominada classificação hierárquica. Neste trabalho é proposto a utilização do aprendizado semissupervisionado multidescrição, que é uma das vertentes do aprendizado semissupervisionado, em problemas de classificação hierárquica multirrótulo, com o objetivo de investigar se o aprendizado semissupervisionado é uma abordagem apropriada para resolver o problema de baixa dimensionalidade de dados. Uma análise experimental dos métodos verificou que o aprendizado supervisionado obteve melhor desempenho contra as abordagens semissupervisionadas, contudo, o aprendizado semissupervisionado pode vir a ser uma abordagem amplamente utilizada, pois, há bastante o que ser contribuído nesta área / 2018-03-14 Aprendizado semissupervisionado Classificação hierárquica Multirrótulo Co-training Self-training Semi-supervised learning Hierarchical multi-label classification Co-training Self-training CNPQ::CIENCIAS EXATAS E DA TERRA
362	Applying Machine Learning to Reduce the Adaptation Space in Self-Adaptive Systems : an exploratory work Buttar, Sarpreet Singh January 2018 (has links) Self-adaptive systems are capable of autonomously adjusting their behavior at runtime to accomplish particular adaptation goals. The most common way to realize self-adaption is using a feedback loop(s) which contains four actions: collect runtime data from the system and its environment, analyze the collected data, decide if an adaptation plan is required, and act according to the adaptation plan for achieving the adaptation goals. Existing approaches achieve the adaptation goals by using formal methods, and exhaustively verify all the available adaptation options, i.e., adaptation space. However, verifying the entire adaptation space is often not feasible since it requires time and resources. In this thesis, we present an approach which uses machine learning to reduce the adaptation space in self-adaptive systems. The approach integrates with the feedback loop and selects a subset of the adaptation options that are valid in the current situation. The approach is applied on the simulator of a self-adaptive Internet of Things application which is deployed in KU Leuven, Belgium. We compare our results with a formal model based self-adaptation approach called ActivFORMS. The results show that on average the adaptation space is reduced by 81.2% and the adaptation time by 85% compared to ActivFORMS while achieving the same quality guarantees. Self-adaptive systems Architecture-based self-adaptation Adaptation space MAPE-K feedback loop DeltaIoT ActivFORMS Machine learning Online supervised learning Classification Regression Engineering and Technology Teknik och teknologier
363	Apports des ontologies à l'analyse exploratoire des images satellitaires / Contribution of ontologies to the exploratory analysis of satellite images Chahdi, Hatim 04 July 2017 (has links) A l'heure actuelle, les images satellites constituent une source d'information incontournable face à de nombreux enjeux environnementaux (déforestation, caractérisation des paysages, aménagement du territoire, etc.). En raison de leur complexité, de leur volume important et des besoins propres à chaque communauté, l'analyse et l'interprétation des images satellites imposent de nouveaux défis aux méthodes de fouille de données. Le parti-pris de cette thèse est d'explorer de nouvelles approches, que nous situons à mi-chemin entre représentation des connaissances et apprentissage statistique, dans le but de faciliter et d'automatiser l'extraction d'informations pertinentes du contenu de ces images. Nous avons, pour cela, proposé deux nouvelles méthodes qui considèrent les images comme des données quantitatives massives dépourvues de labels sémantiques et qui les traitent en se basant sur les connaissances disponibles. Notre première contribution est une approche hybride, qui exploite conjointement le raisonnement à base d'ontologie et le clustering semi-supervisé. Le raisonnement permet l'étiquetage sémantique des pixels à partir de connaissances issues du domaine concerné. Les labels générés guident ensuite la tâche de clustering, qui permet de découvrir de nouvelles classes tout en enrichissant l'étiquetage initial. Notre deuxième contribution procède de manière inverse. Dans un premier temps, l'approche s'appuie sur un clustering topographique pour résumer les données en entrée et réduire de ce fait le nombre de futures instances à traiter par le raisonnement. Celui-ci n'est alors appliqué que sur les prototypes résultant du clustering, l'étiquetage est ensuite propagé automatiquement à l'ensemble des données de départ. Dans ce cas, l'importance est portée sur l'optimisation du temps de raisonnement et à son passage à l'échelle. Nos deux approches ont été testées et évaluées dans le cadre de la classification et de l'interprétation d'images satellites. Les résultats obtenus sont prometteurs et montrent d'une part, que la qualité de la classification peut être améliorée par une prise en compte automatique des connaissances et que l'implication des experts peut être allégée, et d'autre part, que le recours au clustering topographique en amont permet d'éviter le calcul des inférences sur la totalité des pixels de l'image. / Satellite images have become a valuable source of information for Earth observation. They are used to address and analyze multiple environmental issues such as landscapes characterization, urban planning or biodiversity conservation to cite a few.Despite of the large number of existing knowledge extraction techniques, the complexity of satellite images, their large volume, and the specific needs of each community of practice, give rise to new challenges and require the development of highly efficient approaches.In this thesis, we investigate the potential of intelligent combination of knowledge representation systems with statistical learning. Our goal is to develop novel methods which allow automatic analysis of remote sensing images. We elaborate, in this context, two new approaches that consider the images as unlabeled quantitative data and examine the possible use of the available domain knowledge.Our first contribution is a hybrid approach, that successfully combines ontology-based reasoning and semi-supervised clustering for semantic classification. An inference engine first reasons over the available domain knowledge in order to obtain semantically labeled instances. These instances are then used to generate constraints that will guide and enhance the clustering. In this way, our method allows the improvement of the labeling of existing classes while discovering new ones.Our second contribution focuses on scaling ontology reasoning over large datasets. We propose a two step approach where topological clustering is first applied in order to summarize the data, in term of a set of prototypes, and reduces by this way the number of future instances to be treated by the reasoner. The representative prototypes are then labeled using the ontology and the labels automatically propagated to all the input data.We applied our methods to the real-word problem of satellite images classification and interpretation and the obtained results are very promising. They showed, on the one hand, that the quality of the classification can be improved by automatic knowledge integration and that the involvement of experts can be reduced. On the other hand, the upstream exploitation of topographic clustering avoids the calculation of the inferences on all the pixels of the image. Ontologie Apprentissage semi-supervisé Clustering par contraintes Raisonnement Images satellites Fossé sémantique Ontology Semi-Supervised learning Constrained clustering Reasoning Satellite images Semantic gap
364	Prediction of Inter-Frequency Measurements in a LTE Network with Deep Learning / Prediktering av inter-frekvensmätningar i ett LTE-nätverk med Deep Learning Holm, Rasmus January 2018 (has links) The telecommunications industry faces difficult challenges as more and more devices communicate over the internet. A telecommunications network is a complex system with many parts and some are candidates for further automation. We have focused on interfrequency measurements that are used during inter-frequency handovers, among other procedures. A handover is the procedure when for instance a phone changes the base station it communicates with and the inter-frequency measurements are rather expensive to perform. More specifically, we have investigated the possibility of using deep learning—an ever expanding field in machine learning—for predicting inter-frequency measurements in a Long Term Evolution (LTE) network. We have focused on the multi-layer perceptron and extended it with (variational) autoencoders or modified it through dropout such that it approximate the predictive distribution of a Gaussian process. The telecommunications network consist of many cells and each cell gather its own data. One of the strengths of deep learning models is that they usually increase their performance as more and more data is used. We have investigated whether we do see an increase in performance if we combine data from multiple cells and the results show that this is not necessarily the case. The performances are comparable between models trained on combined data from multiple cells and models trained on data from individual cells. We can expect the multi-layer perceptron to perform better than a linear regression model. The best performing multi-layer perceptron architectures have been rather shallow, 1-2 hidden layers, and the extensions/modifications we have used/done have not shown any significant improvements to warrant their presence. For the particular LTE network we have worked with we would recommend to use shallow multi-layer perceptron architectures as far as deep learning models are concerned. Telecommunications Mobile Networks 4G LTE Handover Load Balancing Machine Learning Deep Learning Neural Networks Supervised Learning Autoencoder Probability Theory and Statistics Sannolikhetsteori och statistik
365	Gestion de données manquantes dans des cascades de boosting : application à la détection de visages / Management of missing data in boosting cascades : application to face detection Bouges, Pierre 06 December 2012 (has links) Ce mémoire présente les travaux réalisés dans le cadre de ma thèse. Celle-ci a été menée dans le groupe ISPR (ImageS, Perception systems and Robotics) de l’Institut Pascal au sein de l’équipe ComSee (Computers that See). Ces travaux s’inscrivent dans le cadre du projet Bio Rafale initié par la société clermontoise Vesalis et financé par OSEO. Son but est d’améliorer la sécurité dans les stades en s’appuyant sur l’identification des interdits de stade. Les applications des travaux de cette thèse concernent la détection de visages. Elle représente la première étape de la chaîne de traitement du projet. Les détecteurs les plus performants utilisent une cascade de classifieurs boostés. La notion de cascade fait référence à une succession séquentielle de plusieurs classifieurs. Le boosting, quant à lui, représente un ensemble d’algorithmes d’apprentissage automatique qui combinent linéairement plusieurs classifieurs faibles. Le détecteur retenu pour cette thèse utilise également une cascade de classifieurs boostés. L’apprentissage d’une telle cascade nécessite une base d’apprentissage ainsi qu’un descripteur d’images. Cette description des images est ici assurée par des matrices de covariance. La phase d’apprentissage d’un détecteur d’objets détermine ces conditions d’utilisation. Une de nos contributions est d’adapter un détecteur à des conditions d’utilisation non prévues par l’apprentissage. Les adaptations visées aboutissent à un problème de classification avec données manquantes. Une formulation probabiliste de la structure en cascade est alors utilisée pour incorporer les incertitudes introduites par ces données manquantes. Cette formulation nécessite l’estimation de probabilités a posteriori ainsi que le calcul de nouveaux seuils à chaque niveau de la cascade modifiée. Pour ces deux problèmes, plusieurs solutions sont proposées et de nombreux tests sont effectués pour déterminer la meilleure configuration. Enfin, les applications suivantes sont présentées : détection de visages tournés ou occultés à partir d’un détecteur de visages de face. L’adaptation du détecteur aux visages tournés nécessite l’utilisation d’un modèle géométrique 3D pour ajuster les positions des sous-fenêtres associées aux classifieurs faibles. / This thesis has been realized in the ISPR group (ImageS, Perception systems and Robotics) of the Institut Pascal with the ComSee team (Computers that See). My research is involved in a project called Bio Rafale. It was created by the compagny Vesalis in 2008 and it is funded by OSEO. Its goal is to improve the security in stadium using identification of dangerous fans. The applications of these works deal with face detection. It is the first step in the process chain of the project. Most efficient detectors use a cascade of boosted classifiers. The term cascade refers to a sequential succession of several classifiers. The term boosting refers to a set of learning algorithms that linearly combine several weak classifiers. The detector selected for this thesis also uses a cascade of boosted classifiers. The training of such a cascade needs a training database and an image feature. Here, covariance matrices are used as image feature. The limits of an object detector are fixed by its training stage. One of our contributions is to adapt an object detector to handle some of its limits. The proposed adaptations lead to a problem of classification with missing data. A probabilistic formulation of a cascade is then used to incorporate the uncertainty introduced by the missing data. This formulation involves the estimation of a posteriori probabilities and the computation of new rejection thresholds at each level of the modified cascade. For these two problems, several solutions are proposed and extensive tests are done to find the best configuration. Finally, our solution is applied to the detection of turned or occluded faces using just an uprigth face detector. Detecting the turned faces requires the use of a 3D geometric model to adjust the position of the subwindow associated with each weak classifier. Reconnaissance de forme Détection d’objets Apprentissage supervisé Classification Base d’apprentissage Visage Données manquantes Adaptation Pattern recognition Object detection Supervised learning Classification Training database Face Missing data Adaptation
366	Machine Learning for Solar Energy Prediction Ferrer Martínez, Claudia January 2018 (has links) This thesis consists of the study of different Machine Learning models used to predict solar power data in photovoltaic plants. The process of implement a model of Machine Learning will be reviewed step by step: to collect the data, to pre-process the data in order to make it able to use as input for the model, to divide the data into training data and testing data, to train the Machine Learning algorithm with the training data, to evaluate the algorithm with the testing data, and to make the necessary changes to achieve the best results. The thesis will start with a brief introduction to solar energy in one part, and an introduction to Machine Learning in another part. The theory of different models and algorithms of supervised learning will be reviewed, such as Decision Trees, Naïve Bayer Classification, Support Vector Machines (SVM), K-Nearest Neighbor (KNN), Linear Regression, Logistic Regression, Artificial Neural Network (ANN). Then, the methods Linear Regression, SVM Regression and Artificial Neural Network will be implemented using MATLAB in order to predict solar energy from historical data of photovoltaic plants. The data used to train and test the models is extracted from the National Renewable Energy Laboratory (NREL), that provides a dataset called “Solar Power Data for Integration Studies” intended for use by Project developers and university researchers. The dataset consist of 1 year of hourly power data for approximately 6000 simulated PV plants throughout the United States. Finally, once the different models have been implemented, the results show that the technique which provide the best results is Linear Regression. Machine Learning Solar Energy Forecasting MATLAB Supervised Learning Regression Model Engineering and Technology Teknik och teknologier Elektroteknik och elektronik
367	Agrupamento de dados semissupervisionado na geração de regras fuzzy Lopes, Priscilla de Abreu 27 August 2010 (has links) Submitted by Izabel Franco (izabel-franco@ufscar.br) on 2016-09-06T18:25:30Z No. of bitstreams: 1 DissPAL.pdf: 2245333 bytes, checksum: 24abfad37e7d0675d6cef494f4f41d1e (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-12T14:03:53Z (GMT) No. of bitstreams: 1 DissPAL.pdf: 2245333 bytes, checksum: 24abfad37e7d0675d6cef494f4f41d1e (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-12T14:04:01Z (GMT) No. of bitstreams: 1 DissPAL.pdf: 2245333 bytes, checksum: 24abfad37e7d0675d6cef494f4f41d1e (MD5) / Made available in DSpace on 2016-09-12T14:04:09Z (GMT). No. of bitstreams: 1 DissPAL.pdf: 2245333 bytes, checksum: 24abfad37e7d0675d6cef494f4f41d1e (MD5) Previous issue date: 2010-08-27 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Inductive learning is, traditionally, categorized as supervised and unsupervised. In supervised learning, the learning method is given a labeled data set (classes of data are known). Those data sets are adequate for problems of classification and regression. In unsupervised learning, unlabeled data are analyzed in order to identify structures embedded in data sets. Typically, clustering methods do not make use of previous knowledge, such as classes labels, to execute their job. The characteristics of recently acquired data sets, great volume and mixed attribute structures, contribute to research on better solutions for machine learning jobs. The proposed research fits into this context. It is about semi-supervised fuzzy clustering applied to the generation of sets of fuzzy rules. Semi-supervised clustering does its job by embodying some previous knowledge about the data set. The clustering results are, then, useful for labeling the remaining unlabeled data in the set. Following that, come to action the supervised learning algorithms aimed at generating fuzzy rules. This document contains theoretic concepts, that will help in understanding the research proposal, and a discussion about the context wherein is the proposal. Some experiments were set up to show that this may be an interesting solution for machine learning jobs that have encountered difficulties due to lack of available information about data. / O aprendizado indutivo é, tradicionalmente, dividido em supervisionado e não supervisionado. No aprendizado supervisionado é fornecido ao método de aprendizado um conjunto de dados rotulados (dados que tem a classe conhecida). Estes dados são adequados para problemas de classificação e regressão. No aprendizado não supervisionado são analisados dados não rotulados, com o objetivo de identificar estruturas embutidas no conjunto. Tipicamente, métodos de agrupamento não se utilizam de conhecimento prévio, como rótulos de classes, para desempenhar sua tarefa. A característica de conjuntos de dados atuais, grande volume e estruturas de atributos mistas, contribui para a busca de melhores soluções para tarefas de aprendizado de máquina. É neste contexto em que se encaixa esta proposta de pesquisa. Trata-se da aplicação de métodos de agrupamento fuzzy semi-supervisionados na geração de bases de regras fuzzy. Os métodos de agrupamento semi-supervisionados realizam sua tarefa incorporando algum conhecimento prévio a respeito do conjunto de dados. O resultado do agrupamento é, então, utilizado para rotulação do restante do conjunto. Em seguida, entram em ação algoritmos de aprendizado supervisionado que tem como objetivo gerar regras fuzzy. Este documento contém conceitos teóricos para compreensão da proposta de trabalho e uma discussão a respeito do contexto onde se encaixa a proposta. Alguns experimentos foram realizados a fim de mostrar que esta pode ser uma solução interessante para tarefas de aprendizado de máquina que encontram dificuldades devido à falta de informação disponível sobre dados. Aprendizado Semi-Supervisionado Agrupamento Fuzzy de Dados Geração de Regras Fuzzy Semi-Supervised Learning Fuzzy Data Clustering Fuzzy Rules Generation
368	Investigando a combina??o de t?cnicas de aprendizado semissupervisionado e classifica??o hier?rquica multirr?tulo Santos, Araken de Medeiros 25 May 2012 (has links) Made available in DSpace on 2015-03-03T15:48:39Z (GMT). No. of bitstreams: 1 ArakenMS_TESE.pdf: 4060697 bytes, checksum: 5efe25ac134a602cc32c96b66e749ea0 (MD5) Previous issue date: 2012-05-25 / Data classification is a task with high applicability in a lot of areas. Most methods for treating classification problems found in the literature dealing with single-label or traditional problems. In recent years has been identified a series of classification tasks in which the samples can be labeled at more than one class simultaneously (multi-label classification). Additionally, these classes can be hierarchically organized (hierarchical classification and hierarchical multi-label classification). On the other hand, we have also studied a new category of learning, called semi-supervised learning, combining labeled data (supervised learning) and non-labeled data (unsupervised learning) during the training phase, thus reducing the need for a large amount of labeled data when only a small set of labeled samples is available. Thus, since both the techniques of multi-label and hierarchical multi-label classification as semi-supervised learning has shown favorable results with its use, this work is proposed and used to apply semi-supervised learning in hierarchical multi-label classication tasks, so eciently take advantage of the main advantages of the two areas. An experimental analysis of the proposed methods found that the use of semi-supervised learning in hierarchical multi-label methods presented satisfactory results, since the two approaches were statistically similar results / A classifica??o de dados ? uma tarefa com alta aplicabilidade em uma grande quantidade de dom?nios. A maioria dos m?todos para tratar problemas de classifica??o encontrados na literatura, tratam problemas tradicionais ou unirr?tulo. Nos ?ltimos anos vem sendo identificada uma s?rie de tarefas de classifica??o nas quais os exemplos podem ser rotulados a mais de uma classe simultaneamente (classifica??o multirr?tulo). Adicionalmente, tais classes podem estar hierarquicamente organizadas (classifica??o hier?rquica e classifica??o hier?rquica multirr?tulo). Por outro lado, tem-se estudado tamb?m uma nova categoria de aprendizado, chamada de aprendizado semissupervisionado, que combina dados rotulados (aprendizado supervisionado) e dados n?o-rotulados (aprendizado n?o-supervisionado), durante a fase de treinamento, reduzindo, assim, a necessidade de uma grande quantidade de dados rotulados quando somente um pequeno conjunto de exemplos rotulados est? dispon?- vel. Desse modo, uma vez que tanto as t?cnicas de classifica??o multirr?tulo e hier?rquica multirr?tulo quanto o aprendizado semissupervisionado vem apresentando resultados favor ?veis ? sua utiliza??o, neste trabalho ? proposta e utilizada a aplica??o de aprendizado semissupervisionado em tarefas de classifica??o hier?rquica multirr?tulo, de modo a se atender eficientemente as principais necessidades das duas ?reas. Uma an?lise experimental dos m?todos propostos verificou que a utiliza??o do aprendizado semissupervisionado em m?todos de classifica??o hier?rquica multirr?tulo apresentou resultados satisfat?rios, uma vez que as duas abordagens apresentaram resultados estatisticamente semelhantes Classifica??o multirr?tulo Classifica??o hier?rquica multirr?tulo Aprendizado semissupervisionado Multi-label classification Hierarchical multi-label classification Semi-supervised learning
369	Semi-supervised co-selection : instances and features : application to diagnosis of dry port by rail / Co-selection instances-variables en mode semi-supervisé : application au diagnostic de transport ferroviaire. Makkhongkaew, Raywat 15 December 2016 (has links) Depuis la prolifération des bases de données partiellement étiquetées, l'apprentissage automatique a connu un développement important dans le mode semi-supervisé. Cette tendance est due à la difficulté de l'étiquetage des données d'une part et au coût induit de cet étiquetage quand il est possible, d'autre part.L'apprentissage semi-supervisé consiste en général à modéliser une fonction statistique à partir de base de données regroupant à la fois des exemples étiquetés et d'autres non-étiquetés. Pour aborder une telle problématique, deux familles d'approches existent : celles basées sur la propagation de la supervision en vue de la classification supervisée et celles basées sur les contraintes en vue du clustering (non-supervisé). Nous nous intéressons ici à la deuxième famille avec une difficulté particulière. Il s'agit d'apprendre à partir de données avec une partie étiquetée relativement très réduite par rapport à la partie non-étiquetée.Dans cette thèse, nous nous intéressons à l'optimisation des bases de données statistiques en vue de l'amélioration des modèles d'apprentissage. Cette optimisation peut être horizontale et/ou verticale. La première définit la sélection d'instances et la deuxième définit la tâche de la sélection de variables.Les deux taches sont habituellement étudiées de manière indépendante avec une série de travaux considérable dans la littérature. Nous proposons ici de les étudier dans un cadre simultané, ce qui définit la thématique de la co-sélection. Pour ce faire, nous proposons deux cadres unifiés considérant à la fois la partie étiquetée des données et leur partie non-étiquetée. Le premier cadre est basé sur un clustering pondéré sous contraintes et le deuxième sur la préservation de similarités entre les données. Les deux approches consistent à qualifier les instances et les variables pour en sélectionner les plus pertinentes de manière simultanée.Enfin, nous présentons une série d'études empiriques sur des données publiques connues de la littérature pour valider les approches proposées et les comparer avec d'autres approches connues dans la littérature. De plus, une validation expérimentale est fournie sur un problème réel, concernant le diagnostic de transport ferroviaire de l'état de la Thaïlande / We are drowning in massive data but starved for knowledge retrieval. It is well known through the dimensionality tradeoff that more data increase informative but pay a price in computational complexity, which has to be made up in some way. When the labeled sample size is too little to bring sufficient information about the target concept, supervised learning fail with this serious challenge. Unsupervised learning can be an alternative in this problem. However, as these algorithms ignore label information, important hints from labeled data are left out and this will generally downgrades the performance of unsupervised learning algorithms. Using both labeled and unlabeled data is expected to better procedure in semi-supervised learning, which is more adapted for large domain applications when labels are hardly and costly to obtain. In addition, when data are large, feature selection and instance selection are two important dual operations for removing irrelevant information. Both of tasks with semisupervised learning are different challenges for machine learning and data mining communities for data dimensionality reduction and knowledge retrieval. In this thesis, we focus on co-selection of instances and features in the context of semi-supervised learning. In this context, co-selection becomes a more challenging problem as the data contains labeled and unlabeled examples sampled from the same population. To do such semi-supervised coselection, we propose two unified frameworks, which efficiently integrate labeled and unlabeled parts into the co-selection process. The first framework is based on weighting constrained clustering and the second one is based on similarity preserving selection. Both approaches evaluate the usefulness of features and instances in order to select the most relevant ones, simultaneously. Finally, we present a variety of empirical studies over high-dimensional data sets, which are well-known in the literature. The results are promising and prove the efficiency and effectiveness of the proposed approaches. In addition, the developed methods are validated on a real world application, over data provided by the State Railway of Thailand (SRT). The purpose is to propose the application models from our methodological contributions to diagnose the performance of rail dry port systems. First, we present the results of some ensemble methods applied on a first data set, which is fully labeled. Second, we show how can our co-selection approaches improve the performance of learning algorithms over partially labeled data provided by SRT Sélection d'instances Sélection de variables Co-selection Apprentissage semi-supervisé Classification sous contraintes Instance selection Feature selecion Co-selection Semi-supervised learning Constrained clustering 006.3
370	Uma plataforma móvel para estudos de autonomia. / A móbile platform for autonomy studies. Sergio Ribeiro Augusto 29 March 2007 (has links) Neste trabalho é proposta uma plataforma robótica móvel, concebida de maneira modular e hierárquica, visando o estudo de diversos aspectos aplicados à navegação, tanto autônoma quanto semi-autônoma, em ambientes internos. O sistema proposto possibilita a implementação de arquiteturas reativas e híbridas com aprendizagem, sendo a importância e limitações desta última discutidas. Utilizando a plataforma desenvolvida, uma aplicação de navegação robótica com aprendizagem supervisionada é realizada, usando sensores de ultra-som e através de tele-operação. O objetivo é fazer com que o agente associe, em tempo real, suas próprias respostas sensoriais com as ações motoras realizadas pelo tele-operador, permitindo que a tarefa seja repetida autonomamente com alguma generalização. Para realizar tal mapeamento, uma rede de função de base radial (RBF), usando um algoritmo de aprendizado seqüencial, é apresentada e utilizada. / This work presents a mobile robotic platform, built as a modular and hierarchical approach, aiming at the study of several aspects of indoor navigation. The proposed system allows the implementation of reactive and hybrid architectures with learning, for autonomous or semi-autonomous navigation. The importance and limitations of the learning characteristics are discussed. An application of robotic navigation with supervised learning is implemented using ultrasonic sensors and teleoperation. The aim is the agent to associate, in real time, its own sensorial perception to the motor actions realized by a teleoperator, allowing the task to be repeated in an autonomous way, with some generalization. To make the corresponding mapping, a radial basis function network (RBF), trained by a sequential learning algorithm, is presented and used. Aprendizagem supervisionada Arquitetura de robótica móvel Autonomia Redes de função de base radial Tele-operação Autonomy Mobile robotic architecture Radial basis function network Supervised learning Teleoperation

Search results