Global ETD Search

111	Semi-Supervised Gait Recognition Mitra, Sirshapan 01 January 2024 (has links) (PDF) In this work, we examine semi-supervised learning for Gait recognition with a limited number of labeled samples. Our research focus on two distinct aspects for limited labels, 1)closed-set: with limited labeled samples per individual, and 2) open-set: with limited labeled individuals. We find open-set poses greater challenge compared to closed-set thus, having more labeled ids is important for performance than having more labeled samples per id. Moreover, obtaining labeled samples for a large number of individuals is usually more challenging, therefore limited id setup (closed-setup) is more important to study where most of the training samples belong to unknown ids. We further analyze that existing semi-supervised learning approaches are not well suited for scenario where unlabeled samples belong to novel ids. We propose a simple prototypical self-training approach to solve this problem, where, we integrate semi-supervised learning for closed set setting with self-training which can effectively utilize unlabeled samples from unknown ids. To further alleviate the challenges of limited labeled samples, we explore the role of synthetic data where we utilize diffusion model to generate samples from both known and unknown ids. We perform our experiments on two different Gait recognition benchmarks, CASIA-B and OUMVLP, and provide a comprehensive evaluation of the proposed method. The proposed approach is effective and generalizable for both closed and open-set settings. With merely 20% of labeled samples, we were able to achieve performance competitive to supervised methods utilizing 100% labeled samples while outperforming existing semi-supervised methods. Deep Learning Semi-Supervised Learning Gait Recognition Computer Sciences
112	Defect prediction on production line Khalfaoui, S., Manouvrier, E., Briot, A., Delaux, D., Butel, S., Ibrahim, Jesutofunmi, Kanyere, Tatenda, Orimogunje, Bola, Abdullatif, Amr A.A., Neagu, Daniel 29 March 2022 (has links) Yes / Quality control has long been one of the most challenging fields of manufacturing. The development of advanced sensors and the easier collection of high amounts of data designate the machine learning techniques as a timely natural step forward to leverage quality decision support and manufacturing challenges. This paper introduces an original dataset provided by the automotive supplier company VALEO, coming from a production line, and hosted by the École Normale Supérieure (ENS) Data Challenge to predict defects using non-anonymised features, without access to final test results, to validate the part status (defective or not). We propose in this paper a complete workflow from data exploration to the modelling phase while addressing at each stage challenges and techniques to solve them, as a benchmark reference. The proposed workflow is validated in series of experiments that demonstrate the benefits, challenges and impact of data science adoption in manufacturing. Manufacturing Quality control Defect prediction Machine learning Supervised learning
113	General discriminative optimization for point set registration Zhao, Y., Tang, W., Feng, J., Wan, Tao Ruan, Xi, L. 26 March 2022 (has links) Yes / Point set registration has been actively studied in computer vision and graphics. Optimization algorithms are at the core of solving registration problems. Traditional optimization approaches are mainly based on the gradient of objective functions. The derivation of objective functions makes it challenging to find optimal solutions for complex optimization models, especially for those applications where accuracy is critical. Learning-based optimization is a novel approach to address this problem, which learns the gradient direction from datasets. However, many learning-based optimization algorithms learn gradient directions via a single feature extracted from the dataset, which will cause the updating direction to be vulnerable to perturbations around the data, thus falling into a bad stationary point. This paper proposes the General Discriminative Optimization (GDO) method that updates a gradient path automatically through the trade-off among contributions of different features on updating gradients. We illustrate the benefits of GDO with tasks of 3D point set registrations and show that GDO outperforms the state-of-the-art registration methods in terms of accuracy and robustness to perturbations. Point set registration Supervised learning Learning-based optimisation
114	Classificação semi-supervisionada baseada em desacordo por similaridade / Semi-supervised learning based in disagreement by similarity Gutiérrez, Victor Antonio Laguna 03 May 2010 (has links) O aprendizado semi-supervisionado é um paradigma do aprendizado de máquina no qual a hipótese é induzida aproveitando tanto os dados rotulados quantos os dados não rotulados. Este paradigma é particularmente útil quando a quantidade de exemplos rotulados é muito pequena e a rotulação manual dos exemplos é uma tarefa muito custosa. Nesse contexto, foi proposto o algoritmo Cotraining, que é um algoritmo muito utilizado no cenário semi-supervisionado, especialmente quando existe mais de uma visão dos dados. Esta característica do algoritmo Cotraining faz com que a sua aplicabilidade seja restrita a domínios multi-visão, o que diminui muito o potencial do algoritmo para resolver problemas reais. Nesta dissertação, é proposto o algoritmo Co2KNN, que é uma versão mono-visão do algoritmo Cotraining na qual, ao invés de combinar duas visões dos dados, combina duas estratégias diferentes de induzir classificadores utilizando a mesma visão dos dados. Tais estratégias são chamados de k-vizinhos mais próximos (KNN) Local e Global. No KNN Global, a vizinhança utilizada para predizer o rótulo de um exemplo não rotulado é conformada por aqueles exemplos que contém o novo exemplo entre os seus k vizinhos mais próximos. Entretanto, o KNN Local considera a estratégia tradicional do KNN para recuperar a vizinhança de um novo exemplo. A teoria do Aprendizado Semi-supervisionado Baseado em Desacordo foi utilizada para definir a base teórica do algoritmo Co2KNN, pois argumenta que para o sucesso do algoritmo Cotraining, é suficiente que os classificadores mantenham um grau de desacordo que permita o processo de aprendizado conjunto. Para avaliar o desempenho do Co2KNN, foram executados diversos experimentos que sugerem que o algoritmo Co2KNN tem melhor performance que diferentes algoritmos do estado da arte, especificamente, em domínios mono-visão. Adicionalmente, foi proposto um algoritmo otimizado para diminuir a complexidade computacional do KNN Global, permitindo o uso do Co2KNN em problemas reais de classificação / Semi-supervised learning is a machine learning paradigm in which the induced hypothesis is improved by taking advantage of unlabeled data. Semi-supervised learning is particularly useful when labeled data is scarce and difficult to obtain. In this context, the Cotraining algorithm was proposed. Cotraining is a widely used semisupervised approach that assumes the availability of two independent views of the data. In most real world scenarios, the multi-view assumption is highly restrictive, impairing its usability for classifification purposes. In this work, we propose the Co2KNN algorithm, which is a one-view Cotraining approach that combines two different k-Nearest Neighbors (KNN) strategies referred to as global and local k-Nearest Neighbors. In the global KNN, the nearest neighbors used to classify a new instance are given by the set of training examples which contains this instance within its k-nearest neighbors. In the local KNN, on the other hand, the neighborhood considered to classify a new instance is the set of training examples computed by the traditional KNN approach. The Co2KNN algorithm is based on the theoretical background given by the Semi-supervised Learning by Disagreement, which claims that the success of the combination of two classifiers in the Cotraining framework is due to the disagreement between the classifiers. We carried out experiments showing that Co2KNN improves significatively the classification accuracy specially when just one view of training data is available. Moreover, we present an optimized algorithm to cope with time complexity of computing the global KNN, allowing Co2KNN to tackle real classification problems Aprendizado baseado em desacordo Aprendizado semi-supervisionado Classificação Classification Contraining Cotraining Semi-supervised leaning
115	On discriminative semi-supervised incremental learning with a multi-view perspective for image concept modeling Byun, Byungki 17 January 2012 (has links) This dissertation presents the development of a semi-supervised incremental learning framework with a multi-view perspective for image concept modeling. For reliable image concept characterization, having a large number of labeled images is crucial. However, the size of the training set is often limited due to the cost required for generating concept labels associated with objects in a large quantity of images. To address this issue, in this research, we propose to incrementally incorporate unlabeled samples into a learning process to enhance concept models originally learned with a small number of labeled samples. To tackle the sub-optimality problem of conventional techniques, the proposed incremental learning framework selects unlabeled samples based on an expected error reduction function that measures contributions of the unlabeled samples based on their ability to increase the modeling accuracy. To improve the convergence property of the proposed incremental learning framework, we further propose a multi-view learning approach that makes use of multiple features such as color, texture, etc., of images when including unlabeled samples. For robustness to mismatches between training and testing conditions, a discriminative learning algorithm, namely a kernelized maximal- figure-of-merit (kMFoM) learning approach is also developed. Combining individual techniques, we conduct a set of experiments on various image concept modeling problems, such as handwritten digit recognition, object recognition, and image spam detection to highlight the effectiveness of the proposed framework. Discriminative learning Semi-supervised learning Incremental learning Image modeling Multi-view learning Machine learning Supervised learning (Machine learning) Boosting (Algorithms)
116	Learning with Limited Supervision by Input and Output Coding Zhang, Yi 01 May 2012 (has links) In many real-world applications of supervised learning, only a limited number of labeled examples are available because the cost of obtaining high-quality examples is high. Even with a relatively large number of labeled examples, the learning problem may still suffer from limited supervision as the complexity of the prediction function increases. Therefore, learning with limited supervision presents a major challenge to machine learning. With the goal of supervision reduction, this thesis studies the representation, discovery and incorporation of extra input and output information in learning. Information about the input space can be encoded by regularization. We first design a semi-supervised learning method for text classification that encodes the correlation of words inferred from seemingly irrelevant unlabeled text. We then propose a multi-task learning framework with a matrix-normal penalty, which compactly encodes the covariance structure of the joint input space of multiple tasks. To capture structure information that is more general than covariance and correlation, we study a class of regularization penalties on model compressibility. Then we design the projection penalty, which encodes the structure information from a dimension reduction while controlling the risk of information loss. Information about the output space can be exploited by error correcting output codes. Using the composite likelihood view, we propose an improved pairwise coding for multi-label classification, which encodes pairwise label density (as opposed to label comparisons) and decodes using variational methods. We then investigate problemdependent codes, where the encoding is learned from data instead of being predefined. We first propose a multi-label output code using canonical correlation analysis, where predictability of the code is optimized. We then argue that both discriminability and predictability are critical for output coding, and propose a max-margin formulation that promotes both discriminative and predictable codes. We empirically study our methods in a wide spectrum of applications, including document categorization, landmine detection, face recognition, brain signal classification, handwritten digit recognition, house price forecasting, music emotion prediction, medical decision, email analysis, gene function classification, outdoor scene recognition, and so forth. In all these applications, our proposed methods for encoding input and output information lead to significantly improved prediction performance. regularization error-correcting output codes supervised learning semi-supervised learning multi-task learning multi-label classification dimensionality reduction Computer Sciences
117	Semi-Supervised Learning for Object Detection Rosell, Mikael January 2015 (has links) Many automotive safety applications in modern cars make use of cameras and object detection to analyze the surrounding environment. Pedestrians, animals and other vehicles can be detected and safety actions can be taken before dangerous situations arise. To detect occurrences of the different objects, these systems are traditionally trained to learn a classification model using a set of images that carry labels corresponding to their content. To obtain high performance with a variety of object appearances, the required amount of data is very large. Acquiring unlabeled images is easy, while the manual work of labeling is both time-consuming and costly. Semi-supervised learning refers to methods that utilize both labeled and unlabeled data, a situation that is highly desirable if it can lead to improved accuracy and at the same time alleviate the demand of labeled data. This has been an active area of research in the last few decades, but few studies have investigated the performance of these algorithms in larger systems. In this thesis, we investigate if and how semi-supervised learning can be used in a large-scale pedestrian detection system. With the area of application being automotive safety, where real-time performance is of high importance, the work is focused around boosting classifiers. Results are presented on a few publicly available UCI data sets and on a large data set for pedestrian detection captured in real-life traffic situations. By evaluating the algorithms on the pedestrian data set, we add the complexity of data set size, a large variety of object appearances and high input dimension. It is possible to find situations in low dimensions where an additional set of unlabeled data can be used successfully to improve a classification model, but the results show that it is hard to efficiently utilize semi-supervised learning in large-scale object detection systems. The results are hard to scale to large data sets of higher dimensions as pair-wise computations are of high complexity and proper similarity measures are hard to find. semi-supervised learning object detection pedestrian detection boosting machine learning supervised learning adaboost semiboost regboost self-learning
118	Classificação semi-supervisionada baseada em desacordo por similaridade / Semi-supervised learning based in disagreement by similarity Victor Antonio Laguna Gutiérrez 03 May 2010 (has links) O aprendizado semi-supervisionado é um paradigma do aprendizado de máquina no qual a hipótese é induzida aproveitando tanto os dados rotulados quantos os dados não rotulados. Este paradigma é particularmente útil quando a quantidade de exemplos rotulados é muito pequena e a rotulação manual dos exemplos é uma tarefa muito custosa. Nesse contexto, foi proposto o algoritmo Cotraining, que é um algoritmo muito utilizado no cenário semi-supervisionado, especialmente quando existe mais de uma visão dos dados. Esta característica do algoritmo Cotraining faz com que a sua aplicabilidade seja restrita a domínios multi-visão, o que diminui muito o potencial do algoritmo para resolver problemas reais. Nesta dissertação, é proposto o algoritmo Co2KNN, que é uma versão mono-visão do algoritmo Cotraining na qual, ao invés de combinar duas visões dos dados, combina duas estratégias diferentes de induzir classificadores utilizando a mesma visão dos dados. Tais estratégias são chamados de k-vizinhos mais próximos (KNN) Local e Global. No KNN Global, a vizinhança utilizada para predizer o rótulo de um exemplo não rotulado é conformada por aqueles exemplos que contém o novo exemplo entre os seus k vizinhos mais próximos. Entretanto, o KNN Local considera a estratégia tradicional do KNN para recuperar a vizinhança de um novo exemplo. A teoria do Aprendizado Semi-supervisionado Baseado em Desacordo foi utilizada para definir a base teórica do algoritmo Co2KNN, pois argumenta que para o sucesso do algoritmo Cotraining, é suficiente que os classificadores mantenham um grau de desacordo que permita o processo de aprendizado conjunto. Para avaliar o desempenho do Co2KNN, foram executados diversos experimentos que sugerem que o algoritmo Co2KNN tem melhor performance que diferentes algoritmos do estado da arte, especificamente, em domínios mono-visão. Adicionalmente, foi proposto um algoritmo otimizado para diminuir a complexidade computacional do KNN Global, permitindo o uso do Co2KNN em problemas reais de classificação / Semi-supervised learning is a machine learning paradigm in which the induced hypothesis is improved by taking advantage of unlabeled data. Semi-supervised learning is particularly useful when labeled data is scarce and difficult to obtain. In this context, the Cotraining algorithm was proposed. Cotraining is a widely used semisupervised approach that assumes the availability of two independent views of the data. In most real world scenarios, the multi-view assumption is highly restrictive, impairing its usability for classifification purposes. In this work, we propose the Co2KNN algorithm, which is a one-view Cotraining approach that combines two different k-Nearest Neighbors (KNN) strategies referred to as global and local k-Nearest Neighbors. In the global KNN, the nearest neighbors used to classify a new instance are given by the set of training examples which contains this instance within its k-nearest neighbors. In the local KNN, on the other hand, the neighborhood considered to classify a new instance is the set of training examples computed by the traditional KNN approach. The Co2KNN algorithm is based on the theoretical background given by the Semi-supervised Learning by Disagreement, which claims that the success of the combination of two classifiers in the Cotraining framework is due to the disagreement between the classifiers. We carried out experiments showing that Co2KNN improves significatively the classification accuracy specially when just one view of training data is available. Moreover, we present an optimized algorithm to cope with time complexity of computing the global KNN, allowing Co2KNN to tackle real classification problems Aprendizado baseado em desacordo Aprendizado semi-supervisionado Classificação Contraining Classification Cotraining Semi-supervised leaning
119	Using Semi-supervised Clustering for Neurons Classification Fakhraee Seyedabad, Ali January 2013 (has links) We wish to understand brain; discover its sophisticated ways of calculations to invent improved computational methods. To decipher any complex system, first its components should be understood. Brain comprises neurons. Neurobiologists use morphologic properties like “somatic perimeter”, “axonal length”, and “number of dendrites” to classify neurons. They have discerned two types of neurons: “interneurons” and “pyramidal cells”, and have a consensus about five classes of interneurons: PV, 2/3, Martinotti, Chandelier, and NPY. They still need a more refined classification of interneurons because they suppose its known classes may contain subclasses or new classes may arise. This is a difficult process because of the great number and diversity of interneurons and lack of objective indices to classify them. Machine learning—automatic learning from data—can overcome the mentioned difficulties, but it needs a data set to learn from. To meet this demand neurobiologists compiled a data set from measuring 67 morphologic properties of 220 interneurons of mouse brains; they also labeled some of the samples—i.e. added their opinion about the sample’s classes. This project aimed to use machine learning to determine the true number of classes within the data set, classes of the unlabeled samples, and the accuracy of the available class labels. We used K-means, seeded K-means, and constrained K-means, and clustering validity techniques to achieve our objectives. Our results indicate that: the data set contains seven classes; seeded K-means outperforms K-means and constrained K-means; chandelier and 2/3 are the most consistent classes, whereas PV and Martinotti are the least consistent ones. machine learning clustering semi-supervised learning semi-supervised clustering data mining number of clusters Computer Sciences Datavetenskap (datalogi)
120	Classification d’objets au moyen de machines à vecteurs supports dans les images de sonar de haute résolution du fond marin / Object classification using support vector machines in high resolution sonar seabed imagery Rousselle, Denis 28 November 2016 (has links) Cette thèse a pour objectif d'améliorer la classification d'objets sous-marins dans des images sonar haute résolution. En particulier, il s'agit de distinguer les mines des objets inoffensifs parmi une collection d'objets ressemblant à des mines. Nos recherches ont été dirigées par deux contraintes classiques en guerre de la mine : d'une part, le manque de données et d'autre part, le besoin de lisibilité des décisions. Nous avons donc constitué une base de données la plus représentative possible et simulé des objets dans le but de la compléter. Le manque d'exemples nous a mené à utiliser une représentation compacte, issue de la reconnaissance de visages : les Structural Binary Gradient Patterns (SBGP). Dans la même optique, nous avons dérivé une méthode d'adaptation de domaine semi-supervisée, basée sur le transport optimal, qui peut être facilement interprétable. Enfin, nous avons développé un nouvel algorithme de classification : les Ensemble of Exemplar-Maximum Excluding Ball (EE-MEB) qui sont à la fois adaptés à des petits jeux de données mais dont la décision est également aisément analysable / This thesis aims to improve the classification of underwater objects in high resolution sonar images. Especially, we seek to make the distinction between mines and harmless objects from a collection of mine-like objects. Our research was led by two classical constraints of the mine warfare : firstly, the lack of data and secondly, the need for readability of the classification. In this context, we built a database as much representative as possible and simulated objects in order to complete it. The lack of examples led us to use a compact representation, originally used by the face recognition community : the Structural Binary Gradient Patterns (SBGP). To the same end, we derived a method of semi-supervised domain adaptation, based on optimal transport, that can be easily interpreted. Finally, we developed a new classification algorithm : the Ensemble of Exemplar-Maximum Excluding Ball (EE-MEB) which is suitable for small datasets and with an easily interpretable decision function Guerre des mines Classification d’images Adaptation de domaine semi-supervisée. Mine warfare Images classification Supervised learning Optimal transport Semi-supervised domain adaptation

Search results