Global ETD Search

111	Design Optimization of Fuzzy Logic Systems Dadone, Paolo 29 May 2001 (has links) Fuzzy logic systems are widely used for control, system identification, and pattern recognition problems. In order to maximize their performance, it is often necessary to undertake a design optimization process in which the adjustable parameters defining a particular fuzzy system are tuned to maximize a given performance criterion. Some data to approximate are commonly available and yield what is called the supervised learning problem. In this problem we typically wish to minimize the sum of the squares of errors in approximating the data. We first introduce fuzzy logic systems and the supervised learning problem that, in effect, is a nonlinear optimization problem that at times can be non-differentiable. We review the existing approaches and discuss their weaknesses and the issues involved. We then focus on one of these problems, i.e., non-differentiability of the objective function, and show how current approaches that do not account for non-differentiability can diverge. Moreover, we also show that non-differentiability may also have an adverse practical impact on algorithmic performances. We reformulate both the supervised learning problem and piecewise linear membership functions in order to obtain a polynomial or factorable optimization problem. We propose the application of a global nonconvex optimization approach, namely, a reformulation and linearization technique. The expanded problem dimensionality does not make this approach feasible at this time, even though this reformulation along with the proposed technique still bears a theoretical interest. Moreover, some future research directions are identified. We propose a novel approach to step-size selection in batch training. This approach uses a limited memory quadratic fit on past convergence data. Thus, it is similar to response surface methodologies, but it differs from them in the type of data that are used to fit the model, that is, already available data from the history of the algorithm are used instead of data obtained according to an experimental design. The step-size along the update direction (e.g., negative gradient or deflected negative gradient) is chosen according to a criterion of minimum distance from the vertex of the quadratic model. This approach rescales the complexity in the step-size selection from the order of the (large) number of training data, as in the case of exact line searches, to the order of the number of parameters (generally lower than the number of training data). The quadratic fit approach and a reduced variant are tested on some function approximation examples yielding distributions of the final mean square errors that are improved (i.e., skewed toward lower errors) with respect to the ones in the commonly used pattern-by-pattern approach. Moreover, the quadratic fit is also competitive and sometimes better than the batch training with optimal step-sizes, thus showing an improved performance of this approach. The quadratic fit approach is also tested in conjunction with gradient deflection strategies and memoryless variable metric methods, showing errors smaller by 1 to 7 orders of magnitude. Moreover, the convergence speed by using either the negative gradient direction or a deflected direction is higher than that of the pattern-by-pattern approach, although the computational cost of the algorithm per iteration is moderately higher than the one of the pattern-by-pattern method. Finally, some directions for future research are identified. / Ph. D. Non-differentiable optimization Supervised learning Optimization Fuzzy logic systems
112	Interactively Guiding Semi-Supervised Clustering via Attribute-based Explanations Lad, Shrenik 01 July 2015 (has links) Unsupervised image clustering is a challenging and often ill-posed problem. Existing image descriptors fail to capture the clustering criterion well, and more importantly, the criterion itself may depend on (unknown) user preferences. Semi-supervised approaches such as distance metric learning and constrained clustering thus leverage user-provided annotations indicating which pairs of images belong to the same cluster (must-link) and which ones do not (cannot-link). These approaches require many such constraints before achieving good clustering performance because each constraint only provides weak cues about the desired clustering. In this work, we propose to use image attributes as a modality for the user to provide more informative cues. In particular, the clustering algorithm iteratively and actively queries a user with an image pair. Instead of the user simply providing a must-link/cannot-link constraint for the pair, the user also provides an attribute-based reasoning e.g. "these two images are similar because both are natural and have still water'' or "these two people are dissimilar because one is way older than the other''. Under the guidance of this explanation, and equipped with attribute predictors, many additional constraints are automatically generated. We demonstrate the effectiveness of our approach by incorporating the proposed attribute-based explanations in three standard semi-supervised clustering algorithms: Constrained K-Means, MPCK-Means, and Spectral Clustering, on three domains: scenes, shoes, and faces, using both binary and relative attributes. / Master of Science Computer Vision Semi-Supervised Clustering Attributes Human-Machine Communication
113	Semi-Supervised Gait Recognition Mitra, Sirshapan 01 January 2024 (has links) (PDF) In this work, we examine semi-supervised learning for Gait recognition with a limited number of labeled samples. Our research focus on two distinct aspects for limited labels, 1)closed-set: with limited labeled samples per individual, and 2) open-set: with limited labeled individuals. We find open-set poses greater challenge compared to closed-set thus, having more labeled ids is important for performance than having more labeled samples per id. Moreover, obtaining labeled samples for a large number of individuals is usually more challenging, therefore limited id setup (closed-setup) is more important to study where most of the training samples belong to unknown ids. We further analyze that existing semi-supervised learning approaches are not well suited for scenario where unlabeled samples belong to novel ids. We propose a simple prototypical self-training approach to solve this problem, where, we integrate semi-supervised learning for closed set setting with self-training which can effectively utilize unlabeled samples from unknown ids. To further alleviate the challenges of limited labeled samples, we explore the role of synthetic data where we utilize diffusion model to generate samples from both known and unknown ids. We perform our experiments on two different Gait recognition benchmarks, CASIA-B and OUMVLP, and provide a comprehensive evaluation of the proposed method. The proposed approach is effective and generalizable for both closed and open-set settings. With merely 20% of labeled samples, we were able to achieve performance competitive to supervised methods utilizing 100% labeled samples while outperforming existing semi-supervised methods. Deep Learning Semi-Supervised Learning Gait Recognition Computer Sciences
114	Classificação semi-supervisionada baseada em desacordo por similaridade / Semi-supervised learning based in disagreement by similarity Gutiérrez, Victor Antonio Laguna 03 May 2010 (has links) O aprendizado semi-supervisionado é um paradigma do aprendizado de máquina no qual a hipótese é induzida aproveitando tanto os dados rotulados quantos os dados não rotulados. Este paradigma é particularmente útil quando a quantidade de exemplos rotulados é muito pequena e a rotulação manual dos exemplos é uma tarefa muito custosa. Nesse contexto, foi proposto o algoritmo Cotraining, que é um algoritmo muito utilizado no cenário semi-supervisionado, especialmente quando existe mais de uma visão dos dados. Esta característica do algoritmo Cotraining faz com que a sua aplicabilidade seja restrita a domínios multi-visão, o que diminui muito o potencial do algoritmo para resolver problemas reais. Nesta dissertação, é proposto o algoritmo Co2KNN, que é uma versão mono-visão do algoritmo Cotraining na qual, ao invés de combinar duas visões dos dados, combina duas estratégias diferentes de induzir classificadores utilizando a mesma visão dos dados. Tais estratégias são chamados de k-vizinhos mais próximos (KNN) Local e Global. No KNN Global, a vizinhança utilizada para predizer o rótulo de um exemplo não rotulado é conformada por aqueles exemplos que contém o novo exemplo entre os seus k vizinhos mais próximos. Entretanto, o KNN Local considera a estratégia tradicional do KNN para recuperar a vizinhança de um novo exemplo. A teoria do Aprendizado Semi-supervisionado Baseado em Desacordo foi utilizada para definir a base teórica do algoritmo Co2KNN, pois argumenta que para o sucesso do algoritmo Cotraining, é suficiente que os classificadores mantenham um grau de desacordo que permita o processo de aprendizado conjunto. Para avaliar o desempenho do Co2KNN, foram executados diversos experimentos que sugerem que o algoritmo Co2KNN tem melhor performance que diferentes algoritmos do estado da arte, especificamente, em domínios mono-visão. Adicionalmente, foi proposto um algoritmo otimizado para diminuir a complexidade computacional do KNN Global, permitindo o uso do Co2KNN em problemas reais de classificação / Semi-supervised learning is a machine learning paradigm in which the induced hypothesis is improved by taking advantage of unlabeled data. Semi-supervised learning is particularly useful when labeled data is scarce and difficult to obtain. In this context, the Cotraining algorithm was proposed. Cotraining is a widely used semisupervised approach that assumes the availability of two independent views of the data. In most real world scenarios, the multi-view assumption is highly restrictive, impairing its usability for classifification purposes. In this work, we propose the Co2KNN algorithm, which is a one-view Cotraining approach that combines two different k-Nearest Neighbors (KNN) strategies referred to as global and local k-Nearest Neighbors. In the global KNN, the nearest neighbors used to classify a new instance are given by the set of training examples which contains this instance within its k-nearest neighbors. In the local KNN, on the other hand, the neighborhood considered to classify a new instance is the set of training examples computed by the traditional KNN approach. The Co2KNN algorithm is based on the theoretical background given by the Semi-supervised Learning by Disagreement, which claims that the success of the combination of two classifiers in the Cotraining framework is due to the disagreement between the classifiers. We carried out experiments showing that Co2KNN improves significatively the classification accuracy specially when just one view of training data is available. Moreover, we present an optimized algorithm to cope with time complexity of computing the global KNN, allowing Co2KNN to tackle real classification problems Aprendizado baseado em desacordo Aprendizado semi-supervisionado Classificação Classification Contraining Cotraining Semi-supervised leaning
115	On discriminative semi-supervised incremental learning with a multi-view perspective for image concept modeling Byun, Byungki 17 January 2012 (has links) This dissertation presents the development of a semi-supervised incremental learning framework with a multi-view perspective for image concept modeling. For reliable image concept characterization, having a large number of labeled images is crucial. However, the size of the training set is often limited due to the cost required for generating concept labels associated with objects in a large quantity of images. To address this issue, in this research, we propose to incrementally incorporate unlabeled samples into a learning process to enhance concept models originally learned with a small number of labeled samples. To tackle the sub-optimality problem of conventional techniques, the proposed incremental learning framework selects unlabeled samples based on an expected error reduction function that measures contributions of the unlabeled samples based on their ability to increase the modeling accuracy. To improve the convergence property of the proposed incremental learning framework, we further propose a multi-view learning approach that makes use of multiple features such as color, texture, etc., of images when including unlabeled samples. For robustness to mismatches between training and testing conditions, a discriminative learning algorithm, namely a kernelized maximal- figure-of-merit (kMFoM) learning approach is also developed. Combining individual techniques, we conduct a set of experiments on various image concept modeling problems, such as handwritten digit recognition, object recognition, and image spam detection to highlight the effectiveness of the proposed framework. Discriminative learning Semi-supervised learning Incremental learning Image modeling Multi-view learning Machine learning Supervised learning (Machine learning) Boosting (Algorithms)
116	Learning with Limited Supervision by Input and Output Coding Zhang, Yi 01 May 2012 (has links) In many real-world applications of supervised learning, only a limited number of labeled examples are available because the cost of obtaining high-quality examples is high. Even with a relatively large number of labeled examples, the learning problem may still suffer from limited supervision as the complexity of the prediction function increases. Therefore, learning with limited supervision presents a major challenge to machine learning. With the goal of supervision reduction, this thesis studies the representation, discovery and incorporation of extra input and output information in learning. Information about the input space can be encoded by regularization. We first design a semi-supervised learning method for text classification that encodes the correlation of words inferred from seemingly irrelevant unlabeled text. We then propose a multi-task learning framework with a matrix-normal penalty, which compactly encodes the covariance structure of the joint input space of multiple tasks. To capture structure information that is more general than covariance and correlation, we study a class of regularization penalties on model compressibility. Then we design the projection penalty, which encodes the structure information from a dimension reduction while controlling the risk of information loss. Information about the output space can be exploited by error correcting output codes. Using the composite likelihood view, we propose an improved pairwise coding for multi-label classification, which encodes pairwise label density (as opposed to label comparisons) and decodes using variational methods. We then investigate problemdependent codes, where the encoding is learned from data instead of being predefined. We first propose a multi-label output code using canonical correlation analysis, where predictability of the code is optimized. We then argue that both discriminability and predictability are critical for output coding, and propose a max-margin formulation that promotes both discriminative and predictable codes. We empirically study our methods in a wide spectrum of applications, including document categorization, landmine detection, face recognition, brain signal classification, handwritten digit recognition, house price forecasting, music emotion prediction, medical decision, email analysis, gene function classification, outdoor scene recognition, and so forth. In all these applications, our proposed methods for encoding input and output information lead to significantly improved prediction performance. regularization error-correcting output codes supervised learning semi-supervised learning multi-task learning multi-label classification dimensionality reduction Computer Sciences
117	Semi-Supervised Learning for Object Detection Rosell, Mikael January 2015 (has links) Many automotive safety applications in modern cars make use of cameras and object detection to analyze the surrounding environment. Pedestrians, animals and other vehicles can be detected and safety actions can be taken before dangerous situations arise. To detect occurrences of the different objects, these systems are traditionally trained to learn a classification model using a set of images that carry labels corresponding to their content. To obtain high performance with a variety of object appearances, the required amount of data is very large. Acquiring unlabeled images is easy, while the manual work of labeling is both time-consuming and costly. Semi-supervised learning refers to methods that utilize both labeled and unlabeled data, a situation that is highly desirable if it can lead to improved accuracy and at the same time alleviate the demand of labeled data. This has been an active area of research in the last few decades, but few studies have investigated the performance of these algorithms in larger systems. In this thesis, we investigate if and how semi-supervised learning can be used in a large-scale pedestrian detection system. With the area of application being automotive safety, where real-time performance is of high importance, the work is focused around boosting classifiers. Results are presented on a few publicly available UCI data sets and on a large data set for pedestrian detection captured in real-life traffic situations. By evaluating the algorithms on the pedestrian data set, we add the complexity of data set size, a large variety of object appearances and high input dimension. It is possible to find situations in low dimensions where an additional set of unlabeled data can be used successfully to improve a classification model, but the results show that it is hard to efficiently utilize semi-supervised learning in large-scale object detection systems. The results are hard to scale to large data sets of higher dimensions as pair-wise computations are of high complexity and proper similarity measures are hard to find. semi-supervised learning object detection pedestrian detection boosting machine learning supervised learning adaboost semiboost regboost self-learning
118	Classificação semi-supervisionada baseada em desacordo por similaridade / Semi-supervised learning based in disagreement by similarity Victor Antonio Laguna Gutiérrez 03 May 2010 (has links) O aprendizado semi-supervisionado é um paradigma do aprendizado de máquina no qual a hipótese é induzida aproveitando tanto os dados rotulados quantos os dados não rotulados. Este paradigma é particularmente útil quando a quantidade de exemplos rotulados é muito pequena e a rotulação manual dos exemplos é uma tarefa muito custosa. Nesse contexto, foi proposto o algoritmo Cotraining, que é um algoritmo muito utilizado no cenário semi-supervisionado, especialmente quando existe mais de uma visão dos dados. Esta característica do algoritmo Cotraining faz com que a sua aplicabilidade seja restrita a domínios multi-visão, o que diminui muito o potencial do algoritmo para resolver problemas reais. Nesta dissertação, é proposto o algoritmo Co2KNN, que é uma versão mono-visão do algoritmo Cotraining na qual, ao invés de combinar duas visões dos dados, combina duas estratégias diferentes de induzir classificadores utilizando a mesma visão dos dados. Tais estratégias são chamados de k-vizinhos mais próximos (KNN) Local e Global. No KNN Global, a vizinhança utilizada para predizer o rótulo de um exemplo não rotulado é conformada por aqueles exemplos que contém o novo exemplo entre os seus k vizinhos mais próximos. Entretanto, o KNN Local considera a estratégia tradicional do KNN para recuperar a vizinhança de um novo exemplo. A teoria do Aprendizado Semi-supervisionado Baseado em Desacordo foi utilizada para definir a base teórica do algoritmo Co2KNN, pois argumenta que para o sucesso do algoritmo Cotraining, é suficiente que os classificadores mantenham um grau de desacordo que permita o processo de aprendizado conjunto. Para avaliar o desempenho do Co2KNN, foram executados diversos experimentos que sugerem que o algoritmo Co2KNN tem melhor performance que diferentes algoritmos do estado da arte, especificamente, em domínios mono-visão. Adicionalmente, foi proposto um algoritmo otimizado para diminuir a complexidade computacional do KNN Global, permitindo o uso do Co2KNN em problemas reais de classificação / Semi-supervised learning is a machine learning paradigm in which the induced hypothesis is improved by taking advantage of unlabeled data. Semi-supervised learning is particularly useful when labeled data is scarce and difficult to obtain. In this context, the Cotraining algorithm was proposed. Cotraining is a widely used semisupervised approach that assumes the availability of two independent views of the data. In most real world scenarios, the multi-view assumption is highly restrictive, impairing its usability for classifification purposes. In this work, we propose the Co2KNN algorithm, which is a one-view Cotraining approach that combines two different k-Nearest Neighbors (KNN) strategies referred to as global and local k-Nearest Neighbors. In the global KNN, the nearest neighbors used to classify a new instance are given by the set of training examples which contains this instance within its k-nearest neighbors. In the local KNN, on the other hand, the neighborhood considered to classify a new instance is the set of training examples computed by the traditional KNN approach. The Co2KNN algorithm is based on the theoretical background given by the Semi-supervised Learning by Disagreement, which claims that the success of the combination of two classifiers in the Cotraining framework is due to the disagreement between the classifiers. We carried out experiments showing that Co2KNN improves significatively the classification accuracy specially when just one view of training data is available. Moreover, we present an optimized algorithm to cope with time complexity of computing the global KNN, allowing Co2KNN to tackle real classification problems Aprendizado baseado em desacordo Aprendizado semi-supervisionado Classificação Contraining Classification Cotraining Semi-supervised leaning
119	Using Semi-supervised Clustering for Neurons Classification Fakhraee Seyedabad, Ali January 2013 (has links) We wish to understand brain; discover its sophisticated ways of calculations to invent improved computational methods. To decipher any complex system, first its components should be understood. Brain comprises neurons. Neurobiologists use morphologic properties like “somatic perimeter”, “axonal length”, and “number of dendrites” to classify neurons. They have discerned two types of neurons: “interneurons” and “pyramidal cells”, and have a consensus about five classes of interneurons: PV, 2/3, Martinotti, Chandelier, and NPY. They still need a more refined classification of interneurons because they suppose its known classes may contain subclasses or new classes may arise. This is a difficult process because of the great number and diversity of interneurons and lack of objective indices to classify them. Machine learning—automatic learning from data—can overcome the mentioned difficulties, but it needs a data set to learn from. To meet this demand neurobiologists compiled a data set from measuring 67 morphologic properties of 220 interneurons of mouse brains; they also labeled some of the samples—i.e. added their opinion about the sample’s classes. This project aimed to use machine learning to determine the true number of classes within the data set, classes of the unlabeled samples, and the accuracy of the available class labels. We used K-means, seeded K-means, and constrained K-means, and clustering validity techniques to achieve our objectives. Our results indicate that: the data set contains seven classes; seeded K-means outperforms K-means and constrained K-means; chandelier and 2/3 are the most consistent classes, whereas PV and Martinotti are the least consistent ones. machine learning clustering semi-supervised learning semi-supervised clustering data mining number of clusters Computer Sciences Datavetenskap (datalogi)
120	Classification d’objets au moyen de machines à vecteurs supports dans les images de sonar de haute résolution du fond marin / Object classification using support vector machines in high resolution sonar seabed imagery Rousselle, Denis 28 November 2016 (has links) Cette thèse a pour objectif d'améliorer la classification d'objets sous-marins dans des images sonar haute résolution. En particulier, il s'agit de distinguer les mines des objets inoffensifs parmi une collection d'objets ressemblant à des mines. Nos recherches ont été dirigées par deux contraintes classiques en guerre de la mine : d'une part, le manque de données et d'autre part, le besoin de lisibilité des décisions. Nous avons donc constitué une base de données la plus représentative possible et simulé des objets dans le but de la compléter. Le manque d'exemples nous a mené à utiliser une représentation compacte, issue de la reconnaissance de visages : les Structural Binary Gradient Patterns (SBGP). Dans la même optique, nous avons dérivé une méthode d'adaptation de domaine semi-supervisée, basée sur le transport optimal, qui peut être facilement interprétable. Enfin, nous avons développé un nouvel algorithme de classification : les Ensemble of Exemplar-Maximum Excluding Ball (EE-MEB) qui sont à la fois adaptés à des petits jeux de données mais dont la décision est également aisément analysable / This thesis aims to improve the classification of underwater objects in high resolution sonar images. Especially, we seek to make the distinction between mines and harmless objects from a collection of mine-like objects. Our research was led by two classical constraints of the mine warfare : firstly, the lack of data and secondly, the need for readability of the classification. In this context, we built a database as much representative as possible and simulated objects in order to complete it. The lack of examples led us to use a compact representation, originally used by the face recognition community : the Structural Binary Gradient Patterns (SBGP). To the same end, we derived a method of semi-supervised domain adaptation, based on optimal transport, that can be easily interpreted. Finally, we developed a new classification algorithm : the Ensemble of Exemplar-Maximum Excluding Ball (EE-MEB) which is suitable for small datasets and with an easily interpretable decision function Guerre des mines Classification d’images Adaptation de domaine semi-supervisée. Mine warfare Images classification Supervised learning Optimal transport Semi-supervised domain adaptation

Search results