Global ETD Search

61	Discriminant Analysis for Longitudinal Data Matira, Kevin January 2017 (has links) Various approaches for discriminant analysis of longitudinal data are investigated, with some focus on model-based approaches. The latter are typically based on the modi ed Cholesky decomposition of the covariance matrix in a Gaussian mixture; however, non-Gaussian mixtures are also considered. Where applicable, the Bayesian information criterion is used to select the number of components per class. The various approaches are demonstrated on real and simulated data. / Thesis / Master of Science (MSc) mixture models supervised learning longitudinal data classification statistical learning
62	Self-Supervised Remote Sensing Image Change Detection and Data Fusion Chen, Yuxing 27 November 2023 (has links) Self-supervised learning models, which are called foundation models, have achieved great success in computer vision. Meanwhile, the limited access to labeled data has driven the development of self-supervised methods in remote sensing tasks. In remote sensing image change detection, the generative models are extensively utilized in unsupervised binary change detection tasks, while they overly focus on pixels rather than on abstract feature representations. In addition, the state-of-the-art satellite image time series change detection approaches fail to effectively leverage the spatial-temporal information of image time series or generalize well to unseen scenarios. Similarly, in the context of multimodal remote sensing data fusion, the recent successes of deep learning techniques mainly focus on specific tasks and complete data fusion paradigms. These task-specific models lack of generalizability to other remote sensing tasks and become overfitted to the dominant modalities. Moreover, they fail to handle incomplete modalities inputs and experience severe degradation in downstream tasks. To address these challenges associated with individual supervised learning models, this thesis presents two novel contributions to self-supervised learning models on remote sensing image change detection and multimodal remote sensing data fusion. The first contribution proposes a bi-temporal / multi-temporal contrastive change detection framework, which employs contrastive loss on image patches or superpixels to get fine-grained change maps and incorporates an uncertainty method to enhance the temporal robustness. In the context of satellite image time series change detection, the proposed approach improves the consistency of pseudo labels through feature tracking and tackles the challenges posed by seasonal changes in long-term remote sensing image time series using supervised contrastive loss and the random walk loss in ConvLSTM. The second contribution develops a self-supervised multimodal RS data fusion framework, with a specific focus on addressing the incomplete multimodal RS data fusion challenges in downstream tasks. Within this framework, multimodal RS data are fused by applying a multi-view contrastive loss at the pixel level and reconstructing each modality using others in a generative way based on MultiMAE. In downstream tasks, the proposed approach leverages a random modality combination training strategy and an attention block to enable fusion across modal-incomplete inputs. The thesis assesses the effectiveness of the proposed self-supervised change detection approach on single-sensor and cross-sensor datasets of SAR and multispectral images, and evaluates the proposed self-supervised multimodal RS data fusion approach on the multimodal RS dataset with SAR, multispectral images, DEM and also LULC maps. The self-supervised change detection approach demonstrates improvements over state-of-the-art unsupervised change detection methods in challenging scenarios involving multi-temporal and multi-sensor RS image change detection. Similarly, the self-supervised multimodal remote sensing data fusion approach achieves the best performance by employing an intermediate fusion strategy on SAR and optical image pairs, outperforming existing unsupervised data fusion approaches. Notably, in incomplete multimodal fusion tasks, the proposed method exhibits impressive performance on all modal-incomplete and single modality inputs, surpassing the performance of vanilla MultiViT, which tends to overfit on dominant modality inputs and fails in tasks with single modality inputs.
63	Self-supervised Learning Methods for Vision-based Tasks Turrisi Da Costa, Victor Guilherme 22 May 2024 (has links) Dealing with large amounts of unlabeled data is a very challenging task. Recently, many different approaches have been proposed to leverage this data for training many machine learning models. Among them, self-supervised learning appears as an efficient solution capable of training powerful and generalizable models. More specifically, instead of relying on human-generated labels, it proposes training objectives that use ``labels'' generated from the data itself, either via data augmentation or by masking the data in some way and trying to reconstruct it. Apart from being able to train models from scratch, self-supervised methods can also be used in specific applications to further improve a pre-trained model. In this thesis, we propose to leverage self-supervised methods in novel ways to tackle different application scenarios. We present four published papers: an open-source library for self-supervised learning that is flexible, scalable, and easy to use; two papers tackling unsupervised domain adaptation in action recognition; and one paper on self-supervised learning for continual learning. The published papers highlight that self-supervised techniques can be leveraged for many scenarios, yielding state-of-the-art results.
64	Design Optimization of Fuzzy Logic Systems Dadone, Paolo 29 May 2001 (has links) Fuzzy logic systems are widely used for control, system identification, and pattern recognition problems. In order to maximize their performance, it is often necessary to undertake a design optimization process in which the adjustable parameters defining a particular fuzzy system are tuned to maximize a given performance criterion. Some data to approximate are commonly available and yield what is called the supervised learning problem. In this problem we typically wish to minimize the sum of the squares of errors in approximating the data. We first introduce fuzzy logic systems and the supervised learning problem that, in effect, is a nonlinear optimization problem that at times can be non-differentiable. We review the existing approaches and discuss their weaknesses and the issues involved. We then focus on one of these problems, i.e., non-differentiability of the objective function, and show how current approaches that do not account for non-differentiability can diverge. Moreover, we also show that non-differentiability may also have an adverse practical impact on algorithmic performances. We reformulate both the supervised learning problem and piecewise linear membership functions in order to obtain a polynomial or factorable optimization problem. We propose the application of a global nonconvex optimization approach, namely, a reformulation and linearization technique. The expanded problem dimensionality does not make this approach feasible at this time, even though this reformulation along with the proposed technique still bears a theoretical interest. Moreover, some future research directions are identified. We propose a novel approach to step-size selection in batch training. This approach uses a limited memory quadratic fit on past convergence data. Thus, it is similar to response surface methodologies, but it differs from them in the type of data that are used to fit the model, that is, already available data from the history of the algorithm are used instead of data obtained according to an experimental design. The step-size along the update direction (e.g., negative gradient or deflected negative gradient) is chosen according to a criterion of minimum distance from the vertex of the quadratic model. This approach rescales the complexity in the step-size selection from the order of the (large) number of training data, as in the case of exact line searches, to the order of the number of parameters (generally lower than the number of training data). The quadratic fit approach and a reduced variant are tested on some function approximation examples yielding distributions of the final mean square errors that are improved (i.e., skewed toward lower errors) with respect to the ones in the commonly used pattern-by-pattern approach. Moreover, the quadratic fit is also competitive and sometimes better than the batch training with optimal step-sizes, thus showing an improved performance of this approach. The quadratic fit approach is also tested in conjunction with gradient deflection strategies and memoryless variable metric methods, showing errors smaller by 1 to 7 orders of magnitude. Moreover, the convergence speed by using either the negative gradient direction or a deflected direction is higher than that of the pattern-by-pattern approach, although the computational cost of the algorithm per iteration is moderately higher than the one of the pattern-by-pattern method. Finally, some directions for future research are identified. / Ph. D. Non-differentiable optimization Supervised learning Optimization Fuzzy logic systems
65	General discriminative optimization for point set registration Zhao, Y., Tang, W., Feng, J., Wan, Tao Ruan, Xi, L. 26 March 2022 (has links) Yes / Point set registration has been actively studied in computer vision and graphics. Optimization algorithms are at the core of solving registration problems. Traditional optimization approaches are mainly based on the gradient of objective functions. The derivation of objective functions makes it challenging to find optimal solutions for complex optimization models, especially for those applications where accuracy is critical. Learning-based optimization is a novel approach to address this problem, which learns the gradient direction from datasets. However, many learning-based optimization algorithms learn gradient directions via a single feature extracted from the dataset, which will cause the updating direction to be vulnerable to perturbations around the data, thus falling into a bad stationary point. This paper proposes the General Discriminative Optimization (GDO) method that updates a gradient path automatically through the trade-off among contributions of different features on updating gradients. We illustrate the benefits of GDO with tasks of 3D point set registrations and show that GDO outperforms the state-of-the-art registration methods in terms of accuracy and robustness to perturbations. Point set registration Supervised learning Learning-based optimisation
66	Defect prediction on production line Khalfaoui, S., Manouvrier, E., Briot, A., Delaux, D., Butel, S., Ibrahim, Jesutofunmi, Kanyere, Tatenda, Orimogunje, Bola, Abdullatif, Amr A.A., Neagu, Daniel 29 March 2022 (has links) Yes / Quality control has long been one of the most challenging fields of manufacturing. The development of advanced sensors and the easier collection of high amounts of data designate the machine learning techniques as a timely natural step forward to leverage quality decision support and manufacturing challenges. This paper introduces an original dataset provided by the automotive supplier company VALEO, coming from a production line, and hosted by the École Normale Supérieure (ENS) Data Challenge to predict defects using non-anonymised features, without access to final test results, to validate the part status (defective or not). We propose in this paper a complete workflow from data exploration to the modelling phase while addressing at each stage challenges and techniques to solve them, as a benchmark reference. The proposed workflow is validated in series of experiments that demonstrate the benefits, challenges and impact of data science adoption in manufacturing. Manufacturing Quality control Defect prediction Machine learning Supervised learning
67	Semi-Supervised Gait Recognition Mitra, Sirshapan 01 January 2024 (has links) (PDF) In this work, we examine semi-supervised learning for Gait recognition with a limited number of labeled samples. Our research focus on two distinct aspects for limited labels, 1)closed-set: with limited labeled samples per individual, and 2) open-set: with limited labeled individuals. We find open-set poses greater challenge compared to closed-set thus, having more labeled ids is important for performance than having more labeled samples per id. Moreover, obtaining labeled samples for a large number of individuals is usually more challenging, therefore limited id setup (closed-setup) is more important to study where most of the training samples belong to unknown ids. We further analyze that existing semi-supervised learning approaches are not well suited for scenario where unlabeled samples belong to novel ids. We propose a simple prototypical self-training approach to solve this problem, where, we integrate semi-supervised learning for closed set setting with self-training which can effectively utilize unlabeled samples from unknown ids. To further alleviate the challenges of limited labeled samples, we explore the role of synthetic data where we utilize diffusion model to generate samples from both known and unknown ids. We perform our experiments on two different Gait recognition benchmarks, CASIA-B and OUMVLP, and provide a comprehensive evaluation of the proposed method. The proposed approach is effective and generalizable for both closed and open-set settings. With merely 20% of labeled samples, we were able to achieve performance competitive to supervised methods utilizing 100% labeled samples while outperforming existing semi-supervised methods. Deep Learning Semi-Supervised Learning Gait Recognition Computer Sciences
68	On discriminative semi-supervised incremental learning with a multi-view perspective for image concept modeling Byun, Byungki 17 January 2012 (has links) This dissertation presents the development of a semi-supervised incremental learning framework with a multi-view perspective for image concept modeling. For reliable image concept characterization, having a large number of labeled images is crucial. However, the size of the training set is often limited due to the cost required for generating concept labels associated with objects in a large quantity of images. To address this issue, in this research, we propose to incrementally incorporate unlabeled samples into a learning process to enhance concept models originally learned with a small number of labeled samples. To tackle the sub-optimality problem of conventional techniques, the proposed incremental learning framework selects unlabeled samples based on an expected error reduction function that measures contributions of the unlabeled samples based on their ability to increase the modeling accuracy. To improve the convergence property of the proposed incremental learning framework, we further propose a multi-view learning approach that makes use of multiple features such as color, texture, etc., of images when including unlabeled samples. For robustness to mismatches between training and testing conditions, a discriminative learning algorithm, namely a kernelized maximal- figure-of-merit (kMFoM) learning approach is also developed. Combining individual techniques, we conduct a set of experiments on various image concept modeling problems, such as handwritten digit recognition, object recognition, and image spam detection to highlight the effectiveness of the proposed framework. Discriminative learning Semi-supervised learning Incremental learning Image modeling Multi-view learning Machine learning Supervised learning (Machine learning) Boosting (Algorithms)
69	Learning with Limited Supervision by Input and Output Coding Zhang, Yi 01 May 2012 (has links) In many real-world applications of supervised learning, only a limited number of labeled examples are available because the cost of obtaining high-quality examples is high. Even with a relatively large number of labeled examples, the learning problem may still suffer from limited supervision as the complexity of the prediction function increases. Therefore, learning with limited supervision presents a major challenge to machine learning. With the goal of supervision reduction, this thesis studies the representation, discovery and incorporation of extra input and output information in learning. Information about the input space can be encoded by regularization. We first design a semi-supervised learning method for text classification that encodes the correlation of words inferred from seemingly irrelevant unlabeled text. We then propose a multi-task learning framework with a matrix-normal penalty, which compactly encodes the covariance structure of the joint input space of multiple tasks. To capture structure information that is more general than covariance and correlation, we study a class of regularization penalties on model compressibility. Then we design the projection penalty, which encodes the structure information from a dimension reduction while controlling the risk of information loss. Information about the output space can be exploited by error correcting output codes. Using the composite likelihood view, we propose an improved pairwise coding for multi-label classification, which encodes pairwise label density (as opposed to label comparisons) and decodes using variational methods. We then investigate problemdependent codes, where the encoding is learned from data instead of being predefined. We first propose a multi-label output code using canonical correlation analysis, where predictability of the code is optimized. We then argue that both discriminability and predictability are critical for output coding, and propose a max-margin formulation that promotes both discriminative and predictable codes. We empirically study our methods in a wide spectrum of applications, including document categorization, landmine detection, face recognition, brain signal classification, handwritten digit recognition, house price forecasting, music emotion prediction, medical decision, email analysis, gene function classification, outdoor scene recognition, and so forth. In all these applications, our proposed methods for encoding input and output information lead to significantly improved prediction performance. regularization error-correcting output codes supervised learning semi-supervised learning multi-task learning multi-label classification dimensionality reduction Computer Sciences
70	Semi-Supervised Learning for Object Detection Rosell, Mikael January 2015 (has links) Many automotive safety applications in modern cars make use of cameras and object detection to analyze the surrounding environment. Pedestrians, animals and other vehicles can be detected and safety actions can be taken before dangerous situations arise. To detect occurrences of the different objects, these systems are traditionally trained to learn a classification model using a set of images that carry labels corresponding to their content. To obtain high performance with a variety of object appearances, the required amount of data is very large. Acquiring unlabeled images is easy, while the manual work of labeling is both time-consuming and costly. Semi-supervised learning refers to methods that utilize both labeled and unlabeled data, a situation that is highly desirable if it can lead to improved accuracy and at the same time alleviate the demand of labeled data. This has been an active area of research in the last few decades, but few studies have investigated the performance of these algorithms in larger systems. In this thesis, we investigate if and how semi-supervised learning can be used in a large-scale pedestrian detection system. With the area of application being automotive safety, where real-time performance is of high importance, the work is focused around boosting classifiers. Results are presented on a few publicly available UCI data sets and on a large data set for pedestrian detection captured in real-life traffic situations. By evaluating the algorithms on the pedestrian data set, we add the complexity of data set size, a large variety of object appearances and high input dimension. It is possible to find situations in low dimensions where an additional set of unlabeled data can be used successfully to improve a classification model, but the results show that it is hard to efficiently utilize semi-supervised learning in large-scale object detection systems. The results are hard to scale to large data sets of higher dimensions as pair-wise computations are of high complexity and proper similarity measures are hard to find. semi-supervised learning object detection pedestrian detection boosting machine learning supervised learning adaboost semiboost regboost self-learning

Search results