• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 22
  • 7
  • 3
  • Tagged with
  • 35
  • 35
  • 21
  • 13
  • 9
  • 9
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Towards Robust Machine Learning Models for Data Scarcity

January 2020 (has links)
abstract: Recently, a well-designed and well-trained neural network can yield state-of-the-art results across many domains, including data mining, computer vision, and medical image analysis. But progress has been limited for tasks where labels are difficult or impossible to obtain. This reliance on exhaustive labeling is a critical limitation in the rapid deployment of neural networks. Besides, the current research scales poorly to a large number of unseen concepts and is passively spoon-fed with data and supervision. To overcome the above data scarcity and generalization issues, in my dissertation, I first propose two unsupervised conventional machine learning algorithms, hyperbolic stochastic coding, and multi-resemble multi-target low-rank coding, to solve the incomplete data and missing label problem. I further introduce a deep multi-domain adaptation network to leverage the power of deep learning by transferring the rich knowledge from a large-amount labeled source dataset. I also invent a novel time-sequence dynamically hierarchical network that adaptively simplifies the network to cope with the scarce data. To learn a large number of unseen concepts, lifelong machine learning enjoys many advantages, including abstracting knowledge from prior learning and using the experience to help future learning, regardless of how much data is currently available. Incorporating this capability and making it versatile, I propose deep multi-task weight consolidation to accumulate knowledge continuously and significantly reduce data requirements in a variety of domains. Inspired by the recent breakthroughs in automatically learning suitable neural network architectures (AutoML), I develop a nonexpansive AutoML framework to train an online model without the abundance of labeled data. This work automatically expands the network to increase model capability when necessary, then compresses the model to maintain the model efficiency. In my current ongoing work, I propose an alternative method of supervised learning that does not require direct labels. This could utilize various supervision from an image/object as a target value for supervising the target tasks without labels, and it turns out to be surprisingly effective. The proposed method only requires few-shot labeled data to train, and can self-supervised learn the information it needs and generalize to datasets not seen during training. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2020
12

Urban Image Analysis with Convolutional Sparse Coding

Affara, Lama Ahmed 18 September 2018 (has links)
Urban image analysis is one of the most important problems lying at the intersection of computer graphics and computer vision research. In addition, Convolutional Sparse Coding (CSC) is a well-established image representation model especially suited for image restoration tasks. This dissertation handles urban image analysis using an asset extraction framework, studies CSC for the reconstruction of both urban and general images using supervised data, and proposes a better computational approach to CSC. Our asset extraction framework uses object proposals which are currently used for increasing the computational efficiency of object detection. In this dissertation, we propose a novel adaptive pipeline for interleaving object proposals with object classification and use it as a formulation for asset detection. We first preprocess the images using a novel and efficient rectification technique. We then employ a particle filter approach to keep track of three priors, which guide proposed samples and get updated using classifier output. Tests performed on over 1000 urban images demonstrate that our rectification method is faster than existing methods without loss in quality, and that our interleaved proposal method outperforms current state-of-the-art. We further demonstrate that other methods can be improved by incorporating our interleaved proposals. We also extend the applicability of the CSC model by proposing a supervised approach to the problem, which aims at learning discriminative dictionaries instead of purely reconstructive ones. We incorporate a supervised regularization term into the traditional unsupervised CSC objective to encourage the final dictionary elements to be discriminative. Experimental results show that using supervised convolutional learning results in two key advantages. First, we learn more semantically relevant filters in the dictionary and second, we achieve improved image reconstruction on unseen data. We finally present two computational contributions to the state of the art in CSC. First, we significantly speed up the computation by proposing a new optimization framework that tackles the problem in the dual domain. Second, we extend the original formulation to higher dimensions in order to process a wider range of inputs, such as RGB images and videos. Our results show up to 20 times speedup compared to current state-of-the-art CSC solvers.
13

Acquisitions d'IRM de diffusion à haute résolution spatiale : nouvelles perspectives grâce au débruitage spatialement adaptatif et angulaire

St-Jean, Samuel January 2015 (has links)
Le début des années 2000 a vu la cartographie du génome humain se réaliser après 13 ans de recherche. Le défi du prochain siècle réside dans la construction du connectome humain, qui consiste à cartographier les connexions du cerveau en utilisant l’imagerie par résonance magnétique (IRM) de diffusion. Cette technique permet en effet d’étudier la matière blanche du cerveau de façon complètement non invasive. Bien que le défi soit monumental, la résolution d’une image d’IRM se situe à l’échelle macroscopique et est environ 1000 fois inférieure à la taille des axones qu’il faut cartographier. Pour aider à pallier à ce problème, ce mémoire propose une nouvelle technique de débruitage spécialement conçue pour l’imagerie de diffusion. L’algorithme Non Local Spatial and Angular Matching (NLSAM) se base sur les principes du block matching et du dictionary learning pour exploiter la redondance des données d’IRM de diffusion. Un seuillage sur les voisins angulaire est aussi réalisé à l’aide du sparse coding, où l’erreur de reconstruction en norme l2 est bornée par la variance locale du bruit. L’algorithme est aussi conçu pour gérer le biais du bruit Ricien et Chi non centré puisque les images d’IRM contiennent du bruit non Gaussien. Ceci permet ainsi d’acquérir des données d’IRM de diffusion à une plus grande résolution spatiale que présentement disponible en milieu clinique. Ce travail ouvre donc la voie à un meilleur type d’acquisition, ce qui pourrait contribuer à révéler de nouveaux détails anatomiques non discernables à la résolution spatiale présentement utilisée par la communauté d’IRM de diffusion. Ceci pourrait aussi éventuellement contribuer à identifier de nouveaux biomarqueurs permettant de comprendre les maladies dégénératives telles que la sclérose en plaques, la maladie d’Alzheimer et la maladie de Parkinson.
14

Sparse coding for machine learning, image processing and computer vision / Représentations parcimonieuses en apprentissage statistique, traitement d’image et vision par ordinateur

Mairal, Julien 30 November 2010 (has links)
Nous étudions dans cette thèse une représentation particulière de signaux fondée sur une méthode d’apprentissage statistique, qui consiste à modéliser des données comme combinaisons linéaires de quelques éléments d’un dictionnaire appris. Ceci peut être vu comme une extension du cadre classique des ondelettes, dont le but est de construire de tels dictionnaires (souvent des bases orthonormales) qui sont adaptés aux signaux naturels. Un succès important de cette approche a été sa capacité à modéliser des imagettes, et la performance des méthodes de débruitage d’images fondées sur elle. Nous traitons plusieurs questions ouvertes, qui sont reliées à ce cadre : Comment apprendre efficacement un dictionnaire ? Comment enrichir ce modèle en ajoutant une structure sous-jacente au dictionnaire ? Est-il possible d’améliorer les méthodes actuelles de traitement d’image fondées sur cette approche ? Comment doit-on apprendre le dictionnaire lorsque celui-ci est utilisé pour une tâche autre que la reconstruction de signaux ? Y a-t-il des applications intéressantes de cette méthode en vision par ordinateur ? Nous répondons à ces questions, avec un point de vue multidisciplinaire, en empruntant des outils d’apprentissage statistique, d’optimisation convexe et stochastique, de traitement des signaux et des images, de vison par ordinateur, mais aussi d'optimisation sur des graphes. / We study in this thesis a particular machine learning approach to represent signals that that consists of modelling data as linear combinations of a few elements from a learned dictionary. It can be viewed as an extension of the classical wavelet framework, whose goal is to design such dictionaries (often orthonormal basis) that are adapted to natural signals. An important success of dictionary learning methods has been their ability to model natural image patches and the performance of image denoising algorithms that it has yielded. We address several open questions related to this framework: How to efficiently optimize the dictionary? How can the model be enriched by adding a structure to the dictionary? Can current image processing tools based on this method be further improved? How should one learn the dictionary when it is used for a different task than signal reconstruction? How can it be used for solving computer vision problems? We answer these questions with a multidisciplinarity approach, using tools from statistical machine learning, convex and stochastic optimization, image and signal processing, computer vision, but also optimization on graphs.
15

Représentations parcimonieuses et apprentissage de dictionnaires pour la classification et le clustering de séries temporelles / Time warp invariant sparse coding and dictionary learning for time series classification and clustering

Varasteh Yazdi, Saeed 15 November 2018 (has links)
L'apprentissage de dictionnaires à partir de données temporelles est un problème fondamental pour l’extraction de caractéristiques temporelles latentes, la révélation de primitives saillantes et la représentation de données temporelles complexes. Cette thèse porte sur l’apprentissage de dictionnaires pour la représentation parcimonieuse de séries temporelles. On s’intéresse à l’apprentissage de représentations pour la reconstruction, la classification et le clustering de séries temporelles sous des transformations de distortions temporelles. Nous proposons de nouveaux modèles invariants aux distortions temporelles.La première partie du travail porte sur l’apprentissage de dictionnaire pour des tâches de reconstruction et de classification de séries temporelles. Nous avons proposé un modèle TWI-OMP (Time-Warp Invariant Orthogonal Matching Pursuit) invariant aux distorsions temporelles, basé sur un opérateur de maximisation du cosinus entre des séries temporelles. Nous avons ensuite introduit le concept d’atomes jumelés (sibling atomes) et avons proposé une approche d’apprentissage de dictionnaires TWI-kSVD étendant la méthode kSVD à des séries temporelles.Dans la seconde partie du travail, nous nous sommes intéressés à l’apprentissage de dictionnaires pour le clustering de séries temporelles. Nous avons proposé une formalisation du problème et une solution TWI-DLCLUST par descente de gradient.Les modèles proposés sont évalués au travers plusieurs jeux de données publiques et réelles puis comparés aux approches majeures de l’état de l’art. Les expériences conduites et les résultats obtenus montrent l’intérêt des modèles d’apprentissage de représentations proposés pour la classification et le clustering de séries temporelles. / Learning dictionary for sparse representing time series is an important issue to extract latent temporal features, reveal salient primitives and sparsely represent complex temporal data. This thesis addresses the sparse coding and dictionary learning problem for time series classification and clustering under time warp. For that, we propose a time warp invariant sparse coding and dictionary learning framework where both input samples and atoms define time series of different lengths that involve varying delays.In the first part, we formalize an L0 sparse coding problem and propose a time warp invariant orthogonal matching pursuit based on a new cosine maximization time warp operator. For the dictionary learning stage, a non linear time warp invariant kSVD (TWI-kSVD) is proposed. Thanks to a rotation transformation between each atom and its sibling atoms, a singular value decomposition is used to jointly approximate the coefficients and update the dictionary, similar to the standard kSVD. In the second part, a time warp invariant dictionary learning for time series clustering is formalized and a gradient descent solution is proposed.The proposed methods are confronted to major shift invariant, convolved and kernel dictionary learning methods on several public and real temporal data. The conducted experiments show the potential of the proposed frameworks to efficiently sparse represent, classify and cluster time series under time warp.
16

MULTI-COLUMN NEURAL NETWORKS AND SPARSE CODING NOVEL TECHNIQUES IN MACHINE LEARNING

Hoori, Ammar O 01 January 2019 (has links)
Accurate and fast machine learning (ML) algorithms are highly vital in artificial intelligence (AI) applications. In complex dataset problems, traditional ML methods such as radial basis function neural network (RBFN), sparse coding (SC) using dictionary learning, and particle swarm optimization (PSO) provide trivial results, large structure, slow training, and/or slow testing. This dissertation introduces four novel ML techniques: the multi-column RBFN network (MCRN), the projected dictionary learning algorithm (PDL) and the multi-column adaptive and non-adaptive particle swarm optimization techniques (MC-APSO and MC-PSO). These novel techniques provide efficient alternatives for traditional ML techniques. Compared to traditional ML techniques, the novel ML techniques demonstrate more accurate results, faster training and testing timing, and parallelized structured solutions. MCRN deploys small RBFNs in a parallel structure to speed up both training and testing. Each RBFN is trained with a subset of the dataset and the overall structure provides results that are more accurate. PDL introduces a conceptual dictionary learning method in updating the dictionary atoms with the reconstructed input blocks. This method improves the sparsity of extracted features and hence, the image denoising results. MC-PSO and MC-APSO provide fast and more accurate alternatives to the PSO and APSO slow evolutionary techniques. MC-PSO and MC-APSO use multi-column parallelized RBFN structure to improve results and speed with a wide range of classification dataset problems. The novel techniques are trained and tested using benchmark dataset problems and the results are compared with the state-of-the-art counterpart techniques to evaluate their performance. Novel techniques’ results show superiority over techniques in accuracy and speed in most of the experimental results, which make them good alternatives in solving difficult ML problems.
17

Panic Detection in Human Crowds using Sparse Coding

Kumar, Abhishek 21 August 2012 (has links)
Recently, the surveillance of human activities has drawn a lot of attention from the research community and the camera based surveillance is being tried with the aid of computers. Cameras are being used extensively for surveilling human activities; however, placing cameras and transmitting visual data is not the end of a surveillance system. Surveillance needs to detect abnormal or unwanted activities. Such abnormal activities are very infrequent as compared to regular activities. At present, surveillance is done manually, where the job of operators is to watch a set of surveillance video screens to discover an abnormal event. This is expensive and prone to error. The limitation of these surveillance systems can be effectively removed if an automated anomaly detection system is designed. With powerful computers, computer vision is being seen as a panacea for surveillance. A computer vision aided anomaly detection system will enable the selection of those video frames which contain an anomaly, and only those selected frames will be used for manual verifications. A panic is a type of anomaly in a human crowd, which appears when a group of people start to move faster than the usual speed. Such situations can arise due to a fearsome activity near a crowd such as fight, robbery, riot, etc. A variety of computer vision based algorithms have been developed to detect panic in human crowds, however, most of the proposed algorithms are computationally expensive and hence too slow to be real-time. Dictionary learning is a robust tool to model a behaviour in terms of the linear combination of dictionary elements. A few panic detection algorithms have shown high accuracy using the dictionary learning method; however, the dictionary learning approach is computationally expensive. Orthogonal matching pursuit (OMP) is an inexpensive way to model a behaviour using dictionary elements and in this research OMP is used to design a panic detection algorithm. The proposed algorithm has been tested on two datasets and results are found to be comparable to state-of-the-art algorithms.
18

On sparse representations and new meta-learning paradigms for representation learning

Mehta, Nishant A. 27 August 2014 (has links)
Given the "right" representation, learning is easy. This thesis studies representation learning and meta-learning, with a special focus on sparse representations. Meta-learning is fundamental to machine learning, and it translates to learning to learn itself. The presentation unfolds in two parts. In the first part, we establish learning theoretic results for learning sparse representations. The second part introduces new multi-task and meta-learning paradigms for representation learning. On the sparse representations front, our main pursuits are generalization error bounds to support a supervised dictionary learning model for Lasso-style sparse coding. Such predictive sparse coding algorithms have been applied with much success in the literature; even more common have been applications of unsupervised sparse coding followed by supervised linear hypothesis learning. We present two generalization error bounds for predictive sparse coding, handling the overcomplete setting (more original dimensions than learned features) and the infinite-dimensional setting. Our analysis led to a fundamental stability result for the Lasso that shows the stability of the solution vector to design matrix perturbations. We also introduce and analyze new multi-task models for (unsupervised) sparse coding and predictive sparse coding, allowing for one dictionary per task but with sharing between the tasks' dictionaries. The second part introduces new meta-learning paradigms to realize unprecedented types of learning guarantees for meta-learning. Specifically sought are guarantees on a meta-learner's performance on new tasks encountered in an environment of tasks. Nearly all previous work produced bounds on the expected risk, whereas we produce tail bounds on the risk, thereby providing performance guarantees on the risk for a single new task drawn from the environment. The new paradigms include minimax multi-task learning (minimax MTL) and sample variance penalized meta-learning (SVP-ML). Regarding minimax MTL, we provide a high probability learning guarantee on its performance on individual tasks encountered in the future, the first of its kind. We also present two continua of meta-learning formulations, each interpolating between classical multi-task learning and minimax multi-task learning. The idea of SVP-ML is to minimize the task average of the training tasks' empirical risks plus a penalty on their sample variance. Controlling this sample variance can potentially yield a faster rate of decrease for upper bounds on the expected risk of new tasks, while also yielding high probability guarantees on the meta-learner's average performance over a draw of new test tasks. An algorithm is presented for SVP-ML with feature selection representations, as well as a quite natural convex relaxation of the SVP-ML objective.
19

Mémorisation de séquences dans des réseaux de neurones binaires avec efficacité élevée

JIANG, Xiaoran 08 January 2014 (has links) (PDF)
Sequential structure imposed by the forward linear progression of time is omnipresent in all cognitive behaviors. This thesis proposes a novel model to store sequences of any length, scalar or vectorial, in binary neural networks. Particularly, the model that we introduce allows resolving some well known problems in sequential learning, such as error intolerance, catastrophic forgetting and the interference issue while storing complex sequences, etc. The quantity of the total sequential information that the network is able to store grows quadratically with the number of nodes. And the efficiency - the ratio between the capacity and the total amount of information consumed by the storage device - can reach around 30%. This work could be considered as an extension of the non oriented clique-based neural networks previously proposed and studied within our team. Those networks composed of binary neurons and binary connections utilize the concept of graph redundancy and sparsity in order to acquire a quadratic learning diversity. To obtain the ability to store sequences, connections are provided with orientation to form a tournament-based neural network. This is more natural biologically speak- ing, since communication between neurons is unidirectional, from axons to synapses. Any component of the network, a cluster or a node, can be revisited several times within a sequence or by multiple sequences. This allows the network to store se- quences of any length, independent of the number of clusters and only limited by the total available resource of the network. Moreover, in order to allow error correction and provide robustness to the net- work, both spatial assembly redundancy and sequential redundancy, with or without anticipation, may be combined to offer a large amount of redundancy in the activation of a node. Subsequently, a double layered structure is introduced with the purpose of accurate retrieval. The lower layer of tournament-based hetero-associative network stores sequential oriented associations between patterns. An upper auto-associative layer of mirror nodes is superposed to emphasize the co-occurrence of the elements belonging to the same pattern, in the form of a clique. This model is then extended to a hierarchical structure, which helps resolve the interference issue while storing complex sequences. This thesis also contributes in proposing and assessing new decoding rules with respect to sparse messages in order to fully benefit from the theoretical quadratic law of the learning diversity. Besides the performance aspect, the biological plausibility is also a constant concern during this thesis work.
20

Panic Detection in Human Crowds using Sparse Coding

Kumar, Abhishek 21 August 2012 (has links)
Recently, the surveillance of human activities has drawn a lot of attention from the research community and the camera based surveillance is being tried with the aid of computers. Cameras are being used extensively for surveilling human activities; however, placing cameras and transmitting visual data is not the end of a surveillance system. Surveillance needs to detect abnormal or unwanted activities. Such abnormal activities are very infrequent as compared to regular activities. At present, surveillance is done manually, where the job of operators is to watch a set of surveillance video screens to discover an abnormal event. This is expensive and prone to error. The limitation of these surveillance systems can be effectively removed if an automated anomaly detection system is designed. With powerful computers, computer vision is being seen as a panacea for surveillance. A computer vision aided anomaly detection system will enable the selection of those video frames which contain an anomaly, and only those selected frames will be used for manual verifications. A panic is a type of anomaly in a human crowd, which appears when a group of people start to move faster than the usual speed. Such situations can arise due to a fearsome activity near a crowd such as fight, robbery, riot, etc. A variety of computer vision based algorithms have been developed to detect panic in human crowds, however, most of the proposed algorithms are computationally expensive and hence too slow to be real-time. Dictionary learning is a robust tool to model a behaviour in terms of the linear combination of dictionary elements. A few panic detection algorithms have shown high accuracy using the dictionary learning method; however, the dictionary learning approach is computationally expensive. Orthogonal matching pursuit (OMP) is an inexpensive way to model a behaviour using dictionary elements and in this research OMP is used to design a panic detection algorithm. The proposed algorithm has been tested on two datasets and results are found to be comparable to state-of-the-art algorithms.

Page generated in 0.075 seconds