Global ETD Search

1	Learning Linear, Sparse, Factorial Codes Olshausen, Bruno A. 01 December 1996 (has links) In previous work (Olshausen & Field 1996), an algorithm was described for learning linear sparse codes which, when trained on natural images, produces a set of basis functions that are spatially localized, oriented, and bandpass (i.e., wavelet-like). This note shows how the algorithm may be interpreted within a maximum-likelihood framework. Several useful insights emerge from this connection: it makes explicit the relation to statistical independence (i.e., factorial coding), it shows a formal relationship to the algorithm of Bell and Sejnowski (1995), and it suggests how to adapt parameters that were previously fixed. unsupervised learning factorial coding sparse coding MIT
2	Receptive field structures for recognition Balas, Benjamin, Sinha, Pawan 01 March 2005 (has links) Localized operators, like Gabor wavelets and difference-of-Gaussian filters, are considered to be useful tools for image representation. This is due to their ability to form a Âsparse codeÂ that can serve as a basis set for high-fidelity reconstruction of natural images. However, for many visual tasks, the more appropriate criterion of representational efficacy is ÂrecognitionÂ, rather than ÂreconstructionÂ. It is unclear whether simple local features provide the stability necessary to subserve robust recognition of complex objects. In this paper, we search the space of two-lobed differential operators for those that constitute a good representational code under recognition/discrimination criteria. We find that a novel operator, which we call the Âdissociated dipoleÂ displays useful properties in this regard. We describe simple computational experiments to assess the merits of such dipoles relative to the more traditional local operators. The results suggest that non-local operators constitute a vocabulary that is stable across a range of image transformations. AI object recognition face recognition sparse coding
3	Effective Gene Expression Annotation Approaches for Mouse Brain Images January 2016 (has links) abstract: Understanding the complexity of temporal and spatial characteristics of gene expression over brain development is one of the crucial research topics in neuroscience. An accurate description of the locations and expression status of relative genes requires extensive experiment resources. The Allen Developing Mouse Brain Atlas provides a large number of in situ hybridization (ISH) images of gene expression over seven different mouse brain developmental stages. Studying mouse brain models helps us understand the gene expressions in human brains. This atlas collects about thousands of genes and now they are manually annotated by biologists. Due to the high labor cost of manual annotation, investigating an efficient approach to perform automated gene expression annotation on mouse brain images becomes necessary. In this thesis, a novel efficient approach based on machine learning framework is proposed. Features are extracted from raw brain images, and both binary classification and multi-class classification models are built with some supervised learning methods. To generate features, one of the most adopted methods in current research effort is to apply the bag-of-words (BoW) algorithm. However, both the efficiency and the accuracy of BoW are not outstanding when dealing with large-scale data. Thus, an augmented sparse coding method, which is called Stochastic Coordinate Coding, is adopted to generate high-level features in this thesis. In addition, a new multi-label classification model is proposed in this thesis. Label hierarchy is built based on the given brain ontology structure. Experiments have been conducted on the atlas and the results show that this approach is efficient and classifies the images with a relatively higher accuracy. / Dissertation/Thesis / Masters Thesis Computer Science 2016 Computer science Gene Expression Image Annotation Multi-label Sparse Coding
4	An Equivalence Between Sparse Approximation and Support Vector Machines Girosi, Federico 01 May 1997 (has links) In the first part of this paper we show a similarity between the principle of Structural Risk Minimization Principle (SRM) (Vapnik, 1982) and the idea of Sparse Approximation, as defined in (Chen, Donoho and Saunders, 1995) and Olshausen and Field (1996). Then we focus on two specific (approximate) implementations of SRM and Sparse Approximation, which have been used to solve the problem of function approximation. For SRM we consider the Support Vector Machine technique proposed by V. Vapnik and his team at AT&T Bell Labs, and for Sparse Approximation we consider a modification of the Basis Pursuit De-Noising algorithm proposed by Chen, Donoho and Saunders (1995). We show that, under certain conditions, these two techniques are equivalent: they give the same solution and they require the solution of the same quadratic programming problem. Support Vector Machines Sparse Approximation Sparse Coding Reproducing Kernel Hilbert Spaces
5	Image/Video Deblocking via Sparse Representation Chiou, Yi-Wen 08 September 2012 (has links) Blocking artifact, characterized by visually noticeable changes in pixel values along block boundaries, is a common problem in block-based image/video compression, especially at low bitrate coding. Various post-processing techniques have been proposed to reduce blocking artifacts, but they usually introduce excessive blurring or ringing effects. This paper proposes a self-learning-based image/ video deblocking framework via properly formulating deblocking as an MCA (morphological component analysis)-based image decomposition problem via sparse representation. The proposed method first decomposes an image/video frame into the low-frequency and high-frequency parts by applying BM3D (block-matching and 3D filtering) algorithm. The high-frequency part is then decomposed into a ¡§blocking component¡¨ and a ¡§non-blocking component¡¨ by performing dictionary learning and sparse coding based on MCA. As a result, the blocking component can be removed from the image/video frame successfully while preserving most original image/video details. Experimental results demonstrate the efficacy of the proposed algorithm. dictionary learning MCA (morphological component analysis) sparse representation Blocking artifact sparse coding
6	Supervised feature learning via sparse coding for music information rerieval O'Brien, Cian John 08 June 2015 (has links) This thesis explores the ideas of feature learning and sparse coding for Music Information Retrieval (MIR). Sparse coding is an algorithm which aims to learn new feature representations from data automatically. In contrast to previous work which uses sparse coding in an MIR context the concept of supervised sparse coding is also investigated, which makes use of the ground-truth labels explicitly during the learning process. Here sparse coding and supervised coding are applied to two MIR problems: classification of musical genre and recognition of the emotional content of music. A variation of Label Consistent K-SVD is used to add supervision during the dictionary learning process. In the case of Music Genre Recognition (MGR) an additional discriminative term is added to encourage tracks from the same genre to have similar sparse codes. For Music Emotion Recognition (MER) a linear regression term is added to learn an optimal classifier and dictionary pair. These results indicate that while sparse coding performs well for MGR, the additional supervision fails to improve the performance. In the case of MER, supervised coding significantly outperforms both standard sparse coding and commonly used designed features, namely MFCC and pitch chroma. Music Sparse coding Music information retrieval Music genre recognition Music emotion recognition
7	Sparse coding for machine learning, image processing and computer vision Mairal, Julien 30 November 2010 (has links) (PDF) We study in this thesis a particular machine learning approach to represent signals that that consists of modelling data as linear combinations of a few elements from a learned dictionary. It can be viewed as an extension of the classical wavelet framework, whose goal is to design such dictionaries (often orthonormal basis) that are adapted to natural signals. An important success of dictionary learning methods has been their ability to model natural image patches and the performance of image denoising algorithms that it has yielded. We address several open questions related to this framework: How to efficiently optimize the dictionary? How can the model be enriched by adding a structure to the dictionary? Can current image processing tools based on this method be further improved? How should one learn the dictionary when it is used for a different task than signal reconstruction? How can it be used for solving computer vision problems? We answer these questions with a multidisciplinarity approach, using tools from statistical machine learning, convex and stochastic optimization, image and signal processing, computer vision, but also optimization on graphs. [MATH] Mathematics [MATH] Mathématiques Sparse coding Dictionary learning Image denoising Convex optimization Images features Sparsity
8	Towards Scalable Analysis of Images and Videos Zhao, Bin 01 September 2014 (has links) With widespread availability of low-cost devices capable of photo shooting and high-volume video recording, we are facing explosion of both image and video data. The sheer volume of such visual data poses both challenges and opportunities in machine learning and computer vision research. In image classification, most of previous research has focused on small to mediumscale data sets, containing objects from dozens of categories. However, we could easily access images spreading thousands of categories. Unfortunately, despite the well-known advantages and recent advancements of multi-class classification techniques in machine learning, complexity concerns have driven most research on such super large-scale data set back to simple methods such as nearest neighbor search, one-vs-one or one-vs-rest approach. However, facing image classification problem with such huge task space, it is no surprise that these classical algorithms, often favored for their simplicity, will be brought to their knees not only because of the training time and storage cost they incur, but also because of the conceptual awkwardness of such algorithms in massive multi-class paradigms. Therefore, it is our goal to directly address the bigness of image data, not only the large number of training images and high-dimensional image features, but also the large task space. Specifically, we present algorithms capable of efficiently and effectively training classifiers that could differentiate tens of thousands of image classes. Similar to images, one of the major difficulties in video analysis is also the huge amount of data, in the sense that videos could be hours long or even endless. However, it is often true that only a small portion of video contains important information. Consequently, algorithms that could automatically detect unusual events within streaming or archival video would significantly improve the efficiency of video analysis and save valuable human attention for only the most salient contents. Moreover, given lengthy recorded videos, such as those captured by digital cameras on mobile phones, or surveillance cameras, most users do not have the time or energy to edit the video such that only the most salient and interesting part of the original video is kept. To this end, we also develop algorithm for automatic video summarization, without human intervention. Finally, we further extend our research on video summarization into a supervised formulation, where users are asked to generate summaries for a subset of a class of videos of similar nature. Given such manually generated summaries, our algorithm learns the preferred storyline within the given class of videos, and automatically generates summaries for the rest of videos in the class, capturing the similar storyline as in those manually summarized videos. Image Classification Video Summarization Unusual Event Detection Sparse Output Coding Dynamic Sparse Coding Online Dictionary Learning
9	Apprentissage supervisé d’une représentation multi-couches à base de dictionnaires pour la classification d’images et de vidéos / Supervised Multi-layer Dictionary learning for image and video classification Chan wai tim, Stefen 17 November 2016 (has links) Ces dernières années, de nombreux travaux ont été publiés sur l'encodage parcimonieux et l'apprentissage de dictionnaires. Leur utilisation s'est initialement développée dans des applications de reconstruction et de restauration d'images. Plus récemment, des recherches ont été réalisées sur l'utilisation des dictionnaires pour des tâches de classification en raison de la capacité de ces méthodes à chercher des motifs sous-jacents dans les images et de bons résultats ont été obtenus dans certaines conditions : objet d'intérêt centré, de même taille, même point de vue. Cependant, hors de ce cadre restrictif, les résultats sont plus mitigés. Dans cette thèse, nous nous intéressons à la recherche de dictionnaires adaptés à la classification. Les méthodes d'apprentissage classiquement utilisées pour les dictionnaires s'appuient sur des algorithmes d'apprentissage non supervisé. Nous allons étudier ici un moyen d'effectuer l'apprentissage de dictionnaires de manière supervisée. Dans l'objectif de pousser encore plus loin le caractère discriminant des codes obtenus par les dictionnaires proposés, nous introduisons également une architecture multicouche de dictionnaires. L'architecture proposée s'appuie sur la description locale d'une image en entrée et sa transformation grâce à une succession d'encodage et de traitements, et fournit en sortie un ensemble de descripteurs adaptés à la classification. La méthode d'apprentissage que nous avons développé est basée sur l'algorithme de rétro-propagation du gradient permettant un apprentissage coordonné des différents dictionnaires et une optimisation uniquement par rapport à un coût de classification. L’architecture proposée a été testée sur les bases de données d’images MNIST, CIFAR-10 et STL-10 avec de bons résultats par rapport aux autres méthodes basées sur l’utilisation de dictionnaires. La structure proposée peut être étendue à l’analyse de vidéos. / In the recent years, numerous works have been published on dictionary learning and sparse coding. They were initially used in image reconstruction and image restoration tasks. Recently, researches were interested in the use of dictionaries for classification tasks because of their capability to represent underlying patterns in images. Good results have been obtained in specific conditions: centered objects of interest, homogeneous sizes and points of view.However, without these constraints, the performances are dropping.In this thesis, we are interested in finding good dictionaries for classification.The learning methods classically used for dictionaries rely on unsupervised learning. Here, we are going to study how to perform supervised dictionary learning.In order to push the performances further, we introduce a multilayer architecture for dictionaries. The proposed architecture is based on the local description of an input image and its transformation thanks to a succession of encoding and processing steps. It outputs a vector of features effective for classification.The learning method we developed is based on the backpropagation algorithm which allows a joint learning of the different dictionaries and an optimization solely with respect to the classification cost.The proposed architecture has been tested on MNIST, CIFAR-10 and STL-10 datasets with good results compared to other dicitonary-based methods. The proposed architecture can be extended to video analysis. Traitement du signal Classification Apprentissage Dictionnaire Parcimonie Signal processing Classification Machine learning Dictionary Sparse Coding 620
10	Exploring Latent Structure in Data: Algorithms and Implementations January 2014 (has links) abstract: Feature representations for raw data is one of the most important component in a machine learning system. Traditionally, features are \textit{hand crafted} by domain experts which can often be a time consuming process. Furthermore, they do not generalize well to unseen data and novel tasks. Recently, there have been many efforts to generate data-driven representations using clustering and sparse models. This dissertation focuses on building data-driven unsupervised models for analyzing raw data and developing efficient feature representations. Simultaneous segmentation and feature extraction approaches for silicon-pores sensor data are considered. Aggregating data into a matrix and performing low rank and sparse matrix decompositions with additional smoothness constraints are proposed to solve this problem. Comparison of several variants of the approaches and results for signal de-noising and translocation/trapping event extraction are presented. Algorithms to improve transform-domain features for ion-channel time-series signals based on matrix completion are presented. The improved features achieve better performance in classification tasks and in reducing the false alarm rates when applied to analyte detection. Developing representations for multimedia is an important and challenging problem with applications ranging from scene recognition, multi-media retrieval and personal life-logging systems to field robot navigation. In this dissertation, we present a new framework for feature extraction for challenging natural environment sounds. Proposed features outperform traditional spectral features on challenging environmental sound datasets. Several algorithms are proposed that perform supervised tasks such as recognition and tag annotation. Ensemble methods are proposed to improve the tag annotation process. To facilitate the use of large datasets, fast implementations are developed for sparse coding, the key component in our algorithms. Several strategies to speed-up Orthogonal Matching Pursuit algorithm using CUDA kernel on a GPU are proposed. Implementations are also developed for a large scale image retrieval system. Image-based "exact search" and "visually similar search" using the image patch sparse codes are performed. Results demonstrate large speed-up over CPU implementations and good retrieval performance is also achieved. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2014 Electrical engineering Computer engineering Artificial intelligence Feature Learning GPU Machine Learning Retrieval Sparse Coding

Search results