21 |
New Directions in Sparse Models for Image Analysis and RestorationJanuary 2013 (has links)
abstract: Effective modeling of high dimensional data is crucial in information processing and machine learning. Classical subspace methods have been very effective in such applications. However, over the past few decades, there has been considerable research towards the development of new modeling paradigms that go beyond subspace methods. This dissertation focuses on the study of sparse models and their interplay with modern machine learning techniques such as manifold, ensemble and graph-based methods, along with their applications in image analysis and recovery. By considering graph relations between data samples while learning sparse models, graph-embedded codes can be obtained for use in unsupervised, supervised and semi-supervised problems. Using experiments on standard datasets, it is demonstrated that the codes obtained from the proposed methods outperform several baseline algorithms. In order to facilitate sparse learning with large scale data, the paradigm of ensemble sparse coding is proposed, and different strategies for constructing weak base models are developed. Experiments with image recovery and clustering demonstrate that these ensemble models perform better when compared to conventional sparse coding frameworks. When examples from the data manifold are available, manifold constraints can be incorporated with sparse models and two approaches are proposed to combine sparse coding with manifold projection. The improved performance of the proposed techniques in comparison to sparse coding approaches is demonstrated using several image recovery experiments. In addition to these approaches, it might be required in some applications to combine multiple sparse models with different regularizations. In particular, combining an unconstrained sparse model with non-negative sparse coding is important in image analysis, and it poses several algorithmic and theoretical challenges. A convex and an efficient greedy algorithm for recovering combined representations are proposed. Theoretical guarantees on sparsity thresholds for exact recovery using these algorithms are derived and recovery performance is also demonstrated using simulations on synthetic data. Finally, the problem of non-linear compressive sensing, where the measurement process is carried out in feature space obtained using non-linear transformations, is considered. An optimized non-linear measurement system is proposed, and improvements in recovery performance are demonstrated in comparison to using random measurements as well as optimized linear measurements. / Dissertation/Thesis / Ph.D. Electrical Engineering 2013
|
22 |
Sparse Methods in Image Understanding and Computer VisionJanuary 2013 (has links)
abstract: Image understanding has been playing an increasingly crucial role in vision applications. Sparse models form an important component in image understanding, since the statistics of natural images reveal the presence of sparse structure. Sparse methods lead to parsimonious models, in addition to being efficient for large scale learning. In sparse modeling, data is represented as a sparse linear combination of atoms from a "dictionary" matrix. This dissertation focuses on understanding different aspects of sparse learning, thereby enhancing the use of sparse methods by incorporating tools from machine learning. With the growing need to adapt models for large scale data, it is important to design dictionaries that can model the entire data space and not just the samples considered. By exploiting the relation of dictionary learning to 1-D subspace clustering, a multilevel dictionary learning algorithm is developed, and it is shown to outperform conventional sparse models in compressed recovery, and image denoising. Theoretical aspects of learning such as algorithmic stability and generalization are considered, and ensemble learning is incorporated for effective large scale learning. In addition to building strategies for efficiently implementing 1-D subspace clustering, a discriminative clustering approach is designed to estimate the unknown mixing process in blind source separation. By exploiting the non-linear relation between the image descriptors, and allowing the use of multiple features, sparse methods can be made more effective in recognition problems. The idea of multiple kernel sparse representations is developed, and algorithms for learning dictionaries in the feature space are presented. Using object recognition experiments on standard datasets it is shown that the proposed approaches outperform other sparse coding-based recognition frameworks. Furthermore, a segmentation technique based on multiple kernel sparse representations is developed, and successfully applied for automated brain tumor identification. Using sparse codes to define the relation between data samples can lead to a more robust graph embedding for unsupervised clustering. By performing discriminative embedding using sparse coding-based graphs, an algorithm for measuring the glomerular number in kidney MRI images is developed. Finally, approaches to build dictionaries for local sparse coding of image descriptors are presented, and applied to object recognition and image retrieval. / Dissertation/Thesis / Ph.D. Electrical Engineering 2013
|
23 |
Semantic Sparse Learning in Images and VideosJanuary 2014 (has links)
abstract: Many learning models have been proposed for various tasks in visual computing. Popular examples include hidden Markov models and support vector machines. Recently, sparse-representation-based learning methods have attracted a lot of attention in the computer vision field, largely because of their impressive performance in many applications. In the literature, many of such sparse learning methods focus on designing or application of some learning techniques for certain feature space without much explicit consideration on possible interaction between the underlying semantics of the visual data and the employed learning technique. Rich semantic information in most visual data, if properly incorporated into algorithm design, should help achieving improved performance while delivering intuitive interpretation of the algorithmic outcomes. My study addresses the problem of how to explicitly consider the semantic information of the visual data in the sparse learning algorithms. In this work, we identify four problems which are of great importance and broad interest to the community. Specifically, a novel approach is proposed to incorporate label information to learn a dictionary which is not only reconstructive but also discriminative; considering the formation process of face images, a novel image decomposition approach for an ensemble of correlated images is proposed, where a subspace is built from the decomposition and applied to face recognition; based on the observation that, the foreground (or salient) objects are sparse in input domain and the background is sparse in frequency domain, a novel and efficient spatio-temporal saliency detection algorithm is proposed to identify the salient regions in video; and a novel hidden Markov model learning approach is proposed by utilizing a sparse set of pairwise comparisons among the data, which is easier to obtain and more meaningful, consistent than tradition labels, in many scenarios, e.g., evaluating motion skills in surgical simulations. In those four problems, different types of semantic information are modeled and incorporated in designing sparse learning algorithms for the corresponding visual computing tasks. Several real world applications are selected to demonstrate the effectiveness of the proposed methods, including, face recognition, spatio-temporal saliency detection, abnormality detection, spatio-temporal interest point detection, motion analysis and emotion recognition. In those applications, data of different modalities are involved, ranging from audio signal, image to video. Experiments on large scale real world data with comparisons to state-of-art methods confirm the proposed approaches deliver salient advantages, showing adding those semantic information dramatically improve the performances of the general sparse learning methods. / Dissertation/Thesis / Ph.D. Computer Science 2014
|
24 |
Segmentação de lesões melanocíticas usando uma abordagem baseada no aprendizado de dicionários / Segmentation of melanocytic lesions using a dictionary learning based approachFlores, Eliezer Soares January 2015 (has links)
Segmentação é uma etapa essencial para sistemas de pré-triagem de lesões melanocíticas. Neste trabalho, um novo método para segmentar lesões melanocíticas em imagens de câmera padrão (i.e., imagens macroscópicas) é apresentado. Inicialmente, para reduzir artefatos indesejáveis, os efeitos de sombra são atenuados na imagem macroscópica e uma présegmentação é obtida usando um esquema que combina a transformada wavelet com a transformada watershed. Em seguida, uma imagem de variação textural projetada para melhorar a discriminabilidade da lesão em relação ao fundo é obtida e a região présegmentada é usada para o aprendizado de um dicionário inicial e de uma representação inicial via um método de fatoração de matrizes não-negativas. Uma versão nãosupervisionada e não-paramétrica do método de aprendizado de dicionário baseado em teoria da informação é proposta para otimizar esta representação, selecionando o subconjunto de átomos que maximiza a compactividade e a representatividade do dicionário aprendido. Por fim, a imagem da lesão de pele é representada usando o dicionário aprendido e segmentada com o método de corte normalizado em grafos. Nossos resultados experimentais baseados em uma base de imagens bastante utilizada sugerem que o método proposto tende a fornecer melhores resultados do que os métodos estado-da-arte analisados (em termos do erro XOR). / Segmentation is an essential step for the automated pre-screening of melanocytic lesions. In this work, a new method for segmenting melanocytic lesions in standard camera images (i.e., macroscopic images) is presented. Initially, to reduce unwanted artifacts, shading effects are attenuated in the macroscopic image and a pre-segmentation is obtained using a scheme that combines the wavelet transform and the watershed transform. Afterwards, a textural variation image designed to enhance the skin lesion against the background is obtained, and the presegmented skin lesion region is used to learn an initial dictionary and an initial representation via a nonnegative matrix factorization method. An unsupervised and non-parametric version of the information-theoretic dictionary learning method is proposed to optimize this representation by selecting the subset of atoms that maximizes the learned dictionary compactness and representation. Finally, the skin lesion image is represented using the learned dictionary and segmented with the normalized graph cuts method. Our experimental results based on a widely used image dataset suggest that the proposed method tends to provide more accurate skin lesion segmentations than comparable state-of-the-art methods (in terms of the XOR error).
|
25 |
Apprentissage d'atlas fonctionnel du cerveau modélisant la variabilité inter-individuelle / Learning functional brain atlases modeling inter-subject variabilityAbraham, Alexandre 30 November 2015 (has links)
De récentes études ont montré que l'activité spontanée du cerveau observée au repos permet d'étudier l'organisation fonctionnelle cérébrale en complément de l'information fournie par les protocoles de tâches. A partir de ces signaux, nous allons extraire un atlas fonctionnel du cerveau modélisant la variabilité inter-sujet. La nouveauté de notre approche réside dans l'intégration d'a-prioris neuroscientifiques et de la variabilité inter-sujet directement dans un modèles probabiliste de l'activité de repos. Ces modèles seront appliqués sur de larges jeux de données. Cette variabilité, ignorée jusqu'à présent, cont nous permettre d'extraire des atlas flous, donc limités en terme de résolution. Des challenges à la fois numériques et algorithmiques sont à relever de par la taille des jeux de données étudiés et la complexité de la modélisation considérée. / Recent studies have shown that resting-state spontaneous brain activity unveils intrinsic cerebral functioning and complete information brought by prototype task study. From these signals, we will set up a functional atlas of the brain, along with an across-subject variability model. The novelty of our approach lies in the integration of neuroscientific priors and inter-individual variability in a probabilistic description of the rest activity. These models will be applied to large datasets. This variability, ignored until now, may lead to learning of fuzzy atlases, thus limited in term of resolution. This program yields both numerical and algorithmic challenges because of the data volume but also because of the complexity of modelisation.
|
26 |
Optimal Transport Dictionary Learning and Non-negative Matrix Factorization / 最適輸送辞書学習と非負値行列因子分解Rolet, Antoine 23 March 2021 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23314号 / 情博第750号 / 新制||情||128(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 山本 章博, 教授 鹿島 久嗣, 教授 河原 達也 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
27 |
Aplikace metod učení slovníku pro Audio Inpainting / Applications of Dictionary Learning Methods for Audio InpaintingOzdobinski, Roman January 2014 (has links)
This diploma thesis discusses methods of dictionary learning to inpaint missing sections in the audio signal. There was theoretically analyzed and practically used algorithms K-SVD and INK-SVD for dictionary learning. These dictionaries have been applied to the reconstruction of audio signals using OMP (Orthogonal Matching Pursuit). Furthermore, there was proposed an algorithm for selecting the stationary segments and their subsequent use as training data for K-SVD and INK-SVD. In the practical part of thesis have been observed efficiency with training set selection from whole signal compared with algorithm for stationary segmentation used. The influence of mutual coherence on the quality of reconstruction with incoherent dictionary was also studied. With created scripts for multiple testing in Matlab, there was performed comparison of these methods on genre distinct songs.
|
28 |
Temporal signals classification / Classification de signaux temporelsRida, Imad 03 February 2017 (has links)
De nos jours, il existe de nombreuses applications liées à la vision et à l’audition visant à reproduire par des machines les capacités humaines. Notre intérêt pour ce sujet vient du fait que ces problèmes sont principalement modélisés par la classification de signaux temporels. En fait, nous nous sommes intéressés à deux cas distincts, la reconnaissance de la démarche humaine et la reconnaissance de signaux audio, (notamment environnementaux et musicaux). Dans le cadre de la reconnaissance de la démarche, nous avons proposé une nouvelle méthode qui apprend et sélectionne automatiquement les parties dynamiques du corps humain. Ceci permet de résoudre le problème des variations intra-classe de façon dynamique; les méthodes à l’état de l’art se basant au contraire sur des connaissances a priori. Dans le cadre de la reconnaissance audio, aucune représentation de caractéristiques conventionnelle n’a montré sa capacité à s’attaquer indifféremment à des problèmes de reconnaissance d’environnement ou de musique : diverses caractéristiques ont été introduites pour résoudre chaque tâche spécifiquement. Nous proposons ici un cadre général qui effectue la classification des signaux audio grâce à un problème d’apprentissage de dictionnaire supervisé visant à minimiser et maximiser les variations intra-classe et inter-classe respectivement. / Nowadays, there are a lot of applications related to machine vision and hearing which tried to reproduce human capabilities on machines. These problems are mainly amenable to a temporal signals classification problem, due our interest to this subject. In fact, we were interested to two distinct problems, humain gait recognition and audio signal recognition including both environmental and music ones. In the former, we have proposed a novel method to automatically learn and select the dynamic human body-parts to tackle the problem intra-class variations contrary to state-of-art methods which relied on predefined knowledge. To achieve it a group fused lasso algorithm is applied to segment the human body into parts with coherent motion value across the subjects. In the latter, while no conventional feature representation showed its ability to tackle both environmental and music problems, we propose to model audio classification as a supervised dictionary learning problem. This is done by learning a dictionary per class and encouraging the dissimilarity between the dictionaries by penalizing their pair- wise similarities. In addition the coefficients of a signal representation over these dictionaries is sought as sparse as possible. The experimental evaluations provide performing and encouraging results.
|
29 |
Contribution à la décomposition de données multimodales avec des applications en apprentisage de dictionnaires et la décomposition de tenseurs de grande taille. / Contribution to multimodal data processing with applications to dictionary learning and large-scale decompositionTraoré, Abraham 26 November 2019 (has links)
Dans ce travail, on s'intéresse à des outils mathématiques spéciaux appelés tenseurs qui sont formellement définis comme des tableaux multidimensionnels définis sur le produit tensoriel d'espaces vectoriels (chaque espace vectoriel étant muni de son système de coordonnées), le nombre d'espaces vectoriels impliqués dans ce produit étant l'ordre du tenseur. L'intérêt pour les tenseurs est motivé par certains travaux expérimentaux qui ont prouvé, dans divers contextes, que traiter des données multidimensionnelles avec des tenseurs plutôt que des matrices donne un meilleur résultat aussi bien pour des tâches de régression que de classification. Dans le cadre de la thèse, nous nous sommes focalisés sur une décomposition dite de Tucker et avons mis en place une méthode pour l'apprentissage de dictionnaires, une technique pour l'apprentissage en ligne de dictionnaires, une approche pour la décomposition d'un tenseur de grandes tailles et enfin une méthodologie pour la décomposition d'un tenseur qui croît par rapport à tous les modes. De nouveaux résultats théoriques concernant la convergence et la vitesse de convergence sont établis et l'efficacité des algorithmes proposés, reposant soit sur la minimisation alternée, soit sur la descente de gradients par coordonnées, est démontrée sur des problèmes réels / In this work, we are interested in special mathematical tools called tensors, that are multidimensional arrays defined on tensor product of some vector spaces, each of which has its own coordinate system and the number of spaces involved in this product is generally referred to as order. The interest for these tools stem from some empirical works (for a range of applications encompassing both classification and regression) that prove the superiority of tensor processing with respect to matrix decomposition techniques. In this thesis framework, we focused on specific tensor model named Tucker and established new approaches for miscellaneous tasks such as dictionary learning, online dictionary learning, large-scale processing as well as the decomposition of a tensor evolving with respect to each of its modes. New theoretical results are established and the efficiency of the different algorithms, which are based either on alternate minimization or coordinate gradient descent, is proven via real-world problems.
|
30 |
Solving Linear and Bilinear Inverse Problems using Approximate Message Passing MethodsSarkar, Subrata January 2020 (has links)
No description available.
|
Page generated in 0.1139 seconds