Global ETD Search

1	Dictionary learning methods for single-channel source separation Lefèvre, Augustin 03 October 2012 (has links) (PDF) In this thesis we provide three main contributions to blind source separation methods based on NMF. Our first contribution is a group-sparsity inducing penalty specifically tailored for Itakura-Saito NMF. In many music tracks, there are whole intervals where only one source is active at the same time. The group-sparsity penalty we propose allows to blindly indentify these intervals and learn source specific dictionaries. As a consequence, those learned dictionaries can be used to do source separation in other parts of the track were several sources are active. These two tasks of identification and separation are performed simultaneously in one run of group-sparsity Itakura-Saito NMF. Our second contribution is an online algorithm for Itakura-Saito NMF that allows to learn dictionaries on very large audio tracks. Indeed, the memory complexity of a batch implementation NMF grows linearly with the length of the recordings and becomes prohibitive for signals longer than an hour. In contrast, our online algorithm is able to learn NMF on arbitrarily long signals with limited memory usage. Our third contribution deals user informed NMF. In short mixed signals, blind learning becomes very hard and sparsity do not retrieve interpretable dictionaries. Our contribution is very similar in spirit to inpainting. It relies on the empirical fact that, when observing the spectrogram of a mixture signal, an overwhelming proportion of it consists in regions where only one source is active. We describe an extension of NMF to take into account time-frequency localized information on the absence/presence of each source. We also investigate inferring this information with tools from machine learning. Informed source separation Incremental algorithms Structured norms Nonnegative matrix factorization
2	Dictionary learning methods for single-channel source separation / Méthodes d'apprentissage de dictionnaire pour la séparation de sources audio avec un seul capteur Lefèvre, Augustin 03 October 2012 (has links) Nous proposons dans cette thèse trois contributions principales aux méthodes d'apprentissage de dictionnaire. La première est un critère de parcimonie par groupes adapté à la NMF lorsque la mesure de distorsion choisie est la divergence d'Itakura-Saito. Dans la plupart des signaux de musique on peut trouver de longs intervalles où seulement une source est active (des soli). Le critère de parcimonie par groupe que nous proposons permet de trouver automatiquement de tels segments et d'apprendre un dictionnaire adapté à chaque source. Ces dictionnaires permettent ensuite d'effectuer la tâche de séparation dans les intervalles où les sources sont mélangés. Ces deux tâches d'identification et de séparation sont effectuées simultanément en une seule passe de l'algorithme que nous proposons. Notre deuxième contribution est un algorithme en ligne pour apprendre le dictionnaire à grande échelle, sur des signaux de plusieurs heures. L'espace mémoire requis par une NMF estimée en ligne est constant alors qu'il croit linéairement avec la taille des signaux fournis dans la version standard, ce qui est impraticable pour des signaux de plus d'une heure. Notre troisième contribution touche à l'interaction avec l'utilisateur. Pour des signaux courts, l'apprentissage aveugle est particulièrement dificile, et l'apport d'information spécifique au signal traité est indispensable. Notre contribution est similaire à l'inpainting et permet de prendre en compte des annotations temps-fréquences. Elle repose sur l'observation que la quasi-totalité du spectrogramme peut etre divisé en régions spécifiquement assignées à chaque source. Nous décrivons une extension de NMF pour prendre en compte cette information et discutons la possibilité d'inférer cette information automatiquement avec des outils d'apprentissage statistique simples. / In this thesis we provide three main contributions to blind source separation methods based on NMF. Our first contribution is a group-sparsity inducing penalty specifically tailored for Itakura-Saito NMF. In many music tracks, there are whole intervals where only one source is active at the same time. The group-sparsity penalty we propose allows to blindly indentify these intervals and learn source specific dictionaries. As a consequence, those learned dictionaries can be used to do source separation in other parts of the track were several sources are active. These two tasks of identification and separation are performed simultaneously in one run of group-sparsity Itakura-Saito NMF. Our second contribution is an online algorithm for Itakura-Saito NMF that allows to learn dictionaries on very large audio tracks. Indeed, the memory complexity of a batch implementation NMF grows linearly with the length of the recordings and becomes prohibitive for signals longer than an hour. In contrast, our online algorithm is able to learn NMF on arbitrarily long signals with limited memory usage. Our third contribution deals user informed NMF. In short mixed signals, blind learning becomes very hard and sparsity do not retrieve interpretable dictionaries. Our contribution is very similar in spirit to inpainting. It relies on the empirical fact that, when observing the spectrogram of a mixture signal, an overwhelming proportion of it consists in regions where only one source is active. We describe an extension of NMF to take into account time-frequency localized information on the absence/presence of each source. We also investigate inferring this information with tools from machine learning. Apprentissage statistique Factorisation en matrices positives Normes structurées Algorithme incrémental Séparation de sources informée Informed source separation Incremental algorithms Structured norms Nonnegative matrix factorization
3	Méthodes informées de factorisation matricielle non négative : Application à l'identification de sources de particules industrielles / Informed methods of Non-negative Matrix Factorization. A study of industrial source identification Limem, Abdelhakim 21 November 2014 (has links) Les méthodes de NMF permettent la factorisation aveugle d'une matrice non-négative X en le produit X = G . F de deux matrices non-négatives G et F. Bien que ces approches sont étudiées avec un grand intêret par la communauté scientifique, elles souffrent bien souvent d'un manque de robustesse vis à vis des données et des conditions initiales et peuvent présenter des solutions multiples. Dans cette optique et afin de réduire l'espace des solutions admissibles, les travaux de cette thèse ont pour objectif d'informer la NMF, positionnant ainsi nos travaux entre la régression et les factorisations aveugles classiques. Par ailleurs, des fonctions de coûts paramétriques appelées divergences αβ sont utilisées, permettant de tolérer la présence d'aberrations dans les données. Nous introduisons trois types de contraintes recherchées sur la matrice F à savoir (i) la connaissance exacte ou bornée de certains de ses éléments et (ii) la somme à 1 de chacune de ses lignes. Des règles de mise à jour permettant de faire cohabiter l'ensemble de ces contraintes par des méthodes multiplicatives mixées à des projections sont proposées. D'autre part, nous proposons de contraindre la structure de la matrice G par l'usage d'un modèle physique susceptible de distinguer les sources présentes au niveau du récepteur. Une application d'identification de sources de particules en suspension dans l'air, autour d'une région industrielle du littoral nord de la France, a permis de tester l'intérêt de l'approche. À travers une série de tests sur des données synthétiques et réelles, nous montrons l'apport des différentes informations pour rendre les résultats de la factorisation plus cohérents du point de vue de l'interprétation physique et moins dépendants de l'initialisation. / NMF methods aim to factorize a non negative observation matrix X as the product X = G.F between two non-negative matrices G and F. Although these approaches have been studied with great interest in the scientific community, they often suffer from a lack of robustness to data and to initial conditions, and provide multiple solutions. To this end and in order to reduce the space of admissible solutions, the work proposed in this thesis aims to inform NMF, thus placing our work in between regression and classic blind factorization. In addition, some cost functions called parametric αβ-divergences are used, so that the resulting NMF methods are robust to outliers in the data. Three types of constraints are introduced on the matrix F, i. e., (i) the "exact" or "bounded" knowledge on some components, and (ii) the sum to 1 of each line of F. Update rules are proposed so that all these constraints are taken into account by mixing multiplicative methods with projection. Moreover, we propose to constrain the structure of the matrix G by the use of a physical model, in order to discern sources which are influent at the receiver. The considered application - consisting of source identification of particulate matter in the air around an insdustrial area on the French northern coast - showed the interest of the proposed methods. Through a series of experiments on both synthetic and real data, we show the contribution of different informations to make the factorization results more consistent in terms of physical interpretation and less dependent of the initialization Séparation informée de sources Factorisation matricielle non négative Contraintes expertes Divergence αβ Identification de particules de l'air Informed source separation Non-negative Matrix Factorization Expert constraints Αβ-divergence Air particulate matter identification
4	Reverse audio engineering for active listening and other applications Gorlow, Stasnislaw 16 December 2013 (has links) (PDF) This work deals with the problem of reverse audio engineering for active listening. The format under consideration corresponds to the audio CD. The musical content is viewed as the result of a concatenation of the composition, the recording, the mixing, and the mastering. The inversion of the two latter stages constitutes the core of the problem at hand. The audio signal is treated as a post-nonlinear mixture. Thus, the mixture is "decompressed" before being "decomposed" into audio tracks. The problem is tackled in an informed context: The inversion is accompanied by information which is specific to the content production. In this manner, the quality of the inversion is significantly improved. The information is reduced in size by the use of quantification and coding methods, and some facts on psychoacoustics. The proposed methods are applicable in real time and have a low complexity. The obtained results advance the state of the art and contribute new insights. [INFO:INFO_OH] Computer Science/Other [INFO:INFO_OH] Informatique/Autre Reverse audio engineering Active listening Informed Source Separation Multichannel object-based audio coding
5	Reverse audio engineering for active listening and other applications / Rétroingénierie du son pour l’écoute active et autres applications Gorlow, Stasnislaw 16 December 2013 (has links) Ce travail s’intéresse au problème de la rétroingénierie du son pour l’écoute active. Le format considéré correspond au CD audio. Le contenu musical est vu comme le résultat d’un enchaînement de la composition, l’enregistrement, le mixage et le mastering. L’inversion des deux dernières étapes constitue le fond du problème présent. Le signal audio est traité comme un mélange post-non-linéaire. Ainsi, le mélange est « décompressé » avant d'être « décomposé » en pistes audio. Le problème est abordé dans un contexte informé : l’inversion est accompagnée d'une information qui est spécifique à la production du contenu. De cette manière, la qualité de l’inversion est significativement améliorée. L’information est réduite de taille en se servant des méthodes de quantification, codage, et des faits sur la psychoacoustique. Les méthodes proposées s’appliquent en temps réel et montrent une complexité basse. Les résultats obtenus améliorent l’état de l’art et contribuent aux nouvelles connaissances. / This work deals with the problem of reverse audio engineering for active listening. The format under consideration corresponds to the audio CD. The musical content is viewed as the result of a concatenation of the composition, the recording, the mixing, and the mastering. The inversion of the two latter stages constitutes the core of the problem at hand. The audio signal is treated as a post-nonlinear mixture. Thus, the mixture is “decompressed” before being “decomposed” into audio tracks. The problem is tackled in an informed context: The inversion is accompanied by information which is specific to the content production. In this manner, the quality of the inversion is significantly improved. The information is reduced in size by the use of quantification and coding methods, and some facts on psychoacoustics. The proposed methods are applicable in real time and have a low complexity. The obtained results advance the state of the art and contribute new insights. Rétroingénierie du son Écoute active Séparation de sources informée Reverse audio engineering Active listening Informed Source Separation Multichannel object-based audio coding

1

Page generated in 0.098 seconds