Global ETD Search

11	Utilisation de splines monotones afin de condenser des tables de mortalité dans un contexte bayésien Patenaude, Valérie 04 1900 (has links) Dans ce mémoire, nous cherchons à modéliser des tables à deux entrées monotones en lignes et/ou en colonnes, pour une éventuelle application sur les tables de mortalité. Nous adoptons une approche bayésienne non paramétrique et représentons la forme fonctionnelle des données par splines bidimensionnelles. L’objectif consiste à condenser une table de mortalité, c’est-à-dire de réduire l’espace d’entreposage de la table en minimisant la perte d’information. De même, nous désirons étudier le temps nécessaire pour reconstituer la table. L’approximation doit conserver les mêmes propriétés que la table de référence, en particulier la monotonie des données. Nous travaillons avec une base de fonctions splines monotones afin d’imposer plus facilement la monotonie au modèle. En effet, la structure flexible des splines et leurs dérivées faciles à manipuler favorisent l’imposition de contraintes sur le modèle désiré. Après un rappel sur la modélisation unidimensionnelle de fonctions monotones, nous généralisons l’approche au cas bidimensionnel. Nous décrivons l’intégration des contraintes de monotonie dans le modèle a priori sous l’approche hiérarchique bayésienne. Ensuite, nous indiquons comment obtenir un estimateur a posteriori à l’aide des méthodes de Monte Carlo par chaînes de Markov. Finalement, nous étudions le comportement de notre estimateur en modélisant une table de la loi normale ainsi qu’une table t de distribution de Student. L’estimation de nos données d’intérêt, soit la table de mortalité, s’ensuit afin d’évaluer l’amélioration de leur accessibilité. / This master’s thesis is about the estimation of bivariate tables which are monotone within the rows and/or the columns, with a special interest in the approximation of life tables. This problem is approached through a nonparametric Bayesian regression model, in particular linear combinations of regression splines. By condensing a life table, our goal is to reduce its storage space without losing the entries’ accuracy. We will also study the reconstruction time of the table with our estimators. The properties of the reference table, specifically its monotonicity, must be preserved in the estimation. We are working with a monotone spline basis since splines are flexible and their derivatives can easily be manipulated. Those properties enable the imposition of constraints of monotonicity on our model. A brief review on univariate approximations of monotone functions is then extended to bivariate estimations. We use hierarchical Bayesian modeling to include the constraints in the prior distributions. We then explain the Markov chain Monte Carlo algorithm to obtain a posterior estimator. Finally, we study the estimator’s behaviour by applying our model on the Standard Normal table and the Student’s t table. We estimate our data of interest, the life table, to establish the improvement in data accessibility. Fonctions monotones bidimensionnelles Monotone bivariate functions Tables de mortalité Life tables Splines monotones Monotone splines Contraintes de monotonie Constraints of monotonicity Markov chain Monte Carlo techniques
12	Détection et classification de cibles multispectrales dans l'infrarouge MAIRE, Florian 14 February 2014 (has links) (PDF) Les dispositifs de protection de sites sensibles doivent permettre de détecter des menaces potentielles suffisamment à l'avance pour pouvoir mettre en place une stratégie de défense. Dans cette optique, les méthodes de détection et de reconnaissance d'aéronefs se basant sur des images infrarouge multispectrales doivent être adaptées à des images faiblement résolues et être robustes à la variabilité spectrale et spatiale des cibles. Nous mettons au point dans cette thèse, des méthodes statistiques de détection et de reconnaissance d'aéronefs satisfaisant ces contraintes. Tout d'abord, nous spéciﬁons une méthode de détection d'anomalies pour des images multispectrales, combinant un calcul de vraisemblance spectrale avec une étude sur les ensembles de niveaux de la transformée de Mahalanobis de l'image. Cette méthode ne nécessite aucune information a priori sur les aéronefs et nous permet d'identiﬁer les images contenant des cibles. Ces images sont ensuite considérées comme des réalisations d'un modèle statistique d'observations ﬂuctuant spectralement et spatialement autour de formes caractéristiques inconnues. L'estimation des paramètres de ce modèle est réalisée par une nouvelle méthodologie d'apprentissage séquentiel non supervisé pour des modèles à données manquantes que nous avons développée. La mise au point de ce modèle nous permet in ﬁne de proposer une méthode de reconnaissance de cibles basée sur l'estimateur du maximum de vraisemblance a posteriori. Les résultats encourageants, tant en détection qu'en classiﬁcation, justiﬁent l'intérêt du développement de dispositifs permettant l'acquisition d'images multispectrales. Ces méthodes nous ont également permis d'identiﬁer les regroupements de bandes spectrales optimales pour la détection et la reconnaissance d'aéronefs faiblement résolus en infrarouge [SPI:OTHER] Engineering Sciences/Other Reconnaissance de forme Modèles à prototype déformable Apprentissage séquentiel Algorithmes expectation-maximization Détection d'anomalies Signature infrarouge Imagerie multispectrale
13	Improving sampling, optimization and feature extraction in Boltzmann machines Desjardins, Guillaume 12 1900 (has links) L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires. / Despite the current widescale success of deep learning in training large scale hierarchical models through supervised learning, unsupervised learning promises to play a crucial role towards solving general Artificial Intelligence, where agents are expected to learn with little to no supervision. The work presented in this thesis tackles the problem of unsupervised feature learning and density estimation, using a model family at the heart of the deep learning phenomenon: the Boltzmann Machine (BM). We present contributions in the areas of sampling, partition function estimation, optimization and the more general topic of invariant feature learning. With regards to sampling, we present a novel adaptive parallel tempering method which dynamically adjusts the temperatures under simulation to maintain good mixing in the presence of complex multi-modal distributions. When used in the context of stochastic maximum likelihood (SML) training, the improved ergodicity of our sampler translates to increased robustness to learning rates and faster per epoch convergence. Though our application is limited to BM, our method is general and is applicable to sampling from arbitrary probabilistic models using Markov Chain Monte Carlo (MCMC) techniques. While SML gradients can be estimated via sampling, computing data likelihoods requires an estimate of the partition function. Contrary to previous approaches which consider the model as a black box, we provide an efficient algorithm which instead tracks the change in the log partition function incurred by successive parameter updates. Our algorithm frames this estimation problem as one of filtering performed over a 2D lattice, with one dimension representing time and the other temperature. On the topic of optimization, our thesis presents a novel algorithm for applying the natural gradient to large scale Boltzmann Machines. Up until now, its application had been constrained by the computational and memory requirements of computing the Fisher Information Matrix (FIM), which is square in the number of parameters. The Metric-Free Natural Gradient algorithm (MFNG) avoids computing the FIM altogether by combining a linear solver with an efficient matrix-vector operation. The method shows promise in that the resulting updates yield faster per-epoch convergence, despite being slower in terms of wall clock time. Finally, we explore how invariant features can be learnt through modifications to the BM energy function. We study the problem in the context of the spike & slab Restricted Boltzmann Machine (ssRBM), which we extend to handle both binary and sparse input distributions. By associating each spike with several slab variables, latent variables can be made invariant to a rich, high dimensional subspace resulting in increased invariance in the learnt representation. When using the expected model posterior as input to a classifier, increased invariance translates to improved classification accuracy in the low-label data regime. We conclude by showing a connection between invariance and the more powerful concept of disentangling factors of variation. While invariance can be achieved by pooling over subspaces, disentangling can be achieved by learning multiple complementary views of the same subspace. In particular, we show how this can be achieved using third-order BMs featuring multiplicative interactions between pairs of random variables. Réseaux de neurones Apprentissage profond Apprentissage non-supervisé Apprentissage de représentations Machines de Boltzmann Échantillonnage Gradient naturel Modèles bilinéaires Fonction de partition Neural networks Deep learning Unsupervised learning Feature learning Boltzmann machines Markov chain Monte Carlo Parallel tempering Natural gradient Bilinear models Partition function
14	Modèle bayésien non paramétrique pour la segmentation jointe d'un ensemble d'images avec des classes partagées / Bayesian nonparametric model for joint segmentation of a set of images with shared classes Sodjo, Jessica 18 September 2018 (has links) Ce travail porte sur la segmentation jointe d’un ensemble d’images dans un cadre bayésien.Le modèle proposé combine le processus de Dirichlet hiérarchique (HDP) et le champ de Potts.Ainsi, pour un groupe d’images, chacune est divisée en régions homogènes et les régions similaires entre images sont regroupées en classes. D’une part, grâce au HDP, il n’est pas nécessaire de définir a priori le nombre de régions par image et le nombre de classes, communes ou non.D’autre part, le champ de Potts assure une homogénéité spatiale. Les lois a priori et a posteriori en découlant sont complexes rendant impossible le calcul analytique d’estimateurs. Un algorithme de Gibbs est alors proposé pour générer des échantillons de la loi a posteriori. De plus,un algorithme de Swendsen-Wang généralisé est développé pour une meilleure exploration dela loi a posteriori. Enfin, un algorithme de Monte Carlo séquentiel a été défini pour l’estimation des hyperparamètres du modèle.Ces méthodes ont été évaluées sur des images-test et sur des images naturelles. Le choix de la meilleure partition se fait par minimisation d’un critère indépendant de la numérotation. Les performances de l’algorithme sont évaluées via des métriques connues en statistiques mais peu utilisées en segmentation d’image. / This work concerns the joint segmentation of a set images in a Bayesian framework. The proposed model combines the hierarchical Dirichlet process (HDP) and the Potts random field. Hence, for a set of images, each is divided into homogeneous regions and similar regions between images are grouped into classes. On the one hand, thanks to the HDP, it is not necessary to define a priori the number of regions per image and the number of classes, common or not.On the other hand, the Potts field ensures a spatial consistency. The arising a priori and a posteriori distributions are complex and makes it impossible to compute analytically estimators. A Gibbs algorithm is then proposed to generate samples of the distribution a posteriori. Moreover,a generalized Swendsen-Wang algorithm is developed for a better exploration of the a posteriori distribution. Finally, a sequential Monte Carlo sampler is defined for the estimation of the hyperparameters of the model.These methods have been evaluated on toy examples and natural images. The choice of the best partition is done by minimization of a numbering free criterion. The performance are assessed by metrics well-known in statistics but unused in image segmentation. Inférence bayésienne Monte Carlo séquentiel Bayésien non paramétrique Processus de Dirichlet hiérarchique Champ de Potts Algorithme de Swendsen-Wang Segmentation Image Bayesian inference Markov chain Monte Carlo Sequential Monte Carlo Non parametric Bayesian Hierarchical Dirichlet process Potts field Swendsen-Wang algorithm Segmentation Image
15	Approche stochastique de l'analyse du « residual moveout » pour la quantification de l'incertitude dans l'imagerie sismique / A stochastic approach to uncertainty quantification in residual moveout analysis Tamatoro, Johng-Ay 09 April 2014 (has links) Le principale objectif de l'imagerie sismique pétrolière telle qu'elle est réalisée de nos jours est de fournir une image représentative des quelques premiers kilomètres du sous-sol. Cette image permettra la localisation des structures géologiques formant les réservoirs où sont piégées les ressources en hydrocarbures. Pour pouvoir caractériser ces réservoirs et permettre la production des hydrocarbures, le géophysicien utilise la migration-profondeur qui est un outil d'imagerie sismique qui sert à convertir des données-temps enregistrées lors des campagnes d'acquisition sismique en des images-profondeur qui seront exploitées par l'ingénieur-réservoir avec l'aide de l'interprète sismique et du géologue. Lors de la migration profondeur, les évènements sismiques (réflecteurs,…) sont replacés à leurs positions spatiales correctes. Une migration-profondeur pertinente requiert une évaluation précise modèle de vitesse. La précision du modèle de vitesse utilisé pour une migration est jugée au travers l'alignement horizontal des évènements présents sur les Common Image Gather (CIG). Les évènements non horizontaux (Residual Move Out) présents sur les CIG sont dus au ratio du modèle de vitesse de migration par la vitesse effective du milieu. L'analyse du Residual Move Out (RMO) a pour but d'évaluer ce ratio pour juger de la pertinence du modèle de vitesse et permettre sa mise à jour. Les CIG qui servent de données pour l'analyse du RMO sont solutions de problèmes inverses mal posés, et sont corrompues par du bruit. Une analyse de l'incertitude s'avère nécessaire pour améliorer l'évaluation des résultats obtenus. Le manque d'outils d'analyse de l'incertitude dans l'analyse du RMO en fait sa faiblesse. L'analyse et la quantification de l'incertitude pourrait aider à la prise de décisions qui auront des impacts socio-économiques importantes. Ce travail de thèse a pour but de contribuer à l'analyse et à la quantification de l'incertitude dans l'analyse des paramètres calculés pendant le traitement des données sismiques et particulièrement dans l'analyse du RMO. Pour atteindre ces objectifs plusieurs étapes ont été nécessaires. Elles sont entre autres :- L’appropriation des différents concepts géophysiques nécessaires à la compréhension du problème (organisation des données de sismique réflexion, outils mathématiques et méthodologiques utilisés);- Présentations des méthodes et outils pour l'analyse classique du RMO;- Interprétation statistique de l’analyse classique;- Proposition d’une approche stochastique;Cette approche stochastique consiste en un modèle statistique hiérarchique dont les paramètres sont :- la variance traduisant le niveau de bruit dans les données estimée par une méthode basée sur les ondelettes, - une fonction qui traduit la cohérence des amplitudes le long des évènements estimée par des méthodes de lissages de données,- le ratio qui est considéré comme une variable aléatoire et non comme un paramètre fixe inconnue comme c'est le cas dans l'approche classique de l'analyse du RMO. Il est estimé par des méthodes de simulations de Monte Carlo par Chaîne de Markov.L'approche proposée dans cette thèse permet d'obtenir autant de cartes de valeurs du paramètre qu'on le désire par le biais des quantiles. La méthodologie proposée est validée par l'application à des données synthétiques et à des données réelles. Une étude de sensibilité de l'estimation du paramètre a été réalisée. L'utilisation de l'incertitude de ce paramètre pour quantifier l'incertitude des positions spatiales des réflecteurs est présentée dans ce travail de thèse. / The main goal of the seismic imaging for oil exploration and production as it is done nowadays is to provide an image of the first kilometers of the subsurface to allow the localization and an accurate estimation of hydrocarbon resources. The reservoirs where these hydrocarbons are trapped are structures which have a more or less complex geology. To characterize these reservoirs and allow the production of hydrocarbons, the geophysicist uses the depth migration which is a seismic imaging tool which serves to convert time data recorded during seismic surveys into depth images which will be exploited by the reservoir engineer with the help of the seismic interpreter and the geologist. During the depth migration, seismic events (reflectors, diffractions, faults …) are moved to their correct locations in space. Relevant depth migration requires an accurate knowledge of vertical and horizontal seismic velocity variations (velocity model). Usually the so-called Common-Image-Gathers (CIGs) serve as a tool to verify correctness of the velocity model. Often the CIGs are computed in the surface offset (distance between shot point and receiver) domain and their flatness serve as criteria of the velocity model correctness. Residual moveout (RMO) of the events on CIGs due to the ratio of migration velocity model and effective velocity model indicates incorrectness of the velocity model and is used for the velocity model updating. The post-stacked images forming the CIGs which are used as data for the RMO analysis are the results of an inverse problem and are corrupt by noises. An uncertainty analysis is necessary to improve evaluation of the results. Dealing with the uncertainty is a major issue, which supposes to help in decisions that have important social and commercial implications. The goal of this thesis is to contribute to the uncertainty analysis and its quantification in the analysis of various parameters computed during the seismic processing and particularly in RMO analysis. To reach these goals several stages were necessary. We began by appropriating the various geophysical concepts necessary for the understanding of:- the organization of the seismic data ;- the various processing ;- the various mathematical and methodological tools which are used (chapters 2 and 3). In the chapter 4, we present different tools used for the conventional RMO analysis. In the fifth one, we give a statistical interpretation of the conventional RMO analysis and we propose a stochastic approach of this analysis. This approach consists in hierarchical statistical model where the parameters are: - the variance which express the noise level in the data ;- a functional parameter which express coherency of the amplitudes along events ; - the ratio which is assume to be a random variable and not an unknown fixed parameter as it is the case in conventional approach. The adjustment of data to the model done by using smoothing methods of data, combined with the using of the wavelets for the estimation of allow to compute the posterior distribution of given the data by the empirical Bayes methods. An estimation of the parameter is obtained by using Markov Chain Monte Carlo simulations of its posterior distribution. The various quantiles of these simulations provide different estimations of . The proposed methodology is validated in the sixth chapter by its application on synthetic data and real data. A sensitivity analysis of the estimation of the parameter was done. The using of the uncertainty of this parameter to quantify the uncertainty of the spatial positions of reflectors is presented in this thesis. Approche Bayésienne Approche stochastique Loi de probabilité Loi a posteriori Monte-Carlo par chaînes de Markov Metropolis - Hastings Analyse du « Residual moveout » Analyse de l’incertitude Semblance Données sismiques AVO Bayesian approach Stochastic approach Markov chain Monte Carlo Metropolis - Hastings Probability density function A posteriori distribution Residual Moveout Analysis Uncertainty analysis Semblance Seismic data AVO Common Image Gathers
16	Modèles bayésiens pour l’identification de représentations antiparcimonieuses et l’analyse en composantes principales bayésienne non paramétrique / Bayesian methods for anti-sparse coding and non parametric principal component analysis Elvira, Clément 10 November 2017 (has links) Cette thèse étudie deux modèles paramétriques et non paramétriques pour le changement de représentation. L'objectif des deux modèles diffère. Le premier cherche une représentation en plus grande dimension pour gagner en robustesse. L'objectif est de répartir uniformément l’information d’un signal sur toutes les composantes de sa représentation en plus grande dimension. La recherche d'un tel code s'exprime comme un problème inverse impliquant une régularisation de type norme infinie. Nous proposons une formulation bayésienne du problème impliquant une nouvelle loi de probabilité baptisée démocratique, qui pénalise les fortes amplitudes. Deux algorithmes MCMC proximaux sont présentés pour approcher des estimateurs bayésiens. La méthode non supervisée présentée est appelée BAC-1. Des expériences numériques illustrent les performances de l’approche pour la réduction de facteur de crête. Le second modèle identifie un sous-espace pertinent de dimension réduite à des fins de modélisation. Mais les méthodes probabilistes proposées nécessitent généralement de fixer à l'avance la dimension du sous-espace. Ce travail introduit BNP-PCA, une version bayésienne non paramétrique de l'analyse en composantes principales. La méthode couple une loi uniforme sur les bases orthonormales à un a priori non paramétrique de type buffet indien pour favoriser une utilisation parcimonieuse des composantes principales et aucun réglage n'est nécessaire. L'inférence est réalisée à l'aide des méthodes MCMC. L'estimation de la dimension du sous-espace et le comportement numérique de BNP-PCA sont étudiés. Nous montrons la flexibilité de BNP-PCA sur deux applications / This thesis proposes Bayesian parametric and nonparametric models for signal representation. The first model infers a higher dimensional representation of a signal for sake of robustness by enforcing the information to be spread uniformly. These so called anti-sparse representations are obtained by solving a linear inverse problem with an infinite-norm penalty. We propose in this thesis a Bayesian formulation of anti-sparse coding involving a new probability distribution, referred to as the democratic prior. A Gibbs and two proximal samplers are proposed to approximate Bayesian estimators. The algorithm is called BAC-1. Simulations on synthetic data illustrate the performances of the two proposed samplers and the results are compared with state-of-the art methods. The second model identifies a lower dimensional representation of a signal for modelisation and model selection. Principal component analysis is very popular to perform dimension reduction. The selection of the number of significant components is essential but often based on some practical heuristics depending on the application. Few works have proposed a probabilistic approach to infer the number of significant components. We propose a Bayesian nonparametric principal component analysis called BNP-PCA. The proposed model involves an Indian buffet process to promote a parsimonious use of principal components, which is assigned a prior distribution defined on the manifold of orthonormal basis. Inference is done using MCMC methods. The estimators of the latent dimension are theoretically and empirically studied. The relevance of the approach is assessed on two applications Antiparcimonie Loi démocratique Méthodes proximales Bayésien non paramétrique Variété de Stiefel Processus du buffet indien Estimation de sous-espaces Anti-sparse representation Democratic distribution Monte Carlo Markov chains methods Proximal methods Bayesian nonparametrics Stiefel manifold Indian buffet process Subpace estimation
17	Improving sampling, optimization and feature extraction in Boltzmann machines Desjardins, Guillaume 12 1900 (has links) L’apprentissage supervisé de réseaux hiérarchiques à grande échelle connaît présentement un succès fulgurant. Malgré cette effervescence, l’apprentissage non-supervisé représente toujours, selon plusieurs chercheurs, un élément clé de l’Intelligence Artificielle, où les agents doivent apprendre à partir d’un nombre potentiellement limité de données. Cette thèse s’inscrit dans cette pensée et aborde divers sujets de recherche liés au problème d’estimation de densité par l’entremise des machines de Boltzmann (BM), modèles graphiques probabilistes au coeur de l’apprentissage profond. Nos contributions touchent les domaines de l’échantillonnage, l’estimation de fonctions de partition, l’optimisation ainsi que l’apprentissage de représentations invariantes. Cette thèse débute par l’exposition d’un nouvel algorithme d'échantillonnage adaptatif, qui ajuste (de fa ̧con automatique) la température des chaînes de Markov sous simulation, afin de maintenir une vitesse de convergence élevée tout au long de l’apprentissage. Lorsqu’utilisé dans le contexte de l’apprentissage par maximum de vraisemblance stochastique (SML), notre algorithme engendre une robustesse accrue face à la sélection du taux d’apprentissage, ainsi qu’une meilleure vitesse de convergence. Nos résultats sont présent ́es dans le domaine des BMs, mais la méthode est générale et applicable à l’apprentissage de tout modèle probabiliste exploitant l’échantillonnage par chaînes de Markov. Tandis que le gradient du maximum de vraisemblance peut-être approximé par échantillonnage, l’évaluation de la log-vraisemblance nécessite un estimé de la fonction de partition. Contrairement aux approches traditionnelles qui considèrent un modèle donné comme une boîte noire, nous proposons plutôt d’exploiter la dynamique de l’apprentissage en estimant les changements successifs de log-partition encourus à chaque mise à jour des paramètres. Le problème d’estimation est reformulé comme un problème d’inférence similaire au filtre de Kalman, mais sur un graphe bi-dimensionnel, où les dimensions correspondent aux axes du temps et au paramètre de température. Sur le thème de l’optimisation, nous présentons également un algorithme permettant d’appliquer, de manière efficace, le gradient naturel à des machines de Boltzmann comportant des milliers d’unités. Jusqu’à présent, son adoption était limitée par son haut coût computationel ainsi que sa demande en mémoire. Notre algorithme, Metric-Free Natural Gradient (MFNG), permet d’éviter le calcul explicite de la matrice d’information de Fisher (et son inverse) en exploitant un solveur linéaire combiné à un produit matrice-vecteur efficace. L’algorithme est prometteur: en terme du nombre d’évaluations de fonctions, MFNG converge plus rapidement que SML. Son implémentation demeure malheureusement inefficace en temps de calcul. Ces travaux explorent également les mécanismes sous-jacents à l’apprentissage de représentations invariantes. À cette fin, nous utilisons la famille de machines de Boltzmann restreintes “spike & slab” (ssRBM), que nous modifions afin de pouvoir modéliser des distributions binaires et parcimonieuses. Les variables latentes binaires de la ssRBM peuvent être rendues invariantes à un sous-espace vectoriel, en associant à chacune d’elles, un vecteur de variables latentes continues (dénommées “slabs”). Ceci se traduit par une invariance accrue au niveau de la représentation et un meilleur taux de classification lorsque peu de données étiquetées sont disponibles. Nous terminons cette thèse sur un sujet ambitieux: l’apprentissage de représentations pouvant séparer les facteurs de variations présents dans le signal d’entrée. Nous proposons une solution à base de ssRBM bilinéaire (avec deux groupes de facteurs latents) et formulons le problème comme l’un de “pooling” dans des sous-espaces vectoriels complémentaires. / Despite the current widescale success of deep learning in training large scale hierarchical models through supervised learning, unsupervised learning promises to play a crucial role towards solving general Artificial Intelligence, where agents are expected to learn with little to no supervision. The work presented in this thesis tackles the problem of unsupervised feature learning and density estimation, using a model family at the heart of the deep learning phenomenon: the Boltzmann Machine (BM). We present contributions in the areas of sampling, partition function estimation, optimization and the more general topic of invariant feature learning. With regards to sampling, we present a novel adaptive parallel tempering method which dynamically adjusts the temperatures under simulation to maintain good mixing in the presence of complex multi-modal distributions. When used in the context of stochastic maximum likelihood (SML) training, the improved ergodicity of our sampler translates to increased robustness to learning rates and faster per epoch convergence. Though our application is limited to BM, our method is general and is applicable to sampling from arbitrary probabilistic models using Markov Chain Monte Carlo (MCMC) techniques. While SML gradients can be estimated via sampling, computing data likelihoods requires an estimate of the partition function. Contrary to previous approaches which consider the model as a black box, we provide an efficient algorithm which instead tracks the change in the log partition function incurred by successive parameter updates. Our algorithm frames this estimation problem as one of filtering performed over a 2D lattice, with one dimension representing time and the other temperature. On the topic of optimization, our thesis presents a novel algorithm for applying the natural gradient to large scale Boltzmann Machines. Up until now, its application had been constrained by the computational and memory requirements of computing the Fisher Information Matrix (FIM), which is square in the number of parameters. The Metric-Free Natural Gradient algorithm (MFNG) avoids computing the FIM altogether by combining a linear solver with an efficient matrix-vector operation. The method shows promise in that the resulting updates yield faster per-epoch convergence, despite being slower in terms of wall clock time. Finally, we explore how invariant features can be learnt through modifications to the BM energy function. We study the problem in the context of the spike & slab Restricted Boltzmann Machine (ssRBM), which we extend to handle both binary and sparse input distributions. By associating each spike with several slab variables, latent variables can be made invariant to a rich, high dimensional subspace resulting in increased invariance in the learnt representation. When using the expected model posterior as input to a classifier, increased invariance translates to improved classification accuracy in the low-label data regime. We conclude by showing a connection between invariance and the more powerful concept of disentangling factors of variation. While invariance can be achieved by pooling over subspaces, disentangling can be achieved by learning multiple complementary views of the same subspace. In particular, we show how this can be achieved using third-order BMs featuring multiplicative interactions between pairs of random variables. Réseaux de neurones Apprentissage profond Apprentissage non-supervisé Apprentissage de représentations Machines de Boltzmann Échantillonnage Gradient naturel Modèles bilinéaires Fonction de partition Neural networks Deep learning Unsupervised learning Feature learning Boltzmann machines Markov chain Monte Carlo Parallel tempering Natural gradient Bilinear models Partition function
18	Bayesian fusion of multi-band images : A powerful tool for super-resolution / Fusion Bayésienne des multi-bandes Images : Un outil puissant pour la Super-résolution Wei, Qi 24 September 2015 (has links) L’imagerie hyperspectrale (HS) consiste à acquérir une même scène dans plusieurs centaines de bandes spectrales contiguës (dimensions d'un cube de données), ce qui a conduit à trois types d'applications pertinentes, telles que la détection de cibles, la classification et le démélange spectral. Cependant, tandis que les capteurs hyperspectraux fournissent une information spectrale abondante, leur résolution spatiale est généralement plus limitée. Ainsi, la fusion d’une image HS avec d'autres images à haute résolution de la même scène, telles que les images multispectrales (MS) ou panchromatiques (PAN) est un problème intéressant. Le problème de fusionner une image HS de haute résolution spectrale mais de résolution spatiale limitée avec une image auxiliaire de haute résolution spatiale mais de résolution spectrale plus limitée (parfois qualifiée de fusion multi-résolution) a été exploré depuis de nombreuses années. D'un point de vue applicatif, ce problème est également important et est motivé par ceratins projets, comme par exemple le project Japonais HISIU, qui vise à fusionner des images MS et HS recalées acquises pour la même scène avec les mêmes conditions. Les techniques de fusion bayésienne permettent une interprétation intuitive du processus de fusion via la définition de la loi a posteriori de l’image à estimer (qui est de hautes résolutions spatiale et spectrale). Puisque le problème de fusion est généralement mal posé, l’inférence bayésienne offre un moyen pratique pour régulariser le problème en définissant une loi a priori adaptée à la scène d'intérêt. Les différents chapitres de cette thèse sont résumés ci-dessous. Le introduction présente le modèle général de fusion et les hypothèses statistiques utilisées pour les images multi-bandes observées, c’est-à-dire les images HS, MS ou PAN. Les images observées sont des versions dégradées de l'image de référence (à hautes résolutions spatiale et spectrale) qui résultent par exemple d’un flou spatial et spectral et/ou d’un sous-échantillonnage liés aux caractéristiques des capteurs. Les propriétés statistiques des mesures sont alors obtenues directement à partir d’un modèle linéaire traduisant ces dégradations et des propriétés statistiques du bruit. Le chapitre 1 s’intéresse à une technique de fusion bayésienne pour les images multi-bandes de télédétection, à savoir pour les images HS, MS et PAN. Tout d'abord, le problème de fusion est formulé dans un cadre d'estimation bayésienne. Une loi a priori Gaussienne exploitant la géométrie du problème est définie et un algorithme d’estimation Bayésienne permettant d’estimer l’image de référence est étudié. Pour obtenir des estimateurs Bayésiens liés à la distribution postérieure résultant, deux algorithmes basés sur échantillonnage de Monte Carlo et l'optimisation stratégie ont été développés. Le chapitre 2 propose une approche variationnelle pour la fusion d’images HS et MS. Le problème de fusion est formulé comme un problème inverse dont la solution est l'image d’intérêt qui est supposée vivre dans un espace de dimension résuite. Un terme de régularisation imposant des contraintes de parcimonie est défini avec soin. Ce terme traduit le fait que les patches de l'image cible sont bien représentés par une combinaison linéaire d’atomes appartenant à un dictionnaire approprié. Les atomes de ce dictionnaire et le support des coefficients des décompositions des patches sur ces atomes sont appris à l’aide de l’image de haute résolution spatiale. Puis, conditionnellement à ces dictionnaires et à ces supports, le problème de fusion est résolu à l’aide d’un algorithme d’optimisation alternée (utilisant l’algorithme ADMM) qui estime de manière itérative l’image d’intérêt et les coefficients de décomposition. / Hyperspectral (HS) imaging, which consists of acquiring a same scene in several hundreds of contiguous spectral bands (a three dimensional data cube), has opened a new range of relevant applications, such as target detection [MS02], classification [C.-03] and spectral unmixing [BDPD+12]. However, while HS sensors provide abundant spectral information, their spatial resolution is generally more limited. Thus, fusing the HS image with other highly resolved images of the same scene, such as multispectral (MS) or panchromatic (PAN) images is an interesting problem. The problem of fusing a high spectral and low spatial resolution image with an auxiliary image of higher spatial but lower spectral resolution, also known as multi-resolution image fusion, has been explored for many years [AMV+11]. From an application point of view, this problem is also important as motivated by recent national programs, e.g., the Japanese next-generation space-borne hyperspectral image suite (HISUI), which fuses co-registered MS and HS images acquired over the same scene under the same conditions [YI13]. Bayesian fusion allows for an intuitive interpretation of the fusion process via the posterior distribution. Since the fusion problem is usually ill-posed, the Bayesian methodology offers a convenient way to regularize the problem by defining appropriate prior distribution for the scene of interest. The aim of this thesis is to study new multi-band image fusion algorithms to enhance the resolution of hyperspectral image. In the first chapter, a hierarchical Bayesian framework is proposed for multi-band image fusion by incorporating forward model, statistical assumptions and Gaussian prior for the target image to be restored. To derive Bayesian estimators associated with the resulting posterior distribution, two algorithms based on Monte Carlo sampling and optimization strategy have been developed. In the second chapter, a sparse regularization using dictionaries learned from the observed images is introduced as an alternative of the naive Gaussian prior proposed in Chapter 1. instead of Gaussian prior is introduced to regularize the ill-posed problem. Identifying the supports jointly with the dictionaries circumvented the difficulty inherent to sparse coding. To minimize the target function, an alternate optimization algorithm has been designed, which accelerates the fusion process magnificently comparing with the simulation-based method. In the third chapter, by exploiting intrinsic properties of the blurring and downsampling matrices, a much more efficient fusion method is proposed thanks to a closed-form solution for the Sylvester matrix equation associated with maximizing the likelihood. The proposed solution can be embedded into an alternating direction method of multipliers or a block coordinate descent method to incorporate different priors or hyper-priors for the fusion problem, allowing for Bayesian estimators. In the last chapter, a joint multi-band image fusion and unmixing scheme is proposed by combining the well admitted linear spectral mixture model and the forward model. The joint fusion and unmixing problem is solved in an alternating optimization framework, mainly consisting of solving a Sylvester equation and projecting onto a simplex resulting from the non-negativity and sum-to-one constraints. The simulation results conducted on synthetic and semi-synthetic images illustrate the advantages of the developed Bayesian estimators, both qualitatively and quantitatively. Imagerie hyperspectrale Fusion d'images Démélange spectral Problèmes inverses Inférence Bayésienne Optimisation Représentation parcimonieuse Equation de Sylvester Hyperspectral image Image fusion Spectral unmixing Inverse problems Bayesian inference Markov Chain Monte Carlo methods Optimization Sparse representation Sylvester equation
19	Sélection de modèles robuste : régression linéaire et algorithme à sauts réversibles Gagnon, Philippe 10 1900 (has links) No description available. analyse en composantes principales inférence bayésienne robustesse valeurs aberrantes Bayesian inference Markov chain Monte Carlo methods Outliers Principal component analysis Random walk Metropolis algorithm Robustness Super heavy-tailed distributions
20	Détection et classification de cibles multispectrales dans l'infrarouge / Detection and classiﬁcation of multispectral infrared targets Maire, Florian 14 February 2014 (has links) Les dispositifs de protection de sites sensibles doivent permettre de détecter des menaces potentielles suffisamment à l’avance pour pouvoir mettre en place une stratégie de défense. Dans cette optique, les méthodes de détection et de reconnaissance d’aéronefs se basant sur des images infrarouge multispectrales doivent être adaptées à des images faiblement résolues et être robustes à la variabilité spectrale et spatiale des cibles. Nous mettons au point dans cette thèse, des méthodes statistiques de détection et de reconnaissance d’aéronefs satisfaisant ces contraintes. Tout d’abord, nous spéciﬁons une méthode de détection d’anomalies pour des images multispectrales, combinant un calcul de vraisemblance spectrale avec une étude sur les ensembles de niveaux de la transformée de Mahalanobis de l’image. Cette méthode ne nécessite aucune information a priori sur les aéronefs et nous permet d’identiﬁer les images contenant des cibles. Ces images sont ensuite considérées comme des réalisations d’un modèle statistique d’observations ﬂuctuant spectralement et spatialement autour de formes caractéristiques inconnues. L’estimation des paramètres de ce modèle est réalisée par une nouvelle méthodologie d’apprentissage séquentiel non supervisé pour des modèles à données manquantes que nous avons développée. La mise au point de ce modèle nous permet in ﬁne de proposer une méthode de reconnaissance de cibles basée sur l’estimateur du maximum de vraisemblance a posteriori. Les résultats encourageants, tant en détection qu’en classiﬁcation, justiﬁent l’intérêt du développement de dispositifs permettant l’acquisition d’images multispectrales. Ces méthodes nous ont également permis d’identiﬁer les regroupements de bandes spectrales optimales pour la détection et la reconnaissance d’aéronefs faiblement résolus en infrarouge / Surveillance systems should be able to detect potential threats far ahead in order to put forward a defence strategy. In this context, detection and recognition methods making use of multispectral infrared images should cope with low resolution signals and handle both spectral and spatial variability of the targets. We introduce in this PhD thesis a novel statistical methodology to perform aircraft detection and classiﬁcation which take into account these constraints. We ﬁrst propose an anomaly detection method designed for multispectral images, which combines a spectral likelihood measure and a level set study of the image Mahalanobis transform. This technique allows to identify images which feature an anomaly without any prior knowledge on the target. In a second time, these images are used as realizations of a statistical model in which the observations are described as random spectral and spatial deformation of prototype shapes. The model inference, and in particular the prototype shape estimation, is achieved through a novel unsupervised sequential learning algorithm designed for missing data models. This model allows to propose a classiﬁcation algorithm based on maximum a posteriori probability Promising results in detection as well as in classiﬁcation, justify the growing interest surrounding the development of multispectral imaging devices. These methods have also allowed us to identify the optimal infrared spectral band regroupments regarding the low resolution aircraft IRS detection and classiﬁcation Reconnaissance de forme Modèles à prototype déformable Apprentissage séquentiel Algorithmes expectation-maximization Détection d'anomalies Signature infrarouge Imagerie multispectrale Shape recognition Deformable template models Sequential inference Markov chain Monte Carlo methods Expectation-maximization algorithm Anomaly detection Infrared signature Multispectral imagery

Search results