Global ETD Search

131	Quelques applications de l’optimisation numérique aux problèmes d’inférence et d’apprentissage / Few applications of numerical optimization in inference and learning Kannan, Hariprasad 28 September 2018 (has links) Les relaxations en problème d’optimisation linéaire jouent un rôle central en inférence du maximum a posteriori (map) dans les champs aléatoires de Markov discrets. Nous étudions ici les avantages offerts par les méthodes de Newton pour résoudre efficacement le problème dual (au sens de Lagrange) d’une reformulation lisse du problème. Nous comparons ces dernières aux méthodes de premier ordre, à la fois en terme de vitesse de convergence et de robustesse au mauvais conditionnement du problème. Nous exposons donc un cadre général pour l’apprentissage non-supervisé basé sur le transport optimal et les régularisations parcimonieuses. Nous exhibons notamment une approche prometteuse pour résoudre le problème de la préimage dans l’acp à noyau. Du point de vue de l’optimisation, nous décrivons le calcul du gradient d’une version lisse de la norme p de Schatten et comment cette dernière peut être utilisée dans un schéma de majoration-minimisation. / Numerical optimization and machine learning have had a fruitful relationship, from the perspective of both theory and application. In this thesis, we present an application oriented take on some inference and learning problems. Linear programming relaxations are central to maximum a posteriori (MAP) inference in discrete Markov Random Fields (MRFs). Especially, inference in higher-order MRFs presents challenges in terms of efficiency, scalability and solution quality. In this thesis, we study the benefit of using Newton methods to efficiently optimize the Lagrangian dual of a smooth version of the problem. We investigate their ability to achieve superior convergence behavior and to better handle the ill-conditioned nature of the formulation, as compared to first order methods. We show that it is indeed possible to obtain an efficient trust region Newton method, which uses the true Hessian, for a broad range of MAP inference problems. Given the specific opportunities and challenges in the MAP inference formulation, we present details concerning (i) efficient computation of the Hessian and Hessian-vector products, (ii) a strategy to damp the Newton step that aids efficient and correct optimization, (iii) steps to improve the efficiency of the conjugate gradient method through a truncation rule and a pre-conditioner. We also demonstrate through numerical experiments how a quasi-Newton method could be a good choice for MAP inference in large graphs. MAP inference based on a smooth formulation, could greatly benefit from efficient sum-product computation, which is required for computing the gradient and the Hessian. We show a way to perform sum-product computation for trees with sparse clique potentials. This result could be readily used by other algorithms, also. We show results demonstrating the usefulness of our approach using higher-order MRFs. Then, we discuss potential research topics regarding tightening the LP relaxation and parallel algorithms for MAP inference.Unsupervised learning is an important topic in machine learning and it could potentially help high dimensional problems like inference in graphical models. We show a general framework for unsupervised learning based on optimal transport and sparse regularization. Optimal transport presents interesting challenges from an optimization point of view with its simplex constraints on the rows and columns of the transport plan. We show one way to formulate efficient optimization problems inspired by optimal transport. This could be done by imposing only one set of the simplex constraints and by imposing structure on the transport plan through sparse regularization. We show how unsupervised learning algorithms like exemplar clustering, center based clustering and kernel PCA could fit into this framework based on different forms of regularization. We especially demonstrate a promising approach to address the pre-image problem in kernel PCA. Several methods have been proposed over the years, which generally assume certain types of kernels or have too many hyper-parameters or make restrictive approximations of the underlying geometry. We present a more general method, with only one hyper-parameter to tune and with some interesting geometric properties. From an optimization point of view, we show how to compute the gradient of a smooth version of the Schatten p-norm and how it can be used within a majorization-minimization scheme. Finally, we present results from our various experiments. Vision par ordinateur Apprentissage automatique Modèles graphiques Inférence MAP Apprentissage non-supervisé Optimisation numérique Graphical models Machine learning Computer vision Unsupervised learning Numerical optimization MAP inference
132	Nonconvex Alternating Direction Optimization for Graphs : Inference and Learning / L'algorithme des directions alternées non convexe pour graphes : inférence et apprentissage Lê-Huu, Dien Khuê 04 February 2019 (has links) Cette thèse présente nos contributions àl’inférence et l’apprentissage des modèles graphiquesen vision artificielle. Tout d’abord, nous proposons unenouvelle classe d’algorithmes de décomposition pour résoudrele problème d’appariement de graphes et d’hypergraphes,s’appuyant sur l’algorithme des directionsalternées (ADMM) non convexe. Ces algorithmes sontefficaces en terme de calcul et sont hautement parallélisables.En outre, ils sont également très générauxet peuvent être appliqués à des fonctionnelles d’énergiearbitraires ainsi qu’à des contraintes de correspondancearbitraires. Les expériences montrent qu’ils surpassentles méthodes de pointe existantes sur des benchmarkspopulaires. Ensuite, nous proposons une relaxationcontinue non convexe pour le problème d’estimationdu maximum a posteriori (MAP) dans les champsaléatoires de Markov (MRFs). Nous démontrons quecette relaxation est serrée, c’est-à-dire qu’elle est équivalenteau problème original. Cela nous permet d’appliquerdes méthodes d’optimisation continue pour résoudrele problème initial discret sans perte de précisionaprès arrondissement. Nous étudions deux méthodes degradient populaires, et proposons en outre une solutionplus efficace utilisant l’ADMM non convexe. Les expériencessur plusieurs problèmes réels démontrent quenotre algorithme prend l’avantage sur ceux de pointe,dans différentes configurations. Finalement, nous proposonsune méthode d’apprentissage des paramètres deces modèles graphiques avec des données d’entraînement,basée sur l’ADMM non convexe. Cette méthodeconsiste à visualiser les itérations de l’ADMM commeune séquence d’opérations différenciables, ce qui permetde calculer efficacement le gradient de la perted’apprentissage par rapport aux paramètres du modèle.L’apprentissage peut alors utiliser une descente de gradientstochastique. Nous obtenons donc un frameworkunifié pour l’inférence et l’apprentissage avec l’ADMMnon-convexe. Grâce à sa flexibilité, ce framework permetégalement d’entraîner conjointement de-bout-en-boutun modèle graphique avec un autre modèle, telqu’un réseau de neurones, combinant ainsi les avantagesdes deux. Nous présentons des expériences sur un jeude données de segmentation sémantique populaire, démontrantl’efficacité de notre méthode. / This thesis presents our contributions toinference and learning of graph-based models in computervision. First, we propose a novel class of decompositionalgorithms for solving graph and hypergraphmatching based on the nonconvex alternating directionmethod of multipliers (ADMM). These algorithms arecomputationally efficient and highly parallelizable. Furthermore,they are also very general and can be appliedto arbitrary energy functions as well as arbitraryassignment constraints. Experiments show that theyoutperform existing state-of-the-art methods on popularbenchmarks. Second, we propose a nonconvex continuousrelaxation of maximum a posteriori (MAP) inferencein discrete Markov random fields (MRFs). Weshow that this relaxation is tight for arbitrary MRFs.This allows us to apply continuous optimization techniquesto solve the original discrete problem withoutloss in accuracy after rounding. We study two populargradient-based methods, and further propose a more effectivesolution using nonconvex ADMM. Experimentson different real-world problems demonstrate that theproposed ADMM compares favorably with state-of-theartalgorithms in different settings. Finally, we proposea method for learning the parameters of these graphbasedmodels from training data, based on nonconvexADMM. This method consists of viewing ADMM iterationsas a sequence of differentiable operations, whichallows efficient computation of the gradient of the trainingloss with respect to the model parameters, enablingefficient training using stochastic gradient descent. Atthe end we obtain a unified framework for inference andlearning with nonconvex ADMM. Thanks to its flexibility,this framework also allows training jointly endto-end a graph-based model with another model suchas a neural network, thus combining the strengths ofboth. We present experiments on a popular semanticsegmentation dataset, demonstrating the effectivenessof our method. Directions alternées Appariement de graphs Champ aléatoire de Markov Modèles graphiques Inférence Apprentissage ADMM Graph matching Markov random fields Graphical models Inference Learning
133	An Experimental Evaluation of Probabilistic Deep Networks for Real-time Traffic Scene Representation using Graphical Processing Units El-Shaer, Mennat Allah 03 September 2019 (has links) No description available. Artificial Intelligence Computer Engineering Computer Science Electrical Engineering deep learning probabilistic graphical models traffic scene perception graphical processing units GPU autonomous vehicles
134	Probabilistic Graphical Models: an Application in Synchronization and Localization Goodarzi, Meysam 16 June 2023 (has links) Die Lokalisierung von mobilen Nutzern (MU) in sehr dichten Netzen erfordert häufig die Synchronisierung der Access Points (APs) untereinander. Erstens konzentriert sich diese Arbeit auf die Lösung des Problems der Zeitsynchronisation in 5G-Netzwerken, indem ein hybrider Bayesischer Ansatz für die Schätzung des Taktversatzes und des Versatzes verwendet wird. Wir untersuchen und demonstrieren den beträchtlichen Nutzen der Belief Propagation (BP), die auf factor graphs läuft, um eine präzise netzwerkweite Synchronisation zu erreichen. Darüber hinaus nutzen wir die Vorteile der Bayesischen Rekursiven Filterung (BRF), um den Zeitstempel-Fehler bei der paarweisen Synchronisierung zu verringern. Schließlich zeigen wir die Vorzüge der hybriden Synchronisation auf, indem wir ein großes Netzwerk in gemeinsame und lokale Synchronisationsdomänen unterteilen und so den am besten geeigneten Synchronisationsalgorithmus (BP- oder BRF-basiert) auf jede Domäne anwenden können. Zweitens schlagen wir einen Deep Neural Network (DNN)-gestützten Particle Filter-basierten (DePF)-Ansatz vor, um das gemeinsame MU-Sync&loc-Problem zu lösen. Insbesondere setzt DePF einen asymmetrischen Zeitstempel-Austauschmechanismus zwischen den MUs und den APs ein, der Informationen über den Taktversatz, die Zeitverschiebung der MUs, und die AP-MU Abstand liefert. Zur Schätzung des Ankunftswinkels des empfangenen Synchronisierungspakets nutzt DePF den multiple signal classification Algorithmus, der durch die Channel Impulse Response (CIR) der Synchronisierungspakete gespeist wird. Die CIR wird auch genutzt, um den Verbindungszustand zu bestimmen, d. h. Line-of-Sight (LoS) oder Non-LoS (NLoS). Schließlich nutzt DePF particle Gaussian mixtures, die eine hybride partikelbasierte und parametrische BRF-Fusion der vorgenannten Informationen ermöglichen und die Position und die Taktparameter der MUs gemeinsam schätzen. / Mobile User (MU) localization in ultra dense networks often requires, on one hand, the Access Points (APs) to be synchronized among each other, and, on the other hand, the MU-AP synchronization. In this work, we firstly address the former, which eventually provides a basis for the latter, i.e., for the joint MU synchronization and localization (sync&loc). In particular, firstly, this work focuses on tackling the time synchronization problem in 5G networks by adopting a hybrid Bayesian approach for clock offset and skew estimation. Specifically, we investigate and demonstrate the substantial benefit of Belief Propagation (BP) running on Factor Graphs (FGs) in achieving precise network-wide synchronization. Moreover, we take advantage of Bayesian Recursive Filtering (BRF) to mitigate the time-stamping error in pairwise synchronization. Finally, we reveal the merit of hybrid synchronization by dividing a large-scale network into common and local synchronization domains, thereby being able to apply the most suitable synchronization algorithm (BP- or BRF-based) on each domain. Secondly, we propose a Deep Neural Network (DNN)-assisted Particle Filter-based (DePF) approach to address the MU joint sync&loc problem. In particular, DePF deploys an asymmetric time-stamp exchange mechanism between the MUs and the APs, which provides information about the MUs' clock offset, skew, and AP-MU distance. In addition, to estimate the Angle of Arrival (AoA) of the received synchronization packet, DePF draws on the Multiple Signal Classification (MUSIC) algorithm that is fed by the Channel Impulse Response (CIR) experienced by the sync packets. The CIR is also leveraged on to determine the link condition, i.e. Line-of-Sight (LoS) or Non-LoS (NLoS). Finally DePF capitalizes on particle Gaussian mixtures which allow for a hybrid particle-based and parametric BRF fusion of the aforementioned pieces of information and jointly estimate the position and clock parameters of the MUs. Probabilistische grafische Modelle Synchronisierung Lokalisierung Deep Neural Networks Probabilistic Graphical Models Synchronization Localization Deep Neural Networks ZN 6560 ZI 9290 ddc:000
135	A Machine Learning Approach to the analysis of mortality in patients with cardiovascular diseases Aldamiz Orcajo, Juan Miguel January 2021 (has links) Cardiovascular diseases (CVDs) are the main cause of mortality worldwide, counting for a third of world demises. Consequently, early detection and underlying factors of these pathologies can play a critical role in successful treatments. Many researchers have applied machine learning (ML) for mortality risk estimation in CVDs. However, this is difficult due to their complex and multifactorial nature and the lack of large, unbiased data collections. This thesis holds statistical analysis results and a binary classification model for CVDs mortality prediction based on the ESCARVAL-RISK study, a large cohort study (54,678 patients) running from January 2008 until December 2012. This study faces highly imbalanced classes that may lead to classification models with low specificity and sensitivity. This work proposes several ways to balance classes, including hyperparameter optimization and sample techniques tested over 15 different classification algorithms to overcome the problem. While the specificity is low, the proposed approach using SHapley Additive exPlanations (SHAP) identifies factors that may be optimal targets for intensified preventive interventions. / Kardiovaskulära sjukdomar är den främsta dödsorsaken i världen och står för en tredjedel av alla dödsfall i världen. Därför kan tidig upptäckt och underliggande faktorer för dessa sjukdomar spela en avgörande roll för framgångsrika behandlingar. Många forskare har tillämpat maskininlärning (ML) för uppskattning av dödlighetsrisker vid hjärt- och kärlsjukdomar. Detta är dock svårt på grund av deras komplexa och multifaktoriella natur och bristen på stora, opartiska datainsamlingar. Denna avhandling innehåller statistiska analysresultat och en binär klassificeringsmodell för att förutsäga dödligheten i hjärt- och kärlsjukdomar baserat på ESCARVAL-RISK-studien, en stor kohortstudie (54 678 patienter) som pågick från januari 2008 till december 2012. I studien finns mycket obalanserade klasser som kan leda till klassificeringsmodeller med låg specificitet och känslighet. I detta arbete föreslås flera sätt att balansera klasserna, inklusive optimering av hyperparametrar och provtagningstekniker som testats över 15 olika klassificeringsalgoritmer för att lösa problemet. Även om specificiteten är låg identifierar den föreslagna metoden med hjälp av SHapley Additive exPlanations(SHAP) faktorer som kan vara optimala mål för intensifierade förebyggande insatser. Computer and Information Sciences Data- och informationsvetenskap
136	Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social Systems Anantharam, Pramod 31 May 2016 (has links) No description available. Computer Science
137	Probabilistic models for quality control in environmental sensor networks Dereszynski, Ethan W. 04 June 2012 (has links) Networks of distributed, remote sensors are providing ecological scientists with a view of our environment that is unprecedented in detail. However, these networks are subject to harsh conditions, which lead to malfunctions in individual sensors and failures in network communications. This behavior manifests as corrupt or missing measurements in the data. Consequently, before the data can be used in ecological models, future experiments, or even policy decisions, it must be quality controlled (QC'd) to flag affected measurements and impute corrected values. This dissertation describes a probabilistic modeling approach for real-time automated QC that exploits the spatial and temporal correlations in the data to distinguish sensor failures from valid observations. The model adapts to a site by learning a Bayesian network structure that captures spatial relationships among sensors, and then extends this structure to a dynamic Bayesian network to incorporate temporal correlations. The final QC model contains both discrete and continuous variables, which makes inference intractable for large sensor networks. Consequently, we examine the performance of three approximate methods for inference in this probabilistic framework. Two of these algorithms represent contemporary approaches to inference in hybrid models, while the third is a greedy search-based method of our own design. We demonstrate the results of these algorithms on synthetic datasets and real environmental sensor data gathered from an ecological sensor network located in western Oregon. Our results suggest that we can improve performance over networks with less sensors that use exhaustive asynchronic inference by including additional sensors and applying approximate algorithms. / Graduation date: 2013 Bayesian Networks Anomaly Detection Probability Graphical Models Sensor Networks Sensor networks -- Quality control Remote sensing -- Quality control Detectors -- Quality control Graphical modeling (Statistics)
138	Probabilistic models in noisy environments : and their application to a visual prosthesis for the blind Archambeau, Cédric 26 September 2005 (has links) In recent years, probabilistic models have become fundamental techniques in machine learning. They are successfully applied in various engineering problems, such as robotics, biometrics, brain-computer interfaces or artificial vision, and will gain in importance in the near future. This work deals with the difficult, but common situation where the data is, either very noisy, or scarce compared to the complexity of the process to model. We focus on latent variable models, which can be formalized as probabilistic graphical models and learned by the expectation-maximization algorithm or its variants (e.g., variational Bayes).<br> After having carefully studied a non-exhaustive list of multivariate kernel density estimators, we established that in most applications locally adaptive estimators should be preferred. Unfortunately, these methods are usually sensitive to outliers and have often too many parameters to set. Therefore, we focus on finite mixture models, which do not suffer from these drawbacks provided some structural modifications.<br> Two questions are central in this dissertation: (i) how to make mixture models robust to noise, i.e. deal efficiently with outliers, and (ii) how to exploit side-channel information, i.e. additional information intrinsic to the data. In order to tackle the first question, we extent the training algorithms of the popular Gaussian mixture models to the Student-t mixture models. the Student-t distribution can be viewed as a heavy-tailed alternative to the Gaussian distribution, the robustness being tuned by an extra parameter, the degrees of freedom. Furthermore, we introduce a new variational Bayesian algorithm for learning Bayesian Student-t mixture models. This algorithm leads to very robust density estimators and clustering. To address the second question, we introduce manifold constrained mixture models. This new technique exploits the information that the data is living on a manifold of lower dimension than the dimension of the feature space. Taking the implicit geometrical data arrangement into account results in better generalization on unseen data.<br> Finally, we show that the latent variable framework used for learning mixture models can be extended to construct probabilistic regularization networks, such as the Relevance Vector Machines. Subsequently, we make use of these methods in the context of an optic nerve visual prosthesis to restore partial vision to blind people of whom the optic nerve is still functional. Although visual sensations can be induced electrically in the blind's visual field, the coding scheme of the visual information along the visual pathways is poorly known. Therefore, we use probabilistic models to link the stimulation parameters to the features of the visual perceptions. Both black-box and grey-box models are considered. The grey-box models take advantage of the known neurophysiological information and are more instructive to medical doctors and psychologists.<br> Visual prosthesis Nonparameteric density estimation Optic nerve Variational Bayes Expectation-maximization Bayesian learning Finite mixture models Rehabilitation Geodesics Manifold constrained models Regularization networks Probabilistic graphical models Latent variable models Robustness to noise
139	Data Mining Meets HCI: Making Sense of Large Graphs Chau, Dueng Horng 01 July 2012 (has links) We have entered the age of big data. Massive datasets are now common in science, government and enterprises. Yet, making sense of these data remains a fundamental challenge. Where do we start our analysis? Where to go next? How to visualize our findings? We answers these questions by bridging Data Mining and Human- Computer Interaction (HCI) to create tools for making sense of graphs with billions of nodes and edges, focusing on: (1) Attention Routing: we introduce this idea, based on anomaly detection, that automatically draws people’s attention to interesting areas of the graph to start their analyses. We present three examples: Polonium unearths malware from 37 billion machine-file relationships; NetProbe fingers bad guys who commit auction fraud. (2) Mixed-Initiative Sensemaking: we present two examples that combine machine inference and visualization to help users locate next areas of interest: Apolo guides users to explore large graphs by learning from few examples of user interest; Graphite finds interesting subgraphs, based on only fuzzy descriptions drawn graphically. (3) Scaling Up: we show how to enable interactive analytics of large graphs by leveraging Hadoop, staging of operations, and approximate computation. This thesis contributes to data mining, HCI, and importantly their intersection, including: interactive systems and algorithms that scale; theories that unify graph mining approaches; and paradigms that overcome fundamental challenges in visual analytics. Our work is making impact to academia and society: Polonium protects 120 million people worldwide from malware; NetProbe made headlines on CNN, WSJ and USA Today; Pegasus won an opensource software award; Apolo helps DARPA detect insider threats and prevent exfiltration. We hope our Big Data Mantra “Machine for Attention Routing, Human for Interaction” will inspire more innovations at the crossroad of data mining and HCI. Graph Mining Data Mining Machine Learning Human-Computer Interaction HCI Graphical Models Inference Big Data Sensemaking Visualization eBay Auction Fraud Detection Symantec Malware Detection Belief Propagation Random Walk Guilt by Association Polonium NetProbe Apolo Feldspar Graphite
140	Learning with Sparcity: Structures, Optimization and Applications Chen, Xi 01 July 2013 (has links) The development of modern information technology has enabled collecting data of unprecedented size and complexity. Examples include web text data, microarray & proteomics, and data from scientific domains (e.g., meteorology). To learn from these high dimensional and complex data, traditional machine learning techniques often suffer from the curse of dimensionality and unaffordable computational cost. However, learning from large-scale high-dimensional data promises big payoffs in text mining, gene analysis, and numerous other consequential tasks. Recently developed sparse learning techniques provide us a suite of tools for understanding and exploring high dimensional data from many areas in science and engineering. By exploring sparsity, we can always learn a parsimonious and compact model which is more interpretable and computationally tractable at application time. When it is known that the underlying model is indeed sparse, sparse learning methods can provide us a more consistent model and much improved prediction performance. However, the existing methods are still insufficient for modeling complex or dynamic structures of the data, such as those evidenced in pathways of genomic data, gene regulatory network, and synonyms in text data. This thesis develops structured sparse learning methods along with scalable optimization algorithms to explore and predict high dimensional data with complex structures. In particular, we address three aspects of structured sparse learning: 1. Efficient and scalable optimization methods with fast convergence guarantees for a wide spectrum of high-dimensional learning tasks, including single or multi-task structured regression, canonical correlation analysis as well as online sparse learning. 2. Learning dynamic structures of different types of undirected graphical models, e.g., conditional Gaussian or conditional forest graphical models. 3. Demonstrating the usefulness of the proposed methods in various applications, e.g., computational genomics and spatial-temporal climatological data. In addition, we also design specialized sparse learning methods for text mining applications, including ranking and latent semantic analysis. In the last part of the thesis, we also present the future direction of the high-dimensional structured sparse learning from both computational and statistical aspects. Machine Learning Sparse Learning Optimization Structure Regression Multi-task Regression Canonical Correlation Analysis Undirected Graphical Models First-order Method Stochastic Optimization Text Mining Ranking Latent Semantic Analysis Spatial-temporal Data Computational Genomics Computer Sciences

Search results