Global ETD Search

61	Contributions to Unsupervised and Semi-Supervised Learning Pal, David 21 May 2009 (has links) This thesis studies two problems in theoretical machine learning. The first part of the thesis investigates the statistical stability of clustering algorithms. In the second part, we study the relative advantage of having unlabeled data in classification problems. Clustering stability was proposed and used as a model selection method in clustering tasks. The main idea of the method is that from a given data set two independent samples are taken. Each sample individually is clustered with the same clustering algorithm, with the same setting of its parameters. If the two resulting clusterings turn out to be close in some metric, it is concluded that the clustering algorithm and the setting of its parameters match the data set, and that clusterings obtained are meaningful. We study asymptotic properties of this method for certain types of cost minimizing clustering algorithms and relate their asymptotic stability to the number of optimal solutions of the underlying optimization problem. In classification problems, it is often expensive to obtain labeled data, but on the other hand, unlabeled data are often plentiful and cheap. We study how the access to unlabeled data can decrease the amount of labeled data needed in the worst-case sense. We propose an extension of the probably approximately correct (PAC) model in which this question can be naturally studied. We show that for certain basic tasks the access to unlabeled data might, at best, halve the amount of labeled data needed. machine learning statistics unsupervised learning semi-supervised learning learning theory Computer Science
62	MACHINE VISION FOR AUTOMATICVISUAL INSPECTION OF WOODENRAILWAY SLEEPERS USING UNSUPERVISED NEURAL NETWORKS Manne, Mihira January 2009 (has links) The motivation for this thesis work is the need for improving reliability of equipment and quality of service to railway passengers as well as a requirement for cost-effective and efficient condition maintenance management for rail transportation. This thesis work develops a fusion of various machine vision analysis methods to achieve high performance in automation of wooden rail track inspection.The condition monitoring in rail transport is done manually by a human operator where people rely on inference systems and assumptions to develop conclusions. The use of conditional monitoring allows maintenance to be scheduled, or other actions to be taken to avoid the consequences of failure, before the failure occurs. Manual or automated condition monitoring of materials in fields of public transportation like railway, aerial navigation, traffic safety, etc, where safety is of prior importance needs non-destructive testing (NDT).In general, wooden railway sleeper inspection is done manually by a human operator, by moving along the rail sleeper and gathering information by visual and sound analysis for examining the presence of cracks. Human inspectors working on lines visually inspect wooden rails to judge the quality of rail sleeper. In this project work the machine vision system is developed based on the manual visual analysis system, which uses digital cameras and image processing software to perform similar manual inspections. As the manual inspection requires much effort and is expected to be error prone sometimes and also appears difficult to discriminate even for a human operator by the frequent changes in inspected material. The machine vision system developed classifies the condition of material by examining individual pixels of images, processing them and attempting to develop conclusions with the assistance of knowledge bases and features.A pattern recognition approach is developed based on the methodological knowledge from manual procedure. The pattern recognition approach for this thesis work was developed and achieved by a non destructive testing method to identify the flaws in manually done condition monitoring of sleepers.In this method, a test vehicle is designed to capture sleeper images similar to visual inspection by human operator and the raw data for pattern recognition approach is provided from the captured images of the wooden sleepers. The data from the NDT method were further processed and appropriate features were extracted.The collection of data by the NDT method is to achieve high accuracy in reliable classification results. A key idea is to use the non supervised classifier based on the features extracted from the method to discriminate the condition of wooden sleepers in to either good or bad. Self organising map is used as classifier for the wooden sleeper classification.In order to achieve greater integration, the data collected by the machine vision system was made to interface with one another by a strategy called fusion. Data fusion was looked in at two different levels namely sensor-level fusion, feature- level fusion. As the goal was to reduce the accuracy of the human error on the rail sleeper classification as good or bad the results obtained by the feature-level fusion compared to that of the results of actual classification were satisfactory.
63	Learning from Partially Labeled Data: Unsupervised and Semi-supervised Learning on Graphs and Learning with Distribution Shifting Huang, Jiayuan January 2007 (has links) This thesis focuses on two fundamental machine learning problems:unsupervised learning, where no label information is available, and semi-supervised learning, where a small amount of labels are given in addition to unlabeled data. These problems arise in many real word applications, such as Web analysis and bioinformatics,where a large amount of data is available, but no or only a small amount of labeled data exists. Obtaining classification labels in these domains is usually quite difficult because it involves either manual labeling or physical experimentation. This thesis approaches these problems from two perspectives: graph based and distribution based. First, I investigate a series of graph based learning algorithms that are able to exploit information embedded in different types of graph structures. These algorithms allow label information to be shared between nodes in the graph---ultimately communicating information globally to yield effective unsupervised and semi-supervised learning. In particular, I extend existing graph based learning algorithms, currently based on undirected graphs, to more general graph types, including directed graphs, hypergraphs and complex networks. These richer graph representations allow one to more naturally capture the intrinsic data relationships that exist, for example, in Web data, relational data, bioinformatics and social networks. For each of these generalized graph structures I show how information propagation can be characterized by distinct random walk models, and then use this characterization to develop new unsupervised and semi-supervised learning algorithms. Second, I investigate a more statistically oriented approach that explicitly models a learning scenario where the training and test examples come from different distributions. This is a difficult situation for standard statistical learning approaches, since they typically incorporate an assumption that the distributions for training and test sets are similar, if not identical. To achieve good performance in this scenario, I utilize unlabeled data to correct the bias between the training and test distributions. A key idea is to produce resampling weights for bias correction by working directly in a feature space and bypassing the problem of explicit density estimation. The technique can be easily applied to many different supervised learning algorithms, automatically adapting their behavior to cope with distribution shifting between training and test data. unsupervised learning semi-supervised learning graph based learning distribution shifting Computer Science
64	Contributions to Unsupervised and Semi-Supervised Learning Pal, David 21 May 2009 (has links) This thesis studies two problems in theoretical machine learning. The first part of the thesis investigates the statistical stability of clustering algorithms. In the second part, we study the relative advantage of having unlabeled data in classification problems. Clustering stability was proposed and used as a model selection method in clustering tasks. The main idea of the method is that from a given data set two independent samples are taken. Each sample individually is clustered with the same clustering algorithm, with the same setting of its parameters. If the two resulting clusterings turn out to be close in some metric, it is concluded that the clustering algorithm and the setting of its parameters match the data set, and that clusterings obtained are meaningful. We study asymptotic properties of this method for certain types of cost minimizing clustering algorithms and relate their asymptotic stability to the number of optimal solutions of the underlying optimization problem. In classification problems, it is often expensive to obtain labeled data, but on the other hand, unlabeled data are often plentiful and cheap. We study how the access to unlabeled data can decrease the amount of labeled data needed in the worst-case sense. We propose an extension of the probably approximately correct (PAC) model in which this question can be naturally studied. We show that for certain basic tasks the access to unlabeled data might, at best, halve the amount of labeled data needed. machine learning statistics unsupervised learning semi-supervised learning learning theory Computer Science
65	Robust clustering algorithms Gupta, Pramod 05 April 2011 (has links) One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across any different fields ranging from computational biology to social sciences to computer vision in part because they are simple and their output is easy to interpret. However, many of these algorithms lack any performance guarantees when the data is noisy, incomplete or has outliers, which is the case for most real world data. It is well known that standard linkage algorithms perform extremely poorly in presence of noise. In this work we propose two new robust algorithms for bottom-up agglomerative clustering and give formal theoretical guarantees for their robustness. We show that our algorithms can be used to cluster accurately in cases where the data satisfies a number of natural properties and where the traditional agglomerative algorithms fail. We also extend our algorithms to an inductive setting with similar guarantees, in which we randomly choose a small subset of points from a much larger instance space and generate a hierarchy over this sample and then insert the rest of the points to it to generate a hierarchy over the entire instance space. We then do a systematic experimental analysis of various linkage algorithms and compare their performance on a variety of real world data sets and show that our algorithms do much better at handling various forms of noise as compared to other hierarchical algorithms in the presence of noise. Robust algorithms Hierarchical clustering Unsupervised learning Clustering Machine learning Cluster analysis Cluster analysis Computer programs Algorithms
66	Energy storage-aware prediction/control for mobile systems with unstructured loads LeSage, Jonathan Robert, 1985- 26 September 2013 (has links) Mobile systems, such as ground robots and electric vehicles, inherently operate in stochastic environments where load demands are largely unknown. Onboard energy storage, most commonly an electrochemical battery system, can significantly constrain operation. As such, mission planning and control of mobile systems can benefit from a priori knowledge about battery dynamics and constraints, especially the rate-capacity and recovery effects. To help overcome overly conservative predictions common with most existing battery remaining run-time algorithms, a prediction scheme was proposed. For characterization of a priori unknown power loads, an unsupervised Gaussian mixture routine identifies/clusters the measured power loads, and a jump-Markov chain characterizes the load transients. With the jump-Markov load forecasts, a model-based particle filter scheme predicts battery remaining run-time. Monte Carlo simulation studies demonstrate the marked improvement of the proposed technique. It was found that the increase in computational complexity from using a particle filter was justified for power load transient jumps greater than 13.4% of total system power. A multivariable reliability method was developed to assess the feasibility of a planned mission. The probability of mission completion is computed as the reliability integral of mission time exceeding the battery run-time. Because these random variables are inherently dependent, a bivariate characterization was necessary and a method is presented for online estimation of the process correlation via Bayesian updating. Finally, to abate transient shutdown of mobile systems, a model predictive control scheme is proposed that enforces battery terminal voltage constraints under stochastic loading conditions. A Monte Carlo simulation study of a small ground vehicle indicated significant improvement in both time and distance traveled as a result. For evaluation of the proposed methodologies, a laboratory terrain environment was designed and constructed for repeated mobile system discharge studies. The test environment consists of three distinct terrains. For each discharge study, a small unmanned ground vehicle traversed the stochastic terrain environment until battery exhaustion. Results from field tests with a Packbot ground vehicle in generic desert terrain were also used. Evaluation of the proposed prediction algorithms using the experimental studies, via relative accuracy and [alpha]-[lambda] prognostic metrics, indicated significant gains over existing methods. / text Energy storage Mobile systems Ground robotics Model-based prediction Model-predictive control Particle filter Unsupervised learning
67	Human Rationality : Observing or Inferring Reality Henriksson, Maria P. January 2015 (has links) This thesis investigates the boundary of human rationality and how psychological processes interact with underlying regularities in the environment and affect beliefs and achievement. Two common modes in everyday experiential learning, supervised and unsupervised learning were hypothesized to tap different ecological and epistemological approaches to human adaptation; the Brunswikian and the Gibsonian approach. In addition, they were expected to be differentially effective for achievement depending on underlying regularities in the task environment. The first approach assumes that people use top-down processes and learn from hypothesis testing and external feedback, while the latter assumes that people are receptive to environmental stimuli and learn from bottom-up processes, without mediating inferences and support from external feedback, only exploratory observations and actions. Study I investigates selective supervised learning and showed that biased beliefs arise when people store inferences about category members when information is partially absent. This constructivist coding of pseudo-exemplars in memory yields a conservative bias in the relative frequency of targeted category members when the information is constrained by the decision maker’s own selective sampling behavior, suggesting that niche picking and risk aversion contribute to conservatism or inertia in human belief systems. However, a liberal bias in the relative frequency of targeted category members is more likely when information is constrained by the external environment. This result suggests that highly exaggerated beliefs and risky behaviors may be more likely in environments where information is systematically manipulated, for example when positive examples are highlighted to convey a favorable image while negative examples are systematically withheld from the public eye. Study II provides support that the learning modes engage different processes. Supervised learning is more accurate in less complex linear task environments, while unsupervised learning is more accurate in complex nonlinear task environments. Study III provides further support for abstraction based on hypothesis testing in supervised learning, and abstraction based on receptive bottom-up processes in unsupervised learning that aimed to form ideal prototypes as highly valid reference points stored in memory. The studies support previous proposals that integrating the Brunswikian and the Gibsonian approach can broaden the scope of psychological research and scientific inquiry. supervised learning unsupervised learning adaptation niche picking prototypes rules exemplar memory.
68	New tools for unsupervised learning Xiao, Ying 12 January 2015 (has links) In an unsupervised learning problem, one is given an unlabelled dataset and hopes to find some hidden structure; the prototypical example is clustering similar data. Such problems often arise in machine learning and statistics, but also in signal processing, theoretical computer science, and any number of quantitative scientific fields. The distinguishing feature of unsupervised learning is that there are no privileged variables or labels which are particularly informative, and thus the greatest challenge is often to differentiate between what is relevant or irrelevant in any particular dataset or problem. In the course of this thesis, we study a number of problems which span the breadth of unsupervised learning. We make progress in Gaussian mixtures, independent component analysis (where we solve the open problem of underdetermined ICA), and we formulate and solve a feature selection/dimension reduction model. Throughout, our goal is to give finite sample complexity bounds for our algorithms -- these are essentially the strongest type of quantitative bound that one can prove for such algorithms. Some of our algorithmic techniques turn out to be very efficient in practice as well. Our major technical tool is tensor spectral decomposition: tensors are generalisations of matrices, and often allow access to the "fine structure" of data. Thus, they are often the right tools for unravelling the hidden structure in an unsupervised learning setting. However, naive generalisations of matrix algorithms to tensors run into NP-hardness results almost immediately, and thus to solve our problems, we are obliged to develop two new tensor decompositions (with robust analyses) from scratch. Both of these decompositions are polynomial time, and can be viewed as efficient generalisations of PCA extended to tensors. Tensor Spectral decomposition Unsupervised learning Independent component analysis Fourier transform Gaussian mixture model Feature selection
69	Efficient deterministic approximate Bayesian inference for Gaussian process models Bui, Thang Duc January 2018 (has links) Gaussian processes are powerful nonparametric distributions over continuous functions that have become a standard tool in modern probabilistic machine learning. However, the applicability of Gaussian processes in the large-data regime and in hierarchical probabilistic models is severely limited by analytic and computational intractabilities. It is, therefore, important to develop practical approximate inference and learning algorithms that can address these challenges. To this end, this dissertation provides a comprehensive and unifying perspective of pseudo-point based deterministic approximate Bayesian learning for a wide variety of Gaussian process models, which connects previously disparate literature, greatly extends them and allows new state-of-the-art approximations to emerge. We start by building a posterior approximation framework based on Power-Expectation Propagation for Gaussian process regression and classification. This framework relies on a structured approximate Gaussian process posterior based on a small number of pseudo-points, which is judiciously chosen to summarise the actual data and enable tractable and efficient inference and hyperparameter learning. Many existing sparse approximations are recovered as special cases of this framework, and can now be understood as performing approximate posterior inference using a common approximate posterior. Critically, extensive empirical evidence suggests that new approximation methods arisen from this unifying perspective outperform existing approaches in many real-world regression and classification tasks. We explore the extensions of this framework to Gaussian process state space models, Gaussian process latent variable models and deep Gaussian processes, which also unify many recently developed approximation schemes for these models. Several mean-field and structured approximate posterior families for the hidden variables in these models are studied. We also discuss several methods for approximate uncertainty propagation in recurrent and deep architectures based on Gaussian projection, linearisation, and simple Monte Carlo. The benefit of the unified inference and learning frameworks for these models are illustrated in a variety of real-world state-space modelling and regression tasks.
70	Approximate inference : new visions Li, Yingzhen January 2018 (has links) Nowadays machine learning (especially deep learning) techniques are being incorporated to many intelligent systems affecting the quality of human life. The ultimate purpose of these systems is to perform automated decision making, and in order to achieve this, predictive systems need to return estimates of their confidence. Powered by the rules of probability, Bayesian inference is the gold standard method to perform coherent reasoning under uncertainty. It is generally believed that intelligent systems following the Bayesian approach can better incorporate uncertainty information for reliable decision making, and be less vulnerable to attacks such as data poisoning. Critically, the success of Bayesian methods in practice, including the recent resurgence of Bayesian deep learning, relies on fast and accurate approximate Bayesian inference applied to probabilistic models. These approximate inference methods perform (approximate) Bayesian reasoning at a relatively low cost in terms of time and memory, thus allowing the principles of Bayesian modelling to be applied to many practical settings. However, more work needs to be done to scale approximate Bayesian inference methods to big systems such as deep neural networks and large-scale dataset such as ImageNet. In this thesis we develop new algorithms towards addressing the open challenges in approximate inference. In the first part of the thesis we develop two new approximate inference algorithms, by drawing inspiration from the well known expectation propagation and message passing algorithms. Both approaches provide a unifying view of existing variational methods from different algorithmic perspectives. We also demonstrate that they lead to better calibrated inference results for complex models such as neural network classifiers and deep generative models, and scale to large datasets containing hundreds of thousands of data-points. In the second theme of the thesis we propose a new research direction for approximate inference: developing algorithms for fitting posterior approximations of arbitrary form, by rethinking the fundamental principles of Bayesian computation and the necessity of algorithmic constraints in traditional inference schemes. We specify four algorithmic options for the development of such new generation approximate inference methods, with one of them further investigated and applied to Bayesian deep learning tasks.

Search results