Global ETD Search

31	Supervision Beyond Manual Annotations for Learning Visual Representations Doersch, Carl 01 April 2016 (has links) For both humans and machines, understanding the visual world requires relating new percepts with past experience. We argue that a good visual representation for an image should encode what makes it similar to other images, enabling the recall of associated experiences. Current machine implementations of visual representations can capture some aspects of similarity, but fall far short of human ability overall. Even if one explicitly labels objects in millions of images to tell the computer what should be considered similar—a very expensive procedure—the labels still do not capture everything that might be relevant. This thesis shows that one can often train a representation which captures similarity beyond what is labeled in a given dataset. That means we can begin with a dataset that has uninteresting labels, or no labels at all, and still build a useful representation. To do this, we propose to using pretext tasks: tasks that are not useful in and of themselves, but serve as an excuse to learn a more general-purpose representation. The labels for a pretext task can be inexpensive or even free. Furthermore, since this approach assumes training labels differ from the desired outputs, it can handle output spaces where the correct answer is ambiguous, and therefore impossible to annotate by hand. The thesis explores two broad classes of supervision. The first isweak image-level supervision, which is exploited to train mid-level discriminative patch classifiers. For example, given a dataset of street-level imagery labeled only with GPS coordinates, patch classifiers are trained to differentiate one specific geographical region (e.g. the city of Paris) from others. The resulting classifiers each automatically collect and associate a set of patches which all depict the same distinctive architectural element. In this way, we can learn to detect elements like balconies, signs, and lamps without annotations. The second type of supervision requires no information about images other than the pixels themselves. Instead, the algorithm is trained to predict the context around image patches. The context serves as a sort of weak label: to predict well, the algorithm must associate similar-looking patches which also have similar contexts. After training, the feature representation learned using this within-image context indeed captures visual similarity across images, which ultimately makes it useful for real tasks like object detection and geometry estimation. Read more pretext tasks self-supervised learning computer vision unsupervised learning weakly-supervised learning context
32	Semantic Mapping using Virtual Sensors and Fusion of Aerial Images with Sensor Data from a Ground Vehicle Persson, Martin January 2008 (has links) <p>In this thesis, semantic mapping is understood to be the process of putting a tag or label on objects or regions in a map. This label should be interpretable by and have a meaning for a human. The use of semantic information has several application areas in mobile robotics. The largest area is in human-robot interaction where the semantics is necessary for a common understanding between robot and human of the operational environment. Other areas include localization through connection of human spatial concepts to particular locations, improving 3D models of indoor and outdoor environments, and model validation.</p><p>This thesis investigates the extraction of semantic information for mobile robots in outdoor environments and the use of semantic information to link ground-level occupancy maps and aerial images. The thesis concentrates on three related issues: i) recognition of human spatial concepts in a scene, ii) the ability to incorporate semantic knowledge in a map, and iii) the ability to connect information collected by a mobile robot with information extracted from an aerial image.</p><p>The first issue deals with a vision-based virtual sensor for classification of views (images). The images are fed into a set of learned virtual sensors, where each virtual sensor is trained for classification of a particular type of human spatial concept. The virtual sensors are evaluated with images from both ordinary cameras and an omni-directional camera, showing robust properties that can cope with variations such as changing season.</p><p>In the second part a probabilistic semantic map is computed based on an occupancy grid map and the output from a virtual sensor. A local semantic map is built around the robot for each position where images have been acquired. This map is a grid map augmented with semantic information in the form of probabilities that the occupied grid cells belong to a particular class. The local maps are fused into a global probabilistic semantic map covering the area along the trajectory of the mobile robot.</p><p>In the third part information extracted from an aerial image is used to improve the mapping process. Region and object boundaries taken from the probabilistic semantic map are used to initialize segmentation of the aerial image. Algorithms for both local segmentation related to the borders and global segmentation of the entire aerial image, exemplified with the two classes ground and buildings, are presented. Ground-level semantic information allows focusing of the segmentation of the aerial image to desired classes and generation of a semantic map that covers a larger area than can be built using only the onboard sensors.</p> Read more semantic mapping aerial image mobile robot supervised learning semi-supervised learning TECHNOLOGY TEKNIKVETENSKAP
33	Domain knowledge, uncertainty, and parameter constraints Mao, Yi 24 August 2010 (has links) No description available. Sentiment analysis Constrained optimization Empirical bayes Supervised learning Supervised learning (Machine learning) Machine learning Artificial intelligence
34	Semantic mapping using virtual sensors and fusion of aerial images with sensor data from a ground vehicle Persson, Martin January 2008 (has links) In this thesis, semantic mapping is understood to be the process of putting a tag or label on objects or regions in a map. This label should be interpretable by and have a meaning for a human. The use of semantic information has several application areas in mobile robotics. The largest area is in human-robot interaction where the semantics is necessary for a common understanding between robot and human of the operational environment. Other areas include localization through connection of human spatial concepts to particular locations, improving 3D models of indoor and outdoor environments, and model validation. This thesis investigates the extraction of semantic information for mobile robots in outdoor environments and the use of semantic information to link ground-level occupancy maps and aerial images. The thesis concentrates on three related issues: i) recognition of human spatial concepts in a scene, ii) the ability to incorporate semantic knowledge in a map, and iii) the ability to connect information collected by a mobile robot with information extracted from an aerial image. The first issue deals with a vision-based virtual sensor for classification of views (images). The images are fed into a set of learned virtual sensors, where each virtual sensor is trained for classification of a particular type of human spatial concept. The virtual sensors are evaluated with images from both ordinary cameras and an omni-directional camera, showing robust properties that can cope with variations such as changing season. In the second part a probabilistic semantic map is computed based on an occupancy grid map and the output from a virtual sensor. A local semantic map is built around the robot for each position where images have been acquired. This map is a grid map augmented with semantic information in the form of probabilities that the occupied grid cells belong to a particular class. The local maps are fused into a global probabilistic semantic map covering the area along the trajectory of the mobile robot. In the third part information extracted from an aerial image is used to improve the mapping process. Region and object boundaries taken from the probabilistic semantic map are used to initialize segmentation of the aerial image. Algorithms for both local segmentation related to the borders and global segmentation of the entire aerial image, exemplified with the two classes ground and buildings, are presented. Ground-level semantic information allows focusing of the segmentation of the aerial image to desired classes and generation of a semantic map that covers a larger area than can be built using only the onboard sensors. Read more semantic mapping aerial image mobile robot supervised learning semi-supervised learning TECHNOLOGY TEKNIKVETENSKAP
35	Textová klasifikace s limitovanými trénovacími daty / Text classification with limited training data Laitoch, Petr January 2021 (has links) The aim of this thesis is to minimize manual work needed to create training data for text classification tasks. Various research areas including weak supervision, interactive learning and transfer learning explore how to minimize training data creation effort. We combine ideas from available literature in order to design a comprehensive text classification framework that employs keyword-based labeling instead of traditional text annotation. Keyword-based labeling aims to label texts based on keywords contained in the texts that are highly correlated with individual classification labels. As noted repeatedly in previous work, coming up with many new keywords is challenging for humans. To accommodate for this issue, we propose an interactive keyword labeler featuring the use of word similarity for guiding a user in keyword labeling. To verify the effectiveness of our novel approach, we implement a minimum viable prototype of the designed framework and use it to perform a user study on a restaurant review multi-label classification problem.
36	Exploration of Semi-supervised Learning for Convolutional Neural Networks Sheffler, Nicholas 01 March 2023 (has links) (PDF) Training a neural network requires a large amount of labeled data that has to be created by either human annotation or by very specifically created methods. Currently, there is a vast abundance of unlabeled data that is neglected sitting on servers, hard drives, websites, etc. These untapped data sources serve as the inspiration for this paper. The goal of this thesis is to explore and test various methods of semi-supervised learning (SSL) for convolutional neural networks (CNN). These methods will be analyzed and evaluated based on their accuracy on a test set of data. Since this particular neural network will be used to offer paths for an autonomous robot, it is important for the networks to be lightweight in scale. This paper will then take this assortment of smaller neural networks and run them through a variety of semi-supervised training methods. The first method is to have a teacher model that is trained on properly labeled data create labels for unlabeled data and add this to the training set for the next student model. From this base method, a few variations were tried in the hopes of getting a significant improvement. The first variation tested by this thesis is the effects of having this teacher and student cycle run more than one iteration. After that, the effects of using the confidence values that the models produced were explored by both including only data with confidence above a certain value and in a different test, relabeling data below a confidence threshold. The last variation this thesis explored was to have two teacher models concurrently and have the combination of those two models decide on the proper label for the unlabeled data. Through exploration and testing, these methods are evaluated in the results section as to which one produces the best results for SSL. Read more Self-Supervised Learning Convolutional Neural Networks Deep Learning Artificial Intelligence Noisy Student Training Supervised Learning
37	Knowledge transfer and retention in deep neural networks Fini, Enrico 17 April 2023 (has links) This thesis addresses the crucial problem of knowledge transfer and retention in deep neural networks. The ability to transfer knowledge from previously learned tasks and retain it for future use is essential for machine learning models to continually adapt to new tasks and improve their overall performance. In principle, knowledge can be transferred between any type of task, but we believe it to be particularly challenging in the field of computer vision, where the size and diversity of visual data often result in high compute requirements and the need for large, complex models. Hence, we analyze transfer and retention learning between unsupervised and supervised visual tasks, which form the main focus of this thesis. We categorize our efforts into several knowledge transfer and retention paradigms, and we tackle them with several contributions for the scientific community. The thesis proposes settings and methods based on knowledge distillation and self-supervised learning techniques. In particular, we devise two novel continual learning settings and seven new methods for knowledge transfer and retention, setting new state-of-the-art in a wide range of tasks. In conclusion, this thesis provides a valuable contribution to the field of computer vision and machine learning and sets a foundation for future work in this area. Read more
38	Active learning via Transduction in Regression Forests Hansson, Kim, Hörlin, Erik January 2015 (has links) Context. The amount of training data required to build accurate modelsis a common problem in machine learning. Active learning is a techniquethat tries to reduce the amount of required training data by making activechoices of which training data holds the greatest value.Objectives. This thesis aims to design, implement and evaluate the Ran-dom Forests algorithm combined with active learning that is suitable forpredictive tasks with real-value data outcomes where the amount of train-ing data is small. machine learning algorithms traditionally requires largeamounts of training data to create a general model, and training data is inmany cases sparse and expensive or difficult to create.Methods.The research methods used for this thesis is implementation andscientific experiment. An approach to active learning was implementedbased on previous work for classification type problems. The approachuses the Mahalanobis distance to perform active learning via transduction.Evaluation was done using several data sets were the decrease in predictionerror was measured over several iterations. The results of the evaluationwas then analyzed using nonparametric statistical testing.Results. The statistical analysis of the evaluation results failed to detect adifference between our approach and a non active learning approach, eventhough the proposed algorithm showed irregular performance. The evalu-ation of our tree-based traversal method, and the evaluation of the Maha-lanobis distance for transduction both showed that these methods performedbetter than Euclidean distance and complete graph traversal.Conclusions. We conclude that the proposed solution did not decreasethe amount of required training data on a significant level. However, theapproach has potential and future work could lead to a working active learn-ing solution. Further work is needed on key areas of the implementation,such as the choice of instances for active learning through transduction un-certainty as well as choice of method for going from transduction model toinduction model. Read more Active learning Regression Random Forests Semi-supervised learning Transduction
39	Scalable semi-supervised grammar induction using cross-linguistically parameterized syntactic prototypes Boonkwan, Prachya January 2014 (has links) This thesis is about the task of unsupervised parser induction: automatically learning grammars and parsing models from raw text. We endeavor to induce such parsers by observing sequences of terminal symbols. We focus on overcoming the problem of frequent collocation that is a major source of error in grammar induction. For example, since a verb and a determiner tend to co-occur in a verb phrase, the probability of attaching the determiner to the verb is sometimes higher than that of attaching the core noun to the verb, resulting in erroneous attachment *((Verb Det) Noun) instead of (Verb (Det Noun)). Although frequent collocation is the heart of grammar induction, it is precariously capable of distorting the grammar distribution. Natural language grammars follow a Zipfian (power law) distribution, where the frequency of any grammar rule is inversely proportional to its rank in the frequency table. We believe that covering the most frequent grammar rules in grammar induction will have a strong impact on accuracy. We propose an efficient approach to grammar induction guided by cross-linguistic language parameters. Our language parameters consist of 33 parameters of frequent basic word orders, which are easy to be elicited from grammar compendiums or short interviews with naïve language informants. These parameters are designed to capture frequent word orders in the Zipfian distribution of natural language grammars, while the rest of the grammar including exceptions can be automatically induced from unlabeled data. The language parameters shrink the search space of the grammar induction problem by exploiting both word order information and predefined attachment directions. The contribution of this thesis is three-fold. (1) We show that the language parameters are adequately generalizable cross-linguistically, as our grammar induction experiments will be carried out on 14 languages on top of a simple unsupervised grammar induction system. (2) Our specification of language parameters improves the accuracy of unsupervised parsing even when the parser is exposed to much less frequent linguistic phenomena in longer sentences when the accuracy decreases within 10%. (3) We investigate the prevalent factors of errors in grammar induction which will provide room for accuracy improvement. The proposed language parameters efficiently cope with the most frequent grammar rules in natural languages. With only 10 man-hours for preparing syntactic prototypes, it improves the accuracy of directed dependency recovery over the state-ofthe- art Gillenwater et al.’s (2010) completely unsupervised parser in: (1) Chinese by 30.32% (2) Swedish by 28.96% (3) Portuguese by 37.64% (4) Dutch by 15.17% (5) German by 14.21% (6) Spanish by 13.53% (7) Japanese by 13.13% (8) English by 12.41% (9) Czech by 9.16% (10) Slovene by 7.24% (11) Turkish by 6.72% and (12) Bulgarian by 5.96%. It is noted that although the directed dependency accuracies of some languages are below 60%, their TEDEVAL scores are still satisfactory (approximately 80%). This suggests us that our parsed trees are, in fact, closely related to the gold-standard trees despite the discrepancy of annotation schemes. We perform an error analysis of over- and under-generation analysis. We found three prevalent problems that cause errors in the experiments: (1) PP attachment (2) discrepancies of dependency annotation schemes and (3) rich morphology. The methods presented in this thesis were originally presented in Boonkwan and Steedman (2011). The thesis presents a great deal more detail in the design of crosslinguistic language parameters, the algorithm of lexicon inventory construction, experiment results, and error analysis. Read more 006.3
40	Incremental semi-supervised learning for anomalous trajectory detection Sillito, Rowland R. January 2010 (has links) The acquisition of a scene-specific normal behaviour model underlies many existing approaches to the problem of automated video surveillance. Since it is unrealistic to acquire a comprehensive set of labelled behaviours for every surveyed scenario, modelling normal behaviour typically corresponds to modelling the distribution of a large collection of unlabelled examples. In general, however, it would be desirable to be able to filter an unlabelled dataset to remove potentially anomalous examples. This thesis proposes a simple semi-supervised learning framework that could allow a human operator to efficiently filter the examples used to construct a normal behaviour model by providing occasional feedback: Specifically, the classification output of the model under construction is used to filter the incoming sequence of unlabelled examples so that human approval is requested before incorporating any example classified as anomalous, while all other examples are automatically used for training. A key component of the proposed framework is an incremental one-class learning algorithm which can be trained on a sequence of normal examples while allowing new examples to be classified at any stage during training. The proposed algorithm represents an initial set of training examples with a kernel density estimate, before using merging operations to incrementally construct a Gaussian mixture model while minimising an information-theoretic cost function. This algorithm is shown to outperform an existing state-of-the-art approach without requiring off-line model selection. Throughout this thesis behaviours are considered in terms of whole motion trajectories: in order to apply the proposed algorithm, trajectories must be encoded with fixed length vectors. To determine an appropriate encoding strategy, an empirical comparison is conducted to determine the relative class-separability afforded by several different trajectory representations for a range of datasets. The results obtained suggest that the choice of representation makes a small but consistent difference to class separability, indicating that cubic B-Spline control points (fitted using least-squares regression) provide a good choice for use in subsequent experiments. The proposed semi-supervised learning framework is tested on three different real trajectory datasets. In all cases the rate of human intervention requests drops steadily, reaching a usefully low level of 1% in one case. A further experiment indicates that once a sufficient number of interventions has been provided, a high level of classification performance can be achieved even if subsequent requests are ignored. The automatic incorporation of unlabelled data is shown to improve classification performance in all cases, while a high level of classification performance is maintained even when unlabelled data containing a high proportion of anomalous examples is presented. Read more 004.33

Search results