Spelling suggestions: "subject:"[een] UNSUPERVISED LEARNING"" "subject:"[enn] UNSUPERVISED LEARNING""
61 |
Truncated Signed Distance Fields Applied To RoboticsCanelhas, Daniel Ricão January 2017 (has links)
This thesis is concerned with topics related to dense mapping of large scale three-dimensional spaces. In particular, the motivating scenario of this work is one in which a mobile robot with limited computational resources explores an unknown environment using a depth-camera. To this end, low-level topics such as sensor noise, map representation, interpolation, bit-rates, compression are investigated, and their impacts on more complex tasks, such as feature detection and description, camera-tracking, and mapping are evaluated thoroughly. A central idea of this thesis is the use of truncated signed distance fields (TSDF) as a map representation and a comprehensive yet accessible treatise on this subject is the first major contribution of this dissertation. The TSDF is a voxel-based representation of 3D space that enables dense mapping with high surface quality and robustness to sensor noise, making it a good candidate for use in grasping, manipulation and collision avoidance scenarios. The second main contribution of this thesis deals with the way in which information can be efficiently encoded in TSDF maps. The redundant way in which voxels represent continuous surfaces and empty space is one of the main impediments to applying TSDF representations to large-scale mapping. This thesis proposes two algorithms for enabling large-scale 3D tracking and mapping: a fast on-the-fly compression method based on unsupervised learning, and a parallel algorithm for lifting a sparse scene-graph representation from the dense 3D map. The third major contribution of this work consists of thorough evaluations of the impacts of low-level choices on higher-level tasks. Examples of these are the relationships between gradient estimation methods and feature detector repeatability, voxel bit-rate, interpolation strategy and compression ratio on camera tracking performance. Each evaluation thus leads to a better understanding of the trade-offs involved, which translate to direct recommendations for future applications, depending on their particular resource constraints.
|
62 |
Object Instance Detection and Dynamics Modeling in a Long-Term Mobile Robot ContextBore, Nils January 2017 (has links)
In the last years, simple service robots such as autonomous vacuum cleaners and lawn mowers have become commercially available and increasingly common. The next generation of service robots should perform more advanced tasks, such as to clean up objects. Robots then need to learn to robustly navigate, and manipulate, cluttered environments, such as an untidy living room. In this thesis, we focus on representations for tasks such as general cleaning and fetching of objects. We discuss requirements for these specific tasks, and argue that solving them would be generally useful, because of their object-centric nature. We rely on two fundamental insights in our approach to understand environments on a fine-grained level. First, many of today's robot map representations are limited to the spatial domain, and ignore that there is a time axis that constrains how much an environment may change during a given period. We argue that it is of critical importance to also consider the temporal domain. By studying the motion of individual objects, we can enable tasks such as general cleaning and object fetching. The second insight comes from that mobile robots are becoming more robust. They can therefore collect large amounts of data from those environments. With more data, unsupervised learning of models becomes feasible, allowing the robot to adapt to changes in the environment, and to scenarios that the designer could not foresee. We view these capabilities as vital for robots to become truly autonomous. The combination of unsupervised learning and dynamics modelling creates an interesting symbiosis: the dynamics vary between different environments and between the objects in one environment, and learning can capture these variations. A major difficulty when modeling environment dynamics is that the whole environment can not be observed at one time, since the robot is moving between different places. We demonstrate how this can be dealt with in a principled manner, by modeling several modes of object movement. We also demonstrate methods for detection and learning of objects and structures in the static parts of the maps. Using the complete system, we can represent and learn many aspects of the full environment. In real-world experiments, we demonstrate that our system can keep track of varied objects in large and highly dynamic environments. / Under de senaste åren har enklare service-robotar, såsom autonoma dammsugare och gräsklippare, börjat säljas, och blivit alltmer vanliga. Nästa generations service-robotar förväntas utföra mer komplexa uppgifter, till exempel att städa upp utspridda föremål i ett vardagsrum. För att uppnå detta måste robotarna kunna navigera i ostrukturerade miljöer, och förstå hur de kan bringas i ordning. I denna avhandling undersöker vi abstrakta representationer som kan förverkliga generalla städrobotar, samt robotar som kan hämta föremål. Vi diskuterar vad dessa specifika tillämpningar kräver i form av representationer, och argumenterar för att en lösning på dessa problem vore mer generellt applicerbar på grund av uppgifternas föremåls-centrerade natur. Vi närmar oss uppgiften genom två viktiga insikter. Till att börja medär många av dagens robot-representationer begränsade till rumsdomänen. De utelämnar alltså att modellera den variation som sker över tiden, och utnyttjar därför inte att rörelsen som kan ske under en given tidsperiod är begränsad. Vi argumenterar för att det är kritiskt att också inkorperara miljöns rörelse i robotens modell. Genom att modellera omgivningen på en föremåls-nivå möjliggörs tillämpningar som städning och hämtning av rörliga objekt. Den andra insikten kommer från att mobila robotar nu börjar bli så robusta att de kan patrullera i en och samma omgivning under flera månader. Dekan därför samla in stora mängder data från enskilda omgivningar. Med dessa stora datamängder börjar det bli möjligt att tillämpa så kallade "unsupervised learning"-metoder för att lära sig modeller av enskilda miljöer utan mänsklig inblandning. Detta tillåter robotarna att anpassa sig till förändringar i omgivningen, samt att lära sig koncept som kan vara svåra att förutse på förhand. Vi ser detta som en grundläggande förmåga hos en helt autonom robot. Kombinationen av unsupervised learning och modellering av omgivningens dynamik är intressant. Eftersom dynamiken varierar mellan olika omgivningar,och mellan olika objekt, kan learning hjälpa oss att fånga dessa variationer,och skapa mer precisa dynamik-modeller. Något som försvårar modelleringen av omgivningens dynamik är att roboten inte kan observera hela omgivningen på samma gång. Detta betyder att saker kan flyttas långa sträckor mellan två observationer. Vi visar hur man kan adressera detta i modellen genom att inlemma flera olika sätt som ett föremål kan flyttas på. Det resulterande systemet är helt probabilistiskt, och kan hålla reda på samtliga föremål i robotens omgivning. Vi demonstrerar även metoder för att upptäcka och lära sig föremål i den statiska delen av omgivningen. Med det kombinerade systemet kan vi således representera och lära oss många aspekter av robotens omgivning. Genom experiment i mänskliga miljöer visar vi att systemet kan hålla reda på olika sorters föremål i stora, och dynamiska, miljöer. / <p>QC 20171213</p>
|
63 |
Automatic induction of verb classes using clusteringSun, Lin January 2013 (has links)
Verb classifications have attracted a great deal of interest in both linguistics and natural language processing (NLP). They have proved useful for important tasks and applications, including e.g. computational lexicography, parsing, word sense disambiguation, semantic role labelling, information extraction, question-answering, and machine translation (Swier and Stevenson, 2004; Dang, 2004; Shi and Mihalcea, 2005; Kipper et al., 2008; Zapirain et al., 2008; Rios et al., 2011). Particularly useful are classes which capture generalizations about a range of linguistic properties (e.g. lexical, (morpho-)syntactic, semantic), such as those proposed by Beth Levin (1993). However, full exploitation of such classes in real-world tasks has been limited because no comprehensive or domain-specific lexical classification is available. This thesis investigates how Levin-style lexical semantic classes could be learned automatically from corpus data. Automatic acquisition is cost-effective when it involves either no or minimal supervision and it can be applied to any domain of interest where adequate corpus data is available. We improve on earlier work on automatic verb clustering. We introduce new features and new clustering methods to improve the accuracy and coverage. We evaluate our methods and features on well-established cross-domain datasets in English, on a specific domain of English (the biomedical) and on another language (French), reporting promising results. Finally, our task-based evaluation demonstrates that the automatically acquired lexical classes enable new approaches to some NLP tasks (e.g. metaphor identification) and help to improve the accuracy of existing ones (e.g. argumentative zoning).
|
64 |
Detecção de novidade com aplicação a fluxos contínuos de dados / Novelty detection with application to data streamsEduardo Jaques Spinosa 20 February 2008 (has links)
Neste trabalho a detecção de novidade é tratada como o problema de identificação de conceitos emergentes em dados que podem ser apresentados em um fluxo contínuo. Considerando a relação intrínseca entre tempo e novidade e os desafios impostos por fluxos de dados, uma nova abordagem é proposta. OLINDDA (OnLIne Novelty and Drift Detection Algorithm) vai além da classficação com uma classe e concentra-se no aprendizado contínuo não-supervisionado de novos conceitos. Tendo aprendido uma descrição inicial de um conceito normal, prossegue à análise de novos dados, tratando-os como um fluxo contínuo em que novos conceitos podem aparecer a qualquer momento. Com o uso de técnicas de agrupamento, OLINDDA pode empregar diversos critérios de validação para avaliar grupos em termos de sua coesão e representatividade. Grupos considerados válidos produzem conceitos que podem sofrer fusão, e cujo conhecimento é continuamente incorporado. A técnica é avaliada experimentalmente com dados artificiais e reais. O módulo de classificação com uma classe é comparado a outras técnicas de detecção de novidade, e a abordagem como um todo é analisada sob vários aspectos por meio da evolução temporal de diversas métricas. Os resultados reforçam a importância da detecção contínua de novos conceitos, assim como as dificuldades e desafios do aprendizado não-supervisionado de novos conceitos em fluxos de dados / In this work novelty detection is treated as the problem of identifying emerging concepts in data that may be presented in a continuous ow. Considering the intrinsic relationship between time and novelty and the challenges imposed by data streams, a novel approach is proposed. OLINDDA, an OnLIne Novelty and Drift Detection Algorithm, goes beyond one-class classification and focuses on the unsupervised continuous learning of novel concepts. Having learned an initial description of a normal concept, it proceeds to the analysis of new data, treating them as a continuous ow where novel concepts may appear at any time. By the use of clustering techniques, OLINDDA may employ several validation criteria to evaluate clusters in terms of their cohesiveness and representativeness. Clusters considered valid produce concepts that may be merged, and whose knowledge is continuously incorporated. The technique is experimentally evaluated with artificial and real data. The one-class classification module is compared to other novelty detection techniques, and the whole approach is analyzed from various aspects through the temporal evolution of several metrics. Results reinforce the importance of continuous detection of novel concepts, as well as the dificulties and challenges of the unsupervised learning of novel concepts in data streams
|
65 |
Técnicas de aprendizado não supervisionado baseadas no algoritmo da caminhada do turista / Unsupervised learning techniques based on the tourist walk algorithmCarlos Humberto Porto Filho 07 November 2017 (has links)
Nas últimas décadas, a quantidade de informações armazenadas no formato digital tem crescido de forma exponencial, levando à necessidade cada vez maior de produção de ferramentas computacionais que auxiliem na geração do conhecimento a partir desses dados. A área de Aprendizado de Máquina fornece diversas técnicas capazes de identificar padrões nesses conjuntos de dados. Dentro dessas técnicas, este trabalho destaca o Aprendizado de Máquina Não Supervisionado onde o objetivo é classificar as entidades em clusters (grupos) mutuamente exclusivos baseados na similaridade entre as instâncias. Os clusters não são pré-definidos e daí o elemento não supervisionado. Organizar esses dados em clusters que façam sentido é uma das maneiras mais fundamentais de entendimento e aprendizado. A análise de clusters é o estudo dos métodos para agrupamento e se divide entre hierárquico e particional. A classificação hierárquica é uma sequência encadeada de partições enquanto que na particional há somente uma partição. O interesse deste trabalho são as técnicas baseadas em uma caminhada determinística parcialmente auto repulsiva conhecida como caminhada do turista. Partindo da hipótese de que é possível utilizar a caminhada do turista como uma técnica de Aprendizado de Máquina Não Supervisionado, foi implementado um algoritmo hierárquico baseado na caminhada do turista proposto por Campiteli et al. (2006). Foi avaliado, através de diferentes conjuntos de imagens médicas, como essa técnica se compara com técnicas hierárquicas tradicionais. Também é proposto um novo algoritmo de Aprendizado de Máquina Não Supervisionado particional baseado na caminhada do turista, chamado de Tourist Walk Partitional Clustering (TWPC). Os resultados mostraram que a técnica hierárquica baseada na caminhada do turista é capaz de identificar clusters em conjuntos de imagens médicas através de uma árvore que não impõe uma estrutura binária, com um número menor de hierarquias e uma invariabilidade à escala dos dados, resultando em uma estrutura mais organizada. Mesmo que a árvore não seja diretamente baseada nas distâncias dos dados, mas em um ranking de vizinhos, ela ainda preserva uma correlação entre suas distâncias cofenéticas e as distâncias reais entre os dados. O método particional proposto TWPC foi capaz de encontrar, de forma eficiente, formas arbitrárias de clusters com variações inter-cluster e intra-cluster. Além disso o algoritmo tem como vantagens: ser determinístico; funcionar com interações locais, sem a necessidade de conhecimento a priori de todos os itens do conjunto; incorporar o conceito de ruído e outlier; e funcionar com um ranking de vizinhos, que pode ser construído através de qualquer medida. / In the last decades, the amount of data stored in digital format has grown exponentially, leading to the increasing need to produce computational tools that help generate knowledge from these data. The Machine Learning field provides several techniques capable of identifying patterns in these data sets. Within these techniques we highlight the Unsupervised Machine Learning where the objective is to classify the entities in mutually exclusive clusters based on the similarity between the instances. Clusters are not predefined and hence the unsupervised element. Organizing this data into clusters that make sense is one of the most fundamental ways of understanding and learning. Cluster analysis is the study of methods for clustering and is divided between hierarchical and partitional. A hierarchical clustering is a sequence of partitions whereas in the partitional clustering there is only one partition. Here we are interested in techniques based on a deterministic partially self-avoiding walk, known as tourist walk. Based on the hypothesis that it is possible to use the tourist walk as an unsupervised machine learning technique, we have implemented a hierarchical algorithm based on the tourist walk proposed by Campiteli et al. (2006). We evaluate this algorithm using different sets of medical images and compare it with traditional hierarchical techniques. We also propose a new algorithm for partitional clustering based on the tourist talk, called Tourist Walk Partitional Clustering (TWPC). The results showed that the hierarchical technique based on the tourist walk is able to identify clusters in sets of medical images through a tree that does not impose a binary structure, with a smaller number of hierarchies and is invariable to scale transformation, resulting in a more organized structure. Even though the tree is not directly based on the distances of the data but on a ranking of neighbors, it still preserves a correlation between its cophenetic distances and the actual distances between the data. The proposed partitional clustering method TWPC was able to find, in an efficient way, arbitrary shapes of clusters with inter-cluster and intra-cluster variations. In addition, the algorithm has the following advantages: it is deterministic; it operates based on local interactions, without the need for a priori knowledge of all the items in the set; it is capable of incorporate the concept of noise and outlier; and work with a ranking of neighbors, which can be built through any measure.
|
66 |
Perceptual learning in speech reveals pathways of processingMunson, Cheyenne Michele 01 December 2011 (has links)
Listeners use perceptual learning to rapidly adapt to manipulated speech input. Examination of this learning process can reveal the pathways used during speech perception. By assessing generalization of perceptually learned categorization boundaries, others have used perceptual learning to help determine whether abstract units are necessary for listeners and models of speech perception. Here we extend this approach to address the inverse issue of specificity. In these experiments we have sought to discover the levels of specificity for which listeners can learn variation in phonetic contrasts. We find that (1) listeners are able to learn multiple voicing boundaries for different pairs of phonemic contrasts relying on the same feature contrast. (2) Listeners generalize voicing boundaries to untrained continua with the same onset as the trained continua, but generalization to continua with different onsets depends on previous experience with other continua sharing this different onset. (3) Listeners can learn different voicing boundaries for continua with the same CV onset, which suggests that boundaries are lexically-specific. (4) Listeners can learn different voicing boundaries for multiple talkers even when they are not given instructions about talkers and their task does not require talker identification. (5) Listeners retain talker-specific boundaries after training on a new boundary for a second talker, but generalize boundaries across talkers when they have no previous experience with a talker. These results were obtained using a new paradigm for unsupervised perceptual learning in speech. They suggest that models of speech perception must be highly flexible in order to accommodate both specificity and generalization of perceptually learned categorization boundaries.
|
67 |
Identifying Crime Hotspot: Evaluating the suitability of Supervised and Unsupervised Machine learningHussein, Abdul Aziz 05 October 2021 (has links)
No description available.
|
68 |
Understanding Traffic Cruising Causation : Via Parking Data EnhancementJasarevic, Mirza January 2021 (has links)
Background. Some computer scientists have recently pointed out that it may be more effective for the computer science community to focus more on data preparation for performance improvements, rather than exclusively comparing modeling techniques.Testing how useful this shift in focus is, this paper chooses a particular data extraction technique to examine the differences in data model performance. Objectives. Five recent (2016-2020) studies concerning modeling parking congestion have used a rationalized approach to feature extraction rather than a measured approach. Their main focus was to select modeling techniques to find the best performance. Instead, this study picks a feature common to them all and attempts to improve it. It is then compared to the performance of the feature when it retains the state it had in the related studies. Weights are applied to the selected features, and altered, rather than using several modeling techniques. Specifically in the case of time series parking data, as the opportunity appeared in that sector. Apart from this, the reusability of the data is also gauged. Methods. An experimental case study is designed in three parts. The first tests the importance of weighted sum configurations relative to drivers' expectations. The second analyzes how much data can be recycled from the real data, and whether spatial or temporal comparisons are better for data synthesis of parking data. The third part compares the performance of the best configuration against the default configuration using k-means clustering algorithm and dynamic time warping distance. Results. The experimental results show performance improvements on all levels, and increasing improvement as the sample sizes grow, up to 9% average improvement per category, 6.2% for the entire city. The popularity of a parking lot turned out to be as important as occupancy rates(50% importance each), while volatility was obstructive. A few months were recyclable, and a few small parking lots could replace each other's datasets. Temporal aspects turned out to be better for parking data simulations than spatial aspects. Conclusions. The results support the data scientists' belief that quality- and quantity improvements of data are more important than creating more, new types of models. The score can be used as a better metric for parking congestion rates, for both drivers and managers. It can be employed in the public sphere under the condition that higher quality, richer data are provided.
|
69 |
Latent analysis of unsupervised latent variable models in fault diagnostics of rotating machinery under stationary and time-varying operating conditionsBalshaw, Ryan January 2020 (has links)
Vibration-based condition monitoring is a key and crucial element for asset longevity and to avoid unexpected financial compromise. Currently, data-driven methodologies often require significant investments into data acquisition and a large amount of operational data for both healthy and unhealthy cases. The acquisition of unhealthy fault data is often financially infeasible and the result is that most methods detailed in literature are not suitable for critical industrial applications.
In this work, unsupervised latent variable models negate the requirement for asset fault data. These models operate by learning the representation of healthy data and utilise health indicators to track deviance from this representation. A variety of latent variable models are compared, namely: Principal Component Analysis, Variational Auto-Encoders and Generative Adversarial Network-based methods. This research investigated the relationship between time-series data and latent variable model design under the sensible notion of data interpretation, the influence of model complexity on result performance on different datasets and shows that the latent manifold, when untangled and traversed in a sensible manner, is indicative of damage.
Three latent health indicators are proposed in this work and utilised in conjunction with a proposed temporal preservation approach. The performance is compared over the different models. It was found that these latent health indicators can augment standard health indicators and benefit model performance. This allows one to compare the performance of different latent variable models, an approach that has not been realised in previous work as the interpretation of the latent manifold and the manifold response to anomalous instances had not been explored. If all aspects of a latent variable model are systematically investigated and compared, different models can be analysed on a consistent platform.
In the model analysis step, a latent variable model is used to evaluate the available data such that the health indicators used to infer the health state of an asset, are available for analysis and comparison. The datasets investigated in this work consist of stationary and time-varying operating conditions. The objective was to determine whether deep learning is comparable or on par with state-of-the-art signal processing techniques. The results showed that damage is detectable in both the input space and the latent space and can be trended to identify clear condition deviance points. This highlights that both spaces are indicative of damage when analysed in a sensible manner. A key take away from this work is that for data that contains impulsive components that manifest naturally and not due to the presence of a fault, the anomaly detection procedure may be limited by inherent assumptions made in model formulations concerning Gaussianity.
This work illustrates how the latent manifold is useful for the detection of anomalous instances, how one must consider a variety of latent-variable model types and how subtle changes to data processing can benefit model performance analysis substantially. For vibration-based condition monitoring, latent variable models offer significant improvements in fault diagnostics and reduce the requirement for expert knowledge. This can ultimately improve asset longevity and the investment required from businesses in asset maintenance. / Dissertation (MEng (Mechanical Engineering))--University of Pretoria, 2020. / Eskom Power Plant Engineering Institute (EPPEI) / UP Postgraduate Bursary / Mechanical and Aeronautical Engineering / MEng (Mechanical Engineering) / Unrestricted
|
70 |
Vliv selekce příznaků metodou HFS na shlukovou analýzu / Effect of HFS Based Feature Selection on Cluster AnalysisMalásek, Jan January 2015 (has links)
Master´s thesis is focused on cluster analysis. Clustering has its roots in many areas, including data mining, statistics, biology and machine learning. The aim of this thesis is to elaborate a recherche of cluster analysis methods, methods for determining number of clusters and a short survey of feature selection methods for unsupervised learning. The very important part of this thesis is software realization for comparing different cluster analysis methods focused on finding optimal number of clusters and sorting data points into correct classes. The program also consists of feature selection HFS method implementation. Experimental methods validation was processed in Matlab environment. The end of master´s thesis compares success of clustering methods using data with known output classes and assesses contribution of feature selection HFS method for unsupervised learning for quality of cluster analysis.
|
Page generated in 0.0719 seconds