601 |
Deep Temporal Clustering: Fully Unsupervised Learning of Time-Domain FeaturesJanuary 2018 (has links)
abstract: Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. This thesis presents a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for temporal dimensionality reduction and a novel temporal clustering layer for cluster assignment. Then it jointly optimizes the clustering objective and the dimensionality reduction objective. Based on requirement and application, the temporal clustering layer can be customized with any temporal similarity metric. Several similarity metrics and state-of-the-art algorithms are considered and compared. To gain insight into temporal features that the network has learned for its clustering, a visualization method is applied that generates a region of interest heatmap for the time series. The viability of the algorithm is demonstrated using time series data from diverse domains, ranging from earthquakes to spacecraft sensor data. In each case, the proposed algorithm outperforms traditional methods. The superior performance is attributed to the fully integrated temporal dimensionality reduction and clustering criterion. / Dissertation/Thesis / Masters Thesis Computer Engineering 2018
|
602 |
Semi-automatic Training Data Generation for Cell Segmentation Network Using an Intermediary Curator NetRamnerö, David January 2017 (has links)
In this work we create an image analysis pipeline to segment cells from microscopy image data. A portion of the segmented images are manually curated and this curated data is used to train a Curator network to filter the whole dataset. The curated data is used to train a separate segmentation network to improve the cell segmentation. This technique can be easily applied to different types of microscopy object segmentation.
|
603 |
A deep learning approach for action classification in American football video sequencesWesterberg, Jacob January 2017 (has links)
The artificial intelligence is a constant topic of conversation with a field of research that is pushed forward by some of the world's largest companies and universities. Deep learning is a branch of machine learning within artificial intelligence based on learning representation of data such as images and texts by processing the data through deep neural networks. Sports are competitive businesses that over the years have become more data driven. Statistics play a big role in the development of the practitioners and the tactics in order to win. Sport organizations have big statistic teams since statistics are manually obtained by these teams. To learn a machine to recognize patterns and actions with deep learning would save a lot of time. In this thesis a deep learning approach is used to examine how well it can perform to classify the actions pass and run in American footbal lgames. A deep learning architecture is first trained and developed on a public video dataset and then trained to classify run and pass plays on a new American football dataset called the All-22 dataset. Results and earlier research show that deep learning has potential to automatize sport statistic but is not yet ready to overtake the role statistic teams have. Further research, bigger and more task specific datasets and more complex architectures are required to enhance the performance of this specific type of deep learning based video recognition.
|
604 |
Classification Performance of Convolutional Neural NetworksMattsson, Niklas January 2016 (has links)
The purpose of this thesis is to determine the performance of convolutional neural networks in classifications per millisecond, not training or accuracy, for the GTX960 and the TegraX1. This is done through varying parameters of the convolutional neural networks and using the Python framework Theano's function profiler to measure the time taken for different networks. The results show that increasing any parameter of the convolutional neural network also increases the time required for the classification of an image. The parameters do not punish the network equally, however. Convolutional layers and their depth have a far bigger negative impact on the network's performance than fully-connected layers and the amount of neurons in them. Additionally, the time needed for training the networks does not appear to correlate with the time needed for classification.
|
605 |
Fast recursive biomedical event extraction / Extraction rapide et récursive des événements biomédicauxLiu, Xiao 25 September 2014 (has links)
L’internet et les nouvelles formes de média de communication, d’information, et de divertissement ont entraîné une croissance massive de la quantité des données numériques. Le traitement et l’interprétation automatique de ces données permettent de créer des bases de connaissances, de rendre les recherches plus efficaces et d’effectuer des recherches sur les médias sociaux. Les travaux de recherche sur le traitement automatique du langage naturel concernent la conception et le développement d’algorithmes, qui permettent aux ordinateurs de traiter automatiquement le langage naturel dans les textes, les contenus audio, les images ou les vidéos, pour des tâches spécifiques. De par la complexité du langage humain, le traitement du langage naturel sous forme textuelle peut être divisé en 4 niveaux : la morphologie, la syntaxe, la sémantique et la pragmatique. Les technologies actuelles du traitement du langage naturel ont eu de grands succès sur les tâches liées auxdeux premiers niveaux, ce qui a permis la commercialisation de beaucoup d’applications comme les moteurs de recherche. Cependant, les moteurs de recherches avancés (structurels) nécessitent une interprétation du langage plus avancée. L’extraction d’information consiste à extraire des informations structurelles à partir des ressources non annotées ou semi-annotées, afin de permettre des recherches avancées et la création automatique des bases de connaissances. Cette thèse étudie le problème d’extraction d’information dans le domaine spécifique de l’extraction des événements biomédicaux. Nous proposons une solution efficace, qui fait un compromis entre deux types principaux de méthodes proposées dans la littérature. Cette solution arrive à un bon équilibre entre la performance et la rapidité, ce qui la rend utilisable pour traiter des données à grande échelle. Elle a des performances compétitives face aux meilleurs modèles existant avec une complexité en temps de calcul beaucoup plus faible. Lors la conception de ce modèle, nous étudions également les effets des différents classifieurs qui sont souvent proposés pour la résolution des problèmes de classification multi-classe. Nous testons également deux méthodes permettant d’intégrer des représentations vectorielles des mots appris par apprentissage profond (deep learning). Même si les classifieurs différents et l’intégration des vecteurs de mots n’améliorent pas grandement la performance, nous pensons que ces directions de recherche ont du potentiel et sont prometteuses pour améliorer l’extraction d’information. / Internet as well as all the modern media of communication, information and entertainment entails a massive increase of digital data quantities. Automatically processing and understanding these massive data enables creating large knowledge bases, more efficient search, social medial research, etc. Natural language processing research concerns the design and development of algorithms that allow computers to process natural language in texts, audios, images or videos automatically for specific tasks. Due to the complexity of human language, natural language processing of text can be divided into four levels: morphology, syntax, semantics and pragmatics. Current natural language processing technologies have achieved great successes in the tasks of the first two levels, leading to successes in many commercial applications such as search. However, advanced structured search engine would require computers to understand language deeper than at the morphology and syntactic levels. Information extraction is designed to extract meaningful structural information from unannotated or semi-annotated resources to enable advanced search and automatically create knowledge bases for further use. This thesis studies the problem of information extraction in the specific domain of biomedical event extraction. We propose an efficient solution, which is a trade-off between the two main trends of methods proposed in previous work. This solution reaches a good balance point between performance and speed, which is suitable to process large scale data. It achieves competitive performance to the best models with a much lower computational complexity. While designing this model, we also studied the effects of different classifiers that are usually proposed to solve the multi-class classification problem. We also tested two simple methods to integrate word vector representations learned by deep learning method into our model. Even if different classifiers and the integration of word vectors do not greatly improve the performance, we believe that these research directions carry some promising potential for improving information extraction.
|
606 |
Application of the German Traffic Sign Recognition Benchmark on the VGG16 network using transfer learning and bottleneck features in KerasPersson, Siri January 2018 (has links)
Convolutional Neural Networks (CNNs) are successful tools in image classification. CNNs are inspired by the animal visual cortex using a similar connectivity pattern as between neurons. The purpose of this thesis is to create a classifier, using transfer learning, that manages to classify images of traffic signs from the German Traffic Sign Recognition Benchmark (GTSRB) with good accuracy and to improve the performance further by tuning the hyperparameters. The pre-trained CNN used is the VGG16 network from the paper "Very deep convolutional networks for large-scale image recognition". The result showed that the VGG16 network got an accuracy of 74.5\% for the hyperparameter set where the learning rate was 1e-6, the batch size was 15 and the dropout rate 0.3. The conclusion was that transfer learning using the bottleneck features is a good tool for building a classifier with only a small amount of training data available and that the results probably could be further improved using more real data or data augmentation both for training and testing and by tuning more of the hyperparameters in the network.
|
607 |
Neural Networks for Semantic Segmentation in the Food Packaging IndustryCarlsson, Mattias January 2018 (has links)
Industrial applications of computer vision often utilize traditional image processing techniques whereas state-of-the-art methods in most image processing challenges are almost exclusively based on convolutional neural networks (CNNs). Thus there is a large potential for improving the performance of many machine vision applications by incorporating CNNs. One such application is the classification of juice boxes with straws, where the baseline solution uses classical image processing techniques on depth images to reject or accept juice boxes. This thesis aim to investigate how CNNs perform on the task of semantic segmentation (pixel-wise classification) of said images and if the result can be used to increase classification performance. A drawback of CNNs is that they usually require large amounts of labelled data for training to be able to generalize and learn anything useful. As labelled data is hard to come by, two ways to get cheap data are investigated, one being synthetic data generation and the other being automatic labelling using the baseline solution. The implemented network performs well on semantic segmentation, even when trained on synthetic data only, though the performance increases with the ratio of real (automatically labelled) to synthetic images. The classification task is very sensitive to small errors in semantic segmentation and the results are therefore not as good as the baseline solution. It is suspected that the drop in performance between validation and test data is due to a domain shift between the data sets, e.g. variations in data collection and straw and box type, and fine-tuning to the target domain could definitely increase performance. When trained on synthetic data the domain shift is even larger and the performance on classification is next to useless. It is likely that the results could be improved by using more advanced data generation, e.g. a generative adversarial network (GAN), or more rigorous modelling of the data.
|
608 |
A Deep Reinforcement Learning Framework where Agents Learn a Basic form of Social MovementEkstedt, Erik January 2018 (has links)
For social robots to move and behave appropriately in dynamic and complex social contexts they need to be flexible in their movement behaviors. The natural complexity of social interaction makes this a difficult property to encode programmatically. Instead of programming these algorithms by hand it could be preferable to have the system learn these behaviors. In this project a framework is created in which an agent, through deep reinforcement learning, can learn how to mimic poses, here defined as the most basic case of social movements. The framework aimed to be as agent agnostic as possible and suitable for both real life robots and virtual agents through an approach called "dancer in the mirror". The framework utilized a learning algorithm called PPO and trained agents, as a proof of concept, on both a virtual environment for the humanoid robot Pepper and for virtual agents in a physics simulation environment. The framework was meant to be a simple starting point that could be extended to incorporate more and more complex tasks. This project shows that this framework was functional for agents to learn to mimic poses on a simplified environment.
|
609 |
Självkörande fordon : och hur de kan komma att påverka vägtransporter / Self-driving vehicles : and how they might affect road transportationPesonen, Mikael January 2017 (has links)
Denna studie har utförts genom att granska relevant teoretiskt material som funnits tillgängligt fram till början av andra kvartalet år 2017. Det teoretiska materialet har prövats mot experter på området i form av intervjuer och slutligen har det gemensamma materialet av teori och empiri tolkats av författaren som också förmedlar sina egna tankar kring självkörande fordon. Studien fokuserar på självkörande vägfordon som bilar och lastbilar och vilka konsekvenser självkörande fordon kan medföra i en framtida situation avseende förflyttning av material, gods och människor i och mellan samhällen. Aspekter som belyses är vilka infrastrukturförändringar som kan tänkas ske och hur miljön kan påverkas. Ett bidrag i denna studie är bedömningen av när fullt självkörande fordon troligtvis kommer finnas tillgängliga och när de har etablerats. Studien visar att det finns potentiellt stora vinster med att införa självkörande fordon på allmänna vägar däribland samhällsekonomiska fördelar och kommersiella aspekter som högre säkerhet på väg samt lägre bränsleförbrukning med kolonnkörning och därmed också mindre utsläpp. Självkörande fordon bör också kunna medföra ett jämnare trafikflöde när trafikolyckorna blir färre och vågliknande mönster i trafiken reduceras eftersom körningen blir allt mer automatiserad. Det bör även leda till kapacitetshöjande effekter på vägarna genom att fler fordon per tidsenhet kan passera ett område och därmed ökad produktivitet. FN:s, EU:s och även Sveriges miljömål bör kunna driva på övergången från fossila fordon till elektriska fordon med olika former av incitament och samtidigt när allt fler går över till elektriska fordon (inom tidsperioden för satta miljömål) är det även troligt att de elektriska fordonen är utrustade med självkörande funktionalitet, inte minst ur säkerhetssynpunkt och därför bör försäljningen av elektriska fordon gynna även försäljningen av självkörande fordon eller självkörande system. Det finns också risker med självkörande fordon eftersom delar eller hela system i självkörande fordon möjligen kan manipuleras, störas eller inaktiveras. Det finns frågor kring vilka etiska val en AI gör och om den data som samlas in och skickas från fordonen till datormoln. Självkörande fordon kan också innebära en inskränkning av den personliga integriteten eftersom det möjligen kommer finnas videokameror i fordonen, eller genom att känslig information som insamlats med fordonen exponeras och resulterar i förföljelse, högre försäkringspremier eller någon annan ovälkommen händelse. Människan har en begränsad förmåga att tänka rationellt (administrative man) eftersom människan bedömer vissa möjliga alternativ och vissa konsekvenser av dessa alternativ samt tenderar att välja det första tillfredsställande alternativet som dyker upp. Detta kan skapa vissa hinder för bland annat implementeringen av självkörande fordon. De hinder som identifierats mot självkörande fordon är främst lagar och regler såsom ansvarsfrågan vid en olycka och avsaknad av unifierade lagar och regler som är anpassade till den nya tekniken. Även acceptans till den nya tekniken kan vara ett hinder och likaså osäkerheten kring framtidens arbetsmarknad eftersom att arbeten kan försvinna och ersättas av maskiner.
|
610 |
Deep Convolutional Neural Networks For Detecting Cellular Changes Due To MalignancyWieslander, Håkan, Forslid, Gustav January 2017 (has links)
Discovering cancer at an early stage is an effective way to increase the chance of survival. However, since most screening processes are done manually it is time inefficient and thus costly. One way of automizing the screening process could be to classify cells using Convolutional Neural Networks. Convolutional Neural Networks have been proven to produce high accuracy for image classification tasks. This thesis investigates if Convolutional Neural Networks can be used as a tool to detect cellular changes due to malignancy in the oral cavity and uterine cervix. Two datasets containing oral cells and two datasets containing cervical cells were used. The cells were divided into normal and abnormal cells for a binary classification. The performance was evaluated for two different network architectures, ResNet and VGG. For the oral datasets the accuracy varied between 78-82% correctly classified cells depending on the dataset and network. For the cervical datasets the accuracy varied between 84-86% correctly classified cells depending on the dataset and network. These results indicates a high potential for classifying abnormalities for oral and cervical cells. ResNet was shown to be the preferable network, with a higher accuracy and a smaller standard deviation.
|
Page generated in 0.0585 seconds