Global ETD Search

31	Animal ID Tag Recognition with Convolutional and Recurrent Neural Network : Identifying digits from a number sequence with RCNN Hijazi, Issa, Pettersson, Pontus January 2019 (has links) Major advances in machine learning have made image recognition applications, with Artificial Neural Network, blossom over the recent years. The aim of this thesis was to find a solution to recognize digits from a number sequence on an ID tag, used to identify farm animals, with the help of image recognition. A Recurrent Convolutional Neural Network solution called PPNet was proposed and tested on a data set called Animal Identification Tags. A transfer learning method was also used to test if it could help PPNet generalize and better recognize digits. PPNet was then compared against Microsoft Azures own image recognition API, to determine how PPNet compares to a general solution. PPNet, while not performing as good, still managed to achieve competitive results to the Azure API. Machine Learning Convolutional Neural Network Recurrent Neural Network Transfer learning Microsoft Azure Computer Sciences Datavetenskap (datalogi)
32	Human Activity Recognition : Deep learning techniques for an upper body exercise classification system Nardi, Paolo January 2019 (has links) Most research behind the use of Machine Learning models in the field of Human Activity Recognition focuses mainly on the classification of daily human activities and aerobic exercises. In this study, we focus on the use of 1 accelerometer and 2 gyroscope sensors to build a Deep Learning classifier to recognise 5 different strength exercises, as well as a null class. The strength exercises tested in this research are as followed: Bench press, bent row, deadlift, lateral rises and overhead press. The null class contains recordings of daily activities, such as sitting or walking around the house. The model used in this paper consists on the creation of consecutive overlapping fixed length sliding windows for each exercise, which are processed separately and act as the input for a Deep Convolutional Neural Network. In this study we compare different sliding windows lengths and overlap percentages (step sizes) to obtain the optimal window length and overlap percentage combination. Furthermore, we explore the accuracy results between 1D and 2D Convolutional Neural Networks. Cross validation is also used to check the overall accuracy of the classifiers, where the database used in this paper contains 5 exercises performed by 3 different users and a null class. Overall the models were found to perform accurately for window’s with length of 0.5 seconds or greater and provided a solid foundation to move forward in the creation of a more robust fully integrated model that can recognize a wider variety of exercises. Human Activity Recognition Deep Learning Machine Learning Exercise Classification Convolutional Neural Network Computer Engineering Datorteknik
33	Spatial resolved electronic structure of low dimensional materials and data analysis Peng, Han January 2018 (has links) Two dimensional (2D) materials with interesting fundamental physics and potential applications attract tremendous efforts to study. The versatile properties of 2D materials can be further tailored by tuning the electronic structure with the layer-stacking arrangement, of which the main adjustable parameters include the thickness and the in-plane twist angle between layers. The Angle-Resolved Photoemission Spectroscopy (ARPES) has become a canonical tool to study the electronic structure of crystalline materials. The recent development of ARPES with sub-micrometre spatial resolution (micro-ARPES) has made it possible to study the electronic structure of materials with mesoscopic domains. In this thesis, we use micro-ARPES to investigate the spatially-resolved electronic structure of a series of few-layer materials: 1. We explore the electronic structure of the domains with different number of layers in few-layer graphene on copper substrate. We observe a layer- dependent substrate doping effect in which the Fermi surface of graphene shifts with the increase of number of layers, which is then explained by a multilayer effective capacitor model. 2. We systematically study the twist angle evolution of the energy band of twisted few-layer graphene over a wide range of twist angles (from 5° to 31°). We directly observe van Hove Singularities (vHSs) in twisted bilayer graphene with wide tunable energy range over 2 eV. In addition, the formation of multiple vHSs (at different binding energies) is observed in trilayer graphene. The large tuning range of vHS binding energy in twisted few-layer graphene provides a promising material base for optoelectrical applications with broad-band wavelength selectivity. 3. To better extract the energy band features from ARPES data, we propose a new method with a convolutional neural network (CNN) that achieves comparable or better results than traditional derivative based methods. Besides ARPES study, this thesis also includes the study of surface reconstruction for the layered material Bi2O2Se with the analysis of Scanning Tunnelling Microscopy (STM) images. To explain the origin of the pattern, we propose a tile model that produces the identical statistics with the experiment.
34	Multi-dialect Arabic broadcast speech recognition Ali, Ahmed Mohamed Abdel Maksoud January 2018 (has links) Dialectal Arabic speech research suffers from the lack of labelled resources and standardised orthography. There are three main challenges in dialectal Arabic speech recognition: (i) finding labelled dialectal Arabic speech data, (ii) training robust dialectal speech recognition models from limited labelled data and (iii) evaluating speech recognition for dialects with no orthographic rules. This thesis is concerned with the following three contributions: Arabic Dialect Identification: We are mainly dealing with Arabic speech without prior knowledge of the spoken dialect. Arabic dialects could be sufficiently diverse to the extent that one can argue that they are different languages rather than dialects of the same language. We have two contributions: First, we use crowdsourcing to annotate a multi-dialectal speech corpus collected from Al Jazeera TV channel. We obtained utterance level dialect labels for 57 hours of high-quality consisting of four major varieties of dialectal Arabic (DA), comprised of Egyptian, Levantine, Gulf or Arabic peninsula, North African or Moroccan from almost 1,000 hours. Second, we build an Arabic dialect identification (ADI) system. We explored two main groups of features, namely acoustic features and linguistic features. For the linguistic features, we look at a wide range of features, addressing words, characters and phonemes. With respect to acoustic features, we look at raw features such as mel-frequency cepstral coefficients combined with shifted delta cepstra (MFCC-SDC), bottleneck features and the i-vector as a latent variable. We studied both generative and discriminative classifiers, in addition to deep learning approaches, namely deep neural network (DNN) and convolutional neural network (CNN). In our work, we propose Arabic as a five class dialect challenge comprising of the previously mentioned four dialects as well as modern standard Arabic. Arabic Speech Recognition: We introduce our effort in building Arabic automatic speech recognition (ASR) and we create an open research community to advance it. This section has two main goals: First, creating a framework for Arabic ASR that is publicly available for research. We address our effort in building two multi-genre broadcast (MGB) challenges. MGB-2 focuses on broadcast news using more than 1,200 hours of speech and 130M words of text collected from the broadcast domain. MGB-3, however, focuses on dialectal multi-genre data with limited non-orthographic speech collected from YouTube, with special attention paid to transfer learning. Second, building a robust Arabic ASR system and reporting a competitive word error rate (WER) to use it as a potential benchmark to advance the state of the art in Arabic ASR. Our overall system is a combination of five acoustic models (AM): unidirectional long short term memory (LSTM), bidirectional LSTM (BLSTM), time delay neural network (TDNN), TDNN layers along with LSTM layers (TDNN-LSTM) and finally TDNN layers followed by BLSTM layers (TDNN-BLSTM). The AM is trained using purely sequence trained neural networks lattice-free maximum mutual information (LFMMI). The generated lattices are rescored using a four-gram language model (LM) and a recurrent neural network with maximum entropy (RNNME) LM. Our official WER is 13%, which has the lowest WER reported on this task. Evaluation: The third part of the thesis addresses our effort in evaluating dialectal speech with no orthographic rules. Our methods learn from multiple transcribers and align the speech hypothesis to overcome the non-orthographic aspects. Our multi-reference WER (MR-WER) approach is similar to the BLEU score used in machine translation (MT). We have also automated this process by learning different spelling variants from Twitter data. We mine automatically from a huge collection of tweets in an unsupervised fashion to build more than 11M n-to-m lexical pairs, and we propose a new evaluation metric: dialectal WER (WERd). Finally, we tried to estimate the word error rate (e-WER) with no reference transcription using decoding and language features. We show that our word error rate estimation is robust for many scenarios with and without the decoding features.
35	Real-time localization of balls and hands in videos of juggling using a convolutional neural network Åkerlund, Rasmus January 2019 (has links) Juggling can be both a recreational activity that provides a wide variety of challenges to participants and an art form that can be performed on stage. Non-learning-based computer vision techniques, depth sensors, and accelerometers have been used in the past to augment these activities. These solutions either require specialized hardware or only work in a very limited set of environments. In this project, a 54 000 frame large video dataset of annotated juggling was created and a convolutional neural network was successfully trained that could locate the balls and hands with high accuracy in a variety of environments. The network was sufficiently light-weight to provide real-time inference on CPUs. In addition, the locations of the balls and hands were recorded for thirty-six common juggling pattern, and small neural networks were trained that could categorize them almost perfectly. By building on the publicly available code, models and datasets that this project has produced jugglers will be able to create interactive juggling games for beginners and novel audio-visual enhancements for live performances. convolutional neural network real-time object localization large video dataset juggling Computer and Information Sciences Data- och informationsvetenskap
36	Deterministic and Flexible Parallel Latent Feature Models Learning Framework for Probabilistic Knowledge Graph Guan, Xiao January 2018 (has links) Knowledge Graph is a rising topic in the field of Artificial Intelligence. As the current trend of knowledge representation, Knowledge graph research is utilizing the large knowledge base freely available on the internet. Knowledge graph also allows inspection, analysis, the reasoning of all knowledge in reality. To enable the ambitious idea of modeling the knowledge of the world, different theory and implementation emerges. Nowadays, we have the opportunity to use freely available information from Wikipedia and Wikidata. The thesis investigates and formulates a theory about learning from Knowledge Graph. The thesis researches probabilistic knowledge graph. It only focuses on a branch called latent feature models in learning probabilistic knowledge graph. These models aim to predict possible relationships of connected entities and relations. There are many models for such a task. The metrics and training process is detailed described and improved in the thesis work. The efficiency and correctness enable us to build a more complex model with confidence. The thesis also covers possible problems in finding and proposes future work. Text classification Recurrent neural network Convolutional neural network Computer Systems Datorsystem
37	Learning Phantom Dose Distribution using Regression Artificial Neural Networks Åkesson, Mattias January 2019 (has links) Before a radiation treatment on a cancer patient can get accomplished the treatment planning system (TPS) needs to undergo a quality assurance (QA). The QA consists of a pre-treatment (PT-QA) on a synthetic phantom body. During the PT-QA, data is collected from the phantom detectors, a set of monitors (transmission detectors) and the angular state of the machine. The outcome of this thesis project is to investigate if it is possible to predict the radiation dose distribution on the phantom body based on the data from the transmission detectors and the angular state of the machine. The motive for this is that an accurate prediction model could remove the PT-QA from most of the patient treatments. Prediction difficulties lie in reducing the contaminated noise from the transmission detectors and correctly mapping the transmission data to the phantom. The task is solved by modeling an artificial neuron network (ANN), that uses a u-net architecture to reduce the noise and a novel model that maps the transmission values to the phantom based on the angular state. The results show a median relative dose deviation ~ 1%. machine learning artificial neural network convolutional neural network quality assurance radiation therapy Engineering and Technology Teknik och teknologier
38	ESTIMATION OF DEPTH FROM DEFOCUS BLUR IN VIRTUAL ENVIRONMENTS COMPARING GRAPH CUTS AND CONVOLUTIONAL NEURAL NETWORK Prodipto Chowdhury (5931032) 17 January 2019 (has links) Depth estimation is one of the most important problems in computer vision. It has attracted a lot of attention because it has applications in many areas, such as robotics, VR and AR, self-driving cars etc. Using the defocus blur of a camera lens is one of the methods of depth estimation. In this thesis, we have researched this technique in virtual environments. Virtual datasets have been created for this purpose. In this research, we have applied graph cuts and convolutional neural network (DfD-net) to estimate depth from defocus blur using a natural (Middlebury) and a virtual (Maya) dataset. Graph Cuts showed similar performance for both natural and virtual datasets in terms of NMAE and NRMSE. However, with regard to SSIM, the performance of graph cuts is 4% better for Middlebury compared to Maya. We have trained the DfD-net using the natural and the virtual dataset and then combining both datasets. The network trained by the virtual dataset performed best for both datasets. The performance of graph-cuts and DfD-net have been compared. Graph-Cuts performance is 7% better than DfD-Net in terms of SSIM for Middlebury images. For Maya images, DfD-Net outperforms Graph-Cuts by 2%. With regard to NRMSE, Graph-Cuts and DfD-net shows similar performance for Maya images. For Middlebury images, Graph-cuts is 1.8% better. The algorithms show no difference in performance in terms of NMAE. The time DfD-net takes to generate depth maps compared to graph cuts is 500 times less for Maya images and 200 times less for Middlebury images. Computer Engineering
39	Low-Cost and Scalable Visual Drone Detection System Based on Distributed Convolutional Neural Network Hyun Hwang (5930672) 20 December 2018 (has links) <div>Recently, with the advancement in drone technology, more and more hobby drones are being manufactured and sold across the world. However, these drones can be repurposed</div><div>for the use in illicit activities such as hostile-load delivery. At the moment there are not many systems readily available for detecting and intercepting those hostile drones. Although there is a prototype of a working drone interceptor system built by the researchers of Purdue University, the system was not ready for the general public due to its nature of proof-of-concept and the high price range of the military-grade RADAR used in the prototype. It is essential to substitute such high-cost elements with low-cost ones, to make such drone interception system affordable enough for large-scale deployment.</div><div><br></div><div><div>This study aims to provide an alternative, affordable way to substitute an expensive, high-precision RADAR system with Convolutional Neural Network based drone detection system, which can be built using multiple low-cost single board computers. The experiment will try to find the feasibility of the proposed system and will evaluate the accuracy of the drone detection in a controlled environment.</div></div> Deep Learning Convolutional Neural Network Drone Detection Computer Vision Low-Cost
40	Metadata Validation Using a Convolutional Neural Network : Detection and Prediction of Fashion Products Nilsson Harnert, Henrik January 2019 (has links) In the e-commerce industry, importing data from third party clothing brands require validation of this data. If the validation step of this data is done manually, it is a tedious and time-consuming task. Part of this task can be replaced or assisted by using computer vision to automatically find clothing types, such as T-shirts and pants, within imported images. After a detection of clothing type is computed, it is possible to recommend the likelihood of clothing products correlating to data imported with a certain accuracy. This was done alongside a prototype interface that can be used to start training, finding clothing types in an image and to mask annotations of products. Annotations are areas describing different clothing types and are used to train an object detector model. A model for finding clothing types is trained on Mask R-CNN object detector and achieves 0.49 mAP accuracy. A detection take just above one second on an Nvidia GTX 1070 8 GB graphics card. Recommending one or several products based on a detection take 0.5 seconds and the algorithm used is k-nearest neighbors. If prediction is done on products of which is used to build the model of the prediction algorithm almost perfect accuracy is achieved while products in images for another products does not achieve nearly as good results. Computer vision Object detection Fashion detection Convolutional neural network Metadata validation Computer Systems Datorsystem

Search results