Global ETD Search

491	Exploration of Energy Efficient Hardware and Algorithms for Deep Learning Syed Sarwar (6634835) 14 May 2019 (has links) <div>Deep Neural Networks (DNNs) have emerged as the state-of-the-art technique in a wide range of machine learning tasks for analytics and computer vision in the next generation of embedded (mobile, IoT, wearable) devices. Despite their success, they suffer from high energy requirements both in inference and training. In recent years, the inherent error resiliency of DNNs has been exploited by introducing approximations at either the algorithmic or the hardware levels (individually) to obtain energy savings while incurring tolerable accuracy degradation. We perform a comprehensive analysis to determine the effectiveness of cross-layer approximations for the energy-efficient realization of large-scale DNNs. Our experiments on recognition benchmarks show that cross-layer approximation provides substantial improvements in energy efficiency for different accuracy/quality requirements. Furthermore, we propose a synergistic framework for combining the approximation techniques. </div><div>To reduce the training complexity of Deep Convolutional Neural Networks (DCNN), we replace certain weight kernels of convolutional layers with Gabor filters. The convolutional layers use the Gabor filters as fixed weight kernels, which extracts intrinsic features, with regular trainable weight kernels. This combination creates a balanced system that gives better training performance in terms of energy and time, compared to the standalone Deep CNN (without any Gabor kernels), in exchange for tolerable accuracy degradation. We also explore an efficient training methodology and incrementally growing a DCNN to allow new classes to be learned while sharing part of the base network. Our approach is an end-to-end learning framework, where we focus on reducing the incremental training complexity while achieving accuracy close to the upper-bound without using any of the old training samples. We have also explored spiking neural networks for energy-efficiency. Training of deep spiking neural networks from direct spike inputs is difficult since its temporal dynamics are not well suited for standard supervision based training algorithms used to train DNNs. We propose a spike-based backpropagation training methodology for state-of-the-art deep Spiking Neural Network (SNN) architectures. This methodology enables real-time training in deep SNNs while achieving comparable inference accuracies on standard image recognition tasks.</div> deep learning Energy Efficiency approximate computing neuromorphic computing
492	Weight parameterizations in deep neural networks / Paramétrisation des poids des réseaux de neurones profonds Zagoruyko, Sergey 07 September 2018 (has links) Les réseaux de neurones multicouches ont été proposés pour la première fois il y a plus de trois décennies, et diverses architectures et paramétrages ont été explorés depuis. Récemment, les unités de traitement graphique ont permis une formation très efficace sur les réseaux neuronaux et ont permis de former des réseaux beaucoup plus grands sur des ensembles de données plus importants, ce qui a considérablement amélioré le rendement dans diverses tâches d'apprentissage supervisé. Cependant, la généralisation est encore loin du niveau humain, et il est difficile de comprendre sur quoi sont basées les décisions prises. Pour améliorer la généralisation et la compréhension, nous réexaminons les problèmes de paramétrage du poids dans les réseaux neuronaux profonds. Nous identifions les problèmes les plus importants, à notre avis, dans les architectures modernes : la profondeur du réseau, l'efficacité des paramètres et l'apprentissage de tâches multiples en même temps, et nous essayons de les aborder dans cette thèse. Nous commençons par l'un des problèmes fondamentaux de la vision par ordinateur, le patch matching, et proposons d'utiliser des réseaux neuronaux convolutifs de différentes architectures pour le résoudre, au lieu de descripteurs manuels. Ensuite, nous abordons la tâche de détection d'objets, où un réseau devrait apprendre simultanément à prédire à la fois la classe de l'objet et l'emplacement. Dans les deux tâches, nous constatons que le nombre de paramètres dans le réseau est le principal facteur déterminant sa performance, et nous explorons ce phénomène dans les réseaux résiduels. Nos résultats montrent que leur motivation initiale, la formation de réseaux plus profonds pour de meilleures représentations, ne tient pas entièrement, et des réseaux plus larges avec moins de couches peuvent être aussi efficaces que des réseaux plus profonds avec le même nombre de paramètres. Dans l'ensemble, nous présentons une étude approfondie sur les architectures et les paramétrages de poids, ainsi que sur les moyens de transférer les connaissances entre elles / Multilayer neural networks were first proposed more than three decades ago, and various architectures and parameterizations were explored since. Recently, graphics processing units enabled very efficient neural network training, and allowed training much larger networks on larger datasets, dramatically improving performance on various supervised learning tasks. However, the generalization is still far from human level, and it is difficult to understand on what the decisions made are based. To improve on generalization and understanding we revisit the problems of weight parameterizations in deep neural networks. We identify the most important, to our mind, problems in modern architectures: network depth, parameter efficiency, and learning multiple tasks at the same time, and try to address them in this thesis. We start with one of the core problems of computer vision, patch matching, and propose to use convolutional neural networks of various architectures to solve it, instead of manual hand-crafting descriptors. Then, we address the task of object detection, where a network should simultaneously learn to both predict class of the object and the location. In both tasks we find that the number of parameters in the network is the major factor determining it's performance, and explore this phenomena in residual networks. Our findings show that their original motivation, training deeper networks for better representations, does not fully hold, and wider networks with less layers can be as effective as deeper with the same number of parameters. Overall, we present an extensive study on architectures and weight parameterizations, and ways of transferring knowledge between them Réseau de neurons artificiels Apprentissage profond Vision par ordinateur Artificial neural networks Deep learning Computer vision
493	Learning Deep Representations : Toward a better new understanding of the deep learning paradigm / Apprentissage de représentations profondes : vers une meilleure compréhension du paradigme d'apprentissage profond Arnold, Ludovic 25 June 2013 (has links) Depuis 2006, les algorithmes d’apprentissage profond qui s’appuient sur des modèles comprenant plusieurs couches de représentations ont pu surpasser l’état de l’art dans plusieurs domaines. Les modèles profonds peuvent être très efficaces en termes du nombre de paramètres nécessaires pour représenter des opérations complexes. Bien que l’entraînement des modèles profonds ait été traditionnellement considéré comme un problème difficile, une approche réussie a été d’utiliser une étape de pré-entraînement couche par couche, non supervisée, pour initialiser des modèles profonds supervisés. Tout d’abord, l’apprentissage non-supervisé présente de nombreux avantages par rapport à la généralisation car il repose uniquement sur des données non étiquetées qu’il est facile de trouver. Deuxièmement, la possibilité d’apprendre des représentations couche par couche, au lieu de toutes les couches à la fois, améliore encore la généralisation et réduit les temps de calcul. Cependant, l’apprentissage profond pose encore beaucoup de questions relatives à la consistance de l’apprentissage couche par couche, avec de nombreuses couches, et à la difficulté d’évaluer la performance, de sélectionner les modèles et d’optimiser la performance des couches. Dans cette thèse, nous examinons d’abord les limites de la justification variationnelle actuelle pour l’apprentissage couche par couche qui ne se généralise pas bien à de nombreuses couches et demandons si une méthode couche par couche peut jamais être vraiment consistante. Nous constatons que l’apprentissage couche par couche peut en effet être consistant et peut conduire à des modèles génératifs profonds optimaux. Pour ce faire, nous introduisons la borne supérieure de la meilleure probabilité marginale latente (BLM upper bound), un nouveau critère qui représente la log-vraisemblance maximale d’un modèle génératif profond quand les couches supérieures ne sont pas connues. Nous prouvons que la maximisation de ce critère pour chaque couche conduit à une architecture profonde optimale, à condition que le reste de l’entraînement se passe bien. Bien que ce critère ne puisse pas être calculé de manière exacte, nous montrons qu’il peut être maximisé efficacement par des auto-encodeurs quand l’encodeur du modèle est autorisé à être aussi riche que possible. Cela donne une nouvelle justification pour empiler les modèles entraînés pour reproduire leur entrée et donne de meilleurs résultats que l’approche variationnelle. En outre, nous donnons une approximation calculable de la BLM upper bound et montrons qu’elle peut être utilisée pour estimer avec précision la log-vraisemblance finale des modèles. Nous proposons une nouvelle méthode pour la sélection de modèles couche par couche pour les modèles profonds, et un nouveau critère pour déterminer si l’ajout de couches est justifié. Quant à la difficulté d’entraîner chaque couche, nous étudions aussi l’impact des métriques et de la paramétrisation sur la procédure de descente de gradient couramment utilisée pour la maximisation de la vraisemblance. Nous montrons que la descente de gradient est implicitement liée à la métrique de l’espace sous-jacent et que la métrique Euclidienne peut souvent être un choix inadapté car elle introduit une dépendance sur la paramétrisation et peut entraîner une violation de la symétrie. Pour pallier ce problème, nous étudions les avantages du gradient naturel et montrons qu’il peut être utilisé pour restaurer la symétrie, mais avec un coût de calcul élevé. Nous proposons donc qu’une paramétrisation centrée peut rétablir la symétrie avec une très faible surcharge computationnelle. / Since 2006, deep learning algorithms which rely on deep architectures with several layers of increasingly complex representations have been able to outperform state-of-the-art methods in several settings. Deep architectures can be very efficient in terms of the number of parameters required to represent complex operations which makes them very appealing to achieve good generalization with small amounts of data. Although training deep architectures has traditionally been considered a difficult problem, a successful approach has been to employ an unsupervised layer-wise pre-training step to initialize deep supervised models. First, unsupervised learning has many benefits w.r.t. generalization because it only relies on unlabeled data which is easily found. Second, the possibility to learn representations layer by layer instead of all layers at once improves generalization further and reduces computational time. However, deep learning is a very recent approach and still poses a lot of theoretical and practical questions concerning the consistency of layer-wise learning with many layers and difficulties such as evaluating performance, performing model selection and optimizing layers. In this thesis we first discuss the limitations of the current variational justification for layer-wise learning which does not generalize well to many layers. We ask if a layer-wise method can ever be truly consistent, i.e. capable of finding an optimal deep model by training one layer at a time without knowledge of the upper layers. We find that layer-wise learning can in fact be consistent and can lead to optimal deep generative models. To do this, we introduce the Best Latent Marginal (BLM) upper bound, a new criterion which represents the maximum log-likelihood of a deep generative model where the upper layers are unspecified. We prove that maximizing this criterion for each layer leads to an optimal deep architecture, provided the rest of the training goes well. Although this criterion cannot be computed exactly, we show that it can be maximized effectively by auto-encoders when the encoder part of the model is allowed to be as rich as possible. This gives a new justification for stacking models trained to reproduce their input and yields better results than the state-of-the-art variational approach. Additionally, we give a tractable approximation of the BLM upper-bound and show that it can accurately estimate the final log-likelihood of models. Taking advantage of these theoretical advances, we propose a new method for performing layer-wise model selection in deep architectures, and a new criterion to assess whether adding more layers is warranted. As for the difficulty of training layers, we also study the impact of metrics and parametrization on the commonly used gradient descent procedure for log-likelihood maximization. We show that gradient descent is implicitly linked with the metric of the underlying space and that the Euclidean metric may often be an unsuitable choice as it introduces a dependence on parametrization and can lead to a breach of symmetry. To mitigate this problem, we study the benefits of the natural gradient and show that it can restore symmetry, regrettably at a high computational cost. We thus propose that a centered parametrization may alleviate the problem with almost no computational overhead. Apprentissage artificiel Réseaux de neurones Apprentissage profond Machine learning Neural networks Deep learning
494	Modélisation de la structure du silicium amorphe à l’aide d’algorithmes d’apprentissage profond Comin, Massimiliano 08 1900 (has links) No description available. Amorphous silicon Deep learning Silicium amorphe Apprentissage profond
495	Authentication Using Deep Learning on User Generated Mouse Movement Images Enström, Olof January 2019 (has links) Continuous authentication using behavioral biometrics can provide an additional layer of protection against online account hijacking and fraud. Mouse dynamics classification is the concept of determining the authenticity of a user through the use of machine learning algorithms on mouse movement data. This thesis investigates the viability of state of the art deep learning technologies in mouse dynamics classification by designing convolutional neural network classifiers taking mouse movement images as input. For purposes of comparison, classifiers using the random forest algorithm and engineered features inspired by related works are implemented and tested on the same data set as the neural network classifier. A technique for lowering bias toward the on-screen location of mouse movement images is introduced, although its effectiveness is questionable and requires further research to thoroughly investigate. This technique was named 'centering', and is used for the deep learning-based classification methods alongside images not using the technique. The neural network classifiers yielded single action classification accuracies of 66% for centering, and 78% for non-centering. The random forest classifiers achieved the average accuracy of 79% for single action classification, which is very close to the results of other studies using similar methods. In addition to single action classification, a set based classification is made. This is the method most suitable for implementation in an actual authentication system as the accuracy is much higher. The neural network and random forest classifiers have different strengths. The neural network is proficient at classifying mouse actions that are of similar appearance in terms of length, location, and curvature. The random forest classifiers seem to be more consistent in these regards, although the accuracy deteriorates for especially long actions. As the different classification methods in this study have different strengths and weaknesses, a composite classification experiment was made where the output was determined by the least ambiguous output of the two models. This composite classification had an accuracy of 83%, meaning it outperformed both the individual models. Machine Learning Authentication Behavioral Biometrics Deep Learning Convolutional Neural Networks Computer Engineering Datorteknik
496	Travel time estimation for emergency services Pereira, Iman, Ren, Guangan January 2019 (has links) Emergency services has a vital function in society, and except saving lifes a functioning emergency service system provides the inhabitants of any give society with a sence of feeling secure. Because of the delicate nature of the services provided there is always an interest in improvement with regards to the performance of the system. In order to have a good system there are a variety of models that can be used as decision making support. An important component in many of these models are the travel time of an emergency vehicle. In In this study the focus lies in travel time estimation for the emergency services and how it could be estimated by using a neural network, called a deep learning process in this report. The data used in the report is map matched GPS points that have been collected by the emergency services in two counties in Sweden, Östergötland and Västergötland. The map matched data has then been matched with NVDB, which is the the national road database, adding an extra layer of information, such as roadlink geometry, number of roundabouts etc. To find the most important features to use as input in the developed model a Pearson and Spearman correlation test was performed. Even if these two tests do not capture all possible relations between features they still give an indication of what features that can be included. The deep learning process developed within this study uses route length, average weighted speed limit, resource category, and road width. It is trained with 75% of the data leaving the remaining 25% for testing of the model. The DLP gives a mean absolute error of 51.39 when trained and 59.21 seconds when presented with new data. This in comparison a simpler model which calculates the travel time by dividing the route length with the weighted averag speed limt, which gives a mean absolute error of 227.48 seconds. According to the error metrics used in order to evaluate the models the DLP performs better than the current model. However there is a dimension of complexity with the DLP which makes it sort of a black box where something goes in and out comes an estimated travel time. If the aim is to have a more comprehensive model, then the current model has its benefits over a DLP. However the potential that lies in using a DLP is entruiging, and with a more in depth analysis of features and how to classify these in combination with more data there may be room for developing more complex DLPs. Travel time estimation deep learning emergency services Transport Systems and Logistics Transportteknik och logistik
497	Synthesis of Thoracic Computer Tomography Images using Generative Adversarial Networks Hagvall Hörnstedt, Julia January 2019 (has links) The use of machine learning algorithms to enhance and facilitate medical diagnosis and analysis is a promising and an important area, which could improve the workload of clinicians’ substantially. In order for machine learning algorithms to learn a certain task, large amount of data needs to be available. Data sets for medical image analysis are rarely public due to restrictions concerning the sharing of patient data. The production of synthetic images could act as an anonymization tool to enable the distribution of medical images and facilitate the training of machine learning algorithms, which could be used in practice. This thesis investigates the use of Generative Adversarial Networks (GAN) for synthesis of new thoracic computer tomography (CT) images, with no connection to real patients. It also examines the usefulness of the images by comparing the quantitative performance of a segmentation network trained with the synthetic images with the quantitative performance of the same segmentation network trained with real thoracic CT images. The synthetic thoracic CT images were generated using CycleGAN for image-to-image translation between label map ground truth images and thoracic CT images. The synthetic images were evaluated using different set-ups of synthetic and real images for training the segmentation network. All set-ups were evaluated according to sensitivity, accuracy, Dice and F2-score and compared to the same parameters evaluated from a segmentation network trained with 344 real images. The thesis shows that it was possible to generate synthetic thoracic CT images using GAN. However, it was not possible to achieve an equal quantitative performance of a segmentation network trained with synthetic data compared to a segmentation network trained with the same amount of real images in the scope of this thesis. It was possible to achieve equal quantitative performance of a segmentation network, as a segmentation network trained on real images, by training it with a combination of real and synthetic images, where a majority of the images were synthetic images and a minority were real images. By using a combination of 59 real images and 590 synthetic images, equal performance as a segmentation network trained with 344 real images was achieved regarding sensitivity, Dice and F2-score. Equal quantitative performance of a segmentation network could thus be achieved by using fewer real images together with an abundance of synthetic images, created at close to no cost, indicating a usefulness of synthetically generated images. Generative Adversarial Networks deep learning image synthesis synthetic images image segmentation Medical Engineering Medicinteknik
498	Active Stereo Reconstruction using Deep Learning Kihlström, Helena January 2019 (has links) Depth estimation using stereo images is an important task in many computer vision applications. A stereo camera contains two image sensors that observe the scene from slightly different viewpoints, making it possible to find the depth of the scene. An active stereo camera also uses a laser projector that projects a pattern into the scene. The advantage of the laser pattern is the additional texture that gives better depth estimations in dark and textureless areas. Recently, deep learning methods have provided new solutions producing state-of-the-art performance in stereo reconstruction. The aim of this project was to investigate the behavior of a deep learning model for active stereo reconstruction, when using data from different cameras. The model is self-supervised, which solves the problem of having enough ground truth data for training the model. It instead uses the known relationship between the left and right images to let the model learn the best estimation. The model was separately trained on datasets from three different active stereo cameras. The three trained models were then compared using evaluation images from all three cameras. The results showed that the model did not always perform better on images from the camera that was used for collecting the training data. However, when comparing the results of different models using the same test images, the model that was trained on images from the camera used for testing gave better results in most cases. Computer Vision Deep Learning Stereo Reconstruction Elektroteknik och elektronik
499	End-to-End Road Lane Detection and Estimation using Deep Learning Vigren, Malcolm, Eriksson, Linus January 2019 (has links) The interest for autonomous driving assistance, and in the end, self-driving cars, has increased vastly over the last decade. Automotive safety continues to be a priority for manufacturers, politicians and people alike. Visual-based systems aiding the drivers have lately been boosted by advances in computer vision and machine learning. In this thesis, we evaluate the concept of an end-to-end machine learning solution for detecting and classifying road lane markings, and compare it to a more classical semantic segmentation solution. The analysis is based on the frame-by-frame scenario, and shows that our proposed end-to-end system has clear advantages when it comes detecting the existence of lanes and producing a consistent, lane-like output, especially in adverse conditions such as weak lane markings. Our proposed method allows the system to predict its own confidence, thereby allowing the system to suppress its own output when it is not deemed safe enough. The thesis finishes with proposed future work needed to achieve optimal performance and create a system ready for deployment in an active safety product. deep learning lane detection active safety machine learning end-to-end Signal Processing Signalbehandling
500	Evaluation of Multiple Object Tracking in Surveillance Video Nyström, Axel January 2019 (has links) Multiple object tracking is the process of assigning unique and consistent identities to objects throughout a video sequence. A popular approach to multiple object tracking, and object tracking in general, is to use a method called tracking-by-detection. Tracking-by-detection is a two-stage procedure: an object detection algorithm first detects objects in a frame, these objects are then associated with already tracked objects by a tracking algorithm. One of the main concerns of this thesis is to investigate how different object detection algorithms perform on surveillance video supplied by National Forensic Centre. The thesis then goes on to explore how the stand-alone alone performance of the object detection algorithm correlates with overall performance of a tracking-by-detection system. Finally, the thesis investigates how the use of visual descriptors in the tracking stage of a tracking-by-detection system effects performance. Results presented in this thesis suggest that the capacity of the object detection algorithm is highly indicative of the overall performance of the tracking-by-detection system. Further, this thesis also shows how the use of visual descriptors in the tracking stage can reduce the number of identity switches and thereby increase performance of the whole system. Multiple Object Tracking Tracking-by-Detection Object Detection Object Tracking Deep Learning Signal Processing Signalbehandling

Search results