• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 90
  • 12
  • 6
  • 4
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 145
  • 145
  • 145
  • 76
  • 52
  • 51
  • 24
  • 23
  • 22
  • 21
  • 20
  • 19
  • 19
  • 19
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Approximate Neural Networks for Speech Applications in Resource-Constrained Environments

January 2016 (has links)
abstract: Speech recognition and keyword detection are becoming increasingly popular applications for mobile systems. While deep neural network (DNN) implementation of these systems have very good performance, they have large memory and compute resource requirements, making their implementation on a mobile device quite challenging. In this thesis, techniques to reduce the memory and computation cost of keyword detection and speech recognition networks (or DNNs) are presented. The first technique is based on representing all weights and biases by a small number of bits and mapping all nodal computations into fixed-point ones with minimal degradation in the accuracy. Experiments conducted on the Resource Management (RM) database show that for the keyword detection neural network, representing the weights by 5 bits results in a 6 fold reduction in memory compared to a floating point implementation with very little loss in performance. Similarly, for the speech recognition neural network, representing the weights by 6 bits results in a 5 fold reduction in memory while maintaining an error rate similar to a floating point implementation. Additional reduction in memory is achieved by a technique called weight pruning, where the weights are classified as sensitive and insensitive and the sensitive weights are represented with higher precision. A combination of these two techniques helps reduce the memory footprint by 81 - 84% for speech recognition and keyword detection networks respectively. Further reduction in memory size is achieved by judiciously dropping connections for large blocks of weights. The corresponding technique, termed coarse-grain sparsification, introduces hardware-aware sparsity during DNN training, which leads to efficient weight memory compression and significant reduction in the number of computations during classification without loss of accuracy. Keyword detection and speech recognition DNNs trained with 75% of the weights dropped and classified with 5-6 bit weight precision effectively reduced the weight memory requirement by ~95% compared to a fully-connected network with double precision, while showing similar performance in keyword detection accuracy and word error rate. / Dissertation/Thesis / Masters Thesis Computer Science 2016
32

Study of Knowledge Transfer Techniques For Deep Learning on Edge Devices

January 2018 (has links)
abstract: With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational heavy, whereas edge devices are usually equipped with limited computational and storage resources. Such models can be compressed and reduced in order to be placed on edge devices, but they may loose their capability and may not generalize and perform well compared to large models. Recent works used knowledge transfer techniques to transfer information from a large network (termed teacher) to a small one (termed student) in order to improve the performance of the latter. This approach seems to be promising for learning on edge devices, but a thorough investigation on its effectiveness is lacking. The purpose of this work is to provide an extensive study on the performance (both in terms of accuracy and convergence speed) of knowledge transfer, considering different student-teacher architectures, datasets and different techniques for transferring knowledge from teacher to student. A good performance improvement is obtained by transferring knowledge from both the intermediate layers and last layer of the teacher to a shallower student. But other architectures and transfer techniques do not fare so well and some of them even lead to negative performance impact. For example, a smaller and shorter network, trained with knowledge transfer on Caltech 101 achieved a significant improvement of 7.36\% in the accuracy and converges 16 times faster compared to the same network trained without knowledge transfer. On the other hand, smaller network which is thinner than the teacher network performed worse with an accuracy drop of 9.48\% on Caltech 101, even with utilization of knowledge transfer. / Dissertation/Thesis / Masters Thesis Computer Science 2018
33

Hluboké neuronové sítě pro předpovídání prodejů / Deep Neural Networks for Sales Forecasting

Tyrpáková, Natália January 2016 (has links)
Sales forecasting is an essential part of supply chain management. In retail business, accurate sales forecasts lead to significant cost reductions. Statistical methods that are commonly used for sales forecasting often overlook important aspects unique for the sales time series, which lowers the forecast accuracy. In this thesis we explore whether it is possible to improve short-term sales forecasting by employing deep neural networks. This thesis analyzes performance of various traditional deep neural network designs and proposes a novel architecture. It also explores several data preprocessing methods, both traditional and non-traditional, which turns out to be a crucial part of sales forecasting using deep neural networks. The best methods of deep neural network approach that we found are then compared to other forecasting methods such as traditional neural networks or exponential smoothing. Powered by TCPDF (www.tcpdf.org)
34

Applications of Tropical Geometry in Deep Neural Networks

Alfarra, Motasem 04 1900 (has links)
This thesis tackles the problem of understanding deep neural network with piece- wise linear activation functions. We leverage tropical geometry, a relatively new field in algebraic geometry to characterize the decision boundaries of a single hidden layer neural network. This characterization is leveraged to understand, and reformulate three interesting applications related to deep neural network. First, we give a geo- metrical demonstration of the behaviour of the lottery ticket hypothesis. Moreover, we deploy the geometrical characterization of the decision boundaries to reformulate the network pruning problem. This new formulation aims to prune network pa- rameters that are not contributing to the geometrical representation of the decision boundaries. In addition, we propose a dual view of adversarial attack that tackles both designing perturbations to the input image, and the equivalent perturbation to the decision boundaries.
35

Informatics Approaches for Understanding Human Facial Attractiveness Perception and Visual Attention / 人間の顔の魅力知覚と視覚的注意の情報学的アプローチによる解明

Tong, Song 24 May 2021 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23398号 / 情博第767号 / 新制||情||131(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 熊田 孝恒, 教授 西田 眞也, 教授 齋木 潤, 准教授 延原 章平 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
36

Rozpoznávání pojmenovaných entit v biomedicínské doméně / Named entity recognition in the biomedical domain

Williams, Shadasha January 2021 (has links)
Thesis Title: Named Entity Recognition in the Biomedical Domain Named entity recognition (NER) is the task of information extraction that attempts to recognize and extract particular entities in a text. One of the issues that stems from NER is that its models are domain specific. The goal of the thesis is to focus on entities strictly from the biomedical domain. The other issue with NER comes the synonymous terms that may be linked to one entity, moreover they lead to issue of disambiguation of the entities. Due to the popularity of neural networks and their success in NLP tasks, the work should use a neural network architecture for the task of named entity disambiguation, which is described in the paper by Eshel et al [1]. One of the subtasks of the thesis is to map the words and entities to a vector space using word embeddings, which attempts to provide textual context similarity, and coherence [2]. The main output of the thesis will be a model that attempts to disambiguate entities of the biomedical domain, using scientific journals (PubMed and Embase) as the documents of our interest.
37

Human Understandable Interpretation of Deep Neural Networks Decisions Using Generative Models

Alabdallah, Abdallah January 2019 (has links)
Deep Neural Networks have long been considered black box systems, where their interpretability is a concern when applied in safety critical systems. In this work, a novel approach of interpreting the decisions of DNNs is proposed. The approach depends on exploiting generative models and the interpretability of their latent space. Three methods for ranking features are explored, two of which depend on sensitivity analysis, and the third one depends on Random Forest model. The Random Forest model was the most successful to rank the features, given its accuracy and inherent interpretability.
38

Inferential GANs and Deep Feature Selection with Applications

Yao Chen (8892395) 15 June 2020 (has links)
Deep nueral networks (DNNs) have become popular due to their predictive power and flexibility in model fitting. In unsupervised learning, variational autoencoders (VAEs) and generative adverarial networks (GANs) are two most popular and successful generative models. How to provide a unifying framework combining the best of VAEs and GANs in a principled way is a challenging task. In supervised learning, the demand for high-dimensional data analysis has grown significantly, especially in the applications of social networking, bioinformatics, and neuroscience. How to simultaneously approximate the true underlying nonlinear system and identify relevant features based on high-dimensional data (typically with the sample size smaller than the dimension, a.k.a. small-n-large-p) is another challenging task.<div><br></div><div>In this dissertation, we have provided satisfactory answers for these two challenges. In addition, we have illustrated some promising applications using modern machine learning methods.<br></div><div><br></div><div>In the first chapter, we introduce a novel inferential Wasserstein GAN (iWGAN) model, which is a principled framework to fuse auto-encoders and WGANs. GANs have been impactful on many problems and applications but suffer from unstable training. The Wasserstein GAN (WGAN) leverages the Wasserstein distance to avoid the caveats in the minmax two-player training of GANs but has other defects such as mode collapse and lack of metric to detect the convergence. The iWGAN model jointly learns an encoder network and a generator network motivated by the iterative primal dual optimization process. The encoder network maps the observed samples to the latent space and the generator network maps the samples from the latent space to the data space. We establish the generalization error bound of iWGANs to theoretically justify the performance of iWGANs. We further provide a rigorous probabilistic interpretation of our model under the framework of maximum likelihood estimation. The iWGAN, with a clear stopping criteria, has many advantages over other autoencoder GANs. The empirical experiments show that the iWGAN greatly mitigates the symptom of mode collapse, speeds up the convergence, and is able to provide a measurement of quality check for each individual sample. We illustrate the ability of iWGANs by obtaining a competitive and stable performance with state-of-the-art for benchmark datasets. <br></div><div><br></div><div>In the second chapter, we present a general framework for high-dimensional nonlinear variable selection using deep neural networks under the framework of supervised learning. The network architecture includes both a selection layer and approximation layers. The problem can be cast as a sparsity-constrained optimization with a sparse parameter in the selection layer and other parameters in the approximation layers. This problem is challenging due to the sparse constraint and the nonconvex optimization. We propose a novel algorithm, called Deep Feature Selection, to estimate both the sparse parameter and the other parameters. Theoretically, we establish the algorithm convergence and the selection consistency when the objective function has a Generalized Stable Restricted Hessian. This result provides theoretical justifications of our method and generalizes known results for high-dimensional linear variable selection. Simulations and real data analysis are conducted to demonstrate the superior performance of our method.<br></div><div><br></div><div><div>In the third chapter, we develop a novel methodology to classify the electrocardiograms (ECGs) to normal, atrial fibrillation and other cardiac dysrhythmias as defined by the Physionet Challenge 2017. More specifically, we use piecewise linear splines for the feature selection and a gradient boosting algorithm for the classifier. In the algorithm, the ECG waveform is fitted by a piecewise linear spline, and morphological features related to the piecewise linear spline coefficients are extracted. XGBoost is used to classify the morphological coefficients and heart rate variability features. The performance of the algorithm was evaluated by the PhysioNet Challenge database (3658 ECGs classified by experts). Our algorithm achieves an average F1 score of 81% for a 10-fold cross validation and also achieved 81% for F1 score on the independent testing set. This score is similar to the top 9th score (81%) in the official phase of the Physionet Challenge 2017.</div></div><div><br></div><div>In the fourth chapter, we introduce a novel region-selection penalty in the framework of image-on-scalar regression to impose sparsity of pixel values and extract active regions simultaneously. This method helps identify regions of interest (ROI) associated with certain disease, which has a great impact on public health. Our penalty combines the Smoothly Clipped Absolute Deviation (SCAD) regularization, enforcing sparsity, and the SCAD of total variation (TV) regularization, enforcing spatial contiguity, into one group, which segments contiguous spatial regions against zero-valued background. Efficient algorithm is based on the alternative direction method of multipliers (ADMM) which decomposes the non-convex problem into two iterative optimization problems with explicit solutions. Another virtue of the proposed method is that a divide and conquer learning algorithm is developed, thereby allowing scaling to large images. Several examples are presented and the experimental results are compared with other state-of-the-art approaches. <br></div>
39

Porovnání hlubokých neuronových sítí a standardních metod pro detekci dopravního značení / Comparison of deep learning and classical methods for traffic signs detection

Geiger, Petr January 2019 (has links)
The goal of this thesis is to explore and evaluate classic and deep neural network computer vision methods in the task of detection position of a level crossing barrier. This thesis is based on an initial detection algorithm using a Stable Wave Detector. The initial algorithm is optimized both in performance and quality of the results. Both is crucial, because the best method should be suitable as a component of the real-time level crossing safety system. Then an another approach is implemented using deep neural networks and optimized in the same manner. Throughout the work several datasets are created for both training and testing of the algorithms. Both approaches are finally evaluated on the same test datasets and the results are compared.
40

RESOURCE MANAGEMENT IN EDGE COMPUTING FOR INTERNET OF THINGS APPLICATIONS

Galanis, Ioannis 01 December 2020 (has links)
The Internet of Things (IoT) computing paradigm has connected smart objects “things” and has brought new services at the proximity of the user. Edge Computing, a natural evolution of the traditional IoT, has been proposed to deal with the ever-increasing (i) number of IoT devices and (ii) the amount of data traffic that is produced by the IoT endpoints. EC promises to significantly reduce the unwanted latency that is imposed by the multi-hop communication delays and suggests that instead of uploading all the data to the remote cloud for further processing, it is beneficial to perform computation at the “edge” of the network, close to where the data is produced. However, bringing computation at the edge level has created numerous challenges as edge devices struggle to keep up with the growing application requirements (e.g. Neural Networks, or video-based analytics). In this thesis, we adopt the EC paradigm and we aim at addressing the open challenges. Our goal is to bridge the performance gap that is caused by the increased requirements of the IoT applications with respect to the IoT platform capabilities and provide latency- and energy-efficient computation at the edge level. Our first step is to study the performance of IoT applications that are based on Deep Neural Networks (DNNs). The exploding need to deploy DNN-based applications on resource-constrained edge devices has created several challenges, mainly due to the complex nature of DNNs. DNNs are becoming deeper and wider in order to fulfill users expectations for high accuracy, while they also become power hungry. For instance, executing a DNN on an edge device can drain the battery within minutes. Our solution to make DNNs more energy and inference friendly is to propose hardware-aware method that re-designs a given DNN architecture. Instead of proxy metrics, we measure the DNN performance on real edge devices and we capture their energy and inference time. Our method manages to find alternative DNN architectures that consume up to 78.82% less energy and are up to35.71% faster than the reference networks. In order to achieve end-to-end optimal performance, we also need to manage theedge device resources that will execute a DNN-based application. Due to their unique characteristics, we distinguish the edge devices into two categories: (i) a neuromorphic platform that is designed to execute Spiking Neural Networks (SNNs), and (ii) a general-purpose edge device that is suitable to host a DNN. For the first category, we train a traditional DNN and then we convert it to a spiking representation. We target the SpiNNaker neuromorphic platform and we develop a novel technique that efficiently configures the platform-dependent parameters, in order to achieve the highest possible SNN accuracy.Experimental results show that our technique is 2.5× faster than an exhaustive approach and can reach up to 0.8% higher accuracy compared to a CPU-based simulation method. Regarding the general-purpose edge devices, we show that a DNN-unaware platform can result in sub-optimal DNN performance in terms of power and inference time. Our approachconfigures the frequency of the device components (GPU, CPU, Memory) and manages to achieve average of 33.4% and up to 66.3% inference time improvements and an average of 42.8% and up to 61.5% power savings compared to the predefined configuration of an edge device. The last part of this thesis is the offloading optimization between the edge devicesand the gateway. The offloaded tasks create contention effects on gateway, which can lead to application slowdown. Our proposed solution configures (i) the number of application stages that are executed on each edge device, and (ii) the achieved utility in terms of Quality of Service (QoS) on each edge device. Our technique manages to (i) maximize theoverall QoS, and (ii) simultaneously satisfy network constraints (bandwidth) and user expectations (execution time). In case of multi-gateway deployments, we tackled the problem of unequal workload distribution. In particular, we propose a workload-aware management scheme that performs intra- and inter-gateway optimizations. The intra-gateway mechanism provides a balanced execution environment for the applications, and it achieves up to 95% performance deviation improvement, compared to un-optimized systems. The presented inter-gateway method manages to balance the workload among multiple gateways and is able to achieve a global performance threshold.

Page generated in 0.0812 seconds