Global ETD Search

31	VOICE COMMAND RECOGNITION WITH DEEP NEURAL NETWORK ON EDGE DEVICES Md Naim Miah (11185971) 26 July 2021 (has links) Interconnected devices are becoming attractive solutions to integrate physical parameters and making them more accessible for further analysis. Edge devices, located at the end of the physical world, measure and transfer data to the remote server using either wired or wireless communication. The exploding number of sensors, being used in the Internet of Things (IoT), medical fields, or industry, are demanding huge bandwidth and computational capabilities in the cloud, to be processed by Artificial Neural Networks (ANNs) – especially, processing audio, video and images from hundreds of edge devices. Additionally, continuous transmission of information to the remote server not only hampers privacy but also increases latency and takes more power. Deep Neural Network (DNN) is proving to be very effective for cognitive tasks, such as speech recognition, object detection, etc., and attracting researchers to apply it in edge devices. Microcontrollers and single-board computers are the most commonly used types of edge devices. These have gone through significant advancements over the years and capable of performing more sophisticated computations, making it a reasonable choice to implement DNN. In this thesis, a DNN model is trained and implemented for Keyword Spotting (KWS) on two types of edge devices: a bare-metal embedded device (microcontroller) and a robot car. The unnecessary components and noise of audio samples are removed, and speech features are extracted using Mel-Frequency Cepstral Co-efficient (MFCC). In the bare-metal microcontroller platform, these features are efficiently extracted using Digital Signal Processing (DSP) library, which makes the calculation much faster. A Depth wise Separable Convolutional Neural Network (DSCNN) based model is proposed and trained with an accuracy of about 91% with only 721 thousand trainable parameters. After implementing the DNN on the microcontroller, the converted model takes only 11.52 Kbyte (2.16%) RAM and 169.63 Kbyte (8.48%) Flash of the test device. It needs to perform 287,673 Multiply-and-Accumulate (MACC) operations and takes about 7ms to execute the model. This trained model is also implemented on the robot car, Jetbot, and designed a voice-controlled robotic vehicle. This robot accepts few selected voice commands-such as “go”, “stop”, etc. and executes accordingly with reasonable accuracy. The Jetbot takes about 15ms to execute the KWS. Thus, this study demonstrates the implementation of Neural Network based KWS on two different types of edge devices: a bare-metal embedded device without any Operating System (OS) and a robot car running on embedded Linux OS. It also shows the feasibility of bare-metal offline KWS implementation for autonomous systems, particularly autonomous vehicles.<br> Edge Device Deep Neural Network (DNN) Microcontroller Speech Command Recognition Jetson Nano
32	Search for Stop using Machine Learning : A Bachelors Project in Physics Gautam, Daniel January 2021 (has links) In this thesis the application of machine learning algorithms as a tool in the search for top squark is studied. Two neural network models are trained with simulated stop events as signal against dileptonic and semi-leptonic top pair production events as background. There is a substantial class imbalance between the number of signal and background samples that are used. The performance of the neural network models are compared to the performance of a cut and count method. None of the models outperform the standard cut and count method. Particle physics ATLAS Supersymmetry Machine learning Deep neural network Stop Top squark Binary classification Subatomic Physics Subatomär fysik
33	Evaluation of 3D motion capture data from a deep neural network combined with a biomechanical model Rydén, Anna, Martinsson, Amanda January 2021 (has links) Motion capture has in recent years grown in interest in many fields from both game industry to sport analysis. The need of reflective markers and expensive multi-camera systems limits the business since they are costly and time-consuming. One solution to this could be a deep neural network trained to extract 3D joint estimations from a 2D video captured with a smartphone. This master thesis project has investigated the accuracy of a trained convolutional neural network, MargiPose, that estimates 25 joint positions in 3D from a 2D video, against a gold standard, multi-camera Vicon-system. The project has also investigated if the data from the deep neural network can be connected to a biomechanical modelling software, AnyBody, for further analysis. The final intention of this project was to analyze how accurate such a combination could be in golf swing analysis. The accuracy of the deep neural network has been evaluated with three parameters: marker position, angular velocity and kinetic energy for different segments of the human body. MargiPose delivers results with high accuracy (Mean Per Joint Position Error (MPJPE) = 1.52 cm) for a simpler movement but for a more advanced motion such as a golf swing, MargiPose achieves less accuracy in marker distance (MPJPE = 3.47 cm). The mean difference in angular velocity shows that MargiPose has difficulties following segments that are occluded or has a greater motion, such as the wrists in a golf swing where they both move fast and are occluded by other body segments. The conclusion of this research is that it is possible to connect data from a trained CNN with a biomechanical modelling software. The accuracy of the network is highly dependent on the intention of the data. For the purpose of golf swing analysis, this could be a great and cost-effective solution which could enable motion analysis for professionals but also for interested beginners. MargiPose shows a high accuracy when evaluating simple movements. However, when using it with the intention of analyzing a golf swing in i biomechanical modelling software, the outcome might be beyond the bounds of reliable results. Human pose estimation motion capture deep neural network CNN MargiPose biomechanical modelling AnyBody modelling system Other Medical Engineering Annan medicinteknik
34	The Subcellular Localization and Protein-protein Interactions of Barley Mixed-Linkage-(1->3),(1->4)-ß-D-Glucan Synthase CSLF6 and CSLH1 Zhou, Yadi January 2018 (has links) No description available. Biochemistry Bioinformatics subcellular localization protein-protein interaction barley CSLF6 CSLH1 gene co-expression network deep neural network
35	Application of Deep Learning in Deep Space Wireless Signal Identification for Intelligent Channel Sensing Kabir, Md Faisal January 2020 (has links) No description available. Electrical Engineering Modulation Classification NASA SCaN Testbed Deep Space Wireless Signal Deep Learning Convolutional Neural Network Deep Neural Network
36	Strategies for Sparsity-based Time-Frequency Analyses Zhang, Shuimei, 0000-0001-8477-5417 January 2021 (has links) Nonstationary signals are widely observed in many real-world applications, e.g., radar, sonar, radio astronomy, communication, acoustics, and vibration applications. Joint time-frequency (TF) domain representations provide a time-varying spectrum for their analyses, discrimination, and classifications. Nonstationary signals commonly exhibit sparse occupancy in the TF domain. In this dissertation, we incorporate such sparsity to enable robust TF analysis in impaired observing environments. In practice, missing data samples frequently occur during signal reception due to various reasons, e.g., propagation fading, measurement obstruction, removal of impulsive noise or narrowband interference, and intentional undersampling. Missing data samples in the time domain lend themselves to be missing entries in the instantaneous autocorrelation function (IAF) and induce artifacts in the TF representation (TFR). Compared to random missing samples, a more realistic and more challenging problem is the existence of burst missing data samples. Unlike the effects of random missing samples, which cause the artifacts to be uniformly spread over the entire TF domain, the artifacts due to burst missing samples are highly localized around the true instantaneous frequencies, rendering extremely challenging TF analyses for which many existing methods become ineffective. In this dissertation, our objective is to develop novel signal processing techniques that offer effective TF analysis capability in the presence of burst missing samples. We propose two mutually related methods that recover missing entries in the IAF and reconstruct high-fidelity TFRs, which approach full-data results with negligible performance loss. In the first method, an IAF slice corresponding to the time or lag is converted to a Hankel matrix, and its missing entries are recovered via atomic norm minimization. The second method generalizes this approach to reduce the effects of TF crossterms. It considers an IAF patch, which is reformulated as a low-rank block Hankel matrix, and the annihilating filter-based approach is used to interpolate the IAF and recover the missing entries. Both methods are insensitive to signal magnitude differences. Furthermore, we develop a novel machine learning-based approach that offers crossterm-free TFRs with effective autoterm preservation. The superiority and usefulness of the proposed methods are demonstrated using simulated and real-world signals. / Electrical and Computer Engineering Electrical engineering Burst missing samples Crossterm mitigation Deep neural network Low-rank structured matrix completion Nonstationary signal Time-frequency analysis
37	IN-MEMORY COMPUTING WITH CMOS AND EMERGING MEMORY TECHNOLOGIES Shubham Jain (7464389) 17 October 2019 (has links) Modern computing workloads such as machine learning and data analytics perform simple computations on large amounts of data. Traditional von Neumann computing systems, which consist of separate processor and memory subsystems, are inefficient in realizing modern computing workloads due to frequent data transfers between these subsystems that incur significant time and energy costs. In-memory computing embeds computational capabilities within the memory subsystem to alleviate the fundamental processor-memory bottleneck, thereby achieving substantial system-level performance and energy benefits. In this dissertation, we explore a new generation of in-memory computing architectures that are enabled by emerging memory technologies and new CMOS-based memory cells. The proposed designs realize Boolean and non-Boolean computations natively within memory arrays.<br><div><br></div><div>For Boolean computing, we leverage the unique characteristics of emerging memories that allow multiple word lines within an array to be simultaneously enabled, opening up the possibility of directly sensing functions of the values stored in multiple rows using single access. We propose Spin-Transfer Torque Compute-in-Memory (STT-CiM), a design for in-memory computing with modifications to peripheral circuits that leverage this principle to perform logic, arithmetic, and complex vector operations. We address the challenge of reliable in-memory computing under process variations utilizing error detecting and correcting codes to control errors during CiM operations. We demonstrate how STT-CiM can be integrated within a general-purpose computing system and propose architectural enhancements to processor instruction sets and on-chip buses for in-memory computing. <br></div><div><br></div><div>For non-Boolean computing, we explore crossbar arrays of resistive memory elements, which are known to compactly and efficiently realize a key primitive operation involved in machine learning algorithms, i.e., vector-matrix multiplication. We highlight a key challenge involved in this approach - the actual function computed by a resistive crossbar can deviate substantially from the desired vector-matrix multiplication operation due to a range of device and circuit level non-idealities. It is essential to evaluate the impact of the errors introduced by these non-idealities at the application level. There has been no study of the impact of non-idealities on the accuracy of large-scale workloads (e.g., Deep Neural Networks [DNNs] with millions of neurons and billions of synaptic connections), in part because existing device and circuit models are too slow to use in application-level evaluation. We propose a Fast Crossbar Model (FCM) to accurately capture the errors arising due to crossbar non-idealities while being four-to-five orders of magnitude faster than circuit simulation. We also develop RxNN, a software framework to evaluate DNN inference on resistive crossbar systems. Using RxNN, we evaluate a suite of large-scale DNNs developed for the ImageNet Challenge (ILSVRC). Our evaluations reveal that the errors due to resistive crossbar non-idealities can degrade the overall accuracy of DNNs considerably, motivating the need for compensation techniques. Subsequently, we propose CxDNN, a hardware-software methodology that enables the realization of large-scale DNNs on crossbar systems with minimal degradation in accuracy by compensating for errors due to non-idealities. CxDNN comprises of (i) an optimized mapping technique to convert floating-point weights and activations to crossbar conductances and input voltages, (ii) a fast re-training method to recover accuracy loss due to this conversion, and (iii) low-overhead compensation hardware to mitigate dynamic and hardware-instance-specific errors. Unlike previous efforts that are limited to small networks and require the training and deployment of hardware-instance-specific models, CxDNN presents a scalable compensation methodology that can address large DNNs (e.g., ResNet-50 on ImageNet), and enables a common model to be trained and deployed on many devices. <br></div><div><br></div><div>For non-Boolean computing, we also propose TiM-DNN, a programmable hardware accelerator that is specifically designed to execute ternary DNNs. TiM-DNN supports various ternary representations including unweighted (-1,0,1), symmetric weighted (-a,0,a), and asymmetric weighted (-a,0,b) ternary systems. TiM-DNN is an in-memory accelerator designed using TiM tiles --- specialized memory arrays that perform massively parallel signed vector-matrix multiplications on ternary values per access. TiM tiles are in turn composed of Ternary Processing Cells (TPCs), new CMOS-based memory cells that function as both ternary storage units and signed scalar multiplication units. We evaluate an implementation of TiM-DNN in 32nm technology using an architectural simulator calibrated with SPICE simulation and RTL synthesis. TiM-DNN achieves a peak performance of 114 TOPs/s, consumes 0.9W power, and occupies 1.96mm2 chip area, representing a 300X improvement in TOPS/W compared to a state-of-the-art NVIDIA Tesla V100 GPU. In comparison to popular quantized DNN accelerators, TiM-DNN achieves 55.2X-240X and 160X-291X improvement in TOPS/W and TOPS/mm2, respectively.<br></div><div><br></div><div>In summary, the dissertation proposes new in-memory computing architectures as well as addresses the need for scalable modeling frameworks and compensation techniques for resistive crossbar based in-memory computing fabrics. Our evaluations show that in-memory computing architectures are promising for realizing modern machine learning and data analytics workloads, and can attain orders-of-magnitude improvement in system-level energy and performance over traditional von Neumann computing systems. <br></div> Computer Engineering In-memory Computing Processing-in-Memory Resistive Crossbar Deep Neural Network Emerging memories STT-MRAM Spintronics
38	HIGH SPEED IMAGING VIA ADVANCED MODELING Soumendu Majee (10942896) 04 August 2021 (has links) <div>There is an increasing need to accurately image objects at a high temporal resolution for different applications in order to analyze the underlying physical, chemical, or biological processes. In this thesis, we use advanced models exploiting the image structure and the measurement process in order to achieve an improved temporal resolution. The thesis is divided into three chapters, each corresponding to a different imaging application.</div><div><br></div><div>In the first chapter, we propose a novel method to localize neurons in fluorescence microscopy images. Accurate localization of neurons enables us to scan only the neuron locations instead of the full brain volume and thus improve the temporal resolution of neuron activity monitoring. We formulate the neuron localization problem as an inverse problem where we reconstruct an image that encodes the location of the neuron centers. The sparsity of the neuron centers serves as a prior model, while the forward model comprises of shape models estimated from training data.</div><div><br></div><div>In the second chapter, we introduce multi-slice fusion, a novel framework to incorporate advanced prior models for inverse problems spanning many dimensions such as 4D computed tomography (CT) reconstruction. State of the art 4D reconstruction methods use model based iterative reconstruction (MBIR), but it depends critically on the quality of the prior modeling. Incorporating deep convolutional neural networks (CNNs) in the 4D reconstruction problem is difficult due to computational difficulties and lack of high-dimensional training data. Multi-Slice Fusion integrates the tomographic forward model with multiple low dimensional CNN denoisers along different planes to produce a 4D regularized reconstruction. The improved regularization in multi-slice fusion allows each time-frame to be reconstructed from fewer measurements, resulting in an improved temporal resolution in the reconstruction. Experimental results on sparse-view and limited-angle CT data demonstrate that Multi-Slice Fusion can substantially improve the quality of reconstructions relative to traditional methods, while also being practical to implement and train.</div><div><br></div><div>In the final chapter, we introduce CodEx, a synergistic combination of coded acquisition and a non-convex Bayesian reconstruction for improving acquisition speed in computed tomography (CT). In an ideal ``step-and-shoot'' tomographic acquisition, the object is rotated to each desired angle, and the view is taken. However, step-and-shoot acquisition is slow and can waste photons, so in practice the object typically rotates continuously in time, leading to views that are blurry. This blur can then result in reconstructions with severe motion artifacts. CodEx works by encoding the acquisition with a known binary code that the reconstruction algorithm then inverts. The CodEx reconstruction method uses the alternating direction method of multipliers (ADMM) to split the inverse problem into iterative deblurring and reconstruction sub-problems, making reconstruction practical. CodEx allows for a fast data acquisition leading to a good temporal resolution in the reconstruction.</div> Computer Engineering Computational Imaging Computed Tomography Inverse Problems Deep Neural Network (DNN) Dynamic Reconstruction Plug-and-play Priors High Speed Imaging Coded exposure
39	Zobrazení a analýza aktivit neuronové sítě ve skrytých vrstvách / Activity of Neural Network in Hidden Layers - Visualisation and Analysis Fábry, Marko January 2016 (has links) Goal of this work was to create system capable of visualisation of activation function values, which were produced by neurons placed in hidden layers of neural networks used for speech recognition. In this work are also described experiments comparing methods for visualisation, visualisations of neural networks with different architectures and neural networks trained with different types of input data. Visualisation system implemented in this work is based on previous work of Mr. Khe Chai Sim and extended with new methods of data normalization. Kaldi toolkit was used for neural network training data preparation. CNTK framework was used for neural network training. Core of this work - the visualisation system was implemented in scripting language Python.
40	Identifikace osob pomocí hlubokých neuronových sítí / Deep Neural Networks for Person Identification Duban, Michal January 2016 (has links) This master's thesis deals with design and implementation of convolutional neural networks used in person re-identification. Implemented convolutional neural networks were tested on two datasets CUHK01 a CUHK03. Results, comparable with state of the art methods were acheved on these datasets. Designed networks were implemented in Caffe framework.

Search results