Global ETD Search

71	Machine Learning on Acoustic Signals Applied to High-Speed Bridge Deck Defect Detection Chou, Yao 06 December 2019 (has links) Machine learning techniques are being applied to many data-intensive problems because they can accurately provide classification of complex data using appropriate training. Often, the performance of machine learning can exceed the performance of traditional techniques because machine learning can take advantage of higher dimensionality than traditional algorithms. In this work, acoustic data sets taken using a rapid scanning technique on concrete bridge decks provided an opportunity to both apply machine learning algorithms to improve detection performance and also to investigate the ways that training of neural networks can be aided by data augmentation approaches. Early detection and repair can enhance safety and performance as well as reduce long-term maintenance costs of concrete bridges. In order to inspect for non-visible internal cracking (called delaminations) of concrete bridges, a rapid inspection method is needed. A six-channel acoustic impact-echo sounding apparatus is used to generate large acoustic data sets on concrete bridge decks at high speeds. A machine learning data processing architecture is described to accurately detect and map delaminations based on the acoustic responses. The machine learning approach achieves accurate results at speeds between 25 and 45 km/h across a bridge deck and successfully demonstrates the use of neural networks to analyze this type of acoustic data. In order to obtain excellent performance, model training generally requires large data sets. However, in many potentially interesting cases, such as bridge deck defect detection, acquiring enough data for training can be difficult. Data augmentation can be used to increase the effective size of the training data set. Acoustic signal data augmentation is demonstrated in conjunction with a machine learning model for acoustic defect detection on bridge decks. Four different augmentation methods are applied to data using two different augmentation strategies. This work demonstrates that a "goldilocks" data augmentation approach can be used to increase machine learning performance when only a limited data set is available. The major technical contributions of this work include application of machine learning to acoustic data sets relevant to bridge deck inspection, solving an important problem in the field of nondestructive evaluation, and a more generalized approach to data augmentation of limited acoustic data sets to expand the classes of acoustic problems that machine learning can successfully address. bridge defect detection convolutional neural networks data augmentation delaminations machine learning Engineering
72	COMPRESSED MOBILENET V3: AN EFFICIENT CNN FOR RESOURCE CONSTRAINED PLATFORMS Kavyashree Pras Shalini Pradeep Prasad (10662020) 10 May 2021 (has links) <p>Computer Vision is a mathematical tool formulated to extend human vision to machines. This tool can perform various tasks such as object classification, object tracking, motion estimation, and image segmentation. These tasks find their use in many applications, namely robotics, self-driving cars, augmented reality, and mobile applications. However, opposed to the traditional technique of incorporating handcrafted features to understand images, convolution neural networks are being used to perform the same function. Computer vision applications widely use CNNs due to their stellar performance in interpreting images. Over the years, there have been numerous advancements in machine learning, particularly to CNNs. However, the need to improve their accuracy, model size and complexity increased, making their deployment in restricted environments a challenge. Many researchers proposed techniques to reduce the size of CNN while still retaining its accuracy. Few of these include network quantization, pruning, low rank, and sparse decomposition and knowledge distillation. Some methods developed efficient models from scratch. This thesis achieves a similar goal using design space exploration techniques on the latest variant of MobileNets, MobileNet V3. Using Depthwise Pointwise Depthwise (DPD) blocks, escalation in the number of expansion filters in some layers and mish activation function MobileNet V3 is reduced to 84.96% in size and made 0.2% more accurate. Furthermore, it is deployed in NXP i.MX RT1060 for image classification on CIFAR-10 dataset.</p> Computer Engineering MobileNet V3 Design Space Exploration Convolutional Neural Networks Resource Constrained Platforms Machine Learning
73	Sentiment Analysis of YouTube Public Videos based on their Comments Kvedaraite, Indre January 2021 (has links) With the rise of social media and publicly available data, opinion mining is more accessible than ever. It is valuable for content creators, companies and advertisers to gain insights into what users think and feel. This work examines comments on YouTube videos, and builds a deep learning classifier to automatically determine their sentiment. Four Long Short-Term Memory-based models are trained and evaluated. Experiments are performed to determine which deep learning model performs with the best accuracy, recall, precision, F1 score and ROC curve on a labelled YouTube Comment dataset. The results indicate that a BiLSTM-based model has the overall best performance, with the accuracy of 89%. Furthermore, the four LSTM-based models are evaluated on an IMDB movie review dataset, achieving an average accuracy of 87%, showing that the models can predict the sentiment of different textual data. Finally, a statistical analysis is performed on the YouTube videos, revealing that videos with positive sentiment have a statistically higher number of upvotes and views. However, the number of downvotes is not significantly higher in videos with negative sentiment. Sentiment analysis Sentiment classification LSTM BiLSTM Recurrent neural networks Convolutional neural networks Software Engineering Programvaruteknik
74	Efficient image based localization using machine learning techniques Elmougi, Ahmed 23 April 2021 (has links) Localization is critical for self-awareness of any autonomous system and is an important part of the autonomous system stack which consists of many phases including sensing, perceiving, planning and control. In the sensing phase, data from on board sensors are collected, preprocessed and passed to the next phase. The perceiving phase is responsible for self awareness or localization and situational awareness which includes multi-objects detection and scene understanding. After the autonomous system is aware of where it is and what is around it, it can use this knowledge to plan for the path it can take and send control commands to pursue this path. In this proposal, we focus on the localization part of the autonomous stack using camera images. We deal with the localization problem from different perspectives including single images and videos. Starting with the single image pose estimation, our approach is to propose systems that not only have good localization accuracy, but also have low space and time complexity. Firstly, we propose SurfCNN, a low cost indoor localization system that uses SURF descriptors instead of the original images to reduce the complexity of training convolutional neural networks (CNN) for indoor localization application. Given a single input image, the strongest SURF features descriptors are used as input to 5 convolutional layers to find its absolute position and orientation in arbitrary reference frame. The proposed system achieves comparable performance to the state of the art using only 300 features without the need for using the full image or complex neural networks architectures. Following, we propose SURF-LSTM, an extension to the idea of using SURF descriptors instead the original images. However, instead of CNN used in SurfCNN, we use long short term memory (LSTM) network which is one type of recurrent neural networks (RNN) to extract the sequential relation between SURF descriptors. Using SURF-LSTM, We only need 50 features to reach comparable or better results compared with SurfCNN that needs 300 features and other works that use full images with large neural networks. In the following research phase, instead of using SURF descriptors as image features to reduce the training complexity, we study the effect of using features extracted from other CNN models that were pretrained on other image tasks like image classification without further training and fine tuning. To learn the pose from pretrained features, graph neural networks (GNN) are adopted to solve the single image localization problem (Pose-GNN) by using these features representations either as features of nodes in a graph (image as a node) or converted into a graph (image as a graph). The proposed models outperform the state of the art methods on indoor localization dataset and have comparable performance for outdoor scenes. In the final stage of single image pose estimation research, we study if we can achieve good localization results without the need for training complex neural network. We propose (Linear-PoseNet) by which we can achieve similar results to the other methods based on neural networks with training a single linear regression layer on image features from pretrained ResNet50 in less than one second on CPU. Moreover, for outdoor scenes, we propose (Dense-PoseNet) that have only 3 fully connected layers trained on few minutes that reach comparable performance to other complex methods. The second localization perspective is to find the relative poses between images in a video instead of absolute poses. We extend the idea used in SurfCNN and SURF-LSTM systems and use SURF descriptors as feature representation of the images in the video. Two systems are proposed to find the relative poses between images in the video using 3D-CNN and 2DCNN-RNN. We show that using 3D-CNN is better than using the combination of CNN-RNN for relative pose estimation. / Graduate SLAM deep learning graph neural networks convolutional neural networks recurrent neural networks computer vision
75	Thor: A Deep Learning Approach for Face Mask Detection to Prevent the COVID-19 Pandemic Snyder, Shay E., Husari, Ghaith 10 March 2021 (has links) With the rapid worldwide spread of Coronavirus (COVID-19 and COVID-20), wearing face masks in public becomes a necessity to mitigate the transmission of this or other pandemics. However, with the lack of on-ground automated prevention measures, depending on humans to enforce face mask-wearing policies in universities and other organizational buildings, is a very costly and time-consuming measure. Without addressing this challenge, mitigating highly airborne transmittable diseases will be impractical, and the time to react will continue to increase. Considering the high personnel traffic in buildings and the effectiveness of countermeasures, that is, detecting and offering unmasked personnel with surgical masks, our aim in this paper is to develop automated detection of unmasked personnel in public spaces in order to respond by providing a surgical mask to them to promptly remedy the situation. Our approach consists of three key components. The first component utilizes a deep learning architecture that integrates deep residual learning (ResNet-50) with Feature Pyramid Network (FPN) to detect the existence of human subjects in the videos (or video feed). The second component utilizes Multi-Task Convolutional Neural Networks (MT-CNN) to detect and extract human faces from these videos. For the third component, we construct and train a convolutional neural network classifier to detect masked and unmasked human subjects. Our techniques were implemented in a mobile robot, Thor, and evaluated using a dataset of videos collected by the robot from public spaces of an educational institute in the U.S. Our evaluation results show that Thor is very accurate achieving an F_{1} score of 87.7% with a recall of 99.2% in a variety of situations, a reasonable accuracy given the challenging dataset and the problem domain. convolutional neural networks COVID-19 deep learning face detection machine learning mask detection Computing
76	Detekcija bolesti biljaka tehnikama dubokog učenja / Plant disease detections using deep learning techniques Arsenović Marko 07 October 2020 (has links) <p>Istraživanja predstavljena u disertaciji imala su za cilj razvoj nove metode bazirane na dubokim konvolucijskim neuoronskim mrežama u cilju detekcije bolesti biljaka na osnovu slike lista. U okviru eksperimentalnog dela rada prikazani su dosadašnji literaturno dostupni pristupi u automatskoj detekciji bolesti biljaka kao i ograničenja ovako dobijenih modela kada se koriste u prirodnim uslovima. U okviru disertacije uvedena je nova baza slika listova, trenutno najveća po broju slika u poređenju sa javno dostupnim bazama, potvrđeni su novi pristupi augmentacije bazirani na GAN arhitekturi nad slikama listova uz novi specijalizovani dvo-koračni pristup kao potencijalni odgovor na nedostatke postojećih rešenja.</p> / <p>The research presented in this thesis was aimed at developing a novel method based on deep convolutional neural networks for automated plant disease detection. Based on current available literature, specialized two-phased deep neural network method introduced in the experimental part of thesis solves the limitations of state-of-the-art plant disease detection methods and provides the possibility for a practical usage of the newly developed model. In addition, a new dataset was introduced, that has more images of leaves than other publicly available datasets, also GAN based augmentation approach on leaves images is experimentally confirmed.</p>
77	Advancing Video Compression With Error Resilience And Content Analysis Di Chen (9234905) 13 August 2020 (has links) <div> <div> <div> <p>In this thesis, two aspects of video coding improvement are discussed, namely error resilience and coding efficiency. </p> <p>With the increasing amount of videos being created and consumed, better video compression tools are needed to provide reliable and fast transmission. Many popular video coding standards such as VPx, H.26x achieve video compression by using spa- tial and temporal dependencies in the source video signal. This makes the encoded bitstream vulnerable to errors during transmission. In this thesis, we investigate an error resilient video coding for the VP9 bitstreams using error resilience packets. An error resilient packet consists of encoded keyframe contents and the prediction sig- nals for each non-keyframe. Experimental results exhibit that our proposed method is effective under typical packet loss conditions. </p> <p>In the second part of the thesis, we first present an automatic stillness feature detection method for group of pictures. The encoder adaptively chooses the coding structure for each group of pictures based on its stillness feature to optimize the coding efficiency. </p> <p>Secondly, a content-based video coding method is proposed. Modern video codecs including the newly developed AOM/AV1 utilize hybrid coding techniques to remove spatial and temporal redundancy. However, the efficient exploitation of statistical dependencies measured by a mean squared error (MSE) does not always produce the best psychovisual result. One interesting approach is to only encode visually relevant information and use a different coding method for “perceptually insignificant” regions </p> </div> </div> <div> <div> <p>xiv </p> </div> </div> </div> <div> <div> <div> <p>in the frame. In this thesis, we introduce a texture analyzer before encoding the input sequences to identify detail irrelevant texture regions in the frame using convolutional neural networks. The texture region is then reconstructed based on one set of motion parameters. We show that for many standard test sets, the proposed method achieved significant data rate reductions. </p> </div> </div> </div> video compression Convolutional Neural Networks texture analysis and synthesis AV1 codec
78	Compressed MobileNet V3: An efficient CNN for resource constrained platforms Prasad, S. P. Kavyashree 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Computer Vision is a mathematical tool formulated to extend human vision to machines. This tool can perform various tasks such as object classification, object tracking, motion estimation, and image segmentation. These tasks find their use in many applications, namely robotics, self-driving cars, augmented reality, and mobile applications. However, opposed to the traditional technique of incorporating handcrafted features to understand images, convolution neural networks are being used to perform the same function. Computer vision applications widely use CNNs due to their stellar performance in interpreting images. Over the years, there have been numerous advancements in machine learning, particularly to CNNs.However, the need to improve their accuracy, model size and complexity increased, making their deployment in restricted environments a challenge. Many researchers proposed techniques to reduce the size of CNN while still retaining its accuracy. Few of these include network quantization, pruning, low rank, and sparse decomposition and knowledge distillation. Some methods developed efficient models from scratch. This thesis achieves a similar goal using design space exploration techniques on the latest variant of MobileNets, MobileNet V3. Using DPD blocks, escalation in the number of expansion filters in some layers and mish activation function MobileNet V3 is reduced to 84.96% in size and made 0.2% more accurate. Furthermore, it is deployed in NXP i.MX RT1060 for image classification on CIFAR-10 dataset. MobileNet V3 Design space exploration Convolutional neural networks Resource constrained platforms Machine learning
79	Design Space Exploration of Convolutional Neural Networks for Image Classification Shah, Prasham 12 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Computer vision is a domain which deals with the goal of making technology as efficient as human vision. To achieve that goal, after decades of research, researchers have developed algorithms that are able to work efficiently on resource constrained hardware like mobile or embedded devices for computer vision applications. Due to their constant efforts, such devices have become capable for tasks like Image Classification, Object Detection, Object Recognition, Semantic Segmentation, and many other applications. Autonomous systems like self-driving cars, Drones and UAVs, are being successfully developed because of these advances in AI. Deep Learning, a part of AI, is a specific domain of Machine Learning which focuses on developing algorithms for such applications. Deep Learning deals with tasks like extracting features from raw image data, replacing pipelines of specialized models with single end-to-end models, making models usable for multiple tasks with superior performance. A major focus is on techniques to detect and extract features which provide better context for inference about an image or video stream. A deep hierarchy of rich features can be learned and automatically extracted from images, provided by the multiple deep layers of CNN models. CNNs are the backbone of Computer Vision. The reason that CNNs are the focus of attention for deep learning models is that they were specifically designed for image data. They are complicated but very effective in extracting features from an image or a video stream. After AlexNet won the ILSVRC in 2012, there was a drastic increase in research related with CNNs. Many state-of-the-art architectures like VGG Net, GoogleNet, ResNet, Inception-v4, Inception-Resnet-v2, ShuffleNet, Xception, MobileNet, MobileNetV2, SqueezeNet, SqueezeNext and many more were introduced. The trend behind the research depicts an increase in the number of layers of CNN to make them more efficient but with that, the size of the model increased as well. This problem was fixed with the advent of new algorithms which resulted in a decrease in model size. As a result, today we have CNN models, which are implemented on mobile devices. These mobile models are compact and have low latency, which in turn reduces the computational cost of the embedded system. This thesis resembles similar idea, it proposes two new CNN architectures, A-MnasNet and R-MnasNet, which have been derived from MnasNet by Design Space Exploration. These architectures outperform MnasNet in terms of model size and accuracy. They have been trained and tested on CIFAR-10 dataset. Furthermore, they were implemented on NXP Bluebox 2.0, an autonomous driving platform, for Image Classification. Convolutional Neural Networks Deep Learning Computer Vision Image Classification A-MnasNet R-MnasNet
80	Design Space Exploration of Convolutional Neural Networks for Image Classification Shah, Prasham 12 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Computer vision is a domain which deals with the goal of making technology as efficient as human vision. To achieve that goal, after decades of research, researchers have developed algorithms that are able to work efficiently on resource constrained hardware like mobile or embedded devices for computer vision applications. Due to their constant efforts, such devices have become capable for tasks like Image Classification, Object Detection, Object Recognition, Semantic Segmentation, and many other applications. Autonomous systems like self-driving cars, Drones and UAVs, are being successfully developed because of these advances in AI. Deep Learning, a part of AI, is a specific domain of Machine Learning which focuses on developing algorithms for such applications. Deep Learning deals with tasks like extracting features from raw image data, replacing pipelines of specialized models with single end-to-end models, making models usable for multiple tasks with superior performance. A major focus is on techniques to detect and extract features which provide better context for inference about an image or video stream. A deep hierarchy of rich features can be learned and automatically extracted from images, provided by the multiple deep layers of CNN models. CNNs are the backbone of Computer Vision. The reason that CNNs are the focus of attention for deep learning models is that they were specifically designed for image data. They are complicated but very effective in extracting features from an image or a video stream. After AlexNet won the ILSVRC in 2012, there was a drastic increase in research related with CNNs. Many state-of-the-art architectures like VGG Net, GoogleNet, ResNet, Inception-v4, Inception-Resnet-v2, ShuffleNet, Xception, MobileNet, MobileNetV2, SqueezeNet, SqueezeNext and many more were introduced. The trend behind the research depicts an increase in the number of layers of CNN to make them more efficient but with that, the size of the model increased as well. This problem was fixed with the advent of new algorithms which resulted in a decrease in model size. As a result, today we have CNN models, which are implemented on mobile devices. These mobile models are compact and have low latency, which in turn reduces the computational cost of the embedded system. This thesis resembles similar idea, it proposes two new CNN architectures, A-MnasNet and R-MnasNet, which have been derived from MnasNet by Design Space Exploration. These architectures outperform MnasNet in terms of model size and accuracy. They have been trained and tested on CIFAR-10 dataset. Furthermore, they were implemented on NXP Bluebox 2.0, an autonomous driving platform, for Image Classification. Convolutional Neural Networks Deep Learning Computer Vision Image Classification A-MnasNet R-MnasNet MnasNet NXP Bluebox 2.0

Search results