Global ETD Search

281	Collaborative Path Planning and Control for Ground Agents Via Photography Collected by Unmanned Aerial Vehicles Wood, Sami Warren 24 June 2022 (has links) Natural disasters damage infrastructure and create significant obstacles to humanitarian aid efforts. Roads may become unusable, hindering or halting efforts to provide food, water, shelter, and life-saving emergency care. Finding a safe route during a disaster is especially difficult because as the disaster unfolds, the usability of roads and other infrastructure can change quickly, rendering most navigation services useless. With the proliferation of cheap cameras and unmanned aerial vehicles [UAVs], the rapid collection of aerial data after a natural disaster has become increasingly common. This data can be used to quickly appraise the damage to critical infrastructure, which can help solve navigational and logistical problems that may arise after the disaster. This work focuses on a framework in which a UAV is paired with an unmanned ground vehicle [UGV]. The UAV follows the UGV with a downward-facing camera and helps the ground vehicle navigate the flooded environment. This work makes several contributions: a simulation environment is created to allow for automated data collection in hypothetical disaster scenarios. The simulation environment uses real-world satellite and elevation data to emulate natural disasters such as floods. The environment partially simulates the dynamics of the UAV and UGV, allowing agents to ex- plore during hypothetical disasters. Several semantic image segmentation models are tested for efficacy in identifying obstacles and creating cost maps for navigation within the environ- ment, as seen by the UAV. A deep homography model incorporates temporal relations across video frames to stitch cost maps together. A weighted version of a navigation algorithm is presented to plan a path through the environment. The synthesis of these modules leads to a novel framework wherein a UAV may guide a UGV safely through a disaster area. / Master of Science / Damage to infrastructure after a natural disaster can make navigation a major challenge. Imagine a hurricane has hit someone's house; they are hurt and need to go to the hospital. Using a traditional GPS navigation system or even their memory may not work as many roads could be impassible. However, if the GPS could be quickly updated as to which roads were not flooded, it could still be used to navigate and avoid hazards. While the system presented is designed to work with a self-driving vehicle, it could easily be extended to give directions to a human. The goal of this work is to provide a system that could be used as a replacement for a GPS based on aerial photography. The advantage of this system is that flooded or damaged infrastructure can be identified and avoided in real-time. The system could even identify other possible routes by using photography, such as driving across a field to reach higher ground. Like a GPS, the system works automatically, tracking a user's position and sug- gesting turns, aiding navigation. A contribution of this work is a simulation of the environment designed in a video game engine. The game engine creates a video game world that can be flooded and used to test the new navigation system. The video game environment is used to train an artificial intel- ligence computer model to identify hazards and create routes that would avoid them. The system could be used in a real-world disaster following training in a video game world. Path Planning Computer Vision Deep Learning
282	Exploring Accumulated Gradient-Based Quantization and Compression for Deep Neural Networks Gaopande, Meghana Laxmidhar 29 May 2020 (has links) The growing complexity of neural networks makes their deployment on resource-constrained embedded or mobile devices challenging. With millions of weights and biases, modern deep neural networks can be computationally intensive, with large memory, power and computational requirements. In this thesis, we devise and explore three quantization methods (post-training, in-training and combined quantization) that quantize 32-bit floating-point weights and biases to lower bit width fixed-point parameters while also achieving significant pruning, leading to model compression. We use the total accumulated absolute gradient over the training process as the indicator of importance of a parameter to the network. The most important parameters are quantized by the smallest amount. The post-training quantization method sorts and clusters the accumulated gradients of the full parameter set and subsequently assigns a bit width to each cluster. The in-training quantization method sorts and divides the accumulated gradients into two groups after each training epoch. The larger group consisting of the lowest accumulated gradients is quantized. The combined quantization method performs in-training quantization followed by post-training quantization. We assume storage of the quantized parameters using compressed sparse row format for sparse matrix storage. On LeNet-300-100 (MNIST dataset), LeNet-5 (MNIST dataset), AlexNet (CIFAR-10 dataset) and VGG-16 (CIFAR-10 dataset), post-training quantization achieves 7.62x, 10.87x, 6.39x and 12.43x compression, in-training quantization achieves 22.08x, 21.05x, 7.95x and 12.71x compression and combined quantization achieves 57.22x, 50.19x, 13.15x and 13.53x compression, respectively. Our methods quantize at the cost of accuracy, and we present our work in the light of the accuracy-compression trade-off. / Master of Science / Neural networks are being employed in many different real-world applications. By learning the complex relationship between the input data and ground-truth output data during the training process, neural networks can predict outputs on new input data obtained in real time. To do so, a typical deep neural network often needs millions of numerical parameters, stored in memory. In this research, we explore techniques for reducing the storage requirements for neural network parameters. We propose software methods that convert 32-bit neural network parameters to values that can be stored using fewer bits. Our methods also convert a majority of numerical parameters to zero. Using special storage methods that only require storage of non-zero parameters, we gain significant compression benefits. On typical benchmarks like LeNet-300-100 (MNIST dataset), LeNet-5 (MNIST dataset), AlexNet (CIFAR-10 dataset) and VGG-16 (CIFAR-10 dataset), our methods can achieve up to 57.22x, 50.19x, 13.15x and 13.53x compression respectively. Storage benefits are achieved at the cost of classification accuracy, and we present our work in the light of the accuracy-compression trade-off. Deep Neural Networks Quantization Pruning Fixed-Point
283	Towards a Resource Efficient Framework for Distributed Deep Learning Applications Han, Jingoo 24 August 2022 (has links) Distributed deep learning has achieved tremendous success for solving scientific problems in research and discovery over the past years. Deep learning training is quite challenging because it requires training on large-scale massive dataset, especially with graphics processing units (GPUs) in latest high-performance computing (HPC) supercomputing systems. HPC architectures bring different performance trends in training throughput compared to the existing studies. Multiple GPUs and high-speed interconnect are used for distributed deep learning on HPC systems. Extant distributed deep learning systems are designed for non-HPC systems without considering efficiency, leading to under-utilization of expensive HPC hardware. In addition, increasing resource heterogeneity has a negative effect on resource efficiency in distributed deep learning methods including federated learning. Thus, it is important to focus on an increasing demand for both high performance and high resource efficiency for distributed deep learning systems, including latest HPC systems and federated learning systems. In this dissertation, we explore and design novel methods and frameworks to improve resource efficiency of distributed deep learning training. We address the following five important topics: performance analysis on deep learning for supercomputers, GPU-aware deep learning job scheduling, topology-aware virtual GPU training, heterogeneity-aware adaptive scheduling, and token-based incentive algorithm. In the first chapter (Chapter 3), we explore and focus on analyzing performance trend of distributed deep learning on latest HPC systems such as Summitdev supercomputer at Oak Ridge National Laboratory. We provide insights by conducting a comprehensive performance study on how deep learning workloads have effects on the performance of HPC systems with large-scale parallel processing capabilities. In the second part (Chapter 4), we design and develop a novel deep learning job scheduler MARBLE, which considers efficiency of GPU resource based on non-linear scalability of GPUs in a single node and improves GPU utilization by sharing GPUs with multiple deep learning training workloads. The third part of this dissertation (Chapter 5) proposes topology-aware virtual GPU training systems TOPAZ, specifically designed for distributed deep learning on recent HPC systems. In the fourth chapter (Chapter 6), we conduct exploration on an innovative holistic federated learning scheduling that employs a heterogeneity-aware adaptive selection method for improving resource efficiency and accuracy performance, coupled with resource usage profiling and accuracy monitoring to achieve multiple goals. In the fifth part of this dissertation (Chapter 7), we are focused on how to provide incentives to participants according to contribution for reaching high performance of final federated model, while tokens are used as a means of paying for the services of providing participants and the training infrastructure. / Doctor of Philosophy / Distributed deep learning is widely used for solving critical scientific problems with massive datasets. However, to accelerate the scientific discovery, resource efficiency is also important for the deployment on real-world systems, such as high-performance computing (HPC) systems. Deployment of existing deep learning applications on these distributed systems may lead to underutilization of HPC hardware resources. In addition, extreme resource heterogeneity has negative effects on distributed deep learning training. However, much of the prior work has not focused on specific challenges in distributed deep learning including HPC systems and heterogeneous federated systems, in terms of optimizing resource utilization.This dissertation addresses the challenges in improving resource efficiency of distributed deep learning applications, through performance analysis on deep learning for supercomputers, GPU-aware deep learning job scheduling, topology-aware virtual GPU training, and heterogeneity-aware adaptive federated learning scheduling and incentive algorithms. Deep Learning Federated Learning HPC Distributed Systems
284	A Comparison of Image Classification with Different Activation Functions in Balanced and Unbalanced Datasets Zhang, Moqi 04 June 2021 (has links) When the novel coronavirus (COVID-19) outbreak began to ring alarm bells worldwide, rapid, efficient diagnosis was critical to the emergency response. The limited ability of medical systems and the increasing number of daily cases pushed researchers to investigate automated models. The use of deep neural networks to help doctors make the correct diagnosis has dramatically reduced the pressure on the healthcare system. Promoting the improvement of diagnosis networks depends not only on the network structure design but also on the activation function performance. To identify an optimal activation function, this study investigates the correlation between the activation function selection and image classification performance in balanced or imbalanced datasets. Our analysis evaluates various network architectures for both commonly used and novel datasets and presents a comprehensive analysis of ten widely used activation functions. The experimental results show that the swish and softplus functions enhance the classification ability of state-of-the-art networks. Finally, this thesis distinguishes the neural networks using ten activation functions, analyzes their pros and cons, and puts forward detailed suggestions on choosing appropriate activation functions in future work. / Master of Science / When the novel coronavirus (COVID-19) outbreak began to ring alarm bells worldwide, the rapid, efficient diagnosis was critical to the emergency response. The manual diagnosis of chest X-rays by radiologists is time and cost-consuming. Compared with traditional diagnostic technology, the artificial intelligence medical system can simultaneously analyze and diagnose hundreds of medical images and speedily obtain high precision and high-efficiency returns. As we all know, machines are brilliant in learning new things and never sleep. Suppose machines can be used to replace human beings in some positions. In that case, it can significantly relieve the pressure on the medical system and buy time for medical practitioners to concentrate more on the research of new technologies. We need to know that the critical decision unit of the intelligent diagnosis system is the activation function. Therefore, this work provides an in-depth evaluation and comparison of the traditional and widely used activation functions with the emerging activation functions, which helps to improve the accuracy of the most advanced diagnostic model on the COVID-19 image dataset. Besides, the results of this study also summarize the cons and pros of using various neural functions and provide many suggestions for future work. deep learning activation function COVID-19
285	Capsule Networks: Framework and Application to Disentanglement for Generative Models Moghimi, Zahra 30 June 2021 (has links) Generative models are one of the most prominent components of unsupervised learning models that have a plethora of applications in various domains such as image-to-image translation, video prediction, and generating synthetic data where accessing real data is expensive, unethical, or compromising privacy. One of the main challenges in designing a generative model is creating a disentangled representation of generative factors which gives control over various characteristics of the generated data. Since the architecture of variational autoencoders is centered around latent variables and their objective function directly governs the generative factors, they are the perfect choice for creating a more disentangled representation. However, these architectures generate samples that are blurry and of lower quality compared to other state-of-the-art generative models such as generative adversarial networks. Thus, we attempt to increase the disentanglement of latent variables in variational autoencoders without compromising the generated image quality. In this thesis, a novel generative model based on capsule networks and a variational autoencoder is proposed. Motivated by the concept of capsule neural networks and their vectorized output, these structures are employed to create a disentangled representation of latent features in variational autoencoders. In particular, the proposed structure, called CapsuleVAE, utilizes a capsule encoder whose vector outputs can translate to latent variables in a meaningful way. It is shown that CapsuleVAE generates results that are sharper and more diverse based on FID score and a metric inspired by the inception score. Furthermore, two different methods for training CapsuleVAE are proposed, and the generated results are investigated. In the first method, an objective function with regularization is proposed, and the optimal regularization hyperparameter is derived. In the second method, called sequential optimization, a novel training technique for training CapsuleVAE is proposed and the results are compared to the first method. Moreover, a novel metric for measuring disentanglement in latent variables is introduced. Based on this metric, it is shown that the proposed CapsuleVAE creates more disentangled representations. In summary, our proposed generative model enhances the disentanglement of latent variables which contributes to the model's generalizing well to new tasks and more control over the generated data. Our model also increases the generated image quality which addresses a common disadvantage in variational autoencoders. / Master of Science / Generative models are algorithms that, given a large enough initial dataset, create data points (such as images) similar to the initial dataset from random input numbers. These algorithms have various applications in different fields, such as generating synthetic healthcare data, wireless systems data generation in extreme or rare conditions, generating high-resolution, colorful images from grey-scale photos or sketches, and in general, generating synthetic data for applications where obtaining real data is expensive, inaccessible, unethical, or compromising privacy. Some generative models create a representation for the data and divide it into several ``generative factors". Researchers have shown that a better data representation is one where the generative factors are ``disentangled", meaning that each generative factor is responsible for only one particular feature in the generated data. Unfortunately, creating a model with disentangled generative factors sacrifices the image quality. In this work, we design a generative model that enhances the disentanglement of generative factors without compromising the quality of the generated images. In order to design a generative model with more disentangled generative factors, we employ capsule networks in the architecture of the generative model. Capsule networks are algorithms that classify the inputted information into different categories. We show that by using capsule networks, our designed generative model achieves higher performance in the quality of the generated images and creates a more disentangled representation of generative factors. Deep Learning Generative models Capsule Networks Disentanglement
286	Attention-based LSTM network for rumor veracity estimation of tweets Singh, J.P., Kumar, A., Rana, Nripendra P., Dwivedi, Y.K. 12 August 2020 (has links) Yes / Twitter has become a fertile place for rumors, as information can spread to a large number of people immediately. Rumors can mislead public opinion, weaken social order, decrease the legitimacy of government, and lead to a significant threat to social stability. Therefore, timely detection and debunking rumor are urgently needed. In this work, we proposed an Attention-based Long-Short Term Memory (LSTM) network that uses tweet text with thirteen different linguistic and user features to distinguish rumor and non-rumor tweets. The performance of the proposed Attention-based LSTM model is compared with several conventional machine and deep learning models. The proposed Attention-based LSTM model achieved an F1-score of 0.88 in classifying rumor and non-rumor tweets, which is better than the state-of-the-art results. The proposed system can reduce the impact of rumors on society and weaken the loss of life, money, and build the firm trust of users with social media platforms. Rumor Rumour Twitter Deep learning Machine learning
287	Camera-based Recovery of Cardiovascular Signals from Unconstrained Face Videos Using an Attention Network Deshpande, Yogesh Rajan 22 June 2023 (has links) This work addresses the problem of recovering the morphology of blood volume pulse (BVP) information from a video of a person's face. Video-based remote plethysmography methods have shown promising results in estimating vital signs such as heart rate and breathing rate. However, recovering the instantaneous pulse rate signals is still a challenge for the community. This is due to the fact that most of the previous methods concentrate on capturing the temporal average of the cardiovascular signals. In contrast, we present an approach in which BVP signals are extracted with a focus on the recovery of the signal morphology as a generalized form for the computation of physiological metrics. We also place emphasis on allowing natural movements by the subject. Furthermore, our system is capable of extracting individual BVP instances with sufficient signal detail to facilitate candidate re-identification. These improvements have resulted in part from the incorporation of a robust skin-detection module into the overall imaging-based photoplethysmography (iPPG) framework. We present extensive experimental results using the challenging UBFC-Phys dataset and the well-known COHFACE dataset. The source code is available at https://github.com/yogeshd21/CVPM-2023-iPPG-Paper. / Master of Science / In this work we are trying to study and recover human health related metrics and the physiological signals which are at the core for the derivation of such metrics. A well know form of physiological signals is ECG (Electrocardiogram) signals and for our research we work with BVP (Blood Volume Pulse) signals. With this work we are proposing a Deep Learning based model for non-invasive retrieval of human physiological signals from human face videos. Most of the state of the art models as well as researchers try to recover averaged cardiac pulse based metrics like heart rate, breathing rate, etc. without focusing on the details of the recovered physiological signal. Physiological signals like BVP have details like systolic peak, diastolic peak and dicrotic notch, and these signals also have applications in various domains like human mental health study, emotional stimuli study, etc. Hence with this work we focus on retrieval of the morphology of such physiological signals and present a quantitative as well as qualitative results for the same. An efficient attention based deep learning model is presented and scope of reidentification using the retrieved signals is also explored. Along with significant implementations like skin detection model our proposed architecture also shows better performance than state of the art models for two very challenging datasets UBFC-Phys as well as COHFACE. The source code is available at https://github.com/yogeshd21/CVPM-2023-iPPG-Paper. Deep Learning Remote Photoplethysmograph (iPPG) Biometrics
288	Grounding deep models of visual data Bargal, Sarah Adel 21 February 2019 (has links) Deep models are state-of-the-art for many computer vision tasks including object classification, action recognition, and captioning. As Artificial Intelligence systems that utilize deep models are becoming ubiquitous, it is also becoming crucial to explain why they make certain decisions: Grounding model decisions. In this thesis, we study: 1) Improving Model Classification. We show that by utilizing web action images along with videos in training for action recognition, significant performance boosts of convolutional models can be achieved. Without explicit grounding, labeled web action images tend to contain discriminative action poses, which highlight discriminative portions of a video’s temporal progression. 2) Spatial Grounding. We visualize spatial evidence of deep model predictions using a discriminative top-down attention mechanism, called Excitation Backprop. We show how such visualizations are equally informative for correct and incorrect model predictions, and highlight the shift of focus when different training strategies are adopted. 3) Spatial Grounding for Improving Model Classification at Training Time. We propose a guided dropout regularizer for deep networks based on the evidence of a network prediction. This approach penalizes neurons that are most relevant for model prediction. By dropping such high-saliency neurons, the network is forced to learn alternative paths in order to maintain loss minimization. We demonstrate better generalization ability, an increased utilization of network neurons, and a higher resilience to network compression. 4) Spatial Grounding for Improving Model Classification at Test Time. We propose Guided Zoom, an approach that utilizes spatial grounding to make more informed predictions at test time. Guided Zoom compares the evidence used to make a preliminary decision with the evidence of correctly classified training examples to ensure evidenceprediction consistency, otherwise refines the prediction. We demonstrate accuracy gains for fine-grained classification. 5) Spatiotemporal Grounding. We devise a formulation that simultaneously grounds evidence in space and time, in a single pass, using top-down saliency. We visualize the spatiotemporal cues that contribute to a deep recurrent neural network’s classification/captioning output. Based on these spatiotemporal cues, we are able to localize segments within a video that correspond with a specific action, or phrase from a caption, without explicitly optimizing/training for these tasks. Computer science Deep learning Grounding Visual data
289	A study of shear behavior of reinforced concrete deep beams Nguyen, Phu Trong, active 21st century 25 November 2014 (has links) Reinforced concrete deep beams are vital structural members serving as load transferring elements. The behavior of reinforced concrete deep beams is complex. Nonlinear distribution of strain and stress must be considered. Prior to 1999, ACI 318 Codes included an empirical design equation for reinforced concrete deep beams. Since 2002, the strut and tie model and nonlinear analysis have been required. However, both methods have disadvantages of complexity or lack of transparency. The objective of this study is to produce a simple, reliable design equation for reinforced concrete deep beams. A nonlinear finite element program, ATENA, was used for analyzing and predicting the behavior of concrete and reinforced concrete structures. First, applicability of ATENA was verified by developing the computer models of simply supported and two span continuous deep beams based on Birrcher’s tests of simply supported deep beams. Tests by Rogowsky and Macgregor and by Ashour are the basis for the models of continuous two span deep beams. Those tests were selected because the researchers reported adequate details of the experimental program and on specimen behavior. Then a series of simply supported and two span continuous deep beam models were developed based on the details and geometry of Birrcher's beams. The computer models were used to investigate the following parameters: the compressive strength of concrete, shear span to depth ratios, longitudinal reinforcement ratios, web reinforcement, effect of member depth, and loading conditions. Finally, a proposed design equation for shear strength of reinforced concrete deep beams was derived based on the observed the behavior of reinforced concrete deep beam tests, the results of the analytical study, and a plastic truss model. The proposed equations were in good agreement with test values and provide an alternate approach to current design procedures for deep beams. / text Deep beams Reinforced concrete Simply supported Continuous deep beams ATENA Shear behavior
290	Deep Boltzmann machines as hierarchical generative models of perceptual inference in the cortex Reichert, David Paul January 2012 (has links) The mammalian neocortex is integral to all aspects of cognition, in particular perception across all sensory modalities. Whether computational principles can be identified that would explain why the cortex is so versatile and capable of adapting to various inputs is not clear. One well-known hypothesis is that the cortex implements a generative model, actively synthesising internal explanations of the sensory input. This ‘analysis by synthesis’ could be instantiated in the top-down connections in the hierarchy of cortical regions, and allow the cortex to evaluate its internal model and thus learn good representations of sensory input over time. Few computational models however exist that implement these principles. In this thesis, we investigate the deep Boltzmann machine (DBM) as a model of analysis by synthesis in the cortex, and demonstrate how three distinct perceptual phenomena can be interpreted in this light: visual hallucinations, bistable perception, and object-based attention. A common thread is that in all cases, the internally synthesised explanations go beyond, or deviate from, what is in the visual input. The DBM was recently introduced in machine learning, but combines several properties of interest for biological application. It constitutes a hierarchical generative model and carries both the semantics of a connectionist neural network and a probabilistic model. Thus, we can consider neuronal mechanisms but also (approximate) probabilistic inference, which has been proposed to underlie cortical processing, and contribute to the ongoing discussion concerning probabilistic or Bayesian models of cognition. Concretely, making use of the model’s capability to synthesise internal representations of sensory input, we model complex visual hallucinations resulting from loss of vision in Charles Bonnet syndrome.We demonstrate that homeostatic regulation of neuronal firing could be the underlying cause, reproduce various aspects of the syndrome, and examine a role for the neuromodulator acetylcholine. Next, we relate bistable perception to approximate, sampling-based probabilistic inference, and show how neuronal adaptation can be incorporated by providing a biological interpretation for a recently developed sampling algorithm. Finally, we explore how analysis by synthesis could be related to attentional feedback processing, employing the generative aspect of the DBM to implement a form of object-based attention. We thus present a model that uniquely combines several computational principles (sampling, neural processing, unsupervised learning) and is general enough to uniquely address a range of distinct perceptual phenomena. The connection to machine learning ensures theoretical grounding and practical evaluation of the underlying principles. Our results lend further credence to the hypothesis of a generative model in the brain, and promise fruitful interaction between neuroscience and Deep Learning approaches.

Search results