31 |
Estimation of Predictive Uncertainty in the Supervised Segmentation of Magnetic Resonance Imaging (MRI) Diffusion Images Using Deep Ensemble Learning / ESTIMATING PREDICTIVE UNCERTAINTY IN DEEP LEARNING SEGMENTATION FOR DIFFUSION MRIMcCrindle, Brian January 2021 (has links)
With the desired deployment of Artificial Intelligence (AI), concerns over whether AI can “communicate” why it has made its decisions is of particular importance. In this thesis, we utilize predictive entropy (PE) as an surrogate for predictive uncertainty and report it for various test-time conditions that alter the testing distribution. This is done to evaluate the potential for PE to indicate when users should trust or dis- trust model predictions under dataset shift or out-of-distribution (OOD) conditions, two scenarios that are prevalent in real-world settings. Specifically, we trained an ensemble of three 2D-UNet architectures to segment synthetically damaged regions in fractional anisotropy scalar maps, a widely used diffusion metric to indicate mi- crostructural white-matter damage. Baseline ensemble statistics report that the true positive rate, false negative rate, false positive rate, true negative rate, Dice score, and precision are 0.91, 0.091, 0.23, 0.77, 0.85, and 0.80, respectively. Test-time PE was reported before and after the ensemble was exposed to increasing geometric distortions (OOD), adversarial examples (OOD), and decreasing signal-to-noise ratios (dataset shift). We observed that even though PE shows a strong negative correlation with model performance for increasing adversarial severity (ρAE = −1), this correlation is not seen under distortion or SNR conditions (ρD = −0.26, ρSNR = −0.30). However, the PE variability (PE-Std) between individual model predictions was shown to be a better indicator of uncertainty as strong negative correlations between model performance and PE-Std were seen during geometric distortions and adversarial ex- amples (ρD = −0.83, ρAE = −1). Unfortunately, PE fails to report large absolute uncertainties during these conditions, thus restricting the analysis to correlative relationships. Finally, determining an uncertainty threshold between “certain” and “uncertain” model predictions was seen to be heavily dependant on model calibra- tion. For augmentation conditions close to the training distribution, a single threshold could be hypothesized. However, caution must be taken if such a technique is clinically applied, as model miscalibration could nullify such a threshold for samples far from the distribution. To ensure that PE or PE-Std could be used more broadly for uncertainty estimation, further work must be completed. / Thesis / Master of Applied Science (MASc)
|
32 |
Developing Deep Learning Tools in Earthquake Detection and Phase PickingMai, Hao 31 August 2023 (has links)
With the rapid growth of seismic data volumes, traditional automated processing methods, which have been in use for decades, face increasing challenges in handling these data, especially in noisy environments. Deep learning (DL) methods, due to their ability to handle large datasets and perform well in complex scenarios, offer promising solutions to these challenges. When I started my Ph.D. degree, although a sizeable number of researchers were beginning to explore the application of deep learning in seismology, almost no one was involved in the development of much-needed automated data annotation tools and deep learning training platforms for this field. In other rapidly evolving fields of artificial intelligence, such automated tools and platforms are often a prerequisite and critical to advancing the development of deep learning. Motivated by this gap, my Ph.D. research focuses on creating these essential tools and conducting critical investigations in the field of earthquake detection and phase picking using DL methods. The first research chapter introduces QuakeLabeler, an open-source Python toolbox that facilitates the efficient creation and management of seismic training datasets. This tool aims to address the laborious process of producing training labels in the vast amount of seismic data available today. Building on this foundational tool, the second research chapter presents Blockly Earthquake Transformer (BET), a deep learning platform that provides an interactive dashboard for efficient customization of deep learning phase pickers. BET aims to optimize the performance of seismic event detection and phase picking by allowing easy customization of model parameters and providing extensions for transfer learning and fine-tuning. The third and final research chapter investigates the performance of DL pickers by examining the effect of training data size and deployment settings on phase picking accuracy. This investigation provides insight into the optimal size of training datasets, the suitability of DL pickers for new target regions, and the impact of various factors on training and on model performance. Through the development of these tools and investigations, this thesis contributes to the application of DL in seismology, paving the way for more efficient seismic data processing, customizable model creation, and a better understanding of DL model performance in earthquake detection and phase-picking tasks.
|
33 |
Deep Learning on the Edge: Model Partitioning, Caching, and CompressionFang, Yihao January 2020 (has links)
With the recent advancement in deep learning, there has been increasing interest to apply deep learning algorithms to mobile edge devices (e.g. wireless access points, mobile phones, and self-driving vehicles). Such devices are closer to end-users and data sources compared to cloud data centers, therefore deep learning on the edge leads to several merits: 1) reduce communication overhead (e.g. latency), 2) preserve data privacy (e.g. not leaking sensitive information to cloud service providers), and 3) promote autonomy without the need of continuous network connectivity. However, it also comes with a trade-off that deep learning on the edge often results in less prediction accuracy or longer inference time. How to optimize such a trade-off has drawn a lot of attention among the machine learning and systems research communities. Those communities have explored three main directions: partitioning, caching, and compression to solve the problem.
Deep learning model partitioning works in distributed and parallel computing by leveraging computation units (e.g. edge nodes and end devices) of different capabilities to achieve the best of both worlds (accuracy and latency), but the inference time of partitioning is nevertheless lower bounded by the smallest of inference times on edge nodes (or end devices).
In contrast, model caching is not limited by such a lower bound. There are two trends of studies in caching, 1) caching the prediction results on the edge node or end device, and 2) caching a partition or less complex model on the edge node or end device. Caching the prediction results usually compromises accuracy, since a mapping function (e.g. a hash function) from the inputs to the cached results often cannot match a complex function given by a full-size neural network. On the other hand, caching a model's partition does not sacrifice accuracy, if we employ a proper partition selection policy.
Model compression reduces deep learning model size by e.g. pruning neural network edges or quantizing network parameters. A reduced model has a smaller size and fewer operations to compute on the edge nodes or end device. However, compression usually sacrifices prediction accuracy in exchange for shorter inference time.
In this thesis, our contributions to partitioning, caching, and compression are covered with experiments on state-of-the-art deep learning models. In partitioning, we propose TeamNet based on competitive and selective learning schemes. Experiments using MNIST and CIFAR-10 datasets show that on Raspberry Pi and Jetson TX2 (with TensorFlow), TeamNet shortens neural network inference as much as 53% without compromising predictive accuracy.
In caching, we propose CacheNet, which caches low-complexity models on end devices and high-complexity (or full) models on edge or cloud servers. Experiments using CIFAR-10 and FVG have shown on Raspberry Pi, Jetson Nano, and Jetson TX2 (with TensorFlow Lite and NCNN), CacheNet is 58-217% faster than baseline approaches that run inference tasks on end devices or edge servers alone.
In compression, we propose the logographic subword model for compression in machine translation. Experiments demonstrate that in the tasks of English-Chinese/Chinese-English translation, logographic subword model reduces training and inference time by 11-77% with Theano and Torch. We demonstrate our approaches are promising for applying deep learning models on the mobile edge. / Thesis / Doctor of Philosophy (PhD) / Edge artificial intelligence (EI) has attracted much attention in recent years. EI is a new computing paradigm where artificial intelligence (e.g. deep learning) algorithms are distributed among edge nodes and end devices of computer networks. There are many merits in EI such as shorter latency, better privacy, and autonomy. These advantages motivate us to contribute to EI by developing intelligent solutions including partitioning, caching, and compression.
|
34 |
Reevaluating the Ventral and Lateral Temporal Neural Pathways in Face Processing: Deep Learning Insights into Face Identity and Facial Expression MechanismsSchwartz, Emily January 2024 (has links)
Thesis advisor: Stefano Anzellotti / There has been much debate over how the functional organization of vision develops. Contemporary theories that are inspired by analyzing neural data with machine learning models have led to new insights in understanding brain organization. Given the evolutionary importance of face perception and the specialized mechanisms that have evolved to support evaluating it, examining faces offers a unique way to study a dedicated mechanism that shares much of its organization in ventral and lateral neural pathways with other social stimuli, and provide insight into a more general principle of the organization of social perception. According to a classical view of face perception (Bruce and Young, 1986; Haxby, Hoffman, and Gobbini, 2000), face identity and facial expression recognition are performed by separate neural substrates (ventral and lateral temporal face-selective regions, respectively). However, recent studies challenge this view, showing that expression valence can also be decoded from ventral regions (Skerry and Saxe, 2014; Li, Richardson, and Ghuman, 2019) and identity from lateral regions (Anzellotti and Caramazza, 2017). These recent findings have inspired the formulation of an alternative hypothesis. From a computational perspective, it may be possible to process face identity and facial expression jointly by disentangling information for the two properties. This hypothesis was tested using deep convolutional neural network (DCNN) models as a proof of principle. Subsequently, this is then followed by evaluating the representational content of static face stimuli within ventral and lateral temporal face- selective regions using intracranial electroencephalography (iEEG). This is then extended to investigating the representation content of dynamic faces within these regions using functional magnetic resonance imaging (fMRI). The results reported here as well as the reviewed literature may help to support the reevaluation of the roles the ventral and lateral temporal neural pathways play in processing socially-relevant stimuli. / Thesis (PhD) — Boston College, 2024. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Psychology and Neuroscience.
|
35 |
A Naturalistic Driving Study for Lane Change Detection and PersonalizationLakhkar, Radhika Anandrao 05 January 2023 (has links)
Driver Assistance and Autonomous Driving features are becoming nearly ubiquitous in new vehicles. The intent of the Driver Assistant features is to assist the driver in making safer decisions. The intent of Autonomous Driving features is to execute vehicle maneuvers, without human intervention, in a safe manner. The overall goal of Driver Assistance and Autonomous Driving features is to reduce accidents, injuries, and deaths with a comforting driving experience. However, different drivers can react differently to advanced automated driving technology. It is therefore important to consider and improve the adaptability of these advances based on driver behavior.
In this thesis, a human-centric approach is adopted in order to provide an enriching driving experience. The thesis investigates the natural behavior of drivers when changing lanes in terms of preferences of vehicle kinematics parameters using a real-world driving dataset collected as part of the Second Strategic Highway Research Program (SHRP2). The SHRP2 Naturalistic Driving Study (NDS) set is mined for lane change events. This work develops a way to detect reliable lane changing instances from a huge NDS dataset with more than 5,400,000 data files. The lane changing instances are distinguished from noisy and erroneous data by using machine vision lane tracking system variables such as left lane marker probability and right lane marker probability. We have shown that detected lane changing instances can be validated using only vehicle kinematics data.
Kinematic vehicle parameters such as vehicle speed, lateral displacement, lateral acceleration, steering wheel angle, and lane change duration are then extracted and examined from time series data to characterize these lane-changing instances for a given driver. We have shown how these vehicle kinematic parameters change and exhibit patterns during lane change maneuvers for a specific driver. The thesis shows the limitations of analyzing vehicle kinematic parameters separately and develops a novel metric, Lane Change Dynamic Score(LCDS) that shows the collective effect of these vehicle kinematic parameters. LCDS is used to classify each lane change and thereby different driving styles. / Master of Science / The current tendency of car manufacturers is to create vehicles that will offer the user the most comfortable ride possible. The user experience is given a lot of attention to ensure it is up to par. With technological advancements, we are moving closer to an era in which automobiles perform many functions autonomously. However, different drivers may react differently to highly automated driving technologies. Therefore, adapting to different driving styles is critical to increasing the acceptance of autonomous vehicle features. In this work, we examine one of the stressful maneuvers of lane changes. The analysis of various drivers' lane-changing behaviors and the value of personalization are the main subjects of this study based on actual driving scenarios. To achieve this, we have provided an algorithm to identify occurrences of lane-changing from real driving trip data files. Following that, we investigated parameters such as lane change duration, vehicle speed, displacement, acceleration, and steering wheel angle when changing lanes. We have demonstrated the patterns and changes in these vehicle kinematic characteristics that occur when a particular driver performs lane change operations. The thesis shows the limitations of analyzing vehicle kinematic parameters separately and develops a novel metric, Lane Change Dynamic Score(LCDS) that shows the collective effect of these vehicle kinematic parameters. LCDS is used to classify each lane change and thereby different driving styles.
|
36 |
Deep Learning-Driven Modeling of Dynamic Acoustic Sensing in Biommetic Soft Robotic PinnaeChakrabarti, Sounak 02 October 2024 (has links)
Bats possess remarkably sophisticated biosonar systems that seamlessly integrate the physical encoding of information through intricate ear motions with the neural extraction and processsing of sensory information. While previous studies have endeavored to mimic the pinna (outer ear) dynamics of bats using fixed deformation patterns in biomimetic soft-robotic sonar heads, such physical approaches are inherently limited in their ability to comprehensively explore the vast actuation pattern space that may enable bats to adaptively sense across diverse environments and tasks.To overcome these limitations, this thesis presents the development of deep regression neural networks capable of predicting the beampattern (acoustic radiation pattern) of a soft-robotic pinna as function of its actuator states. The pinna model geometry is derived from a tomographic scan of the right ear of the greater horseshoe bat (textit{Rhinolophus ferrumequinum}. Three virtual actuators are incorporated into this model to simulate a range of shape deformations. For each unique actuation pattern producing a distinct pinna shape conformation, the corresponding ultrasonic beampattern is numerically estimated using a frequency-domain boundary element method (BEM) simulation, providing ground truth data. Two neural networks architectures, a multilayer perceptron (MLP) and a radial basis function network (RBFN) based on von Mises functions were evaluated for their ability to accurately reproduce these numerical beampattern estimates as a function of spherical coordinates azimuth and elevation. Both networks demonstrate comparably low errors in replicating the beampattern data. However, the MLP exhibits significantly higher computational efficiency, reducing training time by 7.4 seconds and inference time by 0.7 seconds compared to the RBFN. The superior computational performance of deep neural network models in inferring biomimetic pinna beampatterns from actuator states enables an extensive exploration of the vast actuation pattern space to identify pinna actuation patterns optimally suited for specific biosonar sensing tasks. This simulation-based approach provides a powerful framework for elucidating the functional principles underlying the dynamic shape adaptations observed in bat biosonar systems. / Master of Science / The aim is to understand how bats can dynamically change the shape of their outer ears (pinnae) to optimally detect sounds in different environments and for different tasks. Previous studies tried to mimic bat ear motions using fixed deformation patterns in robotic ear models, but this approach is limited. Instead this thesis uses deep learning neural networks to predict how changing the shape of a robotic bat pinna model affects its acoustic beampattern (how it radiates and receives sound). The pinna geometry is based on a 3D scan of a greater horseshoe bat ear, with three virtual "actuators" to deform the shape. For many different actuator patterns deforming the pinna, the resulting beampattern is calculated using computer simulations. Neural networks ( multilayer perceptron and radial basis function network) are trained on this data to accurately predict the beampattern from the actuator states. The multilayer perceptron network is found to be significantly more computationally efficient for this task. This neural network based approach allows rapidly exploring the vast range of possible pinna actuations to identify optimal shapes for specific biosonar sensing tasks, shedding light on principles of dynamic ear shape control in bats.
|
37 |
Robot Motions that Mitigate UncertaintyToubeh, Maymoonah 23 October 2024 (has links)
This dissertation addresses the challenge of robot decision making in the presence of uncertainty, specifically focusing on robot motion decisions in the context of deep learning-based perception uncertainty. The first part of this dissertation introduces a risk-aware framework for path planning and assignment of multiple robots and multiple demands in unknown environments. The second part introduces a risk-aware motion model for searching for a target object in an unknown environment. To illustrate practical application, consider a situation such as disaster response or search-and-rescue, where it is imperative for ground vehicles to swiftly reach critical locations. Afterward, an agent deployed at a specified location must navigate inside a building to find a target, whether it is an object or a person. In the first problem, the terrain information is only available as an aerial georeferenced image frame. Semantic segmentation of the aerial images is performed using Bayesian deep learning techniques, creating a cost map for the safe navigation ground robots. The proposed framework also accounts for risk at a further level, using conditional value at risk (CVaR), for making risk-aware assignments between the source and goal. When the robot reaches its destination, the second problem addresses the object search task using a proposed machine learning-based intelligent motion model. A comparison of various motion models, including a simple greedy baseline, indicates that the proposed model yields more risk-aware and robust results. All in all, considering uncertainty in both systems leads to demonstrably safer decisions. / Doctor of Philosophy / Scientists need to demonstrate that robots are safe and reliable outside of controlled lab environments for real-world applications to be viable. This dissertation addresses the challenge of robot decision-making in the face of uncertainty, specifically focusing on robot motion decisions in the context of deep learning-based perception uncertainty. Deep learning (DL) refers to using large hierarchical structures, often called neural networks, to approximate semantic information from input data.
The first part of this dissertation introduces a risk-aware framework for path planning and assignment of multiple robots and multiple demands in unknown environments. Path planning involves finding a route from the source to the goal, while assignment focuses on selecting source-goal paths to fulfill all demands. The second part introduces a risk-aware motion model for searching for a target object in an unknown environment. Being risk-aware in both cases means taking uncertainty into account. To illustrate practical application, consider a situation such as disaster response or search-and-rescue, where it is imperative for ground vehicles to swiftly reach critical locations. Afterward, an agent deployed at a specified location must navigate inside a building to find a target, whether it is an object or a person.
In this dissertation, deep learning is used to interpret image inputs for two distinct robot systems. The input to the first system is an aerial georeferenced image; the second is an indoor scene. After the images are interpreted by deep learning, they undergo further processing to extract information about uncertainty. The information about the image and the uncertainty is used for later processing. In the first case, we use both a traditional path planning method and a novel path assignment method to assign one path from each source to a demand location. In the second case, a motion model is developed using image data, uncertainty, and position in relation to the anticipated target. Several potential motion models are compared for analysis. All in all, considering uncertainty in both systems leads to demonstrably safer decisions.
|
38 |
Integrating Multiple Modalities into Deep Learning NetworkMcNeil, Patrick 01 January 2017 (has links)
Deep learning networks in the literature traditionally only used a single input modality (or data stream). Integrating multiple modalities into deep learning networks with the goal of correlating extracted features was a major issue. Traditional methods involved treating each modality separately and then writing custom code to combine the extracted features. Current solutions for small numbers of modalities (three or less) showed there are multiple architectures for modality integration. With an increase in the number of modalities, the “curse of dimensionality” affects the performance of the system. The research showed current methods for larger scale integrations required separate, custom created modules with another integration layer outside the deep learning network. These current solutions do not scale well nor provide good generalized performance. This research report studied architectures using multiple modalities and the creation of a scalable and efficient architecture.
|
39 |
A Deep Learning Approach To Target Recognition In Side-Scan Sonar ImageryUnknown Date (has links)
Automatic target recognition capabilities in autonomous underwater vehicles has
been a daunting task, largely due to the noisy nature of sonar imagery and due to the lack
of publicly available sonar data. Machine learning techniques have made great strides in
tackling this feat, although not much research has been done regarding deep learning
techniques for side-scan sonar imagery. Here, a state-of-the-art deep learning object
detection method is adapted for side-scan sonar imagery, with results supporting a simple
yet robust method to detect objects/anomalies along the seabed. A systematic procedure
was employed in transfer learning a pre-trained convolutional neural network in order to
learn the pixel-intensity based features of seafloor anomalies in sonar images. Using this
process, newly trained convolutional neural network models were produced using
relatively small training datasets and tested to show reasonably accurate anomaly
detection and classification with little to no false alarms. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection
|
40 |
IMPROVING THE REALISM OF SYNTHETIC IMAGES THROUGH THE MIXTURE OF ADVERSARIAL AND PERCEPTUAL LOSSESAtapattu, Charith Nisanka 01 December 2018 (has links)
This research is describing a novel method to generate realism improved synthetic images while preserving annotation information and the eye gaze direction. Furthermore, it describes how the perceptual loss can be utilized while introducing basic features and techniques from adversarial networks for better results.
|
Page generated in 0.0641 seconds