Global ETD Search

101	Developing Deep Learning Tools in Earthquake Detection and Phase Picking Mai, Hao 31 August 2023 (has links) With the rapid growth of seismic data volumes, traditional automated processing methods, which have been in use for decades, face increasing challenges in handling these data, especially in noisy environments. Deep learning (DL) methods, due to their ability to handle large datasets and perform well in complex scenarios, offer promising solutions to these challenges. When I started my Ph.D. degree, although a sizeable number of researchers were beginning to explore the application of deep learning in seismology, almost no one was involved in the development of much-needed automated data annotation tools and deep learning training platforms for this field. In other rapidly evolving fields of artificial intelligence, such automated tools and platforms are often a prerequisite and critical to advancing the development of deep learning. Motivated by this gap, my Ph.D. research focuses on creating these essential tools and conducting critical investigations in the field of earthquake detection and phase picking using DL methods. The first research chapter introduces QuakeLabeler, an open-source Python toolbox that facilitates the efficient creation and management of seismic training datasets. This tool aims to address the laborious process of producing training labels in the vast amount of seismic data available today. Building on this foundational tool, the second research chapter presents Blockly Earthquake Transformer (BET), a deep learning platform that provides an interactive dashboard for efficient customization of deep learning phase pickers. BET aims to optimize the performance of seismic event detection and phase picking by allowing easy customization of model parameters and providing extensions for transfer learning and fine-tuning. The third and final research chapter investigates the performance of DL pickers by examining the effect of training data size and deployment settings on phase picking accuracy. This investigation provides insight into the optimal size of training datasets, the suitability of DL pickers for new target regions, and the impact of various factors on training and on model performance. Through the development of these tools and investigations, this thesis contributes to the application of DL in seismology, paving the way for more efficient seismic data processing, customizable model creation, and a better understanding of DL model performance in earthquake detection and phase-picking tasks. Earthquake Detection Seismology Deep Learning
102	An ML-based Method for Efficient Network Utilization in Online Gaming Using 5G Network Slicing Saleh, Peyman 18 July 2023 (has links) Online video gaming has become a ubiquitous aspect of modern-day video gaming. It has gained immense popularity due to its accessibility and immersive experience, resulting in millions of players worldwide participating in various online games. Depending on the type of gameplay, the players’ quality of experience (QoE) in online video gaming can be significantly affected by network factors such as high bandwidth and low latency. As such, providers of online gaming services are competing to offer the highest quality of experience to their users at reasonable prices. To achieve this objective, online game providers face two main challenges. Firstly, they must accurately estimate the network throughput capacity required to meet the servers’ demands and ensure that the QoE is not compromised. Secondly, they must be able to secure the required throughput with network providers, which, in the current conventional network infrastructure, is neither agile nor dynamic. Thus, online game providers have to prepay for extra network throughput capacity or choose a cost-effective capacity that may result in potential QoE losses during peak usage. To address these challenges, this thesis proposes a deep neural network-based model that utilizes a QoE-aware loss function for predicting the future network throughput de- mand. The model can accurately estimate the network throughput capacity required to maintain QoE levels while minimizing the cost of network resources. By doing so, on- line game providers can achieve optimal network resource allocation and effectively meet servers’ demands. Furthermore, this thesis proposes a slice optimizer module that employs 5G network slicing and a machine learning model to optimize network slices in a cost-efficient manner that satisfies both the online game provider’s and the network provider’s requirements. This module can dynamically allocate network resources based on the game provider’s QoE requirements, the network provider’s resource availability, and the cost of network resources. As a result, online game providers can efficiently manage network resources, optimize network slicing, and effectively control the cost of network resources. 5G slicing Network ML Deep
103	A SURVEY OF THROMBOSIS SPECIALISTS ON THE PRACTICAL MANAGEMENT OF EXTENSIVE DEEP VEIN THROMBOSIS AND A PROTOCOL FOR A RANDOMIZED TRIAL Boonyawat, Kochawan January 2017 (has links) BACKGROUND: Though direct oral anticoagulants (DOACs) have become a standard of care in the treatment of acute deep vein thrombosis (DVT), it is our observation that physicians tend to initiate heparin or low-molecular-weight heparin, hereafter called “heparin”, for the treatment of extensive DVT or phlegmasia cerulea dolens (PCD). This might be due to a perception that heparin might relieve DVT-related symptoms more quickly than DOACs. Whether these assumptions are true has not been evaluated. METHODS: We conducted a survey of thrombosis specialists in North America to explore the practical management of anticoagulant therapy in patients with extensive DVT, and the underlying reasons for the selection of heparin over DOACs. A cross-sectional, web-based survey was distributed to thrombosis specialists who are members of four thrombosis societies. RESULTS: Eighty-nine respondents provided consent. Most respondents selected DOACs over heparin in a case scenario representing mild DVT-related symptoms and limited thrombus involvement (81% vs. 19%). Most respondents selected heparin over DOACs in a case scenario representing early stage PCD (84% vs.16.3%) or a patient with high bodyweight (72% vs. 28%). In a case scenario representing extensive DVT, 57.4% of the respondents selected heparin, whereas, 42.6% selected DOACs. In the respondents who selected heparin over DOACs, the major reason was that heparin might relieve DVT-related symptoms more quickly because of its anti-inflammatory effects. DISCUSSION: Severity of DVT-related symptoms, thrombus extent, and bodyweight play a role in the selection of anticoagulant therapy. Despite a lack of evidence to support the hypothesis with respect to which anticoagulant is superior, most thrombosis specialists selected heparin over DOACs in patients with severe DVT-related symptoms and extensive thrombus involvement. Observation of variations in the selection of anticoagulant therapy for the treatment of extensive DVT also indicates that clinical trials in this patient population are needed. / Thesis / Master of Science (MSc) Survey Extensive deep vein thrombosis
104	Deep Learning on the Edge: Model Partitioning, Caching, and Compression Fang, Yihao January 2020 (has links) With the recent advancement in deep learning, there has been increasing interest to apply deep learning algorithms to mobile edge devices (e.g. wireless access points, mobile phones, and self-driving vehicles). Such devices are closer to end-users and data sources compared to cloud data centers, therefore deep learning on the edge leads to several merits: 1) reduce communication overhead (e.g. latency), 2) preserve data privacy (e.g. not leaking sensitive information to cloud service providers), and 3) promote autonomy without the need of continuous network connectivity. However, it also comes with a trade-off that deep learning on the edge often results in less prediction accuracy or longer inference time. How to optimize such a trade-off has drawn a lot of attention among the machine learning and systems research communities. Those communities have explored three main directions: partitioning, caching, and compression to solve the problem. Deep learning model partitioning works in distributed and parallel computing by leveraging computation units (e.g. edge nodes and end devices) of different capabilities to achieve the best of both worlds (accuracy and latency), but the inference time of partitioning is nevertheless lower bounded by the smallest of inference times on edge nodes (or end devices). In contrast, model caching is not limited by such a lower bound. There are two trends of studies in caching, 1) caching the prediction results on the edge node or end device, and 2) caching a partition or less complex model on the edge node or end device. Caching the prediction results usually compromises accuracy, since a mapping function (e.g. a hash function) from the inputs to the cached results often cannot match a complex function given by a full-size neural network. On the other hand, caching a model's partition does not sacrifice accuracy, if we employ a proper partition selection policy. Model compression reduces deep learning model size by e.g. pruning neural network edges or quantizing network parameters. A reduced model has a smaller size and fewer operations to compute on the edge nodes or end device. However, compression usually sacrifices prediction accuracy in exchange for shorter inference time. In this thesis, our contributions to partitioning, caching, and compression are covered with experiments on state-of-the-art deep learning models. In partitioning, we propose TeamNet based on competitive and selective learning schemes. Experiments using MNIST and CIFAR-10 datasets show that on Raspberry Pi and Jetson TX2 (with TensorFlow), TeamNet shortens neural network inference as much as 53% without compromising predictive accuracy. In caching, we propose CacheNet, which caches low-complexity models on end devices and high-complexity (or full) models on edge or cloud servers. Experiments using CIFAR-10 and FVG have shown on Raspberry Pi, Jetson Nano, and Jetson TX2 (with TensorFlow Lite and NCNN), CacheNet is 58-217% faster than baseline approaches that run inference tasks on end devices or edge servers alone. In compression, we propose the logographic subword model for compression in machine translation. Experiments demonstrate that in the tasks of English-Chinese/Chinese-English translation, logographic subword model reduces training and inference time by 11-77% with Theano and Torch. We demonstrate our approaches are promising for applying deep learning models on the mobile edge. / Thesis / Doctor of Philosophy (PhD) / Edge artificial intelligence (EI) has attracted much attention in recent years. EI is a new computing paradigm where artificial intelligence (e.g. deep learning) algorithms are distributed among edge nodes and end devices of computer networks. There are many merits in EI such as shorter latency, better privacy, and autonomy. These advantages motivate us to contribute to EI by developing intelligent solutions including partitioning, caching, and compression. Deep Learning Edge Artificial Intelligence
105	Sedimentological and Geochemical Characterization of Neoproterozoic Deep-Marine Levees Deposits Cunningham, Celeste 20 September 2022 (has links) Deep-marine levees are areally extensive features that border submarine channel systems. Compared to the adjacent channel, where episodes of erosion and bypass are commonplace, levees are mostly depositional features that experience little erosion, and therefore high preservation potential of individual beds, and presumably provide a nearly continuous depositional record of transport events down deep-marine slopes. Nevertheless, despite their size, volumetric prominence, and interpretive significance, deep-marine levees have received much less research attention compared to the adjacent channels. Accordingly, the spatial and temporal evolution of levee stratigraphy is much less well understood, in part because of the typically recessive nature of levee deposits exposed in outcrop in the ancient sedimentary record, and insufficient seismic resolution seismic in the modern. Also, although modern deep-marine levees have been shown to sequester a large proportion of the world’s total buried organic carbon, few studies have attempted to assess carbon deposition and preservation in ancient deep-marine levee deposits. In the Isaac Formation of the Windermere Supergroup (Neoproterozoic) of east-central British Columbia, Canada, well-exposed levee deposits display a systematic organization on several dimensional scales. Levee packages (decameter-scale) are interpreted to be due to cyclic changes in the granulometric makeup of sediment being supplied to the system, whereas bedsets (centimeter- to meter-scale) are interpreted to represent systematic and recurring pulses or surges during a single flow event. Furthermore, physical and geochemical characterization of levee strata at Castle Creek has shown that the unique depositional processes in levees can result in the concentration and enrichment of sedimentary marine organic matter (OM), which occurs mostly in banded, mud-rich sandstones deposited under oxic conditions. Organic carbon occurs primarily as nano-scale coatings on clay particles and uncommon sand-sized organomineralic aggregates and discrete sand-sized amorphous grains. The distribution of this OM in levee strata is controlled by a combination of primary productivity, sea level, and rates of continental runoff and detrital terrigenous influx, which collectively are principally controlled by climate. Understanding the stacking patterns, geochemistry, and organic content of ancient levee deposits is important for assessing sedimentation patterns, depositional processes, event frequency and magnitude, paleoenvironmental conditions, and the evolution of ancient ocean and climate systems. Sedimentology Geochemistry Deep-marine Levees
106	End-to-end Optical Music Recognition Beyond Staff-Level Transcription Ríos-Vila, Antonio 04 July 2024 (has links) El Reconocimiento Óptico de Música (Optical Music Recognition, OMR) es un campo de investigación que estudia cómo leer computacionalmente la notación musical presente en documentos y almacenarla en un formato digital estructurado. Los enfoques tradicionales de OMR suelen estructurarse en torno a un proceso de varias etapas: (i) preprocesamiento de imágenes, donde se abordan cuestiones relacionadas con el proceso de escaneado y la calidad del papel, (ii) segmentación y clasificación de símbolos, donde se detectan y etiquetan los distintos elementos de la imagen, (iii) reconstrucción de la notación musical, una fase de postprocesamiento del proceso de reconocimiento, y (iv) codificación de resultados, donde los elementos reconocidos se almacenan en un formato simbólico adecuado. Estos sistemas logran tasas de reconocimiento competitivas a costa de utilizar determinadas heurísticas, adaptadas a los casos para los que fueron diseñados. En consecuencia, la escalabilidad se convierte en una limitación importante, ya que para cada colección o tipo notacional es necesario diseñar un nuevo conjunto de heurísticas. Además, otro inconveniente de estos enfoques tradicionales es la necesidad de un etiquetado detallado, a menudo obtenido manualmente. Dado que cada símbolo se reconoce individualmente, se requieren las posiciones exactas de cada uno de ellos, junto con sus correspondientes etiquetas musicales. La integración del Aprendizaje Profundo (Deep Learning, DL) en el campo del OMR ha marcado un punto de inflexión hacia la adopción de sistemas holísticos o de extremo a extremo. Estos sistemas, fundamentados en la inteligencia artificial y las redes neuronales profundas, abordan la segmentación y la clasificación de símbolos musicales como un proceso unificado, en lugar de fraccionarlo en múltiples etapas discretas. La metodología permite que el aprendizaje de la extracción de características y la clasificación se realice de manera simultánea, eliminando la necesidad de desarrollar y ajustar procedimientos específicos para cada tarea. La clave de este enfoque radica en el uso de conjuntos de datos compuestos por imágenes de partituras y sus transcripciones correspondientes, obviando la necesidad de marcar la posición exacta de cada símbolo. Así, el avance simplifica significativamente el proceso de transcripción musical, al permitir que las características relevantes para la clasificación sean aprendidas directamente de los datos, sin intervención manual detallada en el etiquetado de elementos individuales. El paradigma de procesamiento de extremo a extremo ha sido objeto de análisis en investigaciones recientes. Estos trabajos, si bien avanzan bajo la premisa de que una fase de preprocesamiento específica ya ha llevado a cabo la segmentación de los pentagramas en las partituras, centran su atención en a recuperación de secuencias de símbolos musicales a partir de imágenes de pentagramas. En este ámbito, las Redes Neuronales Convolucionales Recurrentes (CRNN) son la solución más popular. En estas, el componente convolucional se dedica a la extracción de características significativas de las imágenes, mientras que las capas recurrentes se encargan de interpretar estas características como secuencias de símbolos musicales. Los resultados actuales de OMR han demostrado una gran precisión para transcribir partituras musicales, incluso en los casos más complejos. Estos avances permiten el planteamiento de metas más ambiciosas. Una línea de trabajo destacable es la del OMR universal. Un sistema de transcripción universal de música es aquel capaz de transcribir el contenido de cualquier documento musical. Esto significa que, independientemente de las características y la notación de dicho documento, el modelo es capaz transcribir, en una notación adecuada, y generar la versión digital del mismo. El OMR universal es un modelo ideal por diversas razones. La primera es práctica, ya que facilita el trabajo de los usuarios finales, quienes precisan actualmente de herramientas específicas para cada tipo de partitura musical. La producción de un transcriptor universal permitiría juntar estos programas en herramientas genéricas capaces de cubrir todo el espectro de necesidades de los usuarios, lo cual reduce el coste de procesamiento y mantenimiento de los documentos musicales. Desde un punto de vista científico, esta técnica desbloquearía el potencial de los modelos basados en aprendizaje automático para leer e interpretar documentos musicales, ya que lo harían desde un conocimiento genérico. El logro permite abordar tareas más complejas que necesitan de esta información, pero van más allá de ella. Algunas de estas tareas serían la detección de patrones de autor, la estimación de la dificultad de una partitura o la clasificación por época. Sin embargo, el estado de la cuestión de OMR no es capaz de abordar tal objetivo todavía, debido a una serie de limitaciones. En esta tesis, se proponen trabajos que avanzan el estado de la cuestión de OMR hacia ese objetivo. En primer lugar, se proponen contribuciones para completar los sistemas de OMR, los cuales no son capaces de exportar sus resultados en formatos compatibles con las herramientas musicológicas más comunes. Una vez obtenido un sistema de OMR completo, se proponen trabajos para abordar los problemas de Aligned Music Notation & Lyrics Transcription y polifonía, los cuales son retos relevantes que la literatura no ha abordado (por dificultad). De esta forma, mediante adaptaciones de los sistemas actuales, se avanza el estado de la cuestión en estos temas. Finalmente, se abordan los sistemas libres de segmentación para transcribir páginas musicales, liberando así a los modelos OMR de su estructura secuencial de segmentación y transcripción. En concreto, las investigaciones se enfocan hacia el Sheet Music Transformer, un modelo de transcripción basado en tecnologías de vanguardia para obtener la transcripción de una partitura directamente desde la imagen de su página. / This paper is part of the project I+D+i PID2020-118447RA-I00 (MultiScore), funded by MCIN/AEI/10.13039/501100011033. The first author is supported by grants ACIF/2021/356 and CIBEFP/2022/19 from the “Programa I+D+i de la Generalitat Valenciana”. Deep Learning Optical Music Recognition
107	Reevaluating the Ventral and Lateral Temporal Neural Pathways in Face Processing: Deep Learning Insights into Face Identity and Facial Expression Mechanisms Schwartz, Emily January 2024 (has links) Thesis advisor: Stefano Anzellotti / There has been much debate over how the functional organization of vision develops. Contemporary theories that are inspired by analyzing neural data with machine learning models have led to new insights in understanding brain organization. Given the evolutionary importance of face perception and the specialized mechanisms that have evolved to support evaluating it, examining faces offers a unique way to study a dedicated mechanism that shares much of its organization in ventral and lateral neural pathways with other social stimuli, and provide insight into a more general principle of the organization of social perception. According to a classical view of face perception (Bruce and Young, 1986; Haxby, Hoffman, and Gobbini, 2000), face identity and facial expression recognition are performed by separate neural substrates (ventral and lateral temporal face-selective regions, respectively). However, recent studies challenge this view, showing that expression valence can also be decoded from ventral regions (Skerry and Saxe, 2014; Li, Richardson, and Ghuman, 2019) and identity from lateral regions (Anzellotti and Caramazza, 2017). These recent findings have inspired the formulation of an alternative hypothesis. From a computational perspective, it may be possible to process face identity and facial expression jointly by disentangling information for the two properties. This hypothesis was tested using deep convolutional neural network (DCNN) models as a proof of principle. Subsequently, this is then followed by evaluating the representational content of static face stimuli within ventral and lateral temporal face- selective regions using intracranial electroencephalography (iEEG). This is then extended to investigating the representation content of dynamic faces within these regions using functional magnetic resonance imaging (fMRI). The results reported here as well as the reviewed literature may help to support the reevaluation of the roles the ventral and lateral temporal neural pathways play in processing socially-relevant stimuli. / Thesis (PhD) — Boston College, 2024. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Psychology and Neuroscience. deep learning face perception neuroimaging
108	A Naturalistic Driving Study for Lane Change Detection and Personalization Lakhkar, Radhika Anandrao 05 January 2023 (has links) Driver Assistance and Autonomous Driving features are becoming nearly ubiquitous in new vehicles. The intent of the Driver Assistant features is to assist the driver in making safer decisions. The intent of Autonomous Driving features is to execute vehicle maneuvers, without human intervention, in a safe manner. The overall goal of Driver Assistance and Autonomous Driving features is to reduce accidents, injuries, and deaths with a comforting driving experience. However, different drivers can react differently to advanced automated driving technology. It is therefore important to consider and improve the adaptability of these advances based on driver behavior. In this thesis, a human-centric approach is adopted in order to provide an enriching driving experience. The thesis investigates the natural behavior of drivers when changing lanes in terms of preferences of vehicle kinematics parameters using a real-world driving dataset collected as part of the Second Strategic Highway Research Program (SHRP2). The SHRP2 Naturalistic Driving Study (NDS) set is mined for lane change events. This work develops a way to detect reliable lane changing instances from a huge NDS dataset with more than 5,400,000 data files. The lane changing instances are distinguished from noisy and erroneous data by using machine vision lane tracking system variables such as left lane marker probability and right lane marker probability. We have shown that detected lane changing instances can be validated using only vehicle kinematics data. Kinematic vehicle parameters such as vehicle speed, lateral displacement, lateral acceleration, steering wheel angle, and lane change duration are then extracted and examined from time series data to characterize these lane-changing instances for a given driver. We have shown how these vehicle kinematic parameters change and exhibit patterns during lane change maneuvers for a specific driver. The thesis shows the limitations of analyzing vehicle kinematic parameters separately and develops a novel metric, Lane Change Dynamic Score(LCDS) that shows the collective effect of these vehicle kinematic parameters. LCDS is used to classify each lane change and thereby different driving styles. / Master of Science / The current tendency of car manufacturers is to create vehicles that will offer the user the most comfortable ride possible. The user experience is given a lot of attention to ensure it is up to par. With technological advancements, we are moving closer to an era in which automobiles perform many functions autonomously. However, different drivers may react differently to highly automated driving technologies. Therefore, adapting to different driving styles is critical to increasing the acceptance of autonomous vehicle features. In this work, we examine one of the stressful maneuvers of lane changes. The analysis of various drivers' lane-changing behaviors and the value of personalization are the main subjects of this study based on actual driving scenarios. To achieve this, we have provided an algorithm to identify occurrences of lane-changing from real driving trip data files. Following that, we investigated parameters such as lane change duration, vehicle speed, displacement, acceleration, and steering wheel angle when changing lanes. We have demonstrated the patterns and changes in these vehicle kinematic characteristics that occur when a particular driver performs lane change operations. The thesis shows the limitations of analyzing vehicle kinematic parameters separately and develops a novel metric, Lane Change Dynamic Score(LCDS) that shows the collective effect of these vehicle kinematic parameters. LCDS is used to classify each lane change and thereby different driving styles. Lane Change Personalization Deep Learning
109	Deep Learning-Driven Modeling of Dynamic Acoustic Sensing in Biommetic Soft Robotic Pinnae Chakrabarti, Sounak 02 October 2024 (has links) Bats possess remarkably sophisticated biosonar systems that seamlessly integrate the physical encoding of information through intricate ear motions with the neural extraction and processsing of sensory information. While previous studies have endeavored to mimic the pinna (outer ear) dynamics of bats using fixed deformation patterns in biomimetic soft-robotic sonar heads, such physical approaches are inherently limited in their ability to comprehensively explore the vast actuation pattern space that may enable bats to adaptively sense across diverse environments and tasks.To overcome these limitations, this thesis presents the development of deep regression neural networks capable of predicting the beampattern (acoustic radiation pattern) of a soft-robotic pinna as function of its actuator states. The pinna model geometry is derived from a tomographic scan of the right ear of the greater horseshoe bat (textit{Rhinolophus ferrumequinum}. Three virtual actuators are incorporated into this model to simulate a range of shape deformations. For each unique actuation pattern producing a distinct pinna shape conformation, the corresponding ultrasonic beampattern is numerically estimated using a frequency-domain boundary element method (BEM) simulation, providing ground truth data. Two neural networks architectures, a multilayer perceptron (MLP) and a radial basis function network (RBFN) based on von Mises functions were evaluated for their ability to accurately reproduce these numerical beampattern estimates as a function of spherical coordinates azimuth and elevation. Both networks demonstrate comparably low errors in replicating the beampattern data. However, the MLP exhibits significantly higher computational efficiency, reducing training time by 7.4 seconds and inference time by 0.7 seconds compared to the RBFN. The superior computational performance of deep neural network models in inferring biomimetic pinna beampatterns from actuator states enables an extensive exploration of the vast actuation pattern space to identify pinna actuation patterns optimally suited for specific biosonar sensing tasks. This simulation-based approach provides a powerful framework for elucidating the functional principles underlying the dynamic shape adaptations observed in bat biosonar systems. / Master of Science / The aim is to understand how bats can dynamically change the shape of their outer ears (pinnae) to optimally detect sounds in different environments and for different tasks. Previous studies tried to mimic bat ear motions using fixed deformation patterns in robotic ear models, but this approach is limited. Instead this thesis uses deep learning neural networks to predict how changing the shape of a robotic bat pinna model affects its acoustic beampattern (how it radiates and receives sound). The pinna geometry is based on a 3D scan of a greater horseshoe bat ear, with three virtual "actuators" to deform the shape. For many different actuator patterns deforming the pinna, the resulting beampattern is calculated using computer simulations. Neural networks ( multilayer perceptron and radial basis function network) are trained on this data to accurately predict the beampattern from the actuator states. The multilayer perceptron network is found to be significantly more computationally efficient for this task. This neural network based approach allows rapidly exploring the vast range of possible pinna actuations to identify optimal shapes for specific biosonar sensing tasks, shedding light on principles of dynamic ear shape control in bats. biosonar deep learning digital twin
110	Robot Motions that Mitigate Uncertainty Toubeh, Maymoonah 23 October 2024 (has links) This dissertation addresses the challenge of robot decision making in the presence of uncertainty, specifically focusing on robot motion decisions in the context of deep learning-based perception uncertainty. The first part of this dissertation introduces a risk-aware framework for path planning and assignment of multiple robots and multiple demands in unknown environments. The second part introduces a risk-aware motion model for searching for a target object in an unknown environment. To illustrate practical application, consider a situation such as disaster response or search-and-rescue, where it is imperative for ground vehicles to swiftly reach critical locations. Afterward, an agent deployed at a specified location must navigate inside a building to find a target, whether it is an object or a person. In the first problem, the terrain information is only available as an aerial georeferenced image frame. Semantic segmentation of the aerial images is performed using Bayesian deep learning techniques, creating a cost map for the safe navigation ground robots. The proposed framework also accounts for risk at a further level, using conditional value at risk (CVaR), for making risk-aware assignments between the source and goal. When the robot reaches its destination, the second problem addresses the object search task using a proposed machine learning-based intelligent motion model. A comparison of various motion models, including a simple greedy baseline, indicates that the proposed model yields more risk-aware and robust results. All in all, considering uncertainty in both systems leads to demonstrably safer decisions. / Doctor of Philosophy / Scientists need to demonstrate that robots are safe and reliable outside of controlled lab environments for real-world applications to be viable. This dissertation addresses the challenge of robot decision-making in the face of uncertainty, specifically focusing on robot motion decisions in the context of deep learning-based perception uncertainty. Deep learning (DL) refers to using large hierarchical structures, often called neural networks, to approximate semantic information from input data. The first part of this dissertation introduces a risk-aware framework for path planning and assignment of multiple robots and multiple demands in unknown environments. Path planning involves finding a route from the source to the goal, while assignment focuses on selecting source-goal paths to fulfill all demands. The second part introduces a risk-aware motion model for searching for a target object in an unknown environment. Being risk-aware in both cases means taking uncertainty into account. To illustrate practical application, consider a situation such as disaster response or search-and-rescue, where it is imperative for ground vehicles to swiftly reach critical locations. Afterward, an agent deployed at a specified location must navigate inside a building to find a target, whether it is an object or a person. In this dissertation, deep learning is used to interpret image inputs for two distinct robot systems. The input to the first system is an aerial georeferenced image; the second is an indoor scene. After the images are interpreted by deep learning, they undergo further processing to extract information about uncertainty. The information about the image and the uncertainty is used for later processing. In the first case, we use both a traditional path planning method and a novel path assignment method to assign one path from each source to a demand location. In the second case, a motion model is developed using image data, uncertainty, and position in relation to the anticipated target. Several potential motion models are compared for analysis. All in all, considering uncertainty in both systems leads to demonstrably safer decisions. Uncertainty Robot Motion Deep Learning

Search results