Global ETD Search

1	Bilevel Optimization in the Deep Learning Era: Methods and Applications Zhang, Lei 05 January 2024 (has links) Neural networks, coupled with their associated optimization algorithms, have demonstrated remarkable efficacy and versatility across an extensive array of tasks, encompassing image recognition, speech recognition, object detection, sentiment analysis, and more. The inherent strength of neural networks lies in their capability to autonomously learn intricate representations that map input data to corresponding output labels seamlessly. Nevertheless, not all tasks can be neatly encapsulated within the confines of an end-to-end learning paradigm. The complexity and diversity of real-world challenges necessitate innovative approaches that extend beyond conventional formulations. This calls for the exploration of specialized architectures and optimization strategies tailored to the unique intricacies of specific tasks, ensuring a more nuanced and effective solution to the myriad demands of diverse applications. The bi-level optimization problem stands out as a distinctive form of optimization, characterized by the embedding or nesting of one problem within another. Its relevance persists significantly in the current era dominated by deep learning. A notable instance of its application in the realm of deep learning is observed in hyperparameter optimization. In the context of neural networks, the automatic training of weights through backpropagation represents a crucial aspect. However, certain hyperparameters, such as the learning rate (lr) and the number of layers, must be predetermined and cannot be optimized through the conventional chain rule employed in backpropagation. This underscores the importance of bi-level optimization in addressing the intricate task of fine-tuning these hyperparameters to enhance the overall performance of deep learning models. The domain of deep learning presents a fertile ground for further exploration and discoveries in optimization. The untapped potential for refining hyperparameters and optimizing various aspects of neural network architectures highlights the ongoing opportunities for advancements and breakthroughs in this dynamic field. Within this thesis, we delve into significant bi-level optimization challenges, applying these techniques to pertinent real-world tasks. Given that bi-level optimization entails dual layers of optimization, we explore scenarios where neural networks are present in the upper-level, the inner-level, or both. To be more specific, we systematically investigate four distinct tasks: optimizing neural networks towards optimizing neural networks, optimizing attractors towards optimizing neural networks, optimizing graph structures towards optimizing neural network performance, and optimizing architecture towards optimizing neural networks. For each of these tasks, we formulate the problems using the bi-level optimization approach mathematically, introducing more efficient optimization strategies. Furthermore, we meticulously evaluate the performance and efficiency of our proposed techniques. Importantly, our methodologies and insights transcend the realm of bi-level optimization, extending their applicability broadly to various deep learning models. The contributions made in this thesis offer valuable perspectives and tools for advancing optimization techniques in the broader landscape of deep learning. / Doctor of Philosophy / Bilevel optimization proves to be a valuable technique across various applications. Mathematically, it entails optimizing an objective at the upper level while concurrently addressing another optimization problem at the lower level. The key challenge lies in finding optimal solutions at both levels simultaneously, considering the interdependence between decisions made at each level. The complexity of bilevel optimization escalates when integrated with deep learning. Firstly, deep learning models typically undergo iterative optimization, presenting challenges in streamlining the process within a bilevel optimization framework. Secondly, the bilevel setting introduces complexity, making it difficult to achieve end-to-end optimization for deep learning models. This thesis delves into the bilevel optimization problem through four distinct approaches that incorporate deep learning. These approaches represent different tasks spanning various domains of machine learning, including neural architecture search, graph structure learning, implicit model, and causal inference. Notably, the proposed methods not only address specific types of bilevel optimization problems but also offer theoretical guarantees. The insights and methodologies presented in this thesis have the potential to aid individuals in solving problems involving high-order decisions. NAS Graph GNN Architecture
2	DEMOCRATISING DEEP LEARNING IN MICROBIAL METABOLITES RESEARCH / DEMOCRATISING DEEP LEARNING IN NATURAL PRODUCTS RESEARCH Dial, Keshav January 2023 (has links) Deep learning models are dominating performance across a wide variety of tasks. From protein folding to computer vision to voice recognition, deep learning is changing the way we interact with data. The field of natural products, and more specifically genomic mining, has been slow to adapt to these new technological innovations. As we are in the midst of a data explosion, it is not for lack of training data. Instead, it is due to the lack of a blueprint demonstrating how to correctly integrate these models to maximise performance and inference. During my PhD, I showcase the use of large language models across a variety of data domains to improve common workflows in the field of natural product drug discovery. I improved natural product scaffold comparison by representing molecules as sentences. I developed a series of deep learning models to replace archaic technologies and create a more scalable genomic mining pipeline decreasing running times by 8X. I integrated deep learning-based genomic and enzymatic inference into legacy tooling to improve the quality of short-read assemblies. I also demonstrate how intelligent querying of multi-omic datasets can be used to facilitate the gene signature prediction of encoded microbial metabolites. The models and workflows I developed are wide in scope with the hopes of blueprinting how these industry standard tools can be applied across the entirety of natural product drug discovery. / Thesis / Doctor of Philosophy (PhD) Deep Learning Cheminformatics BERT LLM Bioinformatics T5 genomic mining GNN
3	GNN-based End-to-end Delay Prediction in Software Defined Networking Ge, Zhun 12 August 2022 (has links) Nowadays, computer networks have always been complicated deployment for both the scientific and industry groups as they attempt to comprehend and analyze network performance as well as design efficient procedures for their operation. In software-defined networking (SDN), predicting latency (delay) is essential for enhancing performance, power consumption and resource utilization in meeting its significant latency requirements. In this thesis, we present a graph-based formulation of Abilene Network and other topologies and apply a Graph Neural Network (GNN)-based model, Spatial-Temporal Graph Convolutional Network (STGCN), to predict end-to-end packet delay on this formulation. The evaluation uses STGCN to compare with other machine learning methods: Multiple Linear Regression (MLR), Extreme Gradient Boosting (XGBOOST), Random Forest (RF), and Neural Network (NN). Datasets in use include Abilene, 15-node scale-free, 24-node GEANT2, and 50-node networks. Notably, our GNN-based methodology can achieve 97.0%, 95.9%, 96.1%, and 63.1% less root mean square error (RMSE) in the most complex network situation than the baseline predictor, MLR, XGBOOST and RF, respectively. All the experiments show that STGCN has good prediction performance with small and stable prediction errors. This thesis illustrates the feasibility and benefits of a GNN approach in predicting end-to-end delay in software-defined networks. GNN SDN End-to-end Delay Estimation STGCN
4	Gamma-ray tracking using graph neural networks / Tracking av gamma-strålning med hjälp av neurala grafnätverk Andersson, Mikael January 2021 (has links) While there are existing methods of gamma ray-track reconstruction in specialized detectors such as AGATA, including backtracking and clustering, it is naturally of interest to diversify the portfolio of available tools to provide us viable alternatives. In this study some possibilities found in the field of machine learning were investigated, more specifically within the field of graph neural networks. In this project there was attempt to reconstruct gamma tracks in a germanium solid using data simulated in Geant4. The data consists of photon energies below the pair production limit and so we are limited to the processes of photoelectric absorption and Compton scattering. The author turned to the field of graph networks to utilize its edge and node structure for data of such variable input size as found in this task. A graph neural network (GNN) was implemented and trained on a variety of gamma multiplicities and energies and was subsequently tested in terms of various accuracy parameters and generated energy spectra. In the end the best result involved an edge classifier trained on a large dataset containing a 10^6 tracks bundled together into separate events to be resolved. The network was capable of recalling up to 95 percent of the connective edges for the selected threshold in the infinite resolution case with a peak-to-total ratio of 68 percent for a set of packed data with a model trained on simulated data including realistic uncertainties in both position and energy. / Trots att det existerar en mängd metoder för rekonstruktion av spår i specialiserade detektorer som AGATA är det av naturligt intresse att diversifiera och undersöka nya verktyg för uppgiften. I denna studie undersöktes några möjligheter inom maskininlärning, närmare bestämt inom området neurala grafnätverk. Under projektets gång simulerades data av fotoner i en ihålig, sfärisk geometri av germanium i Geant4. Den simulerade datan är begränsad till energier under parproduktion så datan består av reaktioner genom den fotoelektriska effekten och comptonspridning. Den variabla storleken på denna data och dess spridning i detektorns geometri lämpar sig för ett grafformat med nod och länkstruktur. Ett neuralt grafnätverk (GNN) implementerades och tränades på data med gamma av variabel multiplicitet och energi och evaluerades på ett urval prestandaparametrar och dess förmåga att generera ett användbart spektra. Slutresultatet involverade en länkklassificerings modell tränat på data med 10^6 spår sammanslagna till händelser. Nätverket återkallade 95 procent av positiva länkar för ett val av tröskelvärde i fallet med oändlig upplösning med ett peak-to-total på 68 procent för packad data behandlad med osäkerhet i energi och position motsvarande fallet med begränsad upplösning. Physics Nuclear Physics Detectors Tracking Neural Networks GNN Gamma-ray Fysik Kärnfysik Detektorer Tracking Neurala Nät GNN Gammastrålning Physical Sciences Fysik
5	Segmentace obrazových dat pomocí grafových neuronových sítí / Image segmentation using graph neural networks Boszorád, Matej January 2020 (has links) This diploma thesis describes and implements the design of a graph neural network usedfor 2D segmentation of neural structure. The first chapter of the thesis briefly introduces the problem of segmentation. In this chapter, segmentation techniques are divided according to the principles of the methods they use. Each type of technique contains the essence of this category as well as a description of one representative. The second chapter of the diploma thesis explains graph neural networks (GNN for short). Here, the thesis divides graph neural networks in general and describes recurrent graph neural networks(RGNN for short) and graph autoencoders, that can be used for image segmentation, in more detail. The specific image segmentation solution is based on the message passing method in RGNN, which can replace convolution masks in convolutional neural networks.RGNN also provides a simpler multilayer perceptron topology. The second type of graph neural networks characterised in the thesis are graph autoencoders, which use various methods for better encoding of graph vertices into Euclidean space. The last part ofthe diploma thesis deals with the analysis of the problem, the proposal of its specific solution and the evaluation of results. The purpose of the practical part of the work was the implementation of GNN for image data segmentation. The advantage of using neural networks is the ability to solve different types of segmentation by changing training data. RGNN with messaging passing and node2vec were used as implementation GNNf or segmentation problem. RGNN training was performed on graphics cards provided bythe school and Google Colaboratory. Learning RGNN using node2vec was very memory intensive and therefore it was necessary to train on a processor with an operating memory larger than 12GB. As part of the RGNN optimization, learning was tested using various loss functions, changing topology and learning parameters. A tree structure method was developed to use node2vec to improve segmentation, but the results did not confirman improvement for a small number of iterations. The best outcomes of the practical implementation were evaluated by comparing the tested data with the convolutional neural network U-Net. It is possible to state comparable results to the U-Net network, but further testing is needed to compare these neural networks. The result of the thesisis the use of RGNN as a modern solution to the problem of image segmentation and providing a foundation for further research.
6	Approaching sustainable mobility utilizing graph neural networks Gunnarsson, Robin, Åkermark, Alexander January 2021 (has links) This report is done in collaboration with WirelessCar for the master of science thesis at Halmstad University. Many different parameters influence fuel consumption. The objective of the report is to evaluate if Graph neural networks are a practical model to perform fuel consumption prediction on areas. The model uses a partitioning of geographical locations of trip observations to capture their spatial information. The project also proposes a method to capture the non-stationary behavior of vehicles by defining a vehicle node as a separate entity. The model then captures their different features in a dense layer neural network and utilizes message passing to capture context about neighboring nodes. The model is compared to a baseline neural network with a similar network architecture as the graph neural network. The data is partitioned to define an area with Kmeans and static gridnet partition with and without terrain details. This partition is used to structure a homogeneous graph that is underperforming. The practical drawbacks of the initial homogeneous graph are inspected and addressed to develop a heterogeneous graph that can outperform the neural network baseline. Graph neural network GNN sustainable mobility GCN graph Neural network Cora Fuel consumption prediction graph Computer Sciences Datavetenskap (datalogi)
7	Machine Learning for Improvement of Ocean Data Resolution for Weather Forecasting and Climatological Research Huda, Md Nurul 18 October 2023 (has links) Severe weather events like hurricanes and tornadoes pose major risks globally, underscoring the critical need for accurate forecasts to mitigate impacts. While advanced computational capabilities and climate models have improved predictions, lack of high-resolution initial conditions still limits forecast accuracy. The Atlantic's "Hurricane Alley" region sees most storms arise, thus needing robust in-situ ocean data plus atmospheric profiles to enable precise hurricane tracking and intensity forecasts. Examining satellite datasets reveals radio occultation (RO) provides the most accurate 5-25 km altitude atmospheric measurements. However, below 5 km accuracy remains insufficient over oceans versus land areas. Some recent benchmark study e.g. Patil Iiyama (2022), and Wei Guan (2022) in their work proposed the use of deep learning models for sea surface temperature (SST) prediction in the Tohoku region with very low errors ranging from 0.35°C to 0.75°C and the root-mean-square error increases from 0.27°C to 0.53°C over the over the China seas respectively. The approach we have developed remains unparalleled in its domain as of this date. This research is divided into two parts and aims to develop a data driven satellite-informed machine learning system to combine high-quality but sparse in-situ ocean data with more readily available low-quality satellite data. In the first part of the work, a novel data-driven satellite-informed machine learning algorithm was implemented that combines High-Quality/Low-Coverage in-situ point ocean data (e.g. ARGO Floats) and Low-Quality/High-Coverage Satellite ocean Data (e.g. HYCOM, MODIS-Aqua, G-COM) and generated high resolution data with a RMSE of 0.58◦C over the Atlantic Ocean.The second part of the work a novel GNN algorithm was implemented on the Gulf of Mexico and showed it can successfully capture the complex interactions between the ocean and mimic the path of a ARGO floats with a RMSE of 1.40◦C. / Doctor of Philosophy / Severe storms like hurricanes and tornadoes are a major threat around the world. Accurate weather forecasts can help reduce their impacts. While climate models have improved predictions, lacking detailed initial conditions still limits forecast accuracy. The Atlantic's "Hurricane Alley" sees many storms form, needing good ocean and atmospheric data for precise hurricane tracking and strength forecasts. Studying satellite data shows radio occultation provides the most accurate 5-25 km high altitude measurements over oceans. But below 5 km accuracy remains insufficient versus over land. Recent research proposed using deep learning models for sea surface temperature prediction with low errors. Our approach remains unmatched in this area currently. This research has two parts. First, we developed a satellite-informed machine learning system combining limited high-quality ocean data with more available low-quality satellite data. This generated high resolution Atlantic Ocean data with an error of 0.58°C. Second, we implemented a new algorithm on the Gulf of Mexico, successfully modeling complex ocean interactions and hurricane paths with an error of 1.40°C. Overall, this research advances hurricane forecasting by combining different data sources through innovative machine learning techniques. More accurate predictions can help better prepare communities in hurricane-prone regions. Multi-fidelity data assimilation GEE satellite data ARGO Floats Numerical weather prediction (NWP) Geo- Informed ML GNN
8	Radio and satellite tracking and detecting systems for maritime applications Skoryk, Ivan 15 January 2015 (has links) Submitted in fulfillment of the requirements of the Doctor of Technology Degree in Information Technology, Durban University of Technology, Durban, South Africa, 2014. / The work described in this thesis summarizes the author’s contributions to the design, development and testing of embedded solutions for maritime Radio and Satellite tracking and detecting systems. In order to provide reliable tracking and detecting facilities of ships have to be integrated Convectional Maritime Radio Communications (CMRC) and Maritime Mobile Satellite Communications (MMSC) systems. On the other hand, Global Mobile Satellite Communications (GMSC) as a part of Global Communication Satellite Systems (GCSS) has to be integrated with Global Navigation Satellite Systems (GNSS) of the US GPS or Russian GLONASS systems. The proposed local maritime Radio VHF Communication, Navigation and Surveillance (CNS) systems and devices, such as Radio Automatic Identification System (R-AIS) or VHF Data Link (VDL), Radio Automatic Dependent Surveillance - Broadcast (RADS-B) and GNSS Augmentation VDL-Broadcast (GAVDL-B) are introduced. The new technology deigns of global Satellite CNS maritime equipment and systems, such as Global Ship Tracking (GST) as enhanced Long Range Identification and Tracking (LRIT), Satellite AIS (S-AIS), Satellite Data Link (SDL), Satellite Automatic Dependent Surveillance - Broadcast (SADS-B) and GNSS Augmentation SDL (GASDL) are discussed and benefits of these new technologies and solution for improved Ship Traffic Control (STC) and Management (STM) are explored. The regional maritime CNS solutions via Stratospheric Communication Platforms (SCP), tracking of ships at sea via Space Synthetic Aperture Radar (SSAR) or Inverse Synthetic Aperture Radar (ISAR)and Ground Synthetic Aperture Radar (GSAR) are described. The special tracking systems for collision avoidance with enhanced safety and security at sea including solutions of captured ships by pirates through aids of the MMSC, SCP and Radars are introduced and the testing methodologies employed to qualify embedded hardware for this environment are presented. During the voyage of the ship in good weather conditions and when navigation devices on the bridge are in order, then can be used very well AIS, LRIT, anti-collision Radar and other on-board equipment. However, at very bad weather conditions sometimes surveillance Radar and Radio HF Transceiver cannot work, but may work only GPS Receiver and L/C-band Satellite Transceiver, while Radio VHF Transceiver will have extremely reduced coverage, what is not enough for safe navigation and collision avoidance. Therefore, during those critical circumstances, when the safety of navigations very important, it will be not necessary to ask "Where am I", but "Where are nearby ships around me"? At this point, it should be needed the newest techniques and equipment for enhanced STC and STM, such as GST, S-AIS, SDL, SADS-B and GASDL. Terrorists exploit surprise in successful pirate actions worldwide and security forces are generally unaware of the source of these attacks at sea. In today’s information age, terror threats may originate with transnational organizations or exploit the territory of failed, weak or neutral states. Thus, countering piracy by eliminating the terrorists on land is the best solution, however, it might not be feasible and even though it’s successful could require many years. In the thesis, the general overview of Radio and Mobile Satellite Systems (MSS) for ship communication and tracking systems is conducted as well, including the space platform and orbital mechanics, horizon and geographic satellite coordinates and classification of spacecraft by Geostationary Earth Orbits (GEO) and Non-GEO orbits. Global Positioning System GNN (Online service) Radio in navigation--South Africa Shipping--Technological innovations Shipping--Safety measures
9	RNN-based Graph Neural Network for Credit Load Application leveraging Rejected Customer Cases Nilsson, Oskar, Lilje, Benjamin January 2023 (has links) Machine learning plays a vital role in preventing financial losses within the banking industry, and still, a lot of state of the art and industry-standard approaches within the field neglect rejected customer information and the potential information that they hold to detect similar risk behavior.This thesis explores the possibility of including this information during training and utilizing transactional history through an LSTM to improve the detection of defaults. The model is structured so an encoder is first trained with or without rejected customers. Virtual distances are then calculated in the embedding space between the accepted customers. These distances are used to create a graph where each node contains an LSTM network, and a GCN passes messages between connected nodes. The model is validated using two datasets, one public Taiwan dataset and one private Swedish one provided through the collaborative company. The Taiwan dataset used 8000 data points with a 50/50 split in labels. The Swedish dataset used 4644 with the same split. Multiple metrics were used to validate the impact of the rejected customers and the impact of using time-series data instead of static features. For the encoder part, reconstruction error was used to measure the difference in performance. When creating the edges, the homogeny of the neighborhoods and if a node had a majority of the same labeled neighbors as itself were determining factors, and for the classifier, accuracy, f1-score, and confusion matrix were used to compare results. The results of the work show that the impact of rejected customers is minor when it comes to changes in predictive power. Regarding the effects of using time-series information instead of static features, we saw a comparative result to XGBoost on the Taiwan dataset and an improvement in the predictive power on the Swedish dataset. The results also show the importance of a well-defined virtual distance is critical to the classifier's performance. Machine Learning Deep Learning Reject Inference GNN GCN Graph Neural Networks RNN Recursive Neural Network LSTM Semi-Supervised Learning Encoding Decoding Feature Elimination Computer and Information Sciences Data- och informationsvetenskap
10	Deep Learning Framework for Trajectory Prediction and In-time Prognostics in the Terminal Airspace Varun S Sudarsanan (13889826) 06 October 2022 (has links) <p>Terminal airspace around an airport is the biggest bottleneck for commercial operations in the National Airspace System (NAS). In order to prognosticate the safety status of the terminal airspace, effective prediction of the airspace evolution is necessary. While there are fixed procedural structures for managing operations at an airport, the confluence of a large number of aircraft and the complex interactions between the pilots and air traffic controllers make it challenging to predict its evolution. Modeling the high-dimensional spatio-temporal interactions in the airspace given different environmental and infrastructural constraints is necessary for effective predictions of future aircraft trajectories that characterize the airspace state at any given moment. A novel deep learning architecture using Graph Neural Networks is proposed to predict trajectories of aircraft 10 minutes into the future and estimate prog?nostic metrics for the airspace. The uncertainty in the future is quantified by predicting distributions of future trajectories instead of point estimates. The framework’s viability for trajectory prediction and prognosis is demonstrated with terminal airspace data from Dallas Fort Worth International Airport (DFW). </p> Deep Learning Applications Terminal airspace Aviation Safety prognostics prediction model Graph Neural Network (GNN) Uncertainty Quantification Gaussian Mixture Models

Search results