Global ETD Search

21	Robust Representation Learning for Out-of-Distribution Extrapolation in Relational Data Yangze Zhou (18369795) 17 April 2024 (has links) <p dir="ltr">Recent advancements in representation learning have significantly enhanced the analysis of relational data across various domains, including social networks, bioinformatics, and recommendation systems. In general, these methods assume that the training and test datasets come from the same distribution, an assumption that often fails in real-world scenarios due to evolving data, privacy constraints, and limited resources. The task of out-of-distribution (OOD) extrapolation emerges when the distribution of test data differs from that of the training data, presenting a significant, yet unresolved challenge within the field. This dissertation focuses on developing robust representations for effective OOD extrapolation, specifically targeting relational data types like graphs and sets. For successful OOD extrapolation, it's essential to first acquire a representation that is adequately expressive for tasks within the distribution. In the first work, we introduce Set Twister, a permutation-invariant set representation that generalizes and enhances the theoretical expressiveness of DeepSets, a simple and widely used permutation-invariant representation for set data, allowing it to capture higher-order dependencies. We showcase its implementation simplicity and computational efficiency, as well as its competitive performances with more complex state-of-the-art graph representations in several graph node classification tasks. Secondly, we address OOD scenarios in graph classification and link prediction tasks, particularly when faced with varying graph sizes. Under causal model assumptions, we derive approximately invariant graph representations that improve extrapolation in OOD graph classification task. Furthermore, we provide the first theoretical study of the capability of graph neural networks for inductive OOD link prediction and present a novel representation model that produces structural pairwise embeddings, maintaining predictive accuracy for OOD link prediction as the test graph size increases. Finally, we investigate the impact of environmental data as a confounder between input and target variables, proposing a novel approach utilizing an auxiliary dataset to mitigate distribution shifts. This comprehensive study not only advances our understanding of representation learning in OOD contexts but also highlights potential pathways for future research in enhancing model robustness across diverse applications.</p> Knowledge representation and reasoning Neural networks Applied statistics Out-of-distribution Robustness Graph neural network Structural causal model Link prediction
22	Functional distributional semantics : learning linguistically informed representations from a precisely annotated corpus Emerson, Guy Edward Toh January 2018 (has links) The aim of distributional semantics is to design computational techniques that can automatically learn the meanings of words from a body of text. The twin challenges are: how do we represent meaning, and how do we learn these representations? The current state of the art is to represent meanings as vectors - but vectors do not correspond to any traditional notion of meaning. In particular, there is no way to talk about 'truth', a crucial concept in logic and formal semantics. In this thesis, I develop a framework for distributional semantics which answers this challenge. The meaning of a word is not represented as a vector, but as a 'function', mapping entities (objects in the world) to probabilities of truth (the probability that the word is true of the entity). Such a function can be interpreted both in the machine learning sense of a classifier, and in the formal semantic sense of a truth-conditional function. This simultaneously allows both the use of machine learning techniques to exploit large datasets, and also the use of formal semantic techniques to manipulate the learnt representations. I define a probabilistic graphical model, which incorporates a probabilistic generalisation of model theory (allowing a strong connection with formal semantics), and which generates semantic dependency graphs (allowing it to be trained on a corpus). This graphical model provides a natural way to model logical inference, semantic composition, and context-dependent meanings, where Bayesian inference plays a crucial role. I demonstrate the feasibility of this approach by training a model on WikiWoods, a parsed version of the English Wikipedia, and evaluating it on three tasks. The results indicate that the model can learn information not captured by vector space models.
23	Monolith to microservices using deep learning-based community detection / Monolit till mikrotjänster med hjälp av djupinlärningsbaserad klusterdetektion Bothin, Anton January 2023 (has links) The microservice architecture is widely considered to be best practice. Yet, there still exist many companies currently working in monolith systems. This can largely be attributed to the difficult process of updating a systems architecture. The first step in this process is to identify microservices within a monolith. Here, artificial intelligence could be a useful tool for automating the process of microservice identification. The aim of this thesis was to propose a deep learning-based model for the task of microservice identification, and to compare this model to previously proposed approaches. With the goal of helping companies in their endeavour to move towards a microservice-based architecture. In particular, the thesis has evaluated whether the more complex nature of newer deep learning-based techniques can be utilized in order to identify better microservices. The model proposed by this thesis is based on overlapping community detection, where each identified community is considered a microservice candidate. The model was evaluated by looking at cohesion, modularity, and size. Results indicate that the proposed deep learning-based model performs similarly to other state-of-the-art approaches for the task of microservice identification. The results suggest that deep learning indeed helps in finding nontrivial relations within communities, which overall increases the quality of identified microservices, From this it can be concluded that deep learning is a promising technique for the task of microservice identification, and that further research is warranted. / Allmänt anses mikrotjänstarkitekturen vara bästa praxis. Trots det finns det många företag som fortfarande arbetar i monolitiska system. Detta då det finns många svårigheter runt processesen av att byta systemaritekture. Första steget i denna process är att identifiera mikrotjänster inom en monolit. Här kan artificiell intelligens vara ett användbart verktyg för att automatisera processen runt att identifiera mikrotjänster. Denna avhandling syftar till att föreslå en djupinlärningsbaserad modell för att identifiera mikrotjänster och att jämföra denna modell med tidigare föreslagna modeller. Målet är att hjälpa företag att övergå till en mikrotjänstbaserad arkitektur. Avhandlingen kommer att utvärdera nyare djupinlärningsbaserade tekniker för att se ifall deras mer komplexa struktur kan användas för att identifiera bättre mikrotjänster. Modellen som föreslås är baserad på överlappande klusterdetektion, där varje identifierad kluster betraktas som en mikrotjänstkandidat. Modellen utvärderades genom att titta på sammanhållning, modularitet och storlek. Resultaten indikerar att den föreslagna djupinlärningsbaserade modellen identifierar mikrotjänster av liknande kvalitet som andra state-of-the-art-metoder. Resultaten tyder på att djupinlärning bidrar till att hitta icke triviala relationer inom kluster, vilket ökar kvaliteten av de identifierade mikrotjänsterna. På grund av detta dras slutsatsen att djupinlärning är en lovande teknik för identifiering av mikrotjänster och att ytterligare forskning bör utföras. Community detection Deep learning Graph neural network Microservice System architecture Klustringsdetektion Djupinlärning Graf-neuronnät Mikrotjänst Systemarkitektur Computer and Information Sciences Data- och informationsvetenskap
24	Characterizing Structure of High Entropy Alloys (HEAs) Using Machine Learning Reimer, Christoff 13 December 2023 (has links) The irradiation of crystalline materials in environments such as nuclear reactors leads to the accumulation of micro and nano-scale defects with a negative impact on material properties such as strength, corrosion resistance, and dimensional stability. Point defects in the crystal lattice, the vacancy and self-interstitial, form the basis of this damage and are capable of migrating through the lattice to become part of defect clusters and sinks, or to annihilate themselves. Recently, attention has been given to HEAs for fusion and fission components, as some materials of this class have shown resilience to irradiation-induced damage. The ability to predict defect diffusion and accelerate simulations of defect behaviour in HEAs using ML techniques is consequently a subject that has gathered significant interest. The goal of this work was to produce an unsupervised neural network capable of learning the interatomic dynamics within a specific HEA system from MD data in order to create a KMC type predictor of defect diffusion paths for common point defects in crystal systems such as the vacancy and self-interstitial. Self-interstitial defect states were identified and purified from MD datasets using graph-isomorphism, and a proof-of-concept model for the HEA environment was used with several interaction setups to demonstrate the feasibility of training a GCN to predict vacancy defect transition rates in the HEA crystalline environment. High Entropy Alloy Irradiation damage Crystal Structure Machine Learning Graph Neural Network Kinetic Monte Carlo Frenkel Pair Defect Migration Molecular Dynamics
25	Reducing Power Consumption For Signal Computation in Radio Access Networks : Optimization With Linear Programming and Graph Attention Networks / Reducering av energiförbrukning för signalberäkning i radioaccessnätverk : Optimering med linjär programmering och graf uppmärksamhets nätverk Nordberg, Martin January 2023 (has links) There is an ever-increasing usage of mobile data with global traffic having reached 115 exabytes per month at the end of 2022 for mobile data traffic including fixed wireless access. This is projected to grow up to 453 exabytes at the end of 2028, according to Ericssons 2022 mobile data traffic outlook report. To meet the increasing demand radio access networks (RAN) used for mobile communication are continuously being improved with the current generation enabling larger virtualization of the network through the Cloud RAN (C-RAN) architecture. This facilitates the usage of commercial off-the-shelf servers (COTS) in the network replacing specialized hardware servers and making it easier to scale up or down the network capacity after traffic demand. This thesis looks at how we can efficiently identify servers needed to meet traffic demand in a network consisting of both COTS servers and specialized hardware servers while trying to reduce the energy consumption of the network. We model the problem as a network where the antennas and radio heads are connectedto the core network through a C-RAN and a specialized hardware layer. The network is then represented using a graph where the nodes represent servers in the network. Using this problem model as a base we then generate problem instances with varying topologies, server profiles, and traffic demands. To find out how the traffic should be passed through the network we test two different methods: A mixed integer linear programming (MILP) method focused on energy minimization and a graph attention network (GAT) predictor combined with the energy minimization MILP. To help evaluate the results we also create three other methods: a MILP model that tries to spread the traffic as evenly as possible, a random predictor combined with the energy minimization MILP and a greedy method. Our results show that the energy optimization MILP method can be used to create optimal solutions, but it suffer from a slow computation time compared to the other methods. The GAT model shows promising results in making predictions regarding what servers should be included in a network making it possible to reduce the problem size and solve it faster with MILP. The mean energy cost of the solutions created using the combined GAT/MILP method was 4% more than just using MILP but the time gain was substantial for problems of similar size as the GAT was trained on. With regards to computation time the combined GAT/MILP method used was 85% faster than using only MILP. For networks of almost double the size than the ones that the GAT model was trained on the solutions of the combined GAT and MILP methods had a mean energy cost increase of 7% while still showing a strong speedup, being 93% faster than when only using MILP. Cloud RAN Constrained optimization Mixed integer linear programming MILP Machine learning Graph neural network Graph attention network GAT Computer Engineering Datorteknik
26	Causal discovery in conditional stationary time-series data : Towards causal discovery in videos / Kausal upptäckt för villkorad stationär tidsseriedata : Mot kausal upptäckt i videor Balsells Rodas, Carles January 2021 (has links) Performing causal reasoning in a scene is an inherent mechanism in human cognition; however, the majority of approaches in the causality literature aiming for this task still consider constrained scenarios, such as simple physical systems or stationary time-series data. In this work we aim for causal discovery in videos concerning realistic scenarios. We gather motivation for causal discovery by acknowledging this task to be core at human cognition. Moreover, we interpret the scene as a composition of time-series that interact along the sequence and aim for modeling the non-stationary behaviors in a scene. We propose State-dependent Causal Inference (SDCI) for causal discovery in conditional stationary time-series data. We formulate our problem of causal analysis by considering that the stationarity of the time-series is conditioned on a categorical variable, which we call state. Results show that the probabilistic implementation proposed achieves outstanding results in identifying causal relations on simulated data. When considering the state being independent from the dynamics, our method maintains decent accuracy levels of edge-type identification achieving 74.87% test accuracy when considering a total of 8 states. Furthermore, our method correctly handles regimes where the state variable undergoes complex transitions and is dependent on the dynamics of the scene, achieving 79.21% accuracy in identifying the causal interactions. We consider this work to be an important contribution towards causal discovery in videos. / Att utföra kausala resonemang i en scen är en medfödd mekanism i mänsklig kognition; dock betraktar fortfarande majoriteten av tillvägagångssätt i kausalitetslitteraturen, som syftar till denna uppgift, begränsade scenarier såsom enkla fysiska system eller stationära tidsseriedata. I detta arbete strävar vi efter kausal upptäckt i videor om realistiska scenarier. Vi samlar motivation för kausal upptäckt genom att erkänna att denna uppgift är kärnan i mänsklig kognition. Dessutom tolkar vi scenen som en komposition av tidsserier som interagerar längs sekvensen och syftar till att modellera det icke-stationära beteendet i en scen. Vi föreslår Tillståndsberoende kausal inferens (SDCI) för kausal upptäckt i villkorlig stationär tidsseriedata. Vi formulerar vårt problem med kausalanalys genom att anse att tidsseriens stationäritet är villkorad av en kategorisk variabel, som vi kallar tillstånd. Resultaten visar att det föreslagna probabilistiska genomförandet uppnår enastående resultat vid identifiering av orsakssambandet på simulerade data. När man överväger att tillståndet är oberoende av dynamiken, upprätthåller vår metod anständiga noggrannhetsnivåer av kanttypsidentifiering som uppnår 74, 87% testnoggrannhet när man överväger totalt 8 tillstånd. Dessutom hanterar vår metod korrekt regimer där tillståndsvariabeln genomgår komplexa övergångar och är beroende av dynamiken på scenen och uppnår 79, 21% noggrannhet för att identifiera kausala interaktioner. Vi anser att detta arbete är ett viktigt bidrag till kausal upptäckt i videor. Causality Causal discovery Neural networks Graph neural network Time series Non-stationary Orsakssamband Kausal upptäckt Neurala nätverk Diagram Neurala nätverk Tidsföljder Icke-stationär Computer and Information Sciences Data- och informationsvetenskap
27	Deep Learning Framework for Trajectory Prediction and In-time Prognostics in the Terminal Airspace Varun S Sudarsanan (13889826) 06 October 2022 (has links) <p>Terminal airspace around an airport is the biggest bottleneck for commercial operations in the National Airspace System (NAS). In order to prognosticate the safety status of the terminal airspace, effective prediction of the airspace evolution is necessary. While there are fixed procedural structures for managing operations at an airport, the confluence of a large number of aircraft and the complex interactions between the pilots and air traffic controllers make it challenging to predict its evolution. Modeling the high-dimensional spatio-temporal interactions in the airspace given different environmental and infrastructural constraints is necessary for effective predictions of future aircraft trajectories that characterize the airspace state at any given moment. A novel deep learning architecture using Graph Neural Networks is proposed to predict trajectories of aircraft 10 minutes into the future and estimate prog?nostic metrics for the airspace. The uncertainty in the future is quantified by predicting distributions of future trajectories instead of point estimates. The framework’s viability for trajectory prediction and prognosis is demonstrated with terminal airspace data from Dallas Fort Worth International Airport (DFW). </p> Deep Learning Applications Terminal airspace Aviation Safety prognostics prediction model Graph Neural Network (GNN) Uncertainty Quantification Gaussian Mixture Models
28	Improving the Performance of Clinical Prediction Tasks by Using Structured and Unstructured Data Combined with a Patient Network Nouri Golmaei, Sara 08 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / With the increasing availability of Electronic Health Records (EHRs) and advances in deep learning techniques, developing deep predictive models that use EHR data to solve healthcare problems has gained momentum in recent years. The majority of clinical predictive models benefit from structured data in EHR (e.g., lab measurements and medications). Still, learning clinical outcomes from all possible information sources is one of the main challenges when building predictive models. This work focuses mainly on two sources of information that have been underused by researchers; unstructured data (e.g., clinical notes) and a patient network. We propose a novel hybrid deep learning model, DeepNote-GNN, that integrates clinical notes information and patient network topological structure to improve 30-day hospital readmission prediction. DeepNote-GNN is a robust deep learning framework consisting of two modules: DeepNote and patient network. DeepNote extracts deep representations of clinical notes using a feature aggregation unit on top of a state-of-the-art Natural Language Processing (NLP) technique - BERT. By exploiting these deep representations, a patient network is built, and Graph Neural Network (GNN) is used to train the network for hospital readmission predictions. Performance evaluation on the MIMIC-III dataset demonstrates that DeepNote-GNN achieves superior results compared to the state-of-the-art baselines on the 30-day hospital readmission task. We extensively analyze the DeepNote-GNN model to illustrate the effectiveness and contribution of each component of it. The model analysis shows that patient network has a significant contribution to the overall performance, and DeepNote-GNN is robust and can consistently perform well on the 30-day readmission prediction task. To evaluate the generalization of DeepNote and patient network modules on new prediction tasks, we create a multimodal model and train it on structured and unstructured data of MIMIC-III dataset to predict patient mortality and Length of Stay (LOS). Our proposed multimodal model consists of four components: DeepNote, patient network, DeepTemporal, and score aggregation. While DeepNote keeps its functionality and extracts representations of clinical notes, we build a DeepTemporal module using a fully connected layer stacked on top of a one-layer Gated Recurrent Unit (GRU) to extract the deep representations of temporal signals. Independent to DeepTemporal, we extract feature vectors of temporal signals and use them to build a patient network. Finally, the DeepNote, DeepTemporal, and patient network scores are linearly aggregated to fit the multimodal model on downstream prediction tasks. Our results are very competitive to the baseline model. The multimodal model analysis reveals that unstructured text data better help to estimate predictions than temporal signals. Moreover, there is no limitation in applying a patient network on structured data. In comparison to other modules, the patient network makes a more significant contribution to prediction tasks. We believe that our efforts in this work have opened up a new study area that can be used to enhance the performance of clinical predictive models. Electronic Health Record (EHR) Clinical Note Clinical Predictive Models Multimodal Model Natural Language Processing (NLP) Patient Network Graph Neural Network (GNN) Structured Temporal Data Unstructured Data Data Fusion Deep Learning Feature Aggregation Readmission Prediction Mortality Prediction Length of Stay (LOS) prediction
29	CondBEHRT: A Conditional Probability Based Transformer for Modeling Medical Ontology Lerjebo, Linus, Hägglund, Johannes January 2022 (has links) In recent years the number of electronic healthcare records (EHRs)has increased rapidly. EHR represents a systematized collection of patient health information in a digital format. EHR systems maintain diagnoses, medications, procedures, and lab tests associated with the patients at each time they visit the hospital or care center. Since the information is available into multiple visits to hospitals or care centers, the EHR can be used to increasing quality care. This is especially useful when working with chronic diseases because they tend to evolve. There have been many deep learning methods that make use of these EHRs to solve different prediction tasks. Transformers have shown impressive results in many sequence-to-sequence tasks within natural language processing. This paper will mainly focus on using transformers, explicitly using a sequence of visits to do prediction tasks. The model presented in this paper is called CondBEHRT. Compared to previous state-of-art models, CondBEHRT will focus on using as much available data as possible to understand the patient’s trajectory. Based on all patients, the model will learn the medical ontology between diagnoses, medications, and procedures. The results show that the inferred medical ontology that has been learned can simulate reality quite well. Having the medical ontology also gives insights about the explainability of model decisions. We also compare the proposed model with the state-of-the-art methods using two different use cases; predicting the given codes in the next visit and predicting if the patient will be readmitted within 30 days. Artificial Intelligence Explainable AI Machine Learning Deep Learning Data Mining Electronic Health Records Transformer Graph Neural Network Natural Language Processing Hypertension Kidney Disease Heart Failure Medical Ontology Computer and Information Sciences Data- och informationsvetenskap
30	WEAKLY SUPERVISED CHARACTERIZATION OF DISCOURSES ON SOCIAL AND POLITICAL MOVEMENTS ON ONLINE MEDIA Shamik Roy (16317636) 14 June 2023 (has links) <p>Nowadays an increasing number of people consume, share, and interact with information online. This results in posting and counter-posting on online media by different ideological groups on various polarized topics. Consequently, online media has become the primary platform for political and social influencers to directly interact with the citizens and share their perspectives, views, and stances with the goal of gaining support for their actions, bills, and legislation. Hence, understanding the perspectives and the influencing strategies in online media texts is important for an individual to avoid misinformation and improve trust between the general people and the influencers and the authoritative figures such as the government.</p> <p><br></p> <p>Automatically understanding the perspectives in online media is difficult because of two major challenges. Firstly, the proper grammar or mechanism to characterize the perspectives is not available. Recent studies in Natural Language Processing (NLP) have leveraged resources from social science to explain perspectives. For example, Policy Framing and Moral Foundation Theory are used for understanding how issues are framed and the moral appeal expressed in texts to gain support. However, these theories often fail to capture the nuances in perspectives and cannot generalize over all topics and events. Our research in this dissertation is one of the first studies that adapt social science theories in Natural Language Processing for understanding perspectives to the extent that they can capture differences in ideologies or stances. The second key challenge in understanding perspectives in online media texts is that annotated data is difficult to obtain to build automatic methods to detect the perspectives, that can generalize over the large corpus of online media text on different topics. To tackle this problem, in this dissertation, we used weak sources of supervision such as social network interaction of users who produce and interact with the messages, weak human interaction, or artificial few-shot data using Large Language Models. </p> <p><br></p> <p>Our insight is that various tasks such as perspectives, stances, sentiments toward entities, etc. are interdependent when characterizing online media messages. As a result, we proposed approaches that jointly model various interdependent problems such as perspectives, stances, sentiments toward entities, etc., and perform structured prediction to solve them jointly. Our research findings showed that the messaging choices and perspectives on online media in response to various real-life events and their prominence and contrast in different ideological camps can be efficiently captured using our developed methods.</p> Natural language processing Perspective Analysis Moral Foundation Theory Policy Framing Discourse Analysis Social Media Text Analysis News Media Text Analysis Structured Prediction Graph Neural Network Contrastive Learning Weak Supervision Linguistic Homophily Social Network Analysis Relational Learning Large Language Models

Search results