Spelling suggestions: "subject:"[een] GRAPH NEURAL NETWORKS"" "subject:"[enn] GRAPH NEURAL NETWORKS""
31 |
Engineering Coordination Cages With Generative AI / Konstruktion av Koordinationsburar med Generativ AIAhmad, Jin January 2024 (has links)
Deep learning methods applied to chemistry can speed the discovery of novel compounds and facilitate the design of highly complex structures that are both valid and have important societal applications. Here, we present a pioneering exploration into the use of Generative Artificial Intelligence (GenAI) to design coordination cages within the field of supramolecular chemistry. Specifically, the study leverages GraphINVENT, a graph-based deep generative model, to facilitate the automated generation of tetrahedral coordination cages. Through a combination of computational tools and cheminformatics, the research aims to extend the capabilities of GenAI, traditionally applied in simpler chemical contexts, to the complex and nuanced arena of coordination cages. The approach involves a variety of training strategies, including initial pre-training on a large dataset (GDB-13) followed by transfer learning targeted at generating specific coordination cage structures. Data augmentation techniques were also applied to enrich training but did not yield successful outcomes. Several other strategies were employed, including focusing on single metal ion structures to enhance model familiarity with Fe-based cages and extending training datasets with diverse molecular examples from the ChEMBL database. Despite these strategies, the models struggled to capture the complex interactions required for successful cage generation, indicating potential limitations with both the diversity of the training datasets and the model’s architectural capacity to handle the intricate chemistry of coordination cages. However, training on the organic ligands (linkers) yielded successful results, emphasizing the benefits of focusing on smaller building blocks. The lessons learned from this project are substantial. Firstly, the knowledge acquired about generative models and the complex world of supramolecular chemistry has provided a unique opportunity to understand the challenges and possibilities of applying GenAI to such a complicated field. The results obtained in this project have highlighted the need for further refinement of data handling and model training techniques, paving the way for more advanced applications in the future. Finally, this project has not only raised our understanding of the capabilities and limitations of GenAI in coordination cages, but also set a foundation for future research that could eventually lead to breakthroughs in designing novel cage structures. Further study could concentrate on learning from the linkers in future data-driven cage design projects. / Deep learning-metoder (djup lärande metoder) som tillämpas på kemi kan påskynda upptäckten av nya molekyler och underlätta utformningen av mycket komplexa strukturer som både är giltiga och har viktiga samhällstillämpningar. Här presenterar vi en banbrytande undersökning av användningen av generativ artificiell intelligens (GenAI) för att designa koordinationsburar inom supramolekylär kemi. Specifikt utnyttjar studien GraphINVENT, en grafbaserad djup generativ modell, för att underlätta den automatiska genereringen av tetraedriska koordinationsburar. Genom en kombination av beräkningsverktyg och kemiinformatik syftar forskningen till att utöka kapaciteten hos GenAI, som traditionellt tillämpas i enklare kemiska sammanhang, till den komplexa och nyanserade arenan för koordinationsburar. Metoden innebar inledande förträning på ett brett dataset (GDB-13) följt av transferinlärning inriktad på att generera specifika koordinationsburstrukturer. Dataförstärkningstekniker användes också för att berika träningen men gav inte några lyckade resultat. Flera strategier användes, inklusive fokusering på enstaka metalljonsystem för att förbättra modellens förtrogenhet med Fe-baserade burar och utöka träningsdataset med olika molekylära exempel från ChEMBL-databasen. Trots dessa strategier hade modellerna svårt att fånga de komplexa interaktioner som krävs för framgångsrik generering av burar, vilket indikerar potentiella begränsningar inom både mångfalden av träningsdataset och modellens arkitektoniska kapacitet att hantera den invecklade kemin i koordinationsburar. Däremot var träningen på de organiska liganderna (länkarna) framgångsrik, vilket betonar fördelarna med att fokusera på mindre byggstenar. Dock är fördelarna med detta projekt betydande. Den kunskap som förvärvats om hur generativa modeller fungerar och den komplexa världen av supramolekylär kemi har gett en unik möjlighet att förstå utmaningarna och möjligheterna med att tillämpa GenAI på ett så komplicerat område. Erfarenheterna har visat på behovet av ytterligare förfining av datahantering och modellträningstekniker, vilket banar väg för mer avancerade tillämpningar i framtiden. Det här projektet har inte bara ökat vår förståelse för GenAI:s möjligheter och begränsningar i koordinationsburar utan också lagt grunden för framtida forskning som i slutändan kan leda till banbrytande upptäckter i utformningen av nya burstrukturer. Ytterligare studier skulle kunna fokusera på att lära sig från länkarna för att hjälpa framtida datadrivna projekt för burdesign.
|
32 |
Overcoming the challenges in geometric deep learning via hybrid graph neural network architecturesWenkel, Jan Frederik 11 1900 (has links)
Les progrès technologiques nous permettent aujourd’hui de recueillir des données de différentes modalités, telles que le texte, l'audio, l'image ou la vidéo, à une échelle sans précédent. L'apprentissage profond est l'outil principal qui permet de comprendre et d'exploiter ces collections de données massives. Les capacités actuelles incluant des tâches prédictives telles que l'analyse des sentiments dans les textes, la classification de la musique et des images, la segmentation des images ou la reconnaissance des actions dans les vidéos, ainsi que des tâches génératives telles que la génération de textes, d'images, de musique et même de vidéos. Le succès de l'apprentissage profond dans de nombreuses applications est largement attribué à la capacité des méthodes à exploiter la structure intrinsèque des données. Les tâches de traitement d'images, par exemple, ont donné naissance aux réseaux neuronaux convolutifs qui utilisent l'organisation spatiale des pixels. Cependant, l'analyse des séries temporelles a donné naissance aux réseaux neuronaux récurrents qui utilisent l'organisation temporelle dans leur traitement de l'information en utilisant, par exemple, des mécanismes de mémoire.
Alors que ces modalités peuvent être représentées dans des domaines Euclidiens qui possèdent des propriétés théoriques relativement agréables, d'autres modalités toutes aussi intéressantes possédant une structure plus abstraite. Les données des réseaux sociaux, des moteurs de recherche, des petites molécules ou des protéines sont naturellement représentées par des graphes. L'apprentissage profond géométrique est dirigé vers la généralisation de réseaux neuronaux qui peuvent utiliser la structure à de tels domaines non-Euclidiens. Les principaux outils sont les réseaux neuronaux de graphes qui généralisent les principes des réseaux neuronaux convolutionnels dans le domaine de la vision au domaine des graphes. Cela a permis de développer des méthodes puissantes pour de diverses applications, telles que l'analyse des réseaux sociaux et l’analyse ou la génération de molécules. Toutefois, ces méthodes sont limitées par plusieurs défis fondamentaux liés au paradigme central de passage de messages, c'est-à-dire le calcul répété de moyennes d’information au niveau des sommets par rapport au voisinage de chaque sommet. En conséquence, les représentations locales au niveau des sommets deviennent soit trop similaires en raison des calculs de moyenne répétés, soit les champs récepteurs des modèles sont trop petits pour que les informations ne puissent pas être partagées entre les sommets distants.
Cette thèse présentera des approches pour relever ces défis en commençant par analyser et comprendre des données pertinentes et en identifiant les propriétés structurelles qui permettent un apprentissage efficace des représentations des graphes. Nous formulons ensuite un cadre théorique basé sur la théorie du traitement des signaux de graphes qui nous permet de développer de nouvelles architectures GNN puissantes qui exploitent ces propriétés, tout en atténuant les défis courants. Nous constatons que les modèles hybrides qui combinent les méthodes existantes et les nouveaux principes présentés dans cette thèse sont particulièrement puissants. Nous fournissons des garanties théoriques qui établissent les capacités théoriques des architectures proposées et présentons une analyse empirique qui démontre l'efficacité de ces nouvelles architectures dans une variété d’applications telles que les réseaux sociaux, la biochimie et l'optimisation combinatoire. / Technological advances have enabled us to gather and store data from various modalities such as text, audio, image or video at unprecedented scale. Deep learning is the signature tool that allows to understand and leverage such massive data collections, enabling us to engage in challenging new endeavors. The capabilities range from predictive tasks such as sentiment analysis in text, music and image classification, image segmentation or action recognition in video to generative tasks like generation of text, images, music and even entire videos. The success of deep learning in many applications is largely attributed to the ability of commonly used neural network architectures to leverage the intrinsic structure of the data. Image processing tasks, for example, gave rise to convolutional neural networks that rely on spatial organization of pixels, while time series analysis gave rise to recurrent neural networks that leverage temporal organization in their information processing via feedback loops and memory mechanisms.
While these modalities largely reside in relatively well behaved and often highly regular domains like Euclidean spaces, further modalities that possess more abstract structure have recently attracted much attention. Data from social networks, search engines, small molecules or proteins is naturally represented by graphs and so-called geometric deep learning (GDL) has made great strides towards generalizing the design of structure-aware neural networks to such non-Euclidean domains. Among the most promising methods are graph neural networks (GNNs) that generalize the design of convolutional neural networks in vision to the graph domain. Recent advances in GNN design have introduced increasingly powerful methods for various applications, such as social network analysis, molecular predictive modeling or molecule generation. However, graph representation learning is limited by several fundamental challenges that originate from the central GNN paradigm of message passing, that is repeated averaging of node-level information across node neighborhoods. As a result, local node-level representations become either too similar from excessive averaging, or otherwise, the receptive fields of the models are too small such that information cannot be shared between distant nodes, creating a complex trade-off between so-called oversmoothing and underreaching.
This dissertation presents a principled way of tackling these challenges by first deepening our understanding of the relevant data and identifying the structural properties that allow for effective graph representation learning. We consequently develop a theoretical framework rooted in graph signal processing that allows us to design powerful novel GNN architectures that provably leverage those properties, while alleviating common challenges. We find that hybrid models that combine existing methods together with novel GNN principles are particularly powerful. We provide theoretical guarantees that establish the expressive power of the proposed architectures and present exhaustive empirical analysis that demonstrates the efficacy of these novel architectures in various applications such as social networks, bio-chemistry and combinatorial optimization.
|
33 |
Software Fault Detection in Telecom Networks using Bi-level Federated Graph Neural Networks / Upptäckt av SW-fel i telekommunikationsnätverk med hjälp av federerade grafiska neurala nätverk på två nivåerBourgerie, Rémi January 2023 (has links)
The increasing complexity of telecom networks, induced by the recent development of 5G, is a challenge for detecting faults in the telecom network. In addition to the structural complexity of telecommunication systems, data accessibility has become an issue both in terms of privacy and access cost. We propose a method relying on bi-level Federated Graph Neural Networks to identify anomalies in the telecom network while ensuring reduced communication costs as well as data privacy. Our method considers telecom data as a bi-level graph, where the highest level graph represents the interaction between sites, and each site is further expanded to its software (SW) performance behaviour graph. We developed and compared 4G/5G SW Fault Detection models under 3 settings: (1) Centralized Temporal Graph Neural Networks model: we propose a model to detect anomalies in 4G/5G telecom data. (2) Federated Temporal Graph Neural Networks model: we propose Federated Learning (FL) as a mechanism for privacy-aware training of models for fault detection. (3) Personalized Federated Temporal Graph Neural Networks model: we propose a novel aggregation technique, referred to as FedGraph, leveraging both a graph and the similarities between sites for aggregating the models and proposing models more personalized to each site’s behaviour. We compare the benefits of Federated Learning (FL) models (2) and (3) with centralized training (1) in terms of SW performance data modelling, anomaly detection, and communication cost. The evaluation includes both a scenario with normal functioning sites and a scenario where only a subset of sites exhibit faulty behaviour. The combination of SW execution graphs with GNNs has shown improved modelling performance and minor gains in centralized settings (1). In a normal network context, FL models (2) and (3) perform comparably to centralized training (CL), with slight improvements observed when using the personalized strategy (3). However, in abnormal network scenarios, Federated Learning falls short of achieving comparable detection performance to centralized training. This is due to the unintended learning of abnormal site behaviour, particularly when employing the personalized model (3). These findings highlight the importance of carefully assessing and selecting suitable FL strategies for anomaly detection and model training on telecom network data. / Den ökande komplexiteten i telenäten, som är en följd av den senaste utvecklingen av 5G, är en utmaning när det gäller att upptäcka fel i telenäten. Förutom den strukturella komplexiteten i telekommunikationssystem har datatillgänglighet blivit ett problem både när det gäller integritet och åtkomstkostnader. Vi föreslår en metod som bygger på Federated Graph Neural Networks på två nivåer för att identifiera avvikelser i telenätet och samtidigt säkerställa minskade kommunikationskostnader samt dataintegritet. Vår metod betraktar telekomdata som en graf på två nivåer, där grafen på den högsta nivån representerar interaktionen mellan webbplatser, och varje webbplats utvidgas ytterligare till sin graf för programvarans (SW) prestandabeteende. Vi utvecklade och jämförde 4G/5G SW-feldetekteringsmodeller under 3 inställningar: (1) Central Temporal Graph Neural Networks-modell: vi föreslår en modell för att upptäcka avvikelser i 4G/5G-telekomdata. (2) Federated Temporal Graph Neural Networks-modell: vi föreslår Federated Learning (FL) som en mekanism för integritetsmedveten utbildning av modeller för feldetektering. I motsats till centraliserad inlärning aggregeras lokalt tränade modeller på serversidan och skickas tillbaka till klienterna utan att data läcker ut mellan klienterna och servern, vilket säkerställer integritetsskyddande samarbetsutbildning. (3) Personaliserad Federated Temporal Graph Neural Networks-modell: vi föreslår en ny aggregeringsteknik, kallad FedGraph, som utnyttjar både en graf och likheterna mellan webbplatser för att aggregera modellerna. Vi jämför fördelarna med modellerna Federated Learning (FL) (2) och (3) med centraliserad utbildning (1) när det gäller datamodellering av SW-prestanda, anomalidetektering och kommunikationskostnader. Utvärderingen omfattar både ett scenario med normalt fungerande anläggningar och ett scenario där endast en delmängd av anläggningarna uppvisar felaktigt beteende. Kombinationen av SW-exekveringsgrafer med GNN har visat förbättrad modelleringsprestanda och mindre vinster i centraliserade inställningar (1). I en normal nätverkskontext presterar FL-modellerna (2) och (3) jämförbart med centraliserad träning (CL), med små förbättringar observerade när den personliga strategin används (3). I onormala nätverksscenarier kan Federated Learning dock inte uppnå jämförbar detekteringsprestanda med centraliserad träning. Detta beror på oavsiktlig inlärning av onormalt beteende på webbplatsen, särskilt när man använder den personliga modellen (3). Dessa resultat belyser vikten av att noggrant bedöma och välja lämpliga FL-strategier för anomalidetektering och modellträning på telekomnätdata.
|
34 |
SOLVING PREDICTION PROBLEMS FROM TEMPORAL EVENT DATA ON NETWORKSHao Sha (11048391) 06 August 2021 (has links)
<div><div><div><p>Many complex processes can be viewed as sequential events on a network. In this thesis, we study the interplay between a network and the event sequences on it. We first focus on predicting events on a known network. Examples of such include: modeling retweet cascades, forecasting earthquakes, and tracing the source of a pandemic. In specific, given the network structure, we solve two types of problems - (1) forecasting future events based on the historical events, and (2) identifying the initial event(s) based on some later observations of the dynamics. The inverse problem of inferring the unknown network topology or links, based on the events, is also of great important. Examples along this line include: constructing influence networks among Twitter users from their tweets, soliciting new members to join an event based on their participation history, and recommending positions for job seekers according to their work experience. Following this direction, we study two types of problems - (1) recovering influence networks, and (2) predicting links between a node and a group of nodes, from event sequences.</p></div></div></div>
|
35 |
Aplikace metody učení bez učitele na hledání podobných grafů / Application of Unsupervised Learning Methods in Graph Similarity SearchSabo, Jozef January 2021 (has links)
Goal of this master's thesis was in cooperation with the company Avast to design a system, which can extract knowledge from a database of graphs. Graphs, used for data mining, describe behaviour of computer systems and they are anonymously inserted into the company's database from systems of the company's products users. Each graph in the database can be assigned with one of two labels: clean or malware (malicious) graph. The task of the proposed self-learning system is to find clusters of graphs in the graph database, in which the classes of graphs do not mix. Graph clusters with only one class of graphs can be interpreted as different types of clean or malware graphs and they are a useful source of further analysis on the graphs. To evaluate the quality of the clusters, a custom metric, named as monochromaticity, was designed. The metric evaluates the quality of the clusters based on how much clean and malware graphs are mixed in the clusters. The best results of the metric were obtained when vector representations of graphs were created by a deep learning model (variational graph autoencoder with two relation graph convolution operators) and the parameterless method MeanShift was used for clustering over vectors.
|
36 |
Intersecting Graph Representation Learning and Cell Profiling : A Novel Approach to Analyzing Complex Biomedical DataChamyani, Nima January 2023 (has links)
In recent biomedical research, graph representation learning and cell profiling techniques have emerged as transformative tools for analyzing high-dimensional biological data. The integration of these methods, as investigated in this study, has facilitated an enhanced understanding of complex biological systems, consequently improving drug discovery. The research aimed to decipher connections between chemical structures and cellular phenotypes while incorporating other biological information like proteins and pathways into the workflow. To achieve this, machine learning models' efficacy was examined for classification and regression tasks. The newly proposed graph-level and bio-graph integrative predictors were compared with traditional models. Results demonstrated their potential, particularly in classification tasks. Moreover, the topology of the COVID-19 BioGraph was analyzed, revealing the complex interconnections between chemicals, proteins, and biological pathways. By combining network analysis, graph representation learning, and statistical methods, the study was able to predict active chemical combinations within inactive compounds, thereby exhibiting significant potential for further investigations. Graph-based generative models were also used for molecule generation opening up further research avenues in finding lead compounds. In conclusion, this study underlines the potential of combining graph representation learning and cell profiling techniques in advancing biomedical research in drug repurposing and drug combination. This integration provides a better understanding of complex biological systems, assists in identifying therapeutic targets, and contributes to optimizing molecule generation for drug discovery. Future investigations should optimize these models and validate the drug combination discovery approach. As these techniques continue to evolve, they hold the potential to significantly impact the future of drug screening, drug repurposing, and drug combinations.
|
37 |
Learning to compare nodes in branch and bound with graph neural networksLabassi, Abdel Ghani 08 1900 (has links)
En informatique, la résolution de problèmes NP-difficiles en un temps raisonnable est d’une grande importance : optimisation de la chaîne d’approvisionnement, planification, routage, alignement de séquences biologiques multiples, inference dans les modèles graphiques pro- babilistes, et même certains problèmes de cryptographie sont tous des examples de la classe NP-complet. En pratique, nous modélisons beaucoup d’entre eux comme un problème d’op- timisation en nombre entier, que nous résolvons à l’aide de la méthodologie séparation et évaluation. Un algorithme de ce style divise un espace de recherche pour l’explorer récursi- vement (séparation), et obtient des bornes d’optimalité en résolvant des relaxations linéaires sur les sous-espaces (évaluation). Pour spécifier un algorithme, il faut définir plusieurs pa- ramètres, tel que la manière d’explorer les espaces de recherche, de diviser une recherche l’espace une fois exploré, ou de renforcer les relaxations linéaires. Ces politiques peuvent influencer considérablement la performance de résolution.
Ce travail se concentre sur une nouvelle manière de dériver politique de recherche, c’est à dire le choix du prochain sous-espace à séparer étant donné une partition en cours, en nous servant de l’apprentissage automatique profond. Premièrement, nous collectons des données résumant, sur une collection de problèmes donnés, quels sous-espaces contiennent l’optimum et quels ne le contiennent pas. En représentant ces sous-espaces sous forme de graphes bipartis qui capturent leurs caractéristiques, nous entraînons un réseau de neurones graphiques à déterminer la probabilité qu’un sous-espace contienne la solution optimale par apprentissage supervisé. Le choix d’un tel modèle est particulièrement utile car il peut s’adapter à des problèmes de différente taille sans modifications. Nous montrons que notre approche bat celle de nos concurrents, consistant à des modèles d’apprentissage automatique plus simples entraînés à partir des statistiques du solveur, ainsi que la politique par défaut de SCIP, un solveur open-source compétitif, sur trois familles NP-dures: des problèmes de recherche de stables de taille maximum, de flots de réseau multicommodité à charge fixe, et de satisfiabilité maximum. / In computer science, solving NP-hard problems in a reasonable time is of great importance, such as in supply chain optimization, scheduling, routing, multiple biological sequence align- ment, inference in probabilistic graphical models, and even some problems in cryptography. In practice, we model many of them as a mixed integer linear optimization problem, which we solve using the branch and bound framework. An algorithm of this style divides a search space to explore it recursively (branch) and obtains optimality bounds by solving linear relaxations in such sub-spaces (bound). To specify an algorithm, one must set several pa- rameters, such as how to explore search spaces, how to divide a search space once it has been explored, or how to tighten these linear relaxations. These policies can significantly influence resolution performance.
This work focuses on a novel method for deriving a search policy, that is, a rule for select- ing the next sub-space to explore given a current partitioning, using deep machine learning. First, we collect data summarizing which subspaces contain the optimum, and which do not. By representing these sub-spaces as bipartite graphs encoding their characteristics, we train a graph neural network to determine the probability that a subspace contains the optimal so- lution by supervised learning. The choice of such design is particularly useful as the machine learning model can automatically adapt to problems of different sizes without modifications. We show that our approach beats the one of our competitors, consisting of simpler machine learning models trained from solver statistics, as well as the default policy of SCIP, a state- of-the-art open-source solver, on three NP-hard benchmarks: generalized independent set, fixed-charge multicommodity network flow, and maximum satisfiability problems.
|
38 |
[pt] REDES DE GRAFOS SEMÂNTICOS COM ATENÇÃO E DECOMPOSIÇÃO DE TENSORES PARA VISÃO COMPUTACIONAL E COMPUTAÇÃO GRÁFICA / [en] SEMANTIC GRAPH ATTENTION NETWORKS AND TENSOR DECOMPOSITIONS FOR COMPUTER VISION AND COMPUTER GRAPHICSLUIZ JOSE SCHIRMER SILVA 02 July 2021 (has links)
[pt] Nesta tese, propomos novas arquiteturas para redes neurais profundas utlizando métodos de atenção e álgebra multilinear para aumentar seu desempenho. Também exploramos convoluções em grafos e suas particularidades. Nos concentramos aqui em problemas relacionados à estimativa de pose em tempo real. A estimativa de pose é um problema desafiador em visão computacional com muitas aplicações reais em áreas como realidade aumentada, realidade virtual, animação por computador e reconstrução de cenas 3D. Normalmente, o problema a ser abordado envolve estimar a pose humana 2D ou 3D, ou seja, as partes do corpo de pessoas em imagens ou vídeos, bem como seu posicionamento e estrutura. Diveros trabalhos buscam atingir alta precisão usando arquiteturas baseadas em redes neurais de convolução convencionais; no entanto, erros causados por oclusão e motion blur não são incomuns, e ainda esses modelos são computacionalmente pesados para aplicações em tempo real. Exploramos diferentes arquiteturas para melhorar o tempo de processamento destas redes e, como resultado, propomos dois novos modelos de rede neural para estimativa de pose 2D e 3D. Também apresentamos uma nova arquitetura para redes de atenção em grafos chamada de atenção em grafos semânticos. / [en] This thesis proposes new architectures for deep neural networks with attention enhancement and multilinear algebra methods to increase their performance. We also explore graph convolutions and their particularities. We focus here on the problems related to real-time pose estimation. Pose estimation is a challenging problem in computer vision with many real applications in areas including augmented reality, virtual reality, computer animation, and 3D scene reconstruction. Usually, the problem to be addressed
involves estimating the 2D and 3D human pose, i.e., the anatomical keypoints or body parts of persons in images or videos. Several papers propose approaches to achieve high accuracy using architectures based on conventional convolution neural networks; however, mistakes caused by occlusion and motion blur are not uncommon, and those models are computationally very intensive for real-time applications. We explore different architectures to improve processing time, and, as a result, we propose two novel neural network models for 2D and 3D pose estimation. We also introduce a new architecture for Graph attention networks called Semantic Graph Attention.
|
39 |
Improving The Robustness of Artificial Neural Networks via Bayesian ApproachesJun Zhuang (16456041) 30 August 2023 (has links)
<p>Artificial neural networks (ANNs) have achieved extraordinary performance in various domains in recent years. However, some studies reveal that ANNs may be vulnerable in three aspects: label scarcity, perturbations, and open-set emerging classes. Noisy labeling and self-supervised learning approaches address the label scarcity issues, but most of the work couldn't handle the perturbations. Adversarial training methods, topological denoising methods, and mechanism designing methods aim to mitigate the negative effects caused by perturbations. However, adversarial training methods can barely train a robust model under the circumstance of extensive label scarcity; topological denoising methods are not efficient on dynamic data structures; and mechanism designing methods often depend on heuristic explorations. Detection-based methods devote to identifying novel or anomaly instances for further downstream tasks. Nonetheless, such instances may belong to open-set new emerging classes. To embrace the aforementioned challenges, we address the robustness issues of ANNs from two aspects. First, we propose a series of Bayesian label transition models to improve the robustness of Graph Neural Networks (GNNs) in the presence of label scarcity and perturbations in the graph domain. Second, we propose a new non-exhaustive learning model, named NE-GM-GAN, to handle both open-set problems and class-imbalance issues in network intrusion datasets. Extensive experiments with several datasets demonstrate that our proposed models can effectively improve the robustness of ANNs.</p>
|
40 |
Modelling Cyber Security of Networks as a Reinforcement Learning Problem using Graphs : An Application of Reinforcement Learning to the Meta Attack Language / Cybersäkerhet för datornätverk representerat som ett förstärkningsinlärningsproblem med grafer : Förstärkningsinlärning applicerat på Meta Attack LanguageBerglund, Sandor January 2022 (has links)
ICT systems are part of the vital infrastructure in today’s society. These systems are under constant threat and efforts are continually being put forth by cyber security experts to protect them. By applying modern AI methods, can these efforts both be improved and alleviated of the cost of expert work. This thesis examines whether a reinforcement learning (RL) algorithm can be applied to a cyber security modelling of ICT systems. The research question answered is that of how well an RL algorithm can optimise the resource cost of successful cyber attacks, as represented by a cyber security model? The modelling, called Meta Attack Language (MAL), is a meta language for attack graphs that details the individual steps to be taken in a cyber attack. In the previous work of Manuel Rickli’s thesis, a method of automatically generating attack graphs according to MAL aimed at modelling industry-level computer networks, was presented. The method was used to generate different distributions of attack graphs that were used to train deep Q-learning (DQN) agents. The agents’ results were then compared with a random agent and a greedy method based on the A∗ search algorithm. The results show that attack step selection can be achieved with a higher performance than the uninformed choice of the random agent, by DQN. However, DQN was unable to achieve higher performance than the A∗ method. This may be due to the simplicity of the attack graph generation or the fact that the A∗ method has access to the complete attack graph, amongst other factors. The thesis also raises questions about general representation of MAL attack graphs as RL problems and how to apply RL algorithms to the RL problem. The source code of this thesis is available at: https://github.com/KTH-SSAS/sandor-berglund-thesis. / IT-system är i dagens samhälle en väsentlig del av infrastrukturen som är under konstant hot av olika personer och organisationer. IT-säkerhetsexperter lägger ner beständigt arbete på att hålla dessa system säkra och för att avvärja illvilliga auktioner mot IT-system. Moderna AI-metoder kan användas för att förbättra och lätta på kostnaden av expertarbetet inom området. Detta examensarbete avser att undersöka hur en förstärkningsinlärningsalgoritm kan appliceras på en cybersäkerhetsmodell. Det görs genom att besvara frågeställningen: Hur väl kan en förstärkningsinlärningsalgoritm optimera en cyberattack representerat av en cybersäkerhetsmodell? Meta Attack Language (MAL) är ett metaspråk för attackgrafer som beskriver varje steg i en cyberattack. I detta examensarbete användes Manuell Ricklis implementation av MAL samt attack grafs generation för att definiera ett förstärkningsinlärningsproblem. Förstärkningsinlärningsalgoritmen deep Q-learning (DQN) användes för att träna ett attention baserat neuronnät på olika fördelningar av attackgrafer och jämfördes med en slumpmässig agent och en girig metod baserad på sökalgoritmen A∗ . Resultaten visar att DQN kunde producera en agent som presterar bättre än den oinformerade slumpmässiga agenten. Agenten presterade däremot inte bättre än den giriga A∗ metoden, vilket kan bero på att A∗ har tillgång till den fulla attack grafen, bland andra bidragande faktorer. Arbetet som läggs fram här väcker frågor om hur MAL-attackgrafer representeras som förstärkningsinlärningsproblem och hur förstärkningsinlärningsalgoritmer appliceras där av. Källkoden till det här examensarbetet finns på: https://github.com/KTHSSAS/sandor-berglund-thesis.
|
Page generated in 0.045 seconds