• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 45
  • 3
  • 1
  • Tagged with
  • 58
  • 58
  • 58
  • 41
  • 22
  • 18
  • 17
  • 16
  • 15
  • 13
  • 13
  • 12
  • 10
  • 9
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Models and Representation Learning Mechanisms for Graph Data

Susheel Suresh (14228138) 15 December 2022 (has links)
<p>Graph representation learning (GRL) has been increasing used to model and understand data from a wide variety of complex systems spanning social, technological, bio-chemical and physical domains. GRL consists of two main components (1) a parametrized encoder that provides representations of graph data and (2) a learning process to train the encoder parameters. Designing flexible encoders that capture the underlying invariances and characteristics of graph data are crucial to the success of GRL. On the other hand, the learning process drives the quality of the encoder representations and developing principled learning mechanisms are vital for a number of growing applications in self-supervised, transfer and federated learning settings. To this end, we propose a suite of models and learning algorithms for GRL which form the two main thrusts of this dissertation.</p> <p><br></p> <p>In Thrust I, we propose two novel encoders which build upon on a widely popular GRL encoder class called graph neural networks (GNNs). First, we empirically study the prediction performance of current GNN based encoders when applied to graphs with heterogeneous node mixing patterns using our proposed notion of local assortativity. We find that GNN performance in node prediction tasks strongly correlates with our local assortativity metric---thereby introducing a limit. We propose to transform the input graph into a computation graph with proximity and structural information as distinct types of edges. We then propose a novel GNN based encoder that operates on this computation graph and adaptively chooses between structure and proximity information. Empirically, adopting our transformation and encoder framework leads to improved node classification performance compared to baselines in real-world graphs that exhibit diverse mixing.</p> <p>Secondly, we study the trade-off between expressivity and efficiency of GNNs when applied to temporal graphs for the task of link ranking. We develop an encoder that incorporates a labeling approach designed to allow for efficient inference over the candidate set jointly, while provably boosting expressivity. We also propose to optimize a list-wise loss for improved ranking. With extensive evaluation on real-world temporal graphs, we demonstrate its improved performance and efficiency compared to baselines.</p> <p><br></p> <p>In Thrust II, we propose two principled encoder learning mechanisms for challenging and realistic graph data settings. First, we consider a scenario where only limited or even no labelled data is available for GRL. Recent research has converged on graph contrastive learning (GCL), where GNNs are trained to maximize the correspondence between representations of the same graph in its different augmented forms. However, we find that GNNs trained by traditional GCL often risk capturing redundant graph features and thus may be brittle and provide sub-par performance in downstream tasks. We then propose a novel principle, termed adversarial-GCL (AD-GCL), which enables GNNs to avoid capturing redundant information during the training by optimizing adversarial graph augmentation strategies used in GCL. We pair AD-GCL with theoretical explanations and design a practical instantiation based on trainable edge-dropping graph augmentation. We experimentally validate AD-GCL by comparing with state-of-the-art GCL methods and achieve performance gains in semi-supervised, unsupervised and transfer learning settings using benchmark chemical and biological molecule datasets. </p> <p>Secondly, we consider a scenario where graph data is silo-ed across clients for GRL. We focus on two unique challenges encountered when applying distributed training to GRL: (i) client task heterogeneity and (ii) label scarcity. We propose a novel learning framework called federated self-supervised graph learning (FedSGL), which first utilizes a self-supervised objective to train GNNs in a federated fashion across clients and then, each client fine-tunes the obtained GNNs based on its local task and available labels. Our framework enables the federated GNN model to extract patterns from the common feature (attribute and graph topology) space without the need of labels or being biased by heterogeneous local tasks. Extensive empirical study of FedSGL on both node and graph classification tasks yields fruitful insights into how the level of feature / task heterogeneity, the adopted federated algorithm and the level of label scarcity affects the clients’ performance in their tasks.</p>
32

Reliable graph predictions : Conformal prediction for Graph Neural Networks

Bååw, Albin January 2022 (has links)
We have seen a rapid increase in the development of deep learning algorithms in recent decades. However, while these algorithms have unlocked new business areas and led to great development in many fields, they are usually limited to Euclidean data. Researchers are increasingly starting to find out that they can better represent the data used in many real-life applications as graphs. Examples include high-risk domains such as finding the side effects when combining medicines using a protein-protein network. In high-risk domains, there is a need for trust and transparency in the results returned by deep learning algorithms. In this work, we explore how we can quantify uncertainty in Graph Neural Network predictions using conventional methods for conformal prediction as well as novel methods exploiting graph connectivity information. We evaluate the methods on both static and dynamic graphs and find that neither of the novel methods offers any clear benefits over the conventional methods. However, we see indications that using the graph connectivity information can lead to more efficient conformal predictors and a lower prediction latency than the conventional methods on large data sets. We propose that future work extend the research on using the connectivity information, specifically the node embeddings, to boost the performance of conformal predictors on graphs. / De senaste årtiondena har vi sett en drastiskt ökad utveckling av djupinlärningsalgoritmer. Även fast dessa algoritmer har skapat nya potentiella affärsområden och har även lett till nya upptäckter i flera andra fält, är dessa algoritmer dessvärre oftast begränsade till Euklidisk data. Samtidigt ser vi att allt fler forskare har upptäckt att data i verklighetstrogna applikationer oftast är bättre representerade i form av grafer. Exempel inkluderar hög-risk domäner som läkemedelsutveckling, där man förutspår bieffekter från mediciner med hjälp av protein-protein nätverk. I hög-risk domäner finns det ett krav på tillit och att resultaten från djupinlärningsalgoritmer är transparenta. I den här tesen utforskar vi hur man kan kvantifiera osäkerheten i resultaten hos Neurala Nätverk för grafer (eng. Graph Neural Networks) med hjälp av konform prediktion (eng. Conformal Prediction). Vi testar både konventionella metoder för konform prediktion, samt originella metoder som utnyttjar strukturell information från grafen. Vi utvärderar metoderna både på statiska och dynamiska grafer, och vi kommer fram till att de originella metoderna varken är bättre eller sämre än de konventionella metoderna. Däremot finner vi indikationer på att användning av den strukturella informationen från grafen kan leda till effektivare prediktorer och till lägre svarstid än de konventionella metoderna när de används på stora grafer. Vi föreslår att framtida arbete i området utforskar vidare hur den strukturella informationen kan användas, och framförallt nod representationerna, kan användas för att öka prestandan i konforma prediktorer för grafer.
33

Information Extraction from Invoices using Graph Neural Networks / Utvinning av information från fakturor med hjälp av grafiska neurala nätverk

Tan, Tuoyuan January 2023 (has links)
Information Extraction is a sub-field of Natural Language Processing that aims to extract structured data from unstructured sources. With the progress in digitization, extracting key information like account number, gross amount, etc. from business invoices becomes an interesting problem in both industry and academy. Such a process can largely facilitate online payment, as users do not have to type in key information by themselves. In this project, we design and implement an extraction system that combines Machine Learning and Heuristic Rules to solve the problem. Invoices are transformed into a graph structure and then Graph Neural Networks are used to give predictions of the role of each word appearing on invoices. Rule-based modules output the final extraction results based on aggregated information from predictions. Different variants of graph models are evaluated and the best system achieves 90.93% correct rate. We also study how the number of stacked graph neural layers influences the performance of the system. The ablation study compares the importance of each extracted feature and results show that the combination of features from different sources, rather than any single feature, plays the key role in the classification. Further experiments reveal the respective contributions of Machine Learning and rule-based modules for each label. / Informationsutvinning är ett delområde inom språkteknologi som syftar till att utvinna strukturerade data från ostrukturerade källor. I takt med den ökande digitaliseringen blir det ett intressant problem för både industrin och akademin att extrahera nyckelinformation som t.ex. kontonummer, bruttobelopp och liknande från affärsfakturor. En sådan process kan i hög grad underlätta onlinebetalningar, eftersom användarna inte behöver skriva in nyckelinformation själva. I det här projektet utformar och implementerar vi ett extraktionssystem som kombinerar maskininlärning och heuristiska regler för att lösa problemet. Fakturor kommer att omvandlas till en grafstruktur och sedan används grafiska neurala nätverk för att förutsäga betydelsen av varje ord som förekommer på fakturan. Regelbaserade moduler producerar de slutliga utvinningsresultaten baserat på aggregerad information från förutsägelserna. Olika varianter av grafmodeller utvärderas och det bästa systemet uppnår 90,93 % korrekta resultat. Vi studerar också hur antalet neurala graflager påverkar systemets prestanda. I ablationsstudien jämförs betydelsen av varje extraherat särdrag och resultaten visar att kombinationen av särdrag från olika källor, snarare än något enskilt särdrag, spelar en nyckelroll i klassificeringen. Ytterligare experiment visar hur maskininlärning och regelbaserade moduler på olika sätt bidrar till resultatet.
34

Link Prediction Using Learnable Topology Augmentation / Länkprediktion med hjälp av en inlärningsbar topologiförstärkning

Leatherman, Tori January 2023 (has links)
Link prediction is a crucial task in many downstream applications of graph machine learning. Graph Neural Networks (GNNs) are a prominent approach for transductive link prediction, where the aim is to predict missing links or connections only within the existing nodes of a given graph. However, many real-life applications require inductive link prediction for the newly-coming nodes with no connections to the original graph. Thus, recent approaches have adopted a Multilayer Perceptron (MLP) for inductive link prediction based solely on node features. In this work, we show that incorporating both connectivity structure and features for the new nodes provides better model expressiveness. To bring such expressiveness to inductive link prediction, we propose LEAP, an encoder that features LEArnable toPology augmentation of the original graph and enables message passing with the newly-coming nodes. To the best of our knowledge, this is the first attempt to provide structural contexts for the newly-coming nodes via learnable augmentation under inductive settings. Conducting extensive experiments on four real- world homogeneous graphs demonstrates that LEAP significantly surpasses the state-of-the-art methods in terms of AUC and average precision. The improvements over homogeneous graphs are up to 22% and 17%, respectively. The code and datasets are available on GitHub*. / Att förutsäga länkar är en viktig uppgift i många efterföljande tillämpningar av maskininlärning av grafer. Graph Neural Networks (GNNs) är en framträdande metod för transduktiv länkförutsägelse, där målet är att förutsäga saknade länkar eller förbindelser endast inom de befintliga noderna i en given graf. I många verkliga tillämpningar krävs dock induktiv länkförutsägelse för nytillkomna noder utan kopplingar till den ursprungliga grafen. Därför har man på senare tid antagit en Multilayer Perceptron (MLP) för induktiv länkförutsägelse som enbart bygger på nodens egenskaper. I det här arbetet visar vi att om man införlivar både anslutningsstruktur och egenskaper för de nya noderna får man en bättre modelluttryck. För att ge induktiv länkförutsägelse en sådan uttrycksfullhet föreslår vi LEAP, en kodare som innehåller LEArnable toPology augmentation av den ursprungliga grafen och möjliggör meddelandeöverföring med de nytillkomna noderna. Såvitt vi vet är detta det första försöket att tillhandahålla strukturella sammanhang för de nytillkomna noderna genom en inlärningsbar ökning i induktiva inställningar. Omfattande experiment på fyra homogena grafer i den verkliga världen visar att LEAP avsevärt överträffar "state-of-the-art" metoderna när det gäller AUC och genomsnittlig precision. Förbättringarna jämfört med homogena grafer är upp till 22% och 17%. Koden och datamängderna finns tillgängliga på Github*.
35

The Applicability and Scalability of Graph Neural Networks on Combinatorial Optimization / Tillämpning och Skalbarhet av Grafiska Neurala Nätverk på Kombinatorisk Optimering

Hårderup, Peder January 2023 (has links)
This master's thesis investigates the application of Graph Neural Networks (GNNs) to address scalability challenges in combinatorial optimization, with a primary focus on the minimum Total Dominating set Problem (TDP) and additionally the related Carrier Scheduling Problem (CSP) in networks of Internet of Things. The research identifies the NP-hard nature of these problems as a fundamental challenge and addresses how to improve predictions on input graphs of sizes much larger than seen during training phase. Further, the thesis explores the instability in such scalability when leveraging GNNs for TDP and CSP. Two primary measures to counter this scalability problem are proposed and tested: incorporating node degree as an additional feature and modifying the attention mechanism in GNNs. Results indicate that these countermeasures show promise in addressing scalability issues in TDP, with node degree inclusion demonstrating overall performance improvements while the modified attention mechanism presents a nuanced outcome with some metrics improved at the cost of others. Application of these methods to CSP yields bleak results, evincing the challenges of scalability in more complex problem domains. The thesis contributes by detecting and addressing scalability challenges in combinatorial optimization using GNNs and provides insights for further research in refining methodologies for real-world applications. / Denna masteruppsats undersöker tillämpningen av Grafiska Neurala Nätverk (GNN) för att hantera utmaningar inom skalbarhet vid kombinatorisk optimering, med ett primärt fokus på minimum Total Dominating set Problem (TDP) samt även det relaterade Carrier Scheduling Problem (CSP) i nätverk inom Internet of Things. Studien identifierar den NP-svåra karaktären av dessa problem som en grundläggande utmaning och lyfter hur man kan förbättra prediktioner på indatagrafer av storlekar som är mycket större än vad man sett under träningsfasen. Vidare utforskar uppsatsen instabiliteten i sådan skalbarhet när man utnyttjar GNN för TDP och CSP. Två primära åtgärder mot detta skalbarhetsproblem föreslås och testas: inkorporering av nodgrad som ett extra attribut och modifiering av attention-mekanismer i GNN. Resultaten indikerar att dessa motåtgärder har potential för att angripa skalbarhetsproblem i TDP, där inkludering av nodgrad ger övergripande prestandaförbättringar medan den modifierade attention-mekanismen ger ett mer tvetydigt resultat med vissa mätvärden förbättrade på bekostnad av andra. Tillämpning av dessa metoder på CSP ger svaga resultat, vilket antyder om utmaningarna med skalbarhet i mer komplexa problemdomäner. Uppsatsen bidrar genom att upptäcka och adressera skalbarhetsutmaningar i kombinatorisk optimering med hjälp av GNN och ger insikter för vidare forskning i att förfina metoder för verkliga tillämpningar.
36

Engineering Coordination Cages With Generative AI / Konstruktion av Koordinationsburar med Generativ AI

Ahmad, Jin January 2024 (has links)
Deep learning methods applied to chemistry can speed the discovery of novel compounds and facilitate the design of highly complex structures that are both valid and have important societal applications. Here, we present a pioneering exploration into the use of Generative Artificial Intelligence (GenAI) to design coordination cages within the field of supramolecular chemistry. Specifically, the study leverages GraphINVENT, a graph-based deep generative model, to facilitate the automated generation of tetrahedral coordination cages. Through a combination of computational tools and cheminformatics, the research aims to extend the capabilities of GenAI, traditionally applied in simpler chemical contexts, to the complex and nuanced arena of coordination cages. The approach involves a variety of training strategies, including initial pre-training on a large dataset (GDB-13) followed by transfer learning targeted at generating specific coordination cage structures. Data augmentation techniques were also applied to enrich training but did not yield successful outcomes. Several other strategies were employed, including focusing on single metal ion structures to enhance model familiarity with Fe-based cages and extending training datasets with diverse molecular examples from the ChEMBL database. Despite these strategies, the models struggled to capture the complex interactions required for successful cage generation, indicating potential limitations with both the diversity of the training datasets and the model’s architectural capacity to handle the intricate chemistry of coordination cages. However, training on the organic ligands (linkers) yielded successful results, emphasizing the benefits of focusing on smaller building blocks. The lessons learned from this project are substantial. Firstly, the knowledge acquired about generative models and the complex world of supramolecular chemistry has provided a unique opportunity to understand the challenges and possibilities of applying GenAI to such a complicated field. The results obtained in this project have highlighted the need for further refinement of data handling and model training techniques, paving the way for more advanced applications in the future. Finally, this project has not only raised our understanding of the capabilities and limitations of GenAI in coordination cages, but also set a foundation for future research that could eventually lead to breakthroughs in designing novel cage structures. Further study could concentrate on learning from the linkers in future data-driven cage design projects. / Deep learning-metoder (djup lärande metoder) som tillämpas på kemi kan påskynda upptäckten av nya molekyler och underlätta utformningen av mycket komplexa strukturer som både är giltiga och har viktiga samhällstillämpningar. Här presenterar vi en banbrytande undersökning av användningen av generativ artificiell intelligens (GenAI) för att designa koordinationsburar inom supramolekylär kemi. Specifikt utnyttjar studien GraphINVENT, en grafbaserad djup generativ modell, för att underlätta den automatiska genereringen av tetraedriska koordinationsburar. Genom en kombination av beräkningsverktyg och kemiinformatik syftar forskningen till att utöka kapaciteten hos GenAI, som traditionellt tillämpas i enklare kemiska sammanhang, till den komplexa och nyanserade arenan för koordinationsburar. Metoden innebar inledande förträning på ett brett dataset (GDB-13) följt av transferinlärning inriktad på att generera specifika koordinationsburstrukturer. Dataförstärkningstekniker användes också för att berika träningen men gav inte några lyckade resultat. Flera strategier användes, inklusive fokusering på enstaka metalljonsystem för att förbättra modellens förtrogenhet med Fe-baserade burar och utöka träningsdataset med olika molekylära exempel från ChEMBL-databasen. Trots dessa strategier hade modellerna svårt att fånga de komplexa interaktioner som krävs för framgångsrik generering av burar, vilket indikerar potentiella begränsningar inom både mångfalden av träningsdataset och modellens arkitektoniska kapacitet att hantera den invecklade kemin i koordinationsburar. Däremot var träningen på de organiska liganderna (länkarna) framgångsrik, vilket betonar fördelarna med att fokusera på mindre byggstenar. Dock är fördelarna med detta projekt betydande. Den kunskap som förvärvats om hur generativa modeller fungerar och den komplexa världen av supramolekylär kemi har gett en unik möjlighet att förstå utmaningarna och möjligheterna med att tillämpa GenAI på ett så komplicerat område. Erfarenheterna har visat på behovet av ytterligare förfining av datahantering och modellträningstekniker, vilket banar väg för mer avancerade tillämpningar i framtiden. Det här projektet har inte bara ökat vår förståelse för GenAI:s möjligheter och begränsningar i koordinationsburar utan också lagt grunden för framtida forskning som i slutändan kan leda till banbrytande upptäckter i utformningen av nya burstrukturer. Ytterligare studier skulle kunna fokusera på att lära sig från länkarna för att hjälpa framtida datadrivna projekt för burdesign.
37

Software Fault Detection in Telecom Networks using Bi-level Federated Graph Neural Networks / Upptäckt av SW-fel i telekommunikationsnätverk med hjälp av federerade grafiska neurala nätverk på två nivåer

Bourgerie, Rémi January 2023 (has links)
The increasing complexity of telecom networks, induced by the recent development of 5G, is a challenge for detecting faults in the telecom network. In addition to the structural complexity of telecommunication systems, data accessibility has become an issue both in terms of privacy and access cost. We propose a method relying on bi-level Federated Graph Neural Networks to identify anomalies in the telecom network while ensuring reduced communication costs as well as data privacy. Our method considers telecom data as a bi-level graph, where the highest level graph represents the interaction between sites, and each site is further expanded to its software (SW) performance behaviour graph. We developed and compared 4G/5G SW Fault Detection models under 3 settings: (1) Centralized Temporal Graph Neural Networks model: we propose a model to detect anomalies in 4G/5G telecom data. (2) Federated Temporal Graph Neural Networks model: we propose Federated Learning (FL) as a mechanism for privacy-aware training of models for fault detection. (3) Personalized Federated Temporal Graph Neural Networks model: we propose a novel aggregation technique, referred to as FedGraph, leveraging both a graph and the similarities between sites for aggregating the models and proposing models more personalized to each site’s behaviour. We compare the benefits of Federated Learning (FL) models (2) and (3) with centralized training (1) in terms of SW performance data modelling, anomaly detection, and communication cost. The evaluation includes both a scenario with normal functioning sites and a scenario where only a subset of sites exhibit faulty behaviour. The combination of SW execution graphs with GNNs has shown improved modelling performance and minor gains in centralized settings (1). In a normal network context, FL models (2) and (3) perform comparably to centralized training (CL), with slight improvements observed when using the personalized strategy (3). However, in abnormal network scenarios, Federated Learning falls short of achieving comparable detection performance to centralized training. This is due to the unintended learning of abnormal site behaviour, particularly when employing the personalized model (3). These findings highlight the importance of carefully assessing and selecting suitable FL strategies for anomaly detection and model training on telecom network data. / Den ökande komplexiteten i telenäten, som är en följd av den senaste utvecklingen av 5G, är en utmaning när det gäller att upptäcka fel i telenäten. Förutom den strukturella komplexiteten i telekommunikationssystem har datatillgänglighet blivit ett problem både när det gäller integritet och åtkomstkostnader. Vi föreslår en metod som bygger på Federated Graph Neural Networks på två nivåer för att identifiera avvikelser i telenätet och samtidigt säkerställa minskade kommunikationskostnader samt dataintegritet. Vår metod betraktar telekomdata som en graf på två nivåer, där grafen på den högsta nivån representerar interaktionen mellan webbplatser, och varje webbplats utvidgas ytterligare till sin graf för programvarans (SW) prestandabeteende. Vi utvecklade och jämförde 4G/5G SW-feldetekteringsmodeller under 3 inställningar: (1) Central Temporal Graph Neural Networks-modell: vi föreslår en modell för att upptäcka avvikelser i 4G/5G-telekomdata. (2) Federated Temporal Graph Neural Networks-modell: vi föreslår Federated Learning (FL) som en mekanism för integritetsmedveten utbildning av modeller för feldetektering. I motsats till centraliserad inlärning aggregeras lokalt tränade modeller på serversidan och skickas tillbaka till klienterna utan att data läcker ut mellan klienterna och servern, vilket säkerställer integritetsskyddande samarbetsutbildning. (3) Personaliserad Federated Temporal Graph Neural Networks-modell: vi föreslår en ny aggregeringsteknik, kallad FedGraph, som utnyttjar både en graf och likheterna mellan webbplatser för att aggregera modellerna. Vi jämför fördelarna med modellerna Federated Learning (FL) (2) och (3) med centraliserad utbildning (1) när det gäller datamodellering av SW-prestanda, anomalidetektering och kommunikationskostnader. Utvärderingen omfattar både ett scenario med normalt fungerande anläggningar och ett scenario där endast en delmängd av anläggningarna uppvisar felaktigt beteende. Kombinationen av SW-exekveringsgrafer med GNN har visat förbättrad modelleringsprestanda och mindre vinster i centraliserade inställningar (1). I en normal nätverkskontext presterar FL-modellerna (2) och (3) jämförbart med centraliserad träning (CL), med små förbättringar observerade när den personliga strategin används (3). I onormala nätverksscenarier kan Federated Learning dock inte uppnå jämförbar detekteringsprestanda med centraliserad träning. Detta beror på oavsiktlig inlärning av onormalt beteende på webbplatsen, särskilt när man använder den personliga modellen (3). Dessa resultat belyser vikten av att noggrant bedöma och välja lämpliga FL-strategier för anomalidetektering och modellträning på telekomnätdata.
38

SOLVING PREDICTION PROBLEMS FROM TEMPORAL EVENT DATA ON NETWORKS

Hao Sha (11048391) 06 August 2021 (has links)
<div><div><div><p>Many complex processes can be viewed as sequential events on a network. In this thesis, we study the interplay between a network and the event sequences on it. We first focus on predicting events on a known network. Examples of such include: modeling retweet cascades, forecasting earthquakes, and tracing the source of a pandemic. In specific, given the network structure, we solve two types of problems - (1) forecasting future events based on the historical events, and (2) identifying the initial event(s) based on some later observations of the dynamics. The inverse problem of inferring the unknown network topology or links, based on the events, is also of great important. Examples along this line include: constructing influence networks among Twitter users from their tweets, soliciting new members to join an event based on their participation history, and recommending positions for job seekers according to their work experience. Following this direction, we study two types of problems - (1) recovering influence networks, and (2) predicting links between a node and a group of nodes, from event sequences.</p></div></div></div>
39

Aplikace metody učení bez učitele na hledání podobných grafů / Application of Unsupervised Learning Methods in Graph Similarity Search

Sabo, Jozef January 2021 (has links)
Goal of this master's thesis was in cooperation with the company Avast to design a system, which can extract knowledge from a database of graphs. Graphs, used for data mining, describe behaviour of computer systems and they are anonymously inserted into the company's database from systems of the company's products users. Each graph in the database can be assigned with one of two labels: clean or malware (malicious) graph. The task of the proposed self-learning system is to find clusters of graphs in the graph database, in which the classes of graphs do not mix. Graph clusters with only one class of graphs can be interpreted as different types of clean or malware graphs and they are a useful source of further analysis on the graphs. To evaluate the quality of the clusters, a custom metric, named as monochromaticity, was designed. The metric evaluates the quality of the clusters based on how much clean and malware graphs are mixed in the clusters. The best results of the metric were obtained when vector representations of graphs were created by a deep learning model (variational  graph autoencoder with two relation graph convolution operators) and the parameterless method MeanShift was used for clustering over vectors.
40

Intersecting Graph Representation Learning and Cell Profiling : A Novel Approach to Analyzing Complex Biomedical Data

Chamyani, Nima January 2023 (has links)
In recent biomedical research, graph representation learning and cell profiling techniques have emerged as transformative tools for analyzing high-dimensional biological data. The integration of these methods, as investigated in this study, has facilitated an enhanced understanding of complex biological systems, consequently improving drug discovery. The research aimed to decipher connections between chemical structures and cellular phenotypes while incorporating other biological information like proteins and pathways into the workflow. To achieve this, machine learning models' efficacy was examined for classification and regression tasks. The newly proposed graph-level and bio-graph integrative predictors were compared with traditional models. Results demonstrated their potential, particularly in classification tasks. Moreover, the topology of the COVID-19 BioGraph was analyzed, revealing the complex interconnections between chemicals, proteins, and biological pathways. By combining network analysis, graph representation learning, and statistical methods, the study was able to predict active chemical combinations within inactive compounds, thereby exhibiting significant potential for further investigations. Graph-based generative models were also used for molecule generation opening up further research avenues in finding lead compounds. In conclusion, this study underlines the potential of combining graph representation learning and cell profiling techniques in advancing biomedical research in drug repurposing and drug combination. This integration provides a better understanding of complex biological systems, assists in identifying therapeutic targets, and contributes to optimizing molecule generation for drug discovery. Future investigations should optimize these models and validate the drug combination discovery approach. As these techniques continue to evolve, they hold the potential to significantly impact the future of drug screening, drug repurposing, and drug combinations.

Page generated in 0.0853 seconds