• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 123
  • 19
  • 5
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 189
  • 105
  • 60
  • 44
  • 41
  • 40
  • 37
  • 28
  • 26
  • 22
  • 21
  • 20
  • 20
  • 20
  • 19
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
151

Single Sign-On : Risks and Opportunities of Using SSO (Single Sign-On) in a Complex System Environment with Focus on Overall Security Aspects

Cakir, Ece January 2013 (has links)
Main concern of this thesis is to help design a secure and reliable network system which keeps growing in complexity due to the interfaces with multiple logging sub-systems and to ensure the safety of the network environment for everyone involved. The parties somewhat involved in network systems are always in need of developing new solutions to security problems and striving to have a secure access into a network so as to fulfil their job in safe computing environments. Implementation and use of SSO (Single Sign-On) offering secure and reliable network in complex systems has been specifically defined for the overall security aspects of enterprises. The information to be used within and out of organization was structured layer by layer according to the organizational needs to define the sub-systems. The users in the enterprise were defined according to their role based profiles. Structuring the information layer by layer was shown to improve the level of security by providing multiple authentication mechanisms. Before implementing SSO system necessary requirements are identified. Thereafter, user identity management and different authentication mechanisms were defined together with the network protocols and standards to insure a safe exchange of information within and outside the organization. A marketing research was conducted in line of the SSO solutions. Threat and risk analysis was conducted according to ISO/IEC 27003:2010 standard. The degree of threat and risk were evaluated by considering their consequences and possibilities. These evaluations were processed by risk treatments. MoDAF (Ministry of Defence Architecture Framework) used to show what kind of resources, applications and the other system related information are needed and exchanged in the network. In essence some suggestions were made concerning the ideas of implementing SSO solutions presented in the discussion and analysis chapter.
152

TopFed: TCGA tailored federated query processing and linking to LOD

Saleem, Muhammad, Padmanabhuni, Shanmukha S., Ngonga Ngomo, Axel-Cyrille, Iqbal, Aftab, Almeida, Jonas S., Decker, Stefan, Deus, Helena F. January 2014 (has links)
Methods: We address these issues by transforming the TCGA data into the Semantic Web standard Resource Description Format (RDF), link it to relevant datasets in the Linked Open Data (LOD) cloud and further propose an efficient data distribution strategy to host the resulting 20.4 billion triples data via several SPARQL endpoints. Having the TCGA data distributed across multiple SPARQL endpoints, we enable biomedical scientists to query and retrieve information from these SPARQL endpoints by proposing a TCGA tailored federated SPARQL query processing engine named TopFed. Results: We compare TopFed with a well established federation engine FedX in terms of source selection and query execution time by using 10 different federated SPARQL queries with varying requirements. Our evaluation results show that TopFed selects on average less than half of the sources (with 100% recall) with query execution time equal to one third to that of FedX. Conclusion: With TopFed, we aim to offer biomedical scientists a single-point-of-access through which distributed TCGA data can be accessed in unison. We believe the proposed system can greatly help researchers in the biomedical domain to carry out their research effectively with TCGA as the amount and diversity of data exceeds the ability of local resources to handle its retrieval and parsing.
153

Federated Product Information Search and Semantic Product Comparisons on the Web

Walther, Maximilian Thilo 09 September 2011 (has links)
Product information search has become one of the most important application areas of the Web. Especially considering pricey technical products, consumers tend to carry out intensive research activities previous to the actual acquisition for creating an all-embracing view on the product of interest. Federated search backed by ontology-based product information representation shows great promise for easing this research process. The topic of this thesis is to develop a comprehensive technique for locating, extracting, and integrating information of arbitrary technical products in a widely unsupervised manner. The resulting homogeneous information sets allow a potential consumer to effectively compare technical products based on an appropriate federated product information system.:1. Introduction 1.1. Online Product Information Research 1.1.1. Current Online Product Information Research 1.1.2. Aspired Online Product Information Research 1.2. Federated Shopping Portals 1.3. Research Questions 1.4. Approach and Theses 1.4.1. Approach 1.4.2. Theses 1.4.3. Requirements 1.5. Goals and Non-Goals 1.5.1. Goals 1.5.2. Non-Goals 1.6. Contributions 1.7. Structure 2. Federated Information Systems 2.1. Information Access 2.1.1. Document Retrieval 2.1.2. Federated Search 2.1.3. Federated Ranking 2.2. Information Extraction 2.2.1. Information Extraction from Structured Sources 2.2.2. Information Extraction from Unstructured Sources 2.2.3. Information Extraction from Semi-structured Sources 2.3. Information Integration 2.3.1. Ontologies 2.3.2. Ontology Matching 2.4. Information Presentation 2.5. Product Information 2.5.1. Product Information Source Characteristics 2.5.2. Product Information Source Types 2.5.3. Product Information Integration Types 2.5.4. Product Information Types 2.6. Conclusions 3. A Federated Product Information System 3.1. Finding Basic Product Information 3.2. Enriching Product Information 3.3. Administrating Product Information 3.4. Displaying Product Information 3.5. Conclusions 4. Product Information Extraction from the Web 4.1. Vendor Product Information Search 4.1.1. Vendor Product Information Ranking 4.1.2. Vendor Product Information Extraction 4.2. Producer Product Information Search 4.2.1. Producer Product Document Retrieval 4.2.2. Producer Product Information Extraction 4.3. Third-Party Product Information Search 4.4. Conclusions 5. Product Information Integration for the Web 5.1. Product Representation 5.1.1. Domain Product Ontology 5.1.2. Application Product Ontology 5.1.3. Product Ontology Management 5.2. Product Categorization 5.3. Product Specifications Matching 5.3.1. General Procedure 5.3.2. Elementary Matchers 5.3.3. Evolutionary Matcher 5.3.4. Naïve Bayes Matcher 5.3.5. Result Selection 5.4. Product Specifications Normalization 5.4.1. Product Specifications Atomization 5.4.2. Product Specifications Value Normalization 5.5. Product Comparison 5.6. Conclusions 6. Evaluation 6.1. Implementation 6.1.1. Offers Service 6.1.2. Products Service 6.1.3. Snippets Service 6.1.4. Fedseeko 6.1.5. Fedseeko Browser Plugin 6.1.6. Fedseeko Mobile 6.1.7. Lessons Learned 6.2. Evaluation 6.2.1. Evaluation Measures 6.2.2. Gold Standard 6.2.3. Product Document Retrieval 6.2.4. Product Specifications Extraction 6.2.5. Product Specifications Matching 6.2.6. Comparison with Competitors 6.3. Conclusions 7. Conclusions and Future Work 7.1. Summary 7.2. Conclusions 7.3. Future Work A. Pseudo Code and Extraction Properties A.1. Pseudo Code A.2. Extraction Algorithm Properties A.2.1. Clustering Properties A.2.2. Purging Properties A.2.3. Dropping Properties B. Fedseeko Screenshots B.1. Offer Search B.2. Product Comparison / Die Produktinformationssuche hat sich zu einem der bedeutendsten Themen im Web entwickelt. Speziell im Bereich kostenintensiver technischer Produkte führen potenzielle Konsumenten vor dem eigentlichen Kauf des Produkts langwierige Recherchen durch um einen umfassenden Überblick für das Produkt von Interesse zu erlangen. Die föderierte Suche in Kombination mit ontologiebasierter Produktinformationsrepräsentation stellt eine mögliche Lösung dieser Problemstellung dar. Diese Dissertation stellt Techniken vor, die das automatische Lokalisieren, Extrahieren und Integrieren von Informationen für beliebige technische Produkte ermöglichen. Die resultierenden homogenen Produktinformationen erlauben einem potenziellen Konsumenten, zugehörige Produkte effektiv über ein föderiertes Produktinformationssystem zu vergleichen.:1. Introduction 1.1. Online Product Information Research 1.1.1. Current Online Product Information Research 1.1.2. Aspired Online Product Information Research 1.2. Federated Shopping Portals 1.3. Research Questions 1.4. Approach and Theses 1.4.1. Approach 1.4.2. Theses 1.4.3. Requirements 1.5. Goals and Non-Goals 1.5.1. Goals 1.5.2. Non-Goals 1.6. Contributions 1.7. Structure 2. Federated Information Systems 2.1. Information Access 2.1.1. Document Retrieval 2.1.2. Federated Search 2.1.3. Federated Ranking 2.2. Information Extraction 2.2.1. Information Extraction from Structured Sources 2.2.2. Information Extraction from Unstructured Sources 2.2.3. Information Extraction from Semi-structured Sources 2.3. Information Integration 2.3.1. Ontologies 2.3.2. Ontology Matching 2.4. Information Presentation 2.5. Product Information 2.5.1. Product Information Source Characteristics 2.5.2. Product Information Source Types 2.5.3. Product Information Integration Types 2.5.4. Product Information Types 2.6. Conclusions 3. A Federated Product Information System 3.1. Finding Basic Product Information 3.2. Enriching Product Information 3.3. Administrating Product Information 3.4. Displaying Product Information 3.5. Conclusions 4. Product Information Extraction from the Web 4.1. Vendor Product Information Search 4.1.1. Vendor Product Information Ranking 4.1.2. Vendor Product Information Extraction 4.2. Producer Product Information Search 4.2.1. Producer Product Document Retrieval 4.2.2. Producer Product Information Extraction 4.3. Third-Party Product Information Search 4.4. Conclusions 5. Product Information Integration for the Web 5.1. Product Representation 5.1.1. Domain Product Ontology 5.1.2. Application Product Ontology 5.1.3. Product Ontology Management 5.2. Product Categorization 5.3. Product Specifications Matching 5.3.1. General Procedure 5.3.2. Elementary Matchers 5.3.3. Evolutionary Matcher 5.3.4. Naïve Bayes Matcher 5.3.5. Result Selection 5.4. Product Specifications Normalization 5.4.1. Product Specifications Atomization 5.4.2. Product Specifications Value Normalization 5.5. Product Comparison 5.6. Conclusions 6. Evaluation 6.1. Implementation 6.1.1. Offers Service 6.1.2. Products Service 6.1.3. Snippets Service 6.1.4. Fedseeko 6.1.5. Fedseeko Browser Plugin 6.1.6. Fedseeko Mobile 6.1.7. Lessons Learned 6.2. Evaluation 6.2.1. Evaluation Measures 6.2.2. Gold Standard 6.2.3. Product Document Retrieval 6.2.4. Product Specifications Extraction 6.2.5. Product Specifications Matching 6.2.6. Comparison with Competitors 6.3. Conclusions 7. Conclusions and Future Work 7.1. Summary 7.2. Conclusions 7.3. Future Work A. Pseudo Code and Extraction Properties A.1. Pseudo Code A.2. Extraction Algorithm Properties A.2.1. Clustering Properties A.2.2. Purging Properties A.2.3. Dropping Properties B. Fedseeko Screenshots B.1. Offer Search B.2. Product Comparison
154

Från internationellt samarbete till ett nytt svenskt ledningssystem

Runesson, Johan January 2019 (has links)
Den operativa miljön har förändrats vilket leder till att Försvarsmaktens ledningssystem behöver utvecklas. Sveriges nuvarande ledningssystem omhändertar inte de nya utmaningarna och Försvarsmaktens uppgifter blir allt mer komplexa. Sveriges militärstrategiska koncept baseras på att vinna tillsammans och undvika att förlora ensamma. Ordet tillsammans driver utvecklingen av internationellt sammabete och gemenskap. Under 2016 beslutade Sverige att ansluta sig till Natos utveckling av Federated Mission Networking (FMN). FMN syftar till förbättrat informationsutbyte mellan Nato, Nato-länderna och icke-Natoenheter. FMN är ett ramverk som omfattar alla ingående delar i ett ledningssystem. Konceptet bygger på principerna smidighet, flexibilitet och skalbarhet. Syftet med studien är att belysa ett svenskt införande av FMN-konceptet och undersöka hur detta kan bidra till ökad förmåga att omhänderta komplexiteten i ledning av gemensamma operationer. Studiens slutsats är att FMN-konceptet bidrar till ledningssystemets förmåga att skapa ordning genom fastställda rutiner och metoder. Det underlättar informationsdelning och ökar möjligheten till samordning och samverkan. Konceptet bidrar till en ökad interoperabilitet inom alla ingående delar i ledningssystemet.
155

Personalized Federated Learning for mmWave Beam Prediction Using Non-IID Sub-6 GHz Channels / Personaliserad Federerad Inlärning för mmWave Beam Prediction Användning Icke-IID Sub-6 GHz-kanaler

Cheng, Yuan January 2022 (has links)
While it is difficult for base stations to estimate the millimeter wave (mmWave) channels and find the optimal mmWave beam for user equipments (UEs) quickly, the sub-6 GHz channels which are usually easier to obtain and more robust to blockages could be used to reduce the time before initial access and enhance the reliability of mmWave communication. Considering that the channel information is collected by a massive number of radio base stations and would be sensitive to privacy and security, Federated Learning (FL) is a match for this use case. In practice, the channel vectors are usually subject to Non-Independently Distributed (non-IID) distributions due to the greatly varying wireless communication environments between different radio base stations and their UEs. To achieve satisfying performance for all radio base stations instead of only the majority of them, a useful solution is designing personalized methods for each radio base station. In this thesis, we implement two personalized FL methods including 1) Finetuning FL Model on Private Dataset of Each Client and 2) Adaptive Expert Models for FL to predict the optimal mmWave beamforming vector directly from the non-IID sub-6 GHz channel vectors generated from DeepMIMO. According to our experimental results, Finetuning FL Model on Private Dataset of Each Client achieves higher average mmWave downlink spectral efficiency than the global FL. Besides, in terms of the average Top-1 and Top-3 classification accuracies, its performance improvement over the global FL model even exceeds the improvement of the global FL over the pure local models. / Även om det är svårt för en basstation att uppskatta en kanal för millimetervåg (mmWave) och snabbt hitta den bästa mmWave-strålen för en användarutrustning (UE), kan den dra fördel av kanaler under 6 GHz, som i allmänhet är mer lättillgängliga och mer motståndskraftig mot blockering, för att minska tid för första besök och förbättra tillförlitligheten hos mmWave-kommunikation. Med tanke på att kanalinformation samlas in av ett stort antal radiobasstationer och är känslig för integritet och säkerhet är federated learning (FL) väl lämpat för detta användningsfall. I praktiken, eftersom den trådlösa kommunikationsmiljön varierar mycket mellan olika radiobasstationer och deras UE, följer kanalvektorer vanligtvis en icke-oberoende distribution (icke-IID). För att uppnå tillfredsställande prestanda för alla radiobasstationer, inte bara de flesta radiobasstationer, är en användbar lösning att utforma ett individuellt tillvägagångssätt för varje radiobasstation. I detta dokument implementerar vi två personliga FL-metoder, inklusive 1) finjustering av FL-modellen på varje klients privata datauppsättning och 2) en adaptiv expertmodell av FL för att direkt generera icke-IID sub-6 GHz kanalvektorer förutsäga optimal mmWave beamforming vektorer. Enligt våra experimentella resultat uppnår finjustering av FL-modellen på varje klients privata datauppsättning högre genomsnittlig mmWave-nedlänksspektral effektivitet än global FL. Dessutom överträffar dess prestandaförbättring jämfört med den globala FL-modellen till och med den för den globala FL jämfört med den rent lokala modellen vad gäller genomsnittlig klassificeringsnoggrannhet i topp-1 och topp-3.
156

Re-weighted softmax cross-entropy to control forgetting in federated learning

Legate, Gwendolyne 12 1900 (has links)
Dans l’apprentissage fédéré, un modèle global est appris en agrégeant les mises à jour du modèle calculées à partir d’un ensemble de nœuds clients, un défi clé dans ce domaine est l’hétérogénéité des données entre les clients qui dégrade les performances du modèle. Les algorithmes d’apprentissage fédéré standard effectuent plusieurs étapes de gradient avant de synchroniser le modèle, ce qui peut amener les clients à minimiser exagérément leur propre objectif local et à s’écarter de la solution globale. Nous démontrons que dans un tel contexte, les modèles de clients individuels subissent un oubli catastrophique par rapport aux données d’autres clients et nous proposons une approche simple mais efficace qui modifie l’objectif d’entropie croisée sur une base par client en repondérant le softmax de les logits avant de calculer la perte. Cette approche protège les classes en dehors de l’ensemble d’étiquettes d’un client d’un changement de représentation brutal. Grâce à une évaluation empirique approfondie, nous démontrons que notre approche peut atténuer ce problème, en apportant une amélioration continue aux algorithmes d’apprentissage fédéré standard. Cette approche est particulièrement avantageux dans les contextes d’apprentissage fédéré difficiles les plus étroitement alignés sur les scénarios du monde réel où l’hétérogénéité des données est élevée et la participation des clients à chaque cycle est faible. Nous étudions également les effets de l’utilisation de la normalisation par lots et de la normalisation de groupe avec notre méthode et constatons que la normalisation par lots, qui était auparavant considérée comme préjudiciable à l’apprentissage fédéré, fonctionne exceptionnellement bien avec notre softmax repondéré, remettant en question certaines hypothèses antérieures sur la normalisation dans un système fédéré / In Federated Learning, a global model is learned by aggregating model updates computed from a set of client nodes, a key challenge in this domain is data heterogeneity across clients which degrades model performance. Standard federated learning algorithms perform multiple gradient steps before synchronizing the model which can lead to clients overly minimizing their own local objective and diverging from the global solution. We demonstrate that in such a setting, individual client models experience a catastrophic forgetting with respect to data from other clients and we propose a simple yet efficient approach that modifies the cross-entropy objective on a per-client basis by re-weighting the softmax of the logits prior to computing the loss. This approach shields classes outside a client’s label set from abrupt representation change. Through extensive empirical evaluation, we demonstrate our approach can alleviate this problem, providing consistent improvement to standard federated learning algorithms. It is particularly beneficial under the challenging federated learning settings most closely aligned with real world scenarios where data heterogeneity is high and client participation in each round is low. We also investigate the effects of using batch normalization and group normalization with our method and find that batch normalization which has previously been considered detrimental to federated learning performs particularly well with our re-weighted softmax, calling into question some prior assumptions about normalization in a federated setting
157

Federated Learning with FEDn for Financial Market Surveillance

Voltaire Edoh, Isak January 2022 (has links)
Machine Learning (ML) is the current trend that most industries opt for to improve their business and operations. ML has also been adopted in the financial markets, where well-funded financial institutions employ the latest ML algorithms to gain an advantage on the market. The darker side of ML is the potential emergence of complex algorithmic trading schemes that are abusive and manipulative. Because of this, it is inevitable that ML will be applied to financial market surveillance in order to detect these abusive and manipulative trading strategies. Ideally, an accurate ML detection model would be developed with data from many financial institutions or trading venues. However, such ML models require vast quantities of data, which poses a problem in market surveillance where data is sensitive or limited. Data sharing between companies or countries is typically accompanied by legal and privacy concerns. By training ML models on distributed datasets, Federated Learning (FL) overcomes these issues by eliminating the need to centralise sensitive data. This thesis aimed to address these ML related issues in market surveillance by implementing and evaluating a FL model. FL enables a group of independent data-holding clients with the same intention to build a shared ML model collaboratively without compromising private data. In this work, a ML model is initially deployed in a centralised data setting and trained to detect the manipulative trading scheme known as spoofing. The LSTM-Autoencoder was the model chosen method for this task. The same model is also implemented in a federated setting but with decentralised data, using the FL framework FEDn. Another FL framework, Flower, is also employed to evaluate the performance of FEDn. Experiments were conducted comparing the FL models to the conventional centralised learning model, as well as comparing the two frameworks to each other. The results showed that under certain circumstances, the FL models performed better than the centralised model in detecting spoofing. FEDn was equivalent to Flower in terms of detection performance. In addition, the results indicated that Flower was marginally faster than FEDn. It is assumed that variations in the experimental setup and stochasticity account for the performance disparity.
158

Federated Learning for Time Series Forecasting Using LSTM Networks: Exploiting Similarities Through Clustering / Federerad inlärning för tidserieprognos genom LSTM-nätverk: utnyttjande av likheter genom klustring

Díaz González, Fernando January 2019 (has links)
Federated learning poses a statistical challenge when training on highly heterogeneous sequence data. For example, time-series telecom data collected over long intervals regularly shows mixed fluctuations and patterns. These distinct distributions are an inconvenience when a node not only plans to contribute to the creation of the global model but also plans to apply it on its local dataset. In this scenario, adopting a one-fits-all approach might be inadequate, even when using state-of-the-art machine learning techniques for time series forecasting, such as Long Short-Term Memory (LSTM) networks, which have proven to be able to capture many idiosyncrasies and generalise to new patterns. In this work, we show that by clustering the clients using these patterns and selectively aggregating their updates in different global models can improve local performance with minimal overhead, as we demonstrate through experiments using realworld time series datasets and a basic LSTM model. / Federated Learning utgör en statistisk utmaning vid träning med starkt heterogen sekvensdata. Till exempel så uppvisar tidsseriedata inom telekomdomänen blandade variationer och mönster över längre tidsintervall. Dessa distinkta fördelningar utgör en utmaning när en nod inte bara ska bidra till skapandet av en global modell utan även ämnar applicera denna modell på sin lokala datamängd. Att i detta scenario införa en global modell som ska passa alla kan visa sig vara otillräckligt, även om vi använder oss av de mest framgångsrika modellerna inom maskininlärning för tidsserieprognoser, Long Short-Term Memory (LSTM) nätverk, vilka visat sig kunna fånga komplexa mönster och generalisera väl till nya mönster. I detta arbete visar vi att genom att klustra klienterna med hjälp av dessa mönster och selektivt aggregera deras uppdateringar i olika globala modeller kan vi uppnå förbättringar av den lokal prestandan med minimala kostnader, vilket vi demonstrerar genom experiment med riktigt tidsseriedata och en grundläggande LSTM-modell.
159

Federated Learning for Time Series Forecasting Using Hybrid Model

Li, Yuntao January 2019 (has links)
Time Series data has become ubiquitous thanks to affordable edge devices and sensors. Much of this data is valuable for decision making. In order to use these data for the forecasting task, the conventional centralized approach has shown deficiencies regarding large data communication and data privacy issues. Furthermore, Neural Network models cannot make use of the extra information from the time series, thus they usually fail to provide time series specific results. Both issues expose a challenge to large-scale Time Series Forecasting with Neural Network models. All these limitations lead to our research question:Can we realize decentralized time series forecasting with a Federated Learning mechanism that is comparable to the conventional centralized setup in forecasting performance?In this work, we propose a Federated Series Forecasting framework, resolving the challenge by allowing users to keep the data locally, and learns a shared model by aggregating locally computed updates. Besides, we design a hybrid model to enable Neural Network models utilizing the extra information from the time series to achieve a time series specific learning. In particular, the proposed hybrid outperforms state-of-art baseline data-central models with NN5 and Ericsson KPI data. Meanwhile, the federated settings of purposed model yields comparable results to data-central settings on both NN5 and Ericsson KPI data. These results together answer the research question of this thesis. / Tidseriedata har blivit allmänt förekommande tack vare överkomliga kantenheter och sensorer. Mycket av denna data är värdefull för beslutsfattande. För att kunna använda datan för prognosuppgifter har den konventionella centraliserade metoden visat brister avseende storskalig datakommunikation och integritetsfrågor. Vidare har neurala nätverksmodeller inte klarat av att utnyttja den extra informationen från tidsserierna, vilket leder till misslyckanden med att ge specifikt tidsserierelaterade resultat. Båda frågorna exponerar en utmaning för storskalig tidsserieprognostisering med neurala nätverksmodeller. Alla dessa begränsningar leder till vår forskningsfråga:Kan vi realisera decentraliserad tidsserieprognostisering med en federerad lärningsmekanism som presterar jämförbart med konventionella centrala lösningar i prognostisering?I det här arbetet föreslår vi ett ramverk för federerad tidsserieprognos som löser utmaningen genom att låta användaren behålla data lokalt och lära sig en delad modell genom att aggregera lokalt beräknade uppdateringar. Dessutom utformar vi en hybrid modell för att möjliggöra neurala nätverksmodeller som kan utnyttja den extra informationen från tidsserierna för att uppnå inlärning av specifika tidsserier. Den föreslagna hybrida modellen presterar bättre än state-of-art centraliserade grundläggande modeller med NN5och Ericsson KPIdata. Samtidigt ger den federerade ansatsen jämförbara resultat med de datacentrala ansatserna för både NN5och Ericsson KPI-data. Dessa resultat svarar tillsammans på forskningsfrågan av denna avhandling.
160

Models and Representation Learning Mechanisms for Graph Data

Susheel Suresh (14228138) 15 December 2022 (has links)
<p>Graph representation learning (GRL) has been increasing used to model and understand data from a wide variety of complex systems spanning social, technological, bio-chemical and physical domains. GRL consists of two main components (1) a parametrized encoder that provides representations of graph data and (2) a learning process to train the encoder parameters. Designing flexible encoders that capture the underlying invariances and characteristics of graph data are crucial to the success of GRL. On the other hand, the learning process drives the quality of the encoder representations and developing principled learning mechanisms are vital for a number of growing applications in self-supervised, transfer and federated learning settings. To this end, we propose a suite of models and learning algorithms for GRL which form the two main thrusts of this dissertation.</p> <p><br></p> <p>In Thrust I, we propose two novel encoders which build upon on a widely popular GRL encoder class called graph neural networks (GNNs). First, we empirically study the prediction performance of current GNN based encoders when applied to graphs with heterogeneous node mixing patterns using our proposed notion of local assortativity. We find that GNN performance in node prediction tasks strongly correlates with our local assortativity metric---thereby introducing a limit. We propose to transform the input graph into a computation graph with proximity and structural information as distinct types of edges. We then propose a novel GNN based encoder that operates on this computation graph and adaptively chooses between structure and proximity information. Empirically, adopting our transformation and encoder framework leads to improved node classification performance compared to baselines in real-world graphs that exhibit diverse mixing.</p> <p>Secondly, we study the trade-off between expressivity and efficiency of GNNs when applied to temporal graphs for the task of link ranking. We develop an encoder that incorporates a labeling approach designed to allow for efficient inference over the candidate set jointly, while provably boosting expressivity. We also propose to optimize a list-wise loss for improved ranking. With extensive evaluation on real-world temporal graphs, we demonstrate its improved performance and efficiency compared to baselines.</p> <p><br></p> <p>In Thrust II, we propose two principled encoder learning mechanisms for challenging and realistic graph data settings. First, we consider a scenario where only limited or even no labelled data is available for GRL. Recent research has converged on graph contrastive learning (GCL), where GNNs are trained to maximize the correspondence between representations of the same graph in its different augmented forms. However, we find that GNNs trained by traditional GCL often risk capturing redundant graph features and thus may be brittle and provide sub-par performance in downstream tasks. We then propose a novel principle, termed adversarial-GCL (AD-GCL), which enables GNNs to avoid capturing redundant information during the training by optimizing adversarial graph augmentation strategies used in GCL. We pair AD-GCL with theoretical explanations and design a practical instantiation based on trainable edge-dropping graph augmentation. We experimentally validate AD-GCL by comparing with state-of-the-art GCL methods and achieve performance gains in semi-supervised, unsupervised and transfer learning settings using benchmark chemical and biological molecule datasets. </p> <p>Secondly, we consider a scenario where graph data is silo-ed across clients for GRL. We focus on two unique challenges encountered when applying distributed training to GRL: (i) client task heterogeneity and (ii) label scarcity. We propose a novel learning framework called federated self-supervised graph learning (FedSGL), which first utilizes a self-supervised objective to train GNNs in a federated fashion across clients and then, each client fine-tunes the obtained GNNs based on its local task and available labels. Our framework enables the federated GNN model to extract patterns from the common feature (attribute and graph topology) space without the need of labels or being biased by heterogeneous local tasks. Extensive empirical study of FedSGL on both node and graph classification tasks yields fruitful insights into how the level of feature / task heterogeneity, the adopted federated algorithm and the level of label scarcity affects the clients’ performance in their tasks.</p>

Page generated in 0.0518 seconds