• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 69
  • 3
  • Tagged with
  • 92
  • 92
  • 40
  • 37
  • 30
  • 28
  • 26
  • 22
  • 21
  • 19
  • 17
  • 16
  • 14
  • 13
  • 13
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Personalized Federated Learning for mmWave Beam Prediction Using Non-IID Sub-6 GHz Channels / Personaliserad Federerad Inlärning för mmWave Beam Prediction Användning Icke-IID Sub-6 GHz-kanaler

Cheng, Yuan January 2022 (has links)
While it is difficult for base stations to estimate the millimeter wave (mmWave) channels and find the optimal mmWave beam for user equipments (UEs) quickly, the sub-6 GHz channels which are usually easier to obtain and more robust to blockages could be used to reduce the time before initial access and enhance the reliability of mmWave communication. Considering that the channel information is collected by a massive number of radio base stations and would be sensitive to privacy and security, Federated Learning (FL) is a match for this use case. In practice, the channel vectors are usually subject to Non-Independently Distributed (non-IID) distributions due to the greatly varying wireless communication environments between different radio base stations and their UEs. To achieve satisfying performance for all radio base stations instead of only the majority of them, a useful solution is designing personalized methods for each radio base station. In this thesis, we implement two personalized FL methods including 1) Finetuning FL Model on Private Dataset of Each Client and 2) Adaptive Expert Models for FL to predict the optimal mmWave beamforming vector directly from the non-IID sub-6 GHz channel vectors generated from DeepMIMO. According to our experimental results, Finetuning FL Model on Private Dataset of Each Client achieves higher average mmWave downlink spectral efficiency than the global FL. Besides, in terms of the average Top-1 and Top-3 classification accuracies, its performance improvement over the global FL model even exceeds the improvement of the global FL over the pure local models. / Även om det är svårt för en basstation att uppskatta en kanal för millimetervåg (mmWave) och snabbt hitta den bästa mmWave-strålen för en användarutrustning (UE), kan den dra fördel av kanaler under 6 GHz, som i allmänhet är mer lättillgängliga och mer motståndskraftig mot blockering, för att minska tid för första besök och förbättra tillförlitligheten hos mmWave-kommunikation. Med tanke på att kanalinformation samlas in av ett stort antal radiobasstationer och är känslig för integritet och säkerhet är federated learning (FL) väl lämpat för detta användningsfall. I praktiken, eftersom den trådlösa kommunikationsmiljön varierar mycket mellan olika radiobasstationer och deras UE, följer kanalvektorer vanligtvis en icke-oberoende distribution (icke-IID). För att uppnå tillfredsställande prestanda för alla radiobasstationer, inte bara de flesta radiobasstationer, är en användbar lösning att utforma ett individuellt tillvägagångssätt för varje radiobasstation. I detta dokument implementerar vi två personliga FL-metoder, inklusive 1) finjustering av FL-modellen på varje klients privata datauppsättning och 2) en adaptiv expertmodell av FL för att direkt generera icke-IID sub-6 GHz kanalvektorer förutsäga optimal mmWave beamforming vektorer. Enligt våra experimentella resultat uppnår finjustering av FL-modellen på varje klients privata datauppsättning högre genomsnittlig mmWave-nedlänksspektral effektivitet än global FL. Dessutom överträffar dess prestandaförbättring jämfört med den globala FL-modellen till och med den för den globala FL jämfört med den rent lokala modellen vad gäller genomsnittlig klassificeringsnoggrannhet i topp-1 och topp-3.
72

Re-weighted softmax cross-entropy to control forgetting in federated learning

Legate, Gwendolyne 12 1900 (has links)
Dans l’apprentissage fédéré, un modèle global est appris en agrégeant les mises à jour du modèle calculées à partir d’un ensemble de nœuds clients, un défi clé dans ce domaine est l’hétérogénéité des données entre les clients qui dégrade les performances du modèle. Les algorithmes d’apprentissage fédéré standard effectuent plusieurs étapes de gradient avant de synchroniser le modèle, ce qui peut amener les clients à minimiser exagérément leur propre objectif local et à s’écarter de la solution globale. Nous démontrons que dans un tel contexte, les modèles de clients individuels subissent un oubli catastrophique par rapport aux données d’autres clients et nous proposons une approche simple mais efficace qui modifie l’objectif d’entropie croisée sur une base par client en repondérant le softmax de les logits avant de calculer la perte. Cette approche protège les classes en dehors de l’ensemble d’étiquettes d’un client d’un changement de représentation brutal. Grâce à une évaluation empirique approfondie, nous démontrons que notre approche peut atténuer ce problème, en apportant une amélioration continue aux algorithmes d’apprentissage fédéré standard. Cette approche est particulièrement avantageux dans les contextes d’apprentissage fédéré difficiles les plus étroitement alignés sur les scénarios du monde réel où l’hétérogénéité des données est élevée et la participation des clients à chaque cycle est faible. Nous étudions également les effets de l’utilisation de la normalisation par lots et de la normalisation de groupe avec notre méthode et constatons que la normalisation par lots, qui était auparavant considérée comme préjudiciable à l’apprentissage fédéré, fonctionne exceptionnellement bien avec notre softmax repondéré, remettant en question certaines hypothèses antérieures sur la normalisation dans un système fédéré / In Federated Learning, a global model is learned by aggregating model updates computed from a set of client nodes, a key challenge in this domain is data heterogeneity across clients which degrades model performance. Standard federated learning algorithms perform multiple gradient steps before synchronizing the model which can lead to clients overly minimizing their own local objective and diverging from the global solution. We demonstrate that in such a setting, individual client models experience a catastrophic forgetting with respect to data from other clients and we propose a simple yet efficient approach that modifies the cross-entropy objective on a per-client basis by re-weighting the softmax of the logits prior to computing the loss. This approach shields classes outside a client’s label set from abrupt representation change. Through extensive empirical evaluation, we demonstrate our approach can alleviate this problem, providing consistent improvement to standard federated learning algorithms. It is particularly beneficial under the challenging federated learning settings most closely aligned with real world scenarios where data heterogeneity is high and client participation in each round is low. We also investigate the effects of using batch normalization and group normalization with our method and find that batch normalization which has previously been considered detrimental to federated learning performs particularly well with our re-weighted softmax, calling into question some prior assumptions about normalization in a federated setting
73

Federated Learning with FEDn for Financial Market Surveillance

Voltaire Edoh, Isak January 2022 (has links)
Machine Learning (ML) is the current trend that most industries opt for to improve their business and operations. ML has also been adopted in the financial markets, where well-funded financial institutions employ the latest ML algorithms to gain an advantage on the market. The darker side of ML is the potential emergence of complex algorithmic trading schemes that are abusive and manipulative. Because of this, it is inevitable that ML will be applied to financial market surveillance in order to detect these abusive and manipulative trading strategies. Ideally, an accurate ML detection model would be developed with data from many financial institutions or trading venues. However, such ML models require vast quantities of data, which poses a problem in market surveillance where data is sensitive or limited. Data sharing between companies or countries is typically accompanied by legal and privacy concerns. By training ML models on distributed datasets, Federated Learning (FL) overcomes these issues by eliminating the need to centralise sensitive data. This thesis aimed to address these ML related issues in market surveillance by implementing and evaluating a FL model. FL enables a group of independent data-holding clients with the same intention to build a shared ML model collaboratively without compromising private data. In this work, a ML model is initially deployed in a centralised data setting and trained to detect the manipulative trading scheme known as spoofing. The LSTM-Autoencoder was the model chosen method for this task. The same model is also implemented in a federated setting but with decentralised data, using the FL framework FEDn. Another FL framework, Flower, is also employed to evaluate the performance of FEDn. Experiments were conducted comparing the FL models to the conventional centralised learning model, as well as comparing the two frameworks to each other. The results showed that under certain circumstances, the FL models performed better than the centralised model in detecting spoofing. FEDn was equivalent to Flower in terms of detection performance. In addition, the results indicated that Flower was marginally faster than FEDn. It is assumed that variations in the experimental setup and stochasticity account for the performance disparity.
74

Federated Learning for Time Series Forecasting Using LSTM Networks: Exploiting Similarities Through Clustering / Federerad inlärning för tidserieprognos genom LSTM-nätverk: utnyttjande av likheter genom klustring

Díaz González, Fernando January 2019 (has links)
Federated learning poses a statistical challenge when training on highly heterogeneous sequence data. For example, time-series telecom data collected over long intervals regularly shows mixed fluctuations and patterns. These distinct distributions are an inconvenience when a node not only plans to contribute to the creation of the global model but also plans to apply it on its local dataset. In this scenario, adopting a one-fits-all approach might be inadequate, even when using state-of-the-art machine learning techniques for time series forecasting, such as Long Short-Term Memory (LSTM) networks, which have proven to be able to capture many idiosyncrasies and generalise to new patterns. In this work, we show that by clustering the clients using these patterns and selectively aggregating their updates in different global models can improve local performance with minimal overhead, as we demonstrate through experiments using realworld time series datasets and a basic LSTM model. / Federated Learning utgör en statistisk utmaning vid träning med starkt heterogen sekvensdata. Till exempel så uppvisar tidsseriedata inom telekomdomänen blandade variationer och mönster över längre tidsintervall. Dessa distinkta fördelningar utgör en utmaning när en nod inte bara ska bidra till skapandet av en global modell utan även ämnar applicera denna modell på sin lokala datamängd. Att i detta scenario införa en global modell som ska passa alla kan visa sig vara otillräckligt, även om vi använder oss av de mest framgångsrika modellerna inom maskininlärning för tidsserieprognoser, Long Short-Term Memory (LSTM) nätverk, vilka visat sig kunna fånga komplexa mönster och generalisera väl till nya mönster. I detta arbete visar vi att genom att klustra klienterna med hjälp av dessa mönster och selektivt aggregera deras uppdateringar i olika globala modeller kan vi uppnå förbättringar av den lokal prestandan med minimala kostnader, vilket vi demonstrerar genom experiment med riktigt tidsseriedata och en grundläggande LSTM-modell.
75

Federated Learning for Time Series Forecasting Using Hybrid Model

Li, Yuntao January 2019 (has links)
Time Series data has become ubiquitous thanks to affordable edge devices and sensors. Much of this data is valuable for decision making. In order to use these data for the forecasting task, the conventional centralized approach has shown deficiencies regarding large data communication and data privacy issues. Furthermore, Neural Network models cannot make use of the extra information from the time series, thus they usually fail to provide time series specific results. Both issues expose a challenge to large-scale Time Series Forecasting with Neural Network models. All these limitations lead to our research question:Can we realize decentralized time series forecasting with a Federated Learning mechanism that is comparable to the conventional centralized setup in forecasting performance?In this work, we propose a Federated Series Forecasting framework, resolving the challenge by allowing users to keep the data locally, and learns a shared model by aggregating locally computed updates. Besides, we design a hybrid model to enable Neural Network models utilizing the extra information from the time series to achieve a time series specific learning. In particular, the proposed hybrid outperforms state-of-art baseline data-central models with NN5 and Ericsson KPI data. Meanwhile, the federated settings of purposed model yields comparable results to data-central settings on both NN5 and Ericsson KPI data. These results together answer the research question of this thesis. / Tidseriedata har blivit allmänt förekommande tack vare överkomliga kantenheter och sensorer. Mycket av denna data är värdefull för beslutsfattande. För att kunna använda datan för prognosuppgifter har den konventionella centraliserade metoden visat brister avseende storskalig datakommunikation och integritetsfrågor. Vidare har neurala nätverksmodeller inte klarat av att utnyttja den extra informationen från tidsserierna, vilket leder till misslyckanden med att ge specifikt tidsserierelaterade resultat. Båda frågorna exponerar en utmaning för storskalig tidsserieprognostisering med neurala nätverksmodeller. Alla dessa begränsningar leder till vår forskningsfråga:Kan vi realisera decentraliserad tidsserieprognostisering med en federerad lärningsmekanism som presterar jämförbart med konventionella centrala lösningar i prognostisering?I det här arbetet föreslår vi ett ramverk för federerad tidsserieprognos som löser utmaningen genom att låta användaren behålla data lokalt och lära sig en delad modell genom att aggregera lokalt beräknade uppdateringar. Dessutom utformar vi en hybrid modell för att möjliggöra neurala nätverksmodeller som kan utnyttja den extra informationen från tidsserierna för att uppnå inlärning av specifika tidsserier. Den föreslagna hybrida modellen presterar bättre än state-of-art centraliserade grundläggande modeller med NN5och Ericsson KPIdata. Samtidigt ger den federerade ansatsen jämförbara resultat med de datacentrala ansatserna för både NN5och Ericsson KPI-data. Dessa resultat svarar tillsammans på forskningsfrågan av denna avhandling.
76

Models and Representation Learning Mechanisms for Graph Data

Susheel Suresh (14228138) 15 December 2022 (has links)
<p>Graph representation learning (GRL) has been increasing used to model and understand data from a wide variety of complex systems spanning social, technological, bio-chemical and physical domains. GRL consists of two main components (1) a parametrized encoder that provides representations of graph data and (2) a learning process to train the encoder parameters. Designing flexible encoders that capture the underlying invariances and characteristics of graph data are crucial to the success of GRL. On the other hand, the learning process drives the quality of the encoder representations and developing principled learning mechanisms are vital for a number of growing applications in self-supervised, transfer and federated learning settings. To this end, we propose a suite of models and learning algorithms for GRL which form the two main thrusts of this dissertation.</p> <p><br></p> <p>In Thrust I, we propose two novel encoders which build upon on a widely popular GRL encoder class called graph neural networks (GNNs). First, we empirically study the prediction performance of current GNN based encoders when applied to graphs with heterogeneous node mixing patterns using our proposed notion of local assortativity. We find that GNN performance in node prediction tasks strongly correlates with our local assortativity metric---thereby introducing a limit. We propose to transform the input graph into a computation graph with proximity and structural information as distinct types of edges. We then propose a novel GNN based encoder that operates on this computation graph and adaptively chooses between structure and proximity information. Empirically, adopting our transformation and encoder framework leads to improved node classification performance compared to baselines in real-world graphs that exhibit diverse mixing.</p> <p>Secondly, we study the trade-off between expressivity and efficiency of GNNs when applied to temporal graphs for the task of link ranking. We develop an encoder that incorporates a labeling approach designed to allow for efficient inference over the candidate set jointly, while provably boosting expressivity. We also propose to optimize a list-wise loss for improved ranking. With extensive evaluation on real-world temporal graphs, we demonstrate its improved performance and efficiency compared to baselines.</p> <p><br></p> <p>In Thrust II, we propose two principled encoder learning mechanisms for challenging and realistic graph data settings. First, we consider a scenario where only limited or even no labelled data is available for GRL. Recent research has converged on graph contrastive learning (GCL), where GNNs are trained to maximize the correspondence between representations of the same graph in its different augmented forms. However, we find that GNNs trained by traditional GCL often risk capturing redundant graph features and thus may be brittle and provide sub-par performance in downstream tasks. We then propose a novel principle, termed adversarial-GCL (AD-GCL), which enables GNNs to avoid capturing redundant information during the training by optimizing adversarial graph augmentation strategies used in GCL. We pair AD-GCL with theoretical explanations and design a practical instantiation based on trainable edge-dropping graph augmentation. We experimentally validate AD-GCL by comparing with state-of-the-art GCL methods and achieve performance gains in semi-supervised, unsupervised and transfer learning settings using benchmark chemical and biological molecule datasets. </p> <p>Secondly, we consider a scenario where graph data is silo-ed across clients for GRL. We focus on two unique challenges encountered when applying distributed training to GRL: (i) client task heterogeneity and (ii) label scarcity. We propose a novel learning framework called federated self-supervised graph learning (FedSGL), which first utilizes a self-supervised objective to train GNNs in a federated fashion across clients and then, each client fine-tunes the obtained GNNs based on its local task and available labels. Our framework enables the federated GNN model to extract patterns from the common feature (attribute and graph topology) space without the need of labels or being biased by heterogeneous local tasks. Extensive empirical study of FedSGL on both node and graph classification tasks yields fruitful insights into how the level of feature / task heterogeneity, the adopted federated algorithm and the level of label scarcity affects the clients’ performance in their tasks.</p>
77

Privacy leaks from deep linear networks : Information leak via shared gradients in federated learning systems / Sekretessläckor från djupa linjära nätverk : Informationsläckor via delning av gradienter i distribuerade lärande system

Shi, Guangze January 2022 (has links)
The field of Artificial Intelligence (AI) has always faced two major challenges. The first is that data is kept scattered and cannot be collected for more efficiently use. The second is that data privacy and security need to be continuously strengthened. Based on these two points, federated learning is proposed as an emerging machine learning scheme. The idea of federated learning is to collaboratively train neural networks on servers. Each user receives the current weights of the network and then sequentially sends parameter updates (gradients) based on their own data. Because the input data remains on-device and only the parameter gradients are shared, this scheme is considered to be effective in preserving data privacy. Some previous attacks also provide a false sense of security since they only succeed in contrived settings, even for a single image. Our research mainly focus on attacks on shared gradients, showing experimentally that private training data can be obtained from publicly shared gradients. We do experiments on both linear-based and convolutional-based deep networks, whose results show that our attack is capable of creating a threat to data privacy, and this threat is independent of the specific structure of neural networks. The method presented in this paper is only to illustrate that it is feasible to recover user data from shared gradients, and cannot be used as an attack to obtain privacy in large quantities. The goal is to spark further research on federated learning, especially gradient security. We also make some brief discussion on possible strategies against our attack methods of privacy. Different methods have their own advantages and disadvantages in terms of privacy protection. Therefore, data pre-processing and network structure adjustment may need to be further researched, so that the process of training the models can achieve better privacy protection while maintaining high precision. / Området artificiell intelligens har alltid stått inför två stora utmaningar. Den första är att data hålls utspridda och inte kan samlas in för mer effektiv användning. Det andra är att datasekretess och säkerhet behöver stärkas kontinuerligt. Baserat på dessa två punkter föreslås federerat lärande som ett framväxande angreppssätt inom maskininlärning. Tanken med federerat lärande är att tillsammans träna neurala nätverk på servrar. Varje användare får nätverkets aktuella vikter och skickar sedan parameteruppdateringar (gradienter) sekventiellt baserat på sina egna data. Eftersom indata förblir på enheten och endast parametergradienterna delas, anses detta schema vara effektivt för att bevara datasekretessen. Vissa tidigare attacker ger också en falsk känsla av säkerhet eftersom de bara lyckas i konstruerade inställningar, även för en enda bild. Vår forskning fokuserar främst på attacker på delade gradienter, och visar experimentellt att privat träningsdata kan erhållas från offentligt delade gradienter. Vi gör experiment på både linjärbaserade och faltningsbaserade djupa nätverk, vars resultat visar att vår attack kan skapa ett hot mot dataintegriteten, och detta hot är oberoende av den specifika strukturen hos djupa nätverk. Metoden som presenteras i denna rapport är endast för att illustrera att det är möjligt att rekonstruera användardata från delade gradienter, och kan inte användas som en attack för att erhålla integritet i stora mängder. Målet är att få igång ytterligare forskning om federerat lärande, särskilt gradientsäkerhet. Vi gör också en kort diskussion om möjliga strategier mot våra attackmetoder för integritet. Olika metoder har sina egna fördelar och nackdelar när det gäller integritetsskydd. Därför kan förbearbetning av data och justering av nätverksstruktur behöva undersökas ytterligare, så att processen med att träna modellerna kan uppnå bättre integritetsskydd samtidigt som hög precision bibehålls.
78

NETWORK-AWARE FEDERATED LEARNING ACROSS HIGHLY HETEROGENEOUS EDGE/FOG NETWORKS

Su Wang (17592381) 09 December 2023 (has links)
<p dir="ltr">The parallel growth of contemporary machine learning (ML) technologies alongside edge/-fog networking has necessitated the development of novel paradigms to effectively manage their intersection. Specifically, the proliferation of edge devices equipped with data generation and ML model training capabilities has given rise to an alternative paradigm called federated learning (FL), moving away from traditional centralized ML common in cloud-based networks. FL involves training ML models directly on edge devices where data are generated.</p><p dir="ltr">A fundamental challenge of FL lies in the extensive heterogeneity inherent to edge/fog networks, which manifests in various forms such as (i) statistical heterogeneity: edge devices have distinct underlying data distributions, (ii) structural heterogeneity: edge devices have diverse physical hardware, (iii) data quality heterogeneity: edge devices have varying ratios of labeled and unlabeled data, and (iv) adversarial compromise: some edge devices may be compromised by adversarial attacks. This dissertation endeavors to capture and model these intricate relationships at the intersection of FL and highly heterogeneous edge/fog networks. To do so, this dissertation will initially develop closed-form expressions for the trade-offs between ML performance and resource cost considerations within edge/fog networks. Subsequently, it optimizes the fundamental processes of FL, encompassing aspects such as batch size control for stochastic gradient descent (SGD) and sampling for global aggregations. This optimization is jointly formulated with networking considerations, which include communication resource consumption and device-to-device (D2D) cooperation.</p><p dir="ltr">In the former half of the dissertation, the emphasis is first on optimizing device sampling for global aggregations in FL, and then on developing a self-sufficient hierarchical meta-learning approach for FL. These methodologies maximize expected ML model performance while addressing common challenges associated with statistical and system heterogeneity. Novel techniques, such as management of D2D data offloading, adaptive CPU clock cycle control, integration of meta-learning, and much more, enable these methodologies. In particular, the proposed hierarchical meta-learning approach enables rapid integration of new devices in large-scale edge/fog networks.</p><p dir="ltr">The latter half of the dissertation directs its ocus towards emerging forms of heterogeneity in FL scenarios, namely (i) heterogeneity in quantity and quality of local labeled and unlabeled data at edge devices and (ii) heterogeneity in terms of adversarially comprised edge devices. To deal with heterogeneous labeled/unlabeled data across edge networks, this dissertation proposes a novel methodology that enables multi-source to multi-target federated domain adaptation. This proposed methodology views edge devices as sources – devices with mostly labeled data that perform ML model training, or targets - devices with mostly unlabeled data that rely on sources’ ML models, and subsequently optimizes the network relationships. In the final chapter, a novel methodology to improve FL robustness is developed in part by viewing adversarial attacks on FL as a form of heterogeneity.</p>
79

Confidential Federated Learning with Homomorphic Encryption / Konfidentiellt federat lärande med homomorf kryptering

Wang, Zekun January 2023 (has links)
Federated Learning (FL), one variant of Machine Learning (ML) technology, has emerged as a prevalent method for multiple parties to collaboratively train ML models in a distributed manner with the help of a central server normally supplied by a Cloud Service Provider (CSP). Nevertheless, many existing vulnerabilities pose a threat to the advantages of FL and cause potential risks to data security and privacy, such as data leakage, misuse of the central server, or the threat of eavesdroppers illicitly seeking sensitive information. Promisingly advanced cryptography technologies such as Homomorphic Encryption (HE) and Confidential Computing (CC) can be utilized to enhance the security and privacy of FL. However, the development of a framework that seamlessly combines these technologies together to provide confidential FL while retaining efficiency remains an ongoing challenge. In this degree project, we develop a lightweight and user-friendly FL framework called Heflp, which integrates HE and CC to ensure data confidentiality and integrity throughout the entire FL lifecycle. Heflp supports four HE schemes to fit diverse user requirements, comprising three pre-existing schemes and one optimized scheme that we design, named Flashev2, which achieves the highest time and spatial efficiency across most scenarios. The time and memory overheads of all four HE schemes are also evaluated and a comparison between the pros and cons of each other is summarized. To validate the effectiveness, Heflp is tested on the MNIST dataset and the Threat Intelligence dataset provided by CanaryBit, and the results demonstrate that it successfully preserves data privacy without compromising model accuracy. / Federated Learning (FL), en variant av Maskininlärning (ML)-teknologi, har framträtt som en dominerande metod för flera parter att samarbeta om att distribuerat träna ML-modeller med hjälp av en central server som vanligtvis tillhandahålls av en molntjänstleverantör (CSP). Trots detta utgör många befintliga sårbarheter ett hot mot FL:s fördelar och medför potentiella risker för datasäkerhet och integritet, såsom läckage av data, missbruk av den centrala servern eller risken för avlyssnare som olagligt söker känslig information. Lovande avancerade kryptoteknologier som Homomorf Kryptering (HE) och Konfidentiell Beräkning (CC) kan användas för att förbättra säkerheten och integriteten för FL. Utvecklingen av en ramverk som sömlöst kombinerar dessa teknologier för att erbjuda konfidentiellt FL med bibehållen effektivitet är dock fortfarande en pågående utmaning. I detta examensarbete utvecklar vi en lättviktig och användarvänlig FL-ramverk som kallas Heflp, som integrerar HE och CC för att säkerställa datakonfidentialitet och integritet under hela FLlivscykeln. Heflp stöder fyra HE-scheman för att passa olika användarbehov, bestående av tre befintliga scheman och ett optimerat schema som vi designar, kallat Flashev2, som uppnår högsta tids- och rumeffektivitet i de flesta scenarier. Tids- och minneskostnaderna för alla fyra HE-scheman utvärderas också, och en jämförelse mellan fördelar och nackdelar sammanfattas. För att validera effektiviteten testas Heflp på MNIST-datasetet och Threat Intelligence-datasetet som tillhandahålls av CanaryBit, och resultaten visar att det framgångsrikt bevarar datasekretessen utan att äventyra modellens noggrannhet.
80

Dynamic GAN-based Clustering in Federated Learning

Kim, Yeongwoo January 2020 (has links)
As the era of Industry 4.0 arises, the number of devices that are connectedto a network has increased. The devices continuously generate data that hasvarious information from power consumption to the configuration of thedevices. Since the data have the raw information about each local node inthe network, the manipulation of the information brings a potential to benefitthe network with different methods. However, due to the large amount ofnon-IID data generated in each node, manual operations to process the dataand tune the methods became challenging. To overcome the challenge, therehave been attempts to apply automated methods to build accurate machinelearning models by a subset of collected data or cluster network nodes byleveraging clustering algorithms and using machine learning models withineach cluster. However, the conventional clustering algorithms are imperfectin a distributed and dynamic network due to risk of data privacy, the nondynamicclusters, and the fixed number of clusters. These limitations ofthe clustering algorithms degrade the performance of the machine learningmodels because the clusters may become obsolete over time. Therefore, thisthesis proposes a three-phase clustering algorithm in dynamic environmentsby leveraging 1) GAN-based clustering, 2) cluster calibration, and 3) divisiveclustering in federated learning. GAN-based clustering preserves data becauseit eliminates the necessity of sharing raw data in a network to create clusters.Cluster calibration adds dynamics to fixed clusters by continuously updatingclusters and benefits methods that manage the network. Moreover, the divisiveclustering explores the different number of clusters by iteratively selectingand dividing a cluster into multiple clusters. As a result, we create clustersfor dynamic environments and improve the performance of machine learningmodels within each cluster. / ett nätverk ökat. Enheterna genererar kontinuerligt data som har varierandeinformation, från strömförbrukning till konfigurationen av enheterna. Eftersomdatan innehåller den råa informationen om varje lokal nod i nätverket germanipulation av informationen potential att gynna nätverket med olika metoder.På grund av den stora mängden data, och dess egenskap av att vara icke-o.l.f.,som genereras i varje nod blir manuella operationer för att bearbeta data ochjustera metoderna utmanande. För att hantera utmaningen finns försök med attanvända automatiserade metoder för att bygga precisa maskininlärningsmodellermed hjälp av en mindre mängd insamlad data eller att gruppera nodergenom att utnyttja klustringsalgoritmer och använda maskininlärningsmodellerinom varje kluster. De konventionella klustringsalgoritmerna är emellertidofullkomliga i ett distribuerat och dynamiskt nätverk på grund av risken fördataskydd, de icke-dynamiska klusterna och det fasta antalet kluster. Dessabegränsningar av klustringsalgoritmerna försämrar maskininlärningsmodellernasprestanda eftersom klustren kan bli föråldrade med tiden. Därför föreslårdenna avhandling en trefasklustringsalgoritm i dynamiska miljöer genom attutnyttja 1) GAN-baserad klustring, 2) klusterkalibrering och 3) klyvning avkluster i federerad inlärning. GAN-baserade klustring bevarar dataintegriteteneftersom det eliminerar behovet av att dela rådata i ett nätverk för att skapakluster. Klusterkalibrering lägger till dynamik i klustringen genom att kontinuerligtuppdatera kluster och fördelar metoder som hanterar nätverket. Dessutomdelar den klövlande klustringen olika antal kluster genom att iterativt välja ochdela ett kluster i flera kluster. Som ett resultat skapar vi kluster för dynamiskamiljöer och förbättrar prestandan hos maskininlärningsmodeller inom varjekluster.

Page generated in 0.1231 seconds